• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Learning partially observable deterministic action models (2008)

by E Amir, A Chang
Venue:JAIR
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 17
Next 10 →

Learning symbolic models of stochastic domains

by Hanna M. Pasula, Luke S. Zettlemoyer, Leslie Pack Kaelbling - JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH , 2005
"... In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a a new probabilistic planning rule representation to compactly model model noisy, nondeterministic action effects and show how these rules can be effectively learned. Through experimen ..."
Abstract - Cited by 26 (1 self) - Add to MetaCart
In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a a new probabilistic planning rule representation to compactly model model noisy, nondeterministic action effects and show how these rules can be effectively learned. Through experiments in simple planning domains and a 3D simulated blocks world with realistic physics, we demonstrate that this learning algorithm allows agents to effectively model world dynamics.

Learning planning rules in noisy stochastic worlds

by Luke S. Zettlemoyer, Hanna M. Pasula, Leslie Pack Kaelblin - IN AAAI , 2005
"... We present an algorithm for learning a model of the effects of actions in noisy stochastic worlds. We consider learning in a 3D simulated blocks world with realistic physics. To model this world, we develop a planning representation with explicit mechanisms for expressing object reference and noise. ..."
Abstract - Cited by 18 (2 self) - Add to MetaCart
We present an algorithm for learning a model of the effects of actions in noisy stochastic worlds. We consider learning in a 3D simulated blocks world with realistic physics. To model this world, we develop a planning representation with explicit mechanisms for expressing object reference and noise. We then present a learning algorithm that can create rules while also learning derived predicates, and evaluate this algorithm in the blocks world simulator, demonstrating that we can learn rules that effectively model the world dynamics.

Model-lite Planning for the Web Age Masses: The Challenges of Planning with Incomplete and Evolving Domain Models

by Subbarao Kambhampati , 2007
"... The automated planning community has traditionally focused on the efficient synthesis of plans given a complete domain theory. In the past several years, this line of work met with significant successes, and the future course of the community seems to be set on efficient planning with even richer mo ..."
Abstract - Cited by 9 (3 self) - Add to MetaCart
The automated planning community has traditionally focused on the efficient synthesis of plans given a complete domain theory. In the past several years, this line of work met with significant successes, and the future course of the community seems to be set on efficient planning with even richer models. While this line of research has its applications, there are also many domains and scenarios where the first bottleneck is getting the domain model at any level of completeness. In these scenarios, the modeling burden automatically renders the planning technology unusable. To counter this, I will motivate model-lite planning technology aimed at reducing the domain-modeling burden (possibly at the expense of reduced functionality), and outline the research challenges that need to be addressed to realize it.

Learning Recursive HTN-Method Structures for planning

by Qiang Yang, Rong Pan, Sinno Jialin Pan
"... HTN planning is one of the most effective planning methods in AI. However, designing the HTN-decomposition methods is a very difficult task which has been achieved mainly by humans. It would therefore be desirable to design automated learning methods to acquire these decomposition methods from obser ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
HTN planning is one of the most effective planning methods in AI. However, designing the HTN-decomposition methods is a very difficult task which has been achieved mainly by humans. It would therefore be desirable to design automated learning methods to acquire these decomposition methods from observed action sequences. In this work, we explore how to apply model-based clustering in order to construct task decomposition hierarchies and summarize a database of action sequences. We present a probabilistic model for unsupervised learning of HTN methods from action sequences. Based on this model, we introduce a novel two-pronged approach by simultaneously learning a Markov model for action segment clusters from action sequences and then learning an action parameter model for recognizing tasks. These models are integrated together to construct action clusters. Then, an abstraction algorithm is applied to extract variables from the action parameters in each cluster to obtain succinct HTN methods. We introduce evaluation metrics for this approach, and test the algorithm in a logistics planning domain.

Quasi-Deterministic Partially Observable Markov Decision Processes

by Camille Besse, Brahim Chaib-draa
"... Abstract. We study a subclass of POMDPs, called quasi-deterministic POMDPs (QDET-POMDPs), characterized by deterministic actions and stochastic observations. While this framework does not model the same general problems as POMDPs, they still capture a number of interesting and challenging problems a ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Abstract. We study a subclass of POMDPs, called quasi-deterministic POMDPs (QDET-POMDPs), characterized by deterministic actions and stochastic observations. While this framework does not model the same general problems as POMDPs, they still capture a number of interesting and challenging problems and, in some cases, have interesting properties. By studying the observability available in this subclass, we show that QDET-POMDPs may fall many steps in the complexity classes of polynomial hierarchy. 1

An Architecture for Tool Use and Learning in Robots

by Solly Brown, Claude Sammut
"... In this paper we address the problem of a robot learning to use environmental objects as tools, in order to help it achieve its goals. Learning to use an object as a tool involves understanding which goals it helps an agent to achieve, the properties of the tool that make it useful, and how the tool ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
In this paper we address the problem of a robot learning to use environmental objects as tools, in order to help it achieve its goals. Learning to use an object as a tool involves understanding which goals it helps an agent to achieve, the properties of the tool that make it useful, and how the tool must be manipulated in the environment in order to achieve the desired goal. A cup, for example, can be use to hold objects or liquids, should be of the appropriate size and shape (concave-up), and needs to be held the right way up. We present an architecture for a robot agent that is able to learn about objects in this way, and thereby employ appropriate objects as tools to help it achieve its goals. Our agent learns through demonstration and experiment, with the main generalisation module being an Inductive Logic Programming algorithm. 1

Learning action models from plan examples using weighted MAX-SAT

by Qiang Yang A, Kangheng Wu A, Yunfei Jiang B , 2007
"... AI planning requires the definition of action models using a formal action and plan description language, such as the standard Planning Domain Definition Language (PDDL), as input. However, building action models from scratch is a difficult and time-consuming task, even for experts. In this paper, w ..."
Abstract - Add to MetaCart
AI planning requires the definition of action models using a formal action and plan description language, such as the standard Planning Domain Definition Language (PDDL), as input. However, building action models from scratch is a difficult and time-consuming task, even for experts. In this paper, we develop an algorithm called ARMS (action-relation modelling system) for automatically discovering action models from a set of successful observed plans. Unlike the previous work in action-model learning, we do not assume complete knowledge of states in the middle of observed plans. In fact, our approach works when no or partial intermediate states are given. These example plans are obtained by an observation agent who does not know the logical encoding of the actions and the full state information between the actions. In a real world application, the cost is prohibitively high in labelling the training examples by manually annotating every state in a plan example from snapshots of an environment. To learn action models, ARMS gathers knowledge on the statistical distribution of frequent sets of actions in the example plans. It then builds a weighted propositional satisfiability (weighted MAX-SAT) problem and solves it using a MAX-SAT solver. We lay the theoretical foundations of the learning problem and evaluate the effectiveness of ARMS empirically. © 2006 Elsevier B.V. All rights reserved.

Learning Applicability Conditions in AI Planning from Partial Observations

by Hankz Hankui Zhuo, Derek Hao Hu, Qiang Yang
"... AI planning has become more and more important in many real-world domains such as military applications and intelligent scheduling. However, planning systems require complete specifications of domain models, which can be difficult to encode, even for domain experts. Thus, research on effective and e ..."
Abstract - Add to MetaCart
AI planning has become more and more important in many real-world domains such as military applications and intelligent scheduling. However, planning systems require complete specifications of domain models, which can be difficult to encode, even for domain experts. Thus, research on effective and efficient methods to construct domain models or applicability conditions for planning automatically has become a hot topic for researchers. In this paper, we review our previous work ARMS, which can learn the applicability conditions for planning under STRIPS representations. Moreover, we provide two extensions to our ARMS system, LAMP, which can learn complex action models in PDDL representations with quantifiers and logical implications, and HTN-Learner, which can simultaneously learn method preconditions and action models in hierarchical task network (HTN) models. Our experimental results show that the two proposed algorithms could effectively learn complex action models and HTN models, thus having the ability to effectively acquire applicability conditions and relationships between actions in AI planning. 1

Learning HTN Method Preconditions and Action Models from Partial Observations

by Hankz Hankui Zhuoa, Derek Hao Hua, Chad Hoggb, Qiang Yanga, Hector Munoz-avilab A
"... To apply hierarchical task network (HTN) planning to real-world planning problems, one needs to encode the HTN schemata and action models beforehand. However, acquiring such domain knowledge is difficult and time-consuming because the HTN domain definition involves a significant knowledge-engineerin ..."
Abstract - Add to MetaCart
To apply hierarchical task network (HTN) planning to real-world planning problems, one needs to encode the HTN schemata and action models beforehand. However, acquiring such domain knowledge is difficult and time-consuming because the HTN domain definition involves a significant knowledge-engineering effort. A system that can learn the HTN planning domain knowledge automatically would save time and allow HTN planning to be used in domains where such knowledgeengineering effort is not feasible. In this paper, we present a formal framework and algorithms to acquire HTN planning domain knowledge, by learning the preconditions and effects of actions and preconditions of methods. Our algorithm, HTNlearner, first builds constraints from given observed decomposition trees to build action models and method preconditions. It then solves these constraints using a weighted MAX-SAT solver. The solution can be converted to action models and method preconditions. Unlike prior work on HTN learning, we do not depend on complete action models or state information. We test the algorithm on several domains, and show that our HTN-learner algorithm is both effective and efficient. 1

Monitoring the Generation and Execution of Optimal Plans

by Christian Fritz, Christian Fritz , 2009
"... In dynamic domains, the state of the world may change in unexpected ways during the generation or execution of plans. Regardless of the cause of such changes, they raise the question of whether they interfere with ongoing planning efforts. Unexpected changes during plan generation may invalidate the ..."
Abstract - Add to MetaCart
In dynamic domains, the state of the world may change in unexpected ways during the generation or execution of plans. Regardless of the cause of such changes, they raise the question of whether they interfere with ongoing planning efforts. Unexpected changes during plan generation may invalidate the current planning effort, while discrepancies between expected and actual state of the world during execution may render the executing plan invalid or sub-optimal, with respect to previously identified planning objectives. In this thesis we develop a general monitoring technique that can be used during both plan generation and plan execution to determine the relevance of unexpected changes and which supports recovery. This way, time intensive replanning from scratch in the new and unexpected state can often be avoided. The technique can be applied to a variety of objectives, including monitoring the optimality of plans, rather then just their validity. Intuitively, the technique operates in two steps: during planning the plan is annotated with additional information that is relevant to the achievement of the objective; then, when an unexpected change occurs, this information is used to determine the relevance of the discrepancy with respect to the objective.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University