Results 1 - 10
of
493
Planning and acting in partially observable stochastic domains
- ARTIFICIAL INTELLIGENCE
, 1998
"... In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm ..."
Abstract
-
Cited by 1095 (38 self)
- Add to MetaCart
(Show Context)
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. We begin by introducing the theory of Markov decision processes (mdps) and partially observable mdps (pomdps). We then outline a novel algorithm for solving pomdps offline and show how, in some cases, a finite-memory controller can be extracted from the solution to a pomdp. We conclude with a discussion of how our approach relates to previous work, the complexity of finding exact solutions to pomdps, and of some possibilities for finding approximate solutions.
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 1999
"... Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives ..."
Abstract
-
Cited by 515 (4 self)
- Add to MetaCart
Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives adopted in these areas often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI. It also describes structural properties of MDPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans. Planning problems commonly possess structure in the reward and value functions used to de...
An Algorithm for Probabilistic Planning
, 1995
"... We define the probabilistic planning problem in terms of a probability distribution over initial world states, a boolean combination of propositions representing the goal, a probability threshold, and actions whose effects depend on the execution-time state of the world and on random chance. Adoptin ..."
Abstract
-
Cited by 286 (20 self)
- Add to MetaCart
We define the probabilistic planning problem in terms of a probability distribution over initial world states, a boolean combination of propositions representing the goal, a probability threshold, and actions whose effects depend on the execution-time state of the world and on random chance. Adopting a probabilistic model complicates the definition of plan success: instead of demanding a plan that provably achieves the goal, we seek plans whose probability of success exceeds the threshold. In this paper, we present buridan, an implemented least-commitment planner that solves problems of this form. We prove that the algorithm is both sound and complete. We then explore buridan's efficiency by contrasting four algorithms for plan evaluation, using a combination of analytic methods and empirical experiments. We also describe the interplay between generating plans and evaluating them, and discuss the role of search control in probabilistic planning. 3 We gratefully acknowledge the comment...
Probabilistic planning with information gathering and contingent execution.
, 1993
"... Abstract One way of coping with uncertainty in the world is to build plans that include actions that will produce information about the world when executed, and constrain the execution of subsequent steps in the plan to depend on that information. Literature on decision making discusses the concept ..."
Abstract
-
Cited by 223 (17 self)
- Add to MetaCart
Abstract One way of coping with uncertainty in the world is to build plans that include actions that will produce information about the world when executed, and constrain the execution of subsequent steps in the plan to depend on that information. Literature on decision making discusses the concept of information-producing actions (also called sensory actions, diagnostics, or tests), the value of information, and plans contingent on information learned from tests, but these concepts are missing from most AI representations and algorithms for plan generation. This paper presents a planning representation and algorithm that models informationproducing actions and constructs plans that exploit the information produced by those actions. We extend the BURIDAN
Stochastic Dynamic Programming with Factored Representations
, 1997
"... Markov decision processes(MDPs) have proven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, state-based specifications and computations. To alleviate the combinatorial problems associated with such methods, we pro-p ..."
Abstract
-
Cited by 189 (10 self)
- Add to MetaCart
(Show Context)
Markov decision processes(MDPs) have proven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, state-based specifications and computations. To alleviate the combinatorial problems associated with such methods, we pro-pose new representational and computational techniques for MDPs that exploit certain types of problem structure. We use dynamic Bayesian networks (with decision trees representing the local families of conditional probability distributions) to represent stochastic actions in an MDP, together with a decision-tree representation of rewards. Based on this representation, we develop versions of standard dynamic programming algorithms that directly manipulate decision-tree representations of policies and value functions. This generally obviates the need for state-by-state computation, aggregating states at the leaves of these trees and requiring computations only for each aggregate state. The key to these algorithms is a decision-theoretic generalization of classic regression analysis, in which we determine the features relevant to predicting expected value. We demonstrate the method empirically on several planning problems,
Extending planning graphs to an ADL subset
, 1997
"... We describe an extension of graphplan to a subset of ADL that allows conditional and universally quantified effects in operators in such away that almost all interesting properties of the original graphplan algorithm are preserved. ..."
Abstract
-
Cited by 186 (22 self)
- Add to MetaCart
(Show Context)
We describe an extension of graphplan to a subset of ADL that allows conditional and universally quantified effects in operators in such away that almost all interesting properties of the original graphplan algorithm are preserved.
EXACT AND APPROXIMATE ALGORITHMS FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
, 1998
"... Automated sequential decision making is crucial in many contexts. In the face of uncertainty, this task becomes even more important, though at the same time, computing optimal decision policies becomes more complex. The more sources of uncertainty there are, the harder the problem becomes to solve. ..."
Abstract
-
Cited by 186 (2 self)
- Add to MetaCart
Automated sequential decision making is crucial in many contexts. In the face of uncertainty, this task becomes even more important, though at the same time, computing optimal decision policies becomes more complex. The more sources of uncertainty there are, the harder the problem becomes to solve. In this work, we look at sequential decision making in environments where the actions have probabilistic outcomes and in which the system state is only partially observable. We focus on using a model called a partially observable Markov decision process (POMDP) and explore algorithms which address computing both optimal and approximate policies for use in controlling processes that are modeled using POMDPs. Although solving for the optimal policy is PSPACE-complete (or worse), the study and improvements of exact algorithms lends insight into the optimal solution structure as well as providing a basis for approximate solutions. We present some improvements, analysis and empirical comparisons for some existing and some novel approaches for computing the optimal POMDP policy exactly. Since it is also hard (NP-complete or worse) to derive close approximations to the optimal solution for POMDPs, we consider a number of approaches for deriving policies that yield sub-optimal control and empirically explore their performance on a range of problems. These approaches
UMCP: A Sound and Complete Procedure for Hierarchical Task-Network Planning
"... One big obstacle to understanding the nature of hierarchical task network (htn) planning has been the lack of a clear theoretical framework. In particular, no one has yet presented a clear and concise htn algorithm that is sound and complete. ..."
Abstract
-
Cited by 184 (18 self)
- Add to MetaCart
One big obstacle to understanding the nature of hierarchical task network (htn) planning has been the lack of a clear theoretical framework. In particular, no one has yet presented a clear and concise htn algorithm that is sound and complete.
Encoding Plans in Propositional Logic
, 1996
"... In recent work we showed that planning problems can be efficiently solved by general propositional satisfiability algorithms (Kautz and Selman 1996). A key issue in this approach is the development of practical reductions of planning to SAT. We introduce a series of different SAT encodings for STRIP ..."
Abstract
-
Cited by 173 (9 self)
- Add to MetaCart
In recent work we showed that planning problems can be efficiently solved by general propositional satisfiability algorithms (Kautz and Selman 1996). A key issue in this approach is the development of practical reductions of planning to SAT. We introduce a series of different SAT encodings for STRIPS-style planning, which are sound and complete representations of the original STRIPS specification, and relate our encodings to the Graphplan system of Blum and Furst (1995). We analyze the size complexity of the various encodings, both in terms of number of variables and total length of the resulting formulas. This paper complements the empirical evaluation of several of the encodings reported in Kautz and Selman (1996). We also introduce a novel encoding based on the theory of causal planning, that exploits the notionof "lifting" from the theorem-proving community. This new encoding strictly dominates the others in terms of asymptotic complexity. Finally, we consider further reductions i...