MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Decision-theoretic planning: Structural assumptions and computational leverage (1999) [246 citations — 0 self]

Download:
Download as a PDF | Download as a PS
by Craig Boutilier, Thomas Dean, Steve Hanks
Journal of Artificial Intelligence Research
http://www.isi.edu/~blythe/cs541/Readings/boutilier-etal-jair99.ps
Add To MetaCart

Abstract:

Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives adopted in these areas often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI. It also describes structural properties of MDPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans. Planning problems commonly possess structure in the reward and value functions used to describe performance criteria, in the functions used to describe state transitions and observations, and in the relationships among features used to describe states, actions, rewards, and observations.

Citations

4387 Probabilistic Reasoning in Intelligent Systems – Pearl - 1988
2315 Graph-based algorithms for boolean function manipulation – Bryant - 1986
1875 Artificial Intelligence: A Modern Approach – Russell, Norvig - 1995
1673 Reinforcement learning: An introduction – Sutton, Barto - 1998
1397 STRIPS: A new approach in the application of theorem proving to problem solving – Fikes, Nilsson - 1971
1396 Applied Dynamic Programming – Bellman, Dreyfus - 1962
1224 Some philosophical problems from the standpoint of arti cial intelligence – McCarthy, Hayes - 1969
686 and Nonlinear Programming – Luenberger, Linear - 1984
609 Decision with multiple objectives: Preferences and value tradeoffs,” Cambridge – Keeney, Raiffa - 1976
579 Planning for conjunctive goals – Chapman - 1987
400 Learning to act using Real-Time Dynamic Programming – Barto, Bradtke, et al. - 1995
389 UCPOP: A sound, complete, partial order planner for ADL – Penberthy, Weld - 1992
388 Planning in a hierarchy of abstraction spaces – Sacerdoti - 1974
378 Systematic nonlinear planning – McAllester, Rosenblitt - 1991
361 Markov Decision Processes – Puterman - 1994
353 Dynamic Programming and Markov Processes – Howard - 1960
348 Learning and executing generalized robot plans – Fikes, Hart, et al. - 1972
324 A probabilistic approach to concurrent mapping and localization for mobile robots – Thrun, Fox, et al. - 1998
315 Universal plans for reactive robots in unpredictable environments – Schoppers - 1987
312 A model for reasoning about persistence and causation – Dean, Kanazawa - 1989
293 Real-Time Heuristic Search – Korf - 1990
292 Nonlinear Programming – Bertsekas - 1995
291 Planning and Control – Dean, Wellman - 1991
275 Evaluating influence diagrams – Shachter - 1986
241 An algorithm for probabilistic planning – Kushmerick, Hanks, et al. - 1995
238 ADL: Exploring the middle ground between STRIPS and the situation calculus – Pednault - 1989
235 uence diagrams – Howard, Matheson - 1981
234 An Introduction to Least Commitment Planning – Weld - 1994
226 Probabilistic robot navigation in partially observable environments – Simmons, Koenig - 1995
221 The optimal control of Partially Observable Markov Processe – Sondik - 1971
219 State constraints revisited – Lin, Reiter - 1994
210 Acting optimally in partially observable stochastic domains – Cassandra, Kaelbling, et al. - 1994
201 Conditional nonlinear planning – Peot, Smith - 1992
199 The computational complexity of propositional STRIPS planning – Bylander - 1994
195 Algebraic decision diagrams and their applications. Formal methods in system design – Bahar, Frohm, et al. - 1997
195 Probabilistic planning with information gathering and contingent execution – Draper, S, et al. - 1994
187 Dynamic Programming and – Bertsekas - 1995
187 The Parti-game Algorithm for Variable Resolution Reinforcement Learning – Moore - 1993
186 Exploiting structure in policy construction – Boutilier, Dearden, et al. - 1995
183 Bucket elimination: A unifying framework for probabilistic inference – Dechter - 1999
182 An approach to planning with incomplete information – Etzioni, Hanks, et al. - 1992
176 Context-specific independence in bayesian networks – Boutilier, Friedman, et al. - 1996
168 Algorithm 97 (shortest path – Floyd - 1962
167 The nonlinear nature of plans – Sacerdoti - 1975
158 Reinforcement learning with hierarchies of machines – Parr, Russell - 1998
152 The optimal control of partially observed markov processes over the finite horizon – Smallwood, Sondik - 1973
149 TD-Gammon, a self-teaching backgammon program, achieves master-level play – Tesauro - 1994
145 Planning under time constraints in stochastic domains – Dean, Kaelbling, et al. - 1995
143 A survey of algorithmic methods for partially observed Markov decision processes”, Annals of Operations Research – Lovejoy - 1991
133 Planning with deadlines in stochastic domains – Dean, Kaelbling, et al. - 1993