Download:
|
by Nevin L. Zhang, Wenju Liu
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume7/zhang97a.ps.Z
Add To MetaCart
Abstract:
Partially observable Markov decision processes (POMDPs) are a natural model for planning problems where effects of actions are nondeterministic and the state of the world is not completely observable. It is difficult to solve POMDPs exactly. This paper proposes a new approximation scheme. The basic idea is to transform a POMDP into another one where additional information is provided by an oracle. The oracle informs the planning agent that the current state of the world is in a certain region. The transformed POMDP is consequently said to be region observable. It is easier to solve than the original POMDP. We propose to solve the transformed POMDP and use its optimal policy to construct an approximate policy for the original POMDP. By controlling the amount of additional information that the oracle provides, it is possible to find a proper tradeoff between computational time and approximation quality. In terms of algorithmic contributions, we study in details how to exploit region observability in solving the transformed POMDP. To facilitate the study, we also propose a new exact algorithm for general POMDPs. The algorithm is conceptually simple and yet is significantly more efficient than all previous exact algorithms. 1.
Citations
|
1397
|
Dynamic Programming
– Bellman
- 1957
|
|
361
|
Markov Decision Processes
– Puterman
- 1994
|
|
293
|
Dynamic Programming: Deterministic and Stochastic Models
– Bertsekas
- 1987
|
|
291
|
Planning and Control
– Dean, Wellman
- 1991
|
|
221
|
The optimal control of Partially Observable Markov Processe
– Sondik
- 1971
|
|
210
|
Acting optimally in partially observable stochastic domains
– Cassandra, Kaelbling, et al.
- 1994
|
|
186
|
Exploiting structure in policy construction
– Boutilier, Dearden, et al.
- 1995
|
|
182
|
Acting under uncertainty : Discrete bayesian models for mobile-robot navigation
– Cassandra, Kaelbling, et al.
- 1996
|
|
133
|
Planning with deadlines in stochastic domains
– Dean, Kaelbling, et al.
- 1993
|
|
132
|
A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms
– Monahan
- 1982
|
|
112
|
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. UAI-97
– Cassandra, Littman, et al.
- 1997
|
|
94
|
Decomposition techniques for planning in stochastic domains
– Dean, Lin
- 1995
|
|
78
|
Computing optimal policies for partially observable decision processes using compact representations
– Boutilier, Poole
- 1996
|
|
61
|
Algorithms for partially observable Markov decision processes
– Cheng
- 1988
|
|
58
|
Computationally feasible bounds for partially observed Markov decision processes
– Lovejoy
- 1989
|
|
43
|
A Heuristic Variable Grid Solution Method for POMDPs
– Brafman
- 1997
|
|
40
|
Model reduction techniques for computing approximately optimal solutions for markov decision processes
– Dean, Givan, et al.
- 1997
|
|
36
|
The Witness Algorithm: solving partially observable Markov decision processes
– Littman
- 1994
|
|
32
|
Incremental methods for computing bounds in partially observable Markov decision processes
– Hauskrecht
- 1997
|
|
31
|
Markov Decision Process
– White
- 1993
|
|
28
|
Partially observed Markov decision processes: A survey
– White
- 1991
|
|
26
|
Efficient dynamic-programming updates in partially observable markov decision processes
– Littman, Cassandra, et al.
- 1995
|
|
23
|
Adaptive aggregation for infinite horizon dynamic programming
– Bertsekas, Castanon
- 1989
|
|
21
|
The optimal search for a moving target when the search path is constrained
– Eagle
- 1984
|
|
9
|
Finite-memory Estimation and Control of Finite Probabilistic Systems
– Platzman
- 1977
|
|
6
|
Optimal polices for partially observable Markov decision processes
– Cassandra
- 1994
|
|
4
|
A survey of algorithmic methods for solving partially observable Markov decision processes
– Lovejoy
- 1991
|
|
3
|
Approximating optimal polices for partially observable stochastic domains
– Parr, Russell
- 1995
|
|
3
|
Information seeking in markov decision processes
– Sondik, Mendelssohn
- 1979
|