Download:
|
by Weihong Zhang
in Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU
http://www2.cs.ust.hk/~wzhang/pub/euai01.wzhang.ps
Add To MetaCart
Abstract:
Abstract. Partially Observable Markov Decision Processes (POMDPs) provide an elegant framework for AI planning tasks with uncertainties. Value iteration is a well-known algorithm for solving POMDPs. It is notoriously difficult because at each step it needs to account for every belief state in a continuous space. In this paper, we show that value iteration can be conducted over a subset of belief space. Then, we study a class of POMDPs, namely informative POMDPs, where each observation provides good albeit incomplete information about world states. For informative POMDPs, value iteration can be conducted over a small subset of belief space. This yields two advantages: First, fewer vectors are in need to represent value functions. Second, value iteration can be accelerated. Empirical studies are presented to demonstrate these two advantages. 1
Citations
|
1267
|
Data Networks
– Bertsekas, Gallager
- 1992
|
|
361
|
Markov Decision Processes
– Puterman
- 1994
|
|
189
|
The complexity of Markov decision processes
– Papadimitriou, Tsitsiklis
- 1987
|
|
186
|
Exploiting structure in policy construction
– Boutilier, Dearden, et al.
- 1995
|
|
145
|
Planning under time constraints in stochastic domains
– Dean, Kaelbling, et al.
- 1995
|
|
112
|
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. UAI-97
– Cassandra, Littman, et al.
- 1997
|
|
79
|
Exact and Approximate Algorithms for Partially Observable Markov Decision Processes
– Cassandra
- 1998
|
|
78
|
Optimal control of Markov decision processes with incomplete stateestimation
– Astrom
- 1965
|
|
78
|
Computing optimal policies for partially observable decision processes using compact representations
– Boutilier, Poole
- 1996
|
|
78
|
Value-function approximations for Partially Observable Markov Decision Processes
– Hauskrecht
- 2000
|
|
29
|
Finite-Memory Control of Partially Observable Systems
– Hansen
- 1998
|
|
9
|
Planning medical therapy using partially observable Markov decision processes
– Hauskrecht, Fraser
- 1998
|
|
7
|
An environment model for nonstationary reinforcement learning
– Choi, Yeung, et al.
- 1999
|
|
6
|
The optimal control of partially observable decision processes
– Sondik
- 1971
|
|
4
|
A model approximation scheme for planning in stochastic domains
– Zhang, Liu
- 1997
|
|
3
|
Space-progressive value iteration: an anytime algorithm for a class of POMDPs
– Zhang, Zhang
- 2001
|