MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Value iteration over belief subspace (2001) [1 citations — 1 self]

Download:
Download as a PDF | Download as a PS
by Weihong Zhang
in Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU
http://www2.cs.ust.hk/~wzhang/pub/euai01.wzhang.ps
Add To MetaCart

Abstract:

Abstract. Partially Observable Markov Decision Processes (POMDPs) provide an elegant framework for AI planning tasks with uncertainties. Value iteration is a well-known algorithm for solving POMDPs. It is notoriously difficult because at each step it needs to account for every belief state in a continuous space. In this paper, we show that value iteration can be conducted over a subset of belief space. Then, we study a class of POMDPs, namely informative POMDPs, where each observation provides good albeit incomplete information about world states. For informative POMDPs, value iteration can be conducted over a small subset of belief space. This yields two advantages: First, fewer vectors are in need to represent value functions. Second, value iteration can be accelerated. Empirical studies are presented to demonstrate these two advantages. 1

Citations

1267 Data Networks – Bertsekas, Gallager - 1992
361 Markov Decision Processes – Puterman - 1994
189 The complexity of Markov decision processes – Papadimitriou, Tsitsiklis - 1987
186 Exploiting structure in policy construction – Boutilier, Dearden, et al. - 1995
145 Planning under time constraints in stochastic domains – Dean, Kaelbling, et al. - 1995
112 Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. UAI-97 – Cassandra, Littman, et al. - 1997
79 Exact and Approximate Algorithms for Partially Observable Markov Decision Processes – Cassandra - 1998
78 Optimal control of Markov decision processes with incomplete stateestimation – Astrom - 1965
78 Computing optimal policies for partially observable decision processes using compact representations – Boutilier, Poole - 1996
78 Value-function approximations for Partially Observable Markov Decision Processes – Hauskrecht - 2000
29 Finite-Memory Control of Partially Observable Systems – Hansen - 1998
9 Planning medical therapy using partially observable Markov decision processes – Hauskrecht, Fraser - 1998
7 An environment model for nonstationary reinforcement learning – Choi, Yeung, et al. - 1999
6 The optimal control of partially observable decision processes – Sondik - 1971
4 A model approximation scheme for planning in stochastic domains – Zhang, Liu - 1997
3 Space-progressive value iteration: an anytime algorithm for a class of POMDPs – Zhang, Zhang - 2001