MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Speeding up the convergence of value iteration in partially observable Markov decision processes (2001) [38 citations — 4 self]

Download:
Download as a PDF
by Nevin L. Zhang, Weihong Zhang
Journal of Artificial Intelligence Research
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume14/zhang01a.pdf
Add To MetaCart

Abstract:

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for nding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very e ective: It enabled value iteration to converge after only a few iterations on all the test problems. 1.

Citations

361 Markov Decision Processes – Puterman - 1994
221 The optimal control of Partially Observable Markov Processe – Sondik - 1971
189 The complexity of Markov decision processes – Papadimitriou, Tsitsiklis - 1987
175 Learning Policies for Partially Observable Environments: Scaling Up – Littman, Cassandra, et al. - 1995
132 A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms – Monahan - 1982
131 Algorithms for Sequential Decision Making – Littman - 1996
118 Approximating optimal policies for partially observable stochastic domains. (unpublished manuscript – Parr, Russell - 1995
112 Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. UAI-97 – Cassandra, Littman, et al. - 1997
79 Exact and Approximate Algorithms for Partially Observable Markov Decision Processes – Cassandra - 1998
78 Optimal control of Markov decision processes with incomplete stateestimation – Astrom - 1965
65 Solving POMDPs by searching in policy space – Hansen - 1998
61 Algorithms for partially observable Markov decision processes – Cheng - 1988
58 Computationally feasible bounds for partially observed Markov decision processes – Lovejoy - 1989
35 Planning and control in stochastic domains with imperfect information – Hauskrecht - 1997
32 Incremental methods for computing bounds in partially observable Markov decision processes – Hauskrecht - 1997
26 Efficient dynamic-programming updates in partially observable markov decision processes – Littman, Cassandra, et al. - 1995
21 The optimal search for a moving target when the search path is constrained – Eagle - 1984
17 Solution procedures for partially observed Markov decision processes – White, Scherer - 1989
12 A survey of POMDP applications – Cassandra - 1998
12 A method for speeding up value iteration in partially observable markov decision processes – Zhang, Lee, et al. - 1999
10 Optimal control of partially observable processes over the finite horizon – Smallwood, Sondik - 1973
8 Value function approximations for partially observable Markov decision processes – Hauskrecht - 2000
7 Suboptimal policies with bounds for parameter adaptive decision processes – Lovejoy - 1993
6 E cient dynamic-programming updates in partially observable Markov decision processes – Littman, Cassandra, et al. - 1995
3 A heuristic variable grid solution for POMDPs – Brafman - 1997
2 Planning and Acting in Partially Observable Stochastic Domains – Cassandra - 1998
2 Optimal in nite-horizon undiscounted control of nite probabilistic systems – Platzman - 1980
1 Optimal in#nite-horizon undiscounted control of #nite probabilistic systems – Platzman
1 Speeding Up Value Iteration in POMDPS – Zhang, Liu - 1997