MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Algorithms for Partially Observable Markov Decision Processes (2001) [5 citations — 0 self]

Download:
Download as a PDF | Download as a PS
by Weihong Zhang, Weihong Zhang, Weihong Zhang, Goldsmith Mordecai J. Golin
Hong Kong University of Science and Technology
http://www2.cs.ust.hk/~wzhang/pub/thesis.ps.gz
Add To MetaCart

Abstract:

I hereby declare that I am the sole author of the thesis.

Citations

1397 Dynamic Programming – Bellman - 1957
1397 STRIPS: A new approach in the application of theorem proving to problem solving – Fikes, Nilsson - 1971
1267 Data Networks – Bertsekas, Gallager - 1992
938 Learning from Delayed Rewards – Watkins - 1989
885 Learning to Predict by the Methods of Temporal Differences – Sutton - 1988
408 Planning and acting in partially observable stochastic domains – Kaelbling, Littman, et al. - 1998
400 Learning to act using Real-Time Dynamic Programming – Barto, Bradtke, et al. - 1995
389 UCPOP: A sound, complete, partial order planner for ADL – Penberthy, Weld - 1992
378 Systematic nonlinear planning – McAllester, Rosenblitt - 1991
374 Integrated Architecture for Learning, Planning, and Reacting Based on Approximating Dynamic Programming – Sutton - 1990
361 Markov Decision Processes – Puterman - 1994
314 Probabilistic logic – Nilsson
293 Real-time heuristic search – Korf - 1990
246 Decisiontheoretic planning: Structural assumptions and computationalleverage – Boutilier, Dean, et al. - 1999
241 An algorithm for probabilistic planning – Kushmerick, Hanks, et al. - 1995
239 Prioritized sweeping: Reinforcement learning with less data and less real time – Moore, Atkeson - 1993
215 Improving elevator performance using reinforcement learning – Crites, Barto - 1996
212 Temporal Credit Assignment in Reinforcement Learning – Sutton - 1984
175 Learning Policies for Partially Observable Environments: Scaling Up – Littman, Cassandra, et al. - 1995
162 On the convergence of stochastic iterative dynamic programming algorithms – Jaakkola, Jordan, et al. - 1994
158 Reinforcement learning with hierarchies of machines – Parr, Russell - 1998
156 On-line Q-learning using connectionist systems – Rummery, Niranjan - 1994
152 The optimal control of partially observed markov processes over the finite horizon – Smallwood, Sondik - 1973
145 Planning with Incomplete Information as Heuristic Se arch – Bonet, Geffner
145 Planning under time constraints in stochastic domains – Dean, Kaelbling, et al. - 1995
133 Planning with deadlines in stochastic domains – Dean, Kaelbling, et al. - 1993
132 A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms – Monahan - 1982
124 Macro operators: A weak method for learning – Korf - 1985
122 I.: Reinforcement Learning Algorithm for Partially Observable – Jaakkola, Singh, et al. - 1994
118 Approximating optimal policies for partially observable stochastic domains. (unpublished manuscript – Parr, Russell - 1995
109 Anytime synthetic projection: Maximizing the probability of goal satisfaction – Drummond, Bresina - 1990
94 Decomposition techniques for planning in stochastic domains – Dean, Lin - 1995
93 Learning without stateestimation in partially observable Markovian decision problems – Singh, Jaakkola, et al. - 1994
90 Optimization Theory for Large Systems – Lasdon - 1970
89 Hierarchical solution of Markov decision processes using macro-actions – Hauskrecht, Meuleau, et al. - 1998
88 The complexity of stochastic games – Condon - 1992
86 Utility models for goaldirected, decision-theoretic planners – Haddawy, Hanks - 1998
84 Model minimization in Markov decision processes – Dean, Givan - 1997
80 Finite State Markovian Decision Processes – Derman - 1970
80 Efficient Learning and Planning Within the Dyna Framework – Peng, Williams - 1993
80 Finding structure in reinforcement learning – Thrun, Schwartz - 1995
79 Exact and Approximate Algorithms for Partially Observable Markov Decision Processes – Cassandra - 1998
78 Optimal control of Markov decision processes with incomplete stateestimation – Astrom - 1965
78 Computing optimal policies for partially observable decision processes using compact representations – Boutilier, Poole - 1996
78 Value-function approximations for Partially Observable Markov Decision Processes – Hauskrecht - 2000
72 Stochastic dynamic programming with factored representations – Boutilier, Dearden, et al.
68 Approximate planning in large POMDPs via reusable trajectories – Kearns, Mansour, et al. - 1999
68 A robot navigation architecture based on partially observable markov decision process models – Koenig, Simmons
65 Decision Theory in Expert Systems and Artificial Intelligence – Horvitz, Breese, et al. - 1988
64 The convergence of TD(*) for general – Dayan - 1992