See this document in CiteSeerX!

Algorithms for Partially Observable Markov Decision Processes (2001)  (Make Corrections)  
Weihong Zhang



  Home/Search   Context   Related

 
View or download:
www2.cs.ust.hk/~wzhang/p...thesis.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  wustl.edu/~wzhang/publications (more)
Homepages:  W.Zhang  

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Partially Observable Markov Decision Process (POMDP) is a general sequential decision-making model where the effects of actions are... (Update)

Similar documents (at the sentence level):
6.8%:   Speeding Up the Convergence of Value Iteration in Partially.. - Zhang, Zhang (2001)   (Correct)
6.6%:   Restricted Value Iteration: Theory and Algorithms - Zhang, Zhang   (Correct)

Active bibliography (related documents):   More   All
1.4:   Value Iteration over Belief Subspace - Zhang (2001)   (Correct)
1.1:   Complexity results for Infinite-Horizon Markov Decision Processes - Madani (2000)   (Correct)
1.1:   Value-Function Approximations for Partially Observable Markov.. - Hauskrecht (2000)   (Correct)

Similar documents based on text:   More   All
0.1:   Space-Progressive Value Iteration: An Anytime Algorithm for a.. - Zhang, Zhang (2001)   (Correct)
0.1:   Development of Web-Based Educational Modules for Developing VHDL.. - Song (1997)   (Correct)
0.1:   A Method for Speeding Up Value Iteration in Partially.. - Zhang, Lee, Zhang (1999)   (Correct)

BibTeX entry:   (Update)

@phdthesis{ zhang-algorithms,
  author = "Weihong Zhang",
  title = "Algorithms for Partially Observable Markov Decision Processes",
  url = "citeseer.ist.psu.edu/zhang01algorithms.html" }
Citations (may not include all citations):
891   STRIPS: a new approach to the application of theorem proving (context) - Fikes, Nilsson - 1971
658   Learning from delayed rewards (context) - Watkins - 1989
563   Learning to predict by the method of temporal differences - Sutton - 1988
408   Princeton University Press (context) - Bellman, Programming - 1957
257   Learning to act using real-time dynamic programming - Barto, Bradtke et al. - 1995
246   Markov decision processes (context) - Puterman - 1990
226   Systematic nonlinear planning - McAllester, Rosenblitt - 1991
216   The optimal control of partially observable Markov processes.. (context) - Smallwood, Sondik - 1973
191   Englewood Cliffs (context) - Bertsekas, Gallagher et al. - 1995
188   Decision-theoretic planning: structural assumptions and comp.. - Boutilier, Dean et al. - 1999
187   Planning and acting in partially observable stochastic domai.. - Kaelbling, Littman et al. - 1998
168   Real-time heuristic search (context) - Korf - 1990
162   Prioritized sweeping: reinforcement learning with less data .. - Moore, Atkeson - 1993
146   An algorithm for probabilistic planning - Kushmerick, Hanks et al. - 1995
141   Temporal credit assignment in reinforcement learning (context) - Sutton - 1984
124   Improving elevator performance using reinforcement learning - Crites, Barto - 1996
120   Planning with deadlines in stochastic domains - Dean, Kaelbling et al. - 1993
113   Learning policies for partially observable environments: sca.. - Littman, Cassandra et al. - 1995
108   Probabilistic logic (context) - Nilsson - 1986
107   the convergence of stochastic iterative dynamic programming .. - Jaakkola, Jordan et al. - 1994
106   A survey of partially observable Markov decision processes: .. (context) - Monahan - 1982
96   The complexity of Markov decision processes (context) - Papadimitriou, Tsitsiklis - 1987
90   Planning under time constraints in stochastic domains - Dean, Kaelbling et al. - 1995
87   Reinforcement learning with hierarchies of machines - Parr, Russell - 1998
87   Anytime synthetic projection: maximizing the probability of .. - Drummond, Bresina - 1990
82   Finite state Markovian decision processes (context) - Derman - 1970
81   Reinforcement learning algorithm for partially observable Ma.. - Jaakkola, Singh et al. - 1994
75   Approximating optimal policies for partially observable stoc.. - Parr, Russell - 1995
71   Macro-operators: a weak method for learning (context) - Korf - 1985
70   Abstraction and approximate decision theoretic planning - Dearden, Boutilier - 1997
63   Decomposition techniques for planning in stochastic domains - Dean, Lin - 1995
60   Computing optimal policies for partially observable decision.. - Boutilier, Poole - 1996
56   Learning without state-estimation in partially observable Ma.. - Singh, Jaakkola et al. - 1994
55   Optimization Theory for Large Systems (context) - Lasdon - 1970
51   Finding structure in reinforcement learning - Thrun, Schwartz - 1995
51   Model minimization in Markov decision processes - Dean, Givan - 1997
50   Efficient learning and planning within the dyna framework - Peng, Williams - 1993
49   Q-learning (context) - Watkins, Dayan - 1992
49   Hierarchical solution of Markov decision processes using mac.. - Hauskrecht, Meuleau et al. - 1998
46   Decision theory in expert system and artificial intelligence - Horvitz, Breese et al. - 1988
46   The computational complexity of probabilistic planning - Littman, Goldsmith et al. - 1998
45   Games and Decisions: Introduction and Critical Survey (context) - Luce, Raiffa - 1957
44   Value-function approximations for partially observable Marko.. - Hauskrecht - 2000
42   Optimal control of Markov decision processes with incomplete.. (context) - Astrom - 1965
39   Multi-time models for temporally abstract planning - Precup, Sutton - 1998
37   Stochastic dynamic programming with factored representations - Boutilier, Dearden et al. - 2000
36   Approximate planning in large POMDPs via reusable trajectori.. - Kearns, Mansour et al. - 1999
35   Anytime problem solving using dynamic programming (context) - Boddy - 1991
35   Exact and approximate algorithms for partially observable Ma.. (context) - Cassandra - 1998
33   Computationally feasible bounds for partially observed Marko.. (context) - Lovejoy - 1991
32   Xavier: a robot navigation architecture based on partially o.. (context) - Koenig, Simmons - 1998
30   Algorithms for partially observable Markov decision Processe.. (context) - Cheng - 1988
29   Planning and control in stochastic domains with imperfect in.. - Hauskrecht - 1997
29   Policy iteration for factored MDPs - Koller, Parr - 2000
28   Finite memory control of partially observable systems (context) - Hansen - 1998
26   Markov Decision Processes (context) - White - 1993
25   A heuristic variable grid solution method for POMDPs - Brafman - 1997
24   Integrated architectures for learning, planning, and reactin.. - Sutton - 1990
23   UCPOP: A sound, complete, partial order planner for ADL - Penberthy, Weld - 1992
22   Flexible decomposition algorithms for weakly coupled Markov .. - Parr - 1998
20   The witness algorithm: solving partially observable Markov d.. - Littman - 1994
20   line Q-learning using connectionist systems - Rummery, Niranjan - 1994
19   Dynamic Programming: Models and Applications (context) - Denardo - 1982
19   the complexity of partially observed markov decision process.. - Burago, de Rougemont et al. - 1996
19   Hidden state and reinforcement learning with instancebased s.. - McCallum - 1996
18   Incremental methods for computing bounds in partially observ.. - Hauskrecht - 1997
18   Observation of a Markov process through a noisy channel (context) - Drake - 1962
18   Planning with incomplete information as heuristic search in .. (context) - Bonet, Geffner - 2000
18   A stochastic model of actions and plans for an anytime plann.. (context) - Thiebaux, Hertzberg et al. - 1994
17   Optimal policies for partially observable markov decision pr.. - Cassandra - 1994
16   Speeding up the convergence of value iteration in partially .. - Zhang, Zhang - 2001
16   Dynamic programming for POMDPs using a factored state repres.. - Hansen, Feng - 2000
15   Active gesture recognition using partially observable Markov.. - Darrell, Pentland - 1996
15   Solving POMDPs by searching the space of finite policies - Meuleau, Kim et al. - 1999
15   The complexity of stochastic games - Condon - 1992
15   The optimal search for a moving target when the search path .. (context) - Eagle - 1984
15   Complexity of finite-horizon Markov decision process problem.. - Mundhenk, Goldsmith et al. - 1997
14   Solution procedures for partially observed Markov decision p.. (context) - Scherer - 1989
12   Using eligibility traces to find the best memoryless policy .. - Loch, Singh - 1998
11   Hierarchical reinforcement learning: preliminary results (context) - Kaelbling - 1993
11   Incremental markov-model planning - Washington - 1996
11   Generalized prioritized sweeping - Andre, Friedman et al. - 1997
11   Decomposition of systems governed by Markov chains (context) - Kushner, Chen - 1974
11   An improved policy iteration algorithm for partially observa.. - Hansen - 1997
11   Mathematical programming and the control of Markov chains (context) - Kushner, Kleinman - 1971
10   Finite-sample convergence rates for Q-learning and indirect .. - Kearns, Singh - 1998
10   Complexity issues in Markov decision processes - Goldsmith, Mundhenk - 1998
9   Finite-memory suboptimal design for partially observed Marko.. (context) - Scherer - 1994
9   The complexity of mean payoff games on graphs - Zwick, Paterson - 1996
8   Optimal infinite-horizon undiscounted control of finite prob.. (context) - Platzman - 1980
7   A method for speeding up value iteration in partially observ.. - Zhang, Lee et al. - 1999
7   Finite-memory estimation and control of finite probabilistic.. (context) - Platzman - 1997
7   An improved grid-based approximation algorithm for POMDPs - Zhou, Hansen - 2001
7   A POMDP approximation algorithm that anticipates the need to.. - Zubek, Dietterich - 2000
7   Decomposition principle for dynamic programs (context) - Dantzig, Wolfe - 1960
7   Suboptimal policies with bounds for parameter adaptive decis.. (context) - Lovejoy - 1993
6   The convergence of TD() for general (context) - Dayan - 1992
6   A model approximation scheme for planning in partially obser.. - Zhang, Liu - 1997
6   Solving planning problems with large state and action spaces (context) - Dean, Givan et al. - 1998
6   Utility models for goal-directed, decisiontheoretical planne.. (context) - Haddaway, Hanks - 1998
6   Heuristic search in cyclic AND/OR graphs - Hansen, Ziberstein - 1998
5   Nonapproximability results for partially observable Markov d.. - Lusena, Mundhenk et al. - 2001
5   Solving large POMDPs using real time dynamic programming - Geffner, Bonet - 1998
5   Modeling treatment of ischemic heart disease with partially .. - Hauskrecht, Fraser - 1998
4   The optimal control of partially observable decision process.. (context) - Sondik - 1971
4   The optimal control of partially observable decision process.. (context) - Sondik - 1978
4   Planning treatment of ischemic heart disease with partially .. (context) - Hauskrecht, Fraser - 2000
4   An environment model for nonstationary reinforcement learnin.. - Choi, Yeung et al. - 1999
3   Solving time-dependent problems (context) - Boddy, Dean - 1989
2   Uncertainty and real-time therapy planning: incremental mark.. - Washington - 1996
2   Fast value iteration for goal-directed Markov decision proce.. (context) - Zhang, Zhang - 1997
2   Space-progressive value iteration: an anytime algorithm for .. - Zhang, Zhang - 2001
1   Value iteration over belief subspace - Zhang - 2001
1   A survey of partially observable Markov decision processes (context) - Lovejoy - 1991
1   Conditional nonlinear planner (context) - Peot, Smith - 1992
1   Solving informative partially observable Markov decision pro.. - Zhang, Zhang - 2001
1   BI-POMDP: Bounded, incremental partially-observable Markov-m.. - Washington - 1997
1   the undecidability of probabilistic planning and partially o.. (context) - Madani, Condon et al. - 1999
1   Model minimization, regression, and propositional STRIPS pla.. - Givan, Dean - 1997
1   Multilayered control of large Markov chains (context) - Forestier, Varaiya - 1978
1   Structured readability analysis for Markov decision processe.. (context) - Boutilier, Brafman et al. - 1998

Documents on the same site (http://www.cs.wustl.edu/~wzhang/publications.html):   More
Value Iteration over Belief Subspace - Zhang (2001)   (Correct)
Value Iteration Working With Belief Subset - Zhang, Zhang   (Correct)
Space-Progressive Value Iteration: An Anytime Algorithm for a.. - Zhang, Zhang (2001)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC