### Citations

1076 | Planning and acting in partially observable stochastic domains - Kaelbling, Littman, et al. - 1998 |

418 |
The optimal control of partially observable markov processes over the infinite horizon: Discounted costs
- Sondik
- 1978
(Show Context)
Citation Context ... On the one hand, it draws from work on exact algorithms for POMDPs that use dynamic programming and a piecewise linear and convex representation of the value function (e.g., Smallwood & Sondik 1973; =-=Sondik 1978-=-; Cassandra et al. 1994; Cassandra et al. 1997). On the other, it draws from work on approximation algorithms for POMDPs that perform forward search from a starting belief state, including work on com... |

414 |
The optimal control of partially observable markov processes over a finite horizon
- Smallwood, Sondik
- 1973
(Show Context)
Citation Context ... distributions to expected values. A value function defined for all possible state probability distributions can be represented in different ways; for example, as a set of vectors and a max operator (=-=Smallwood & Sondik 1973-=-) or as a grid of point values with an interpolation rule (e.g., Hauskrecht 1997). Given some explicit representation of the value function, a policy is represented implicitly by the same value functi... |

61 | A heuristic variable grid solution method for POMDPs
- Brafman
- 1997
(Show Context)
Citation Context ...n be found after a finite search (Satia and Lave 1973). Several possible upper bound functions for evaluating the fringe nodes of the search tree have been discussed by others (e.g., Hauskrecht 1997, =-=Brafman 1997-=-) and we do not add to that discussion here. For a lower bound function, we use the piecewise linear and convex value function of a finite-state controller and improve the lower bound during search by... |

42 |
Heuristic search in restricted memory
- Chakrabati, Ghose, et al.
- 1989
(Show Context)
Citation Context ...improve performance of the heuristic search algorithm and accelerate convergence of the error bound. We also plan to implement a memory-bounded version of AO* that can search more deeply in the tree (=-=Chakrabarti et al. 1990-=-; Washington 1997). The most promising aspect of this heuristic search algorithm is its potential for solving problems for which the dynamic-programming update is computationally prohibitive. Consider... |

40 | An improved policy iteration algorithm for partially observable MDPs
- Hansen
- 1998
(Show Context)
Citation Context ...ged, the policy evaluation step is invoked to compute the value function of the transformed finite-state controller. We can prove the following generalization of Howards's policy improvement theorem (=-=Hansen 1998-=-b). Theorem 1 If a finite-state controller is not optimal, policy improvement transforms it into a finite-state controller with a value function that is as good or better for every belief state and be... |

40 | Incremental methods for computing bounds in partially observable Markov decision processes
- Hauskrecht
- 1997
(Show Context)
Citation Context ...bility distributions can be represented in different ways; for example, as a set of vectors and a max operator (Smallwood & Sondik 1973) or as a grid of point values with an interpolation rule (e.g., =-=Hauskrecht 1997-=-). Given some explicit representation of the value function, a policy is represented implicitly by the same value function and one-step lookahead. Most algorithms for solving POMDPs represent a policy... |

36 |
Finite-memory controls of partially observable systems
- Hansen
- 1998
(Show Context)
Citation Context ...ged, the policy evaluation step is invoked to compute the value function of the transformed finite-state controller. We can prove the following generalization of Howards's policy improvement theorem (=-=Hansen 1998-=-b). Theorem 1 If a finite-state controller is not optimal, policy improvement transforms it into a finite-state controller with a value function that is as good or better for every belief state and be... |

31 |
Incremental pruning: A simple, fast, exact algorithm for partially observable markov decision processes
- Cassandra, Littman, et al.
- 1997
(Show Context)
Citation Context ... on exact algorithms for POMDPs that use dynamic programming and a piecewise linear and convex representation of the value function (e.g., Smallwood & Sondik 1973; Sondik 1978; Cassandra et al. 1994; =-=Cassandra et al. 1997-=-). On the other, it draws from work on approximation algorithms for POMDPs that perform forward search from a starting belief state, including work on computing bounds for the fringe nodes of a search... |

30 | Efficient dynamicprogramming updates in partially observable Markov decision processes,” Brown University
- Littman, Cassandra, et al.
- 1995
(Show Context)
Citation Context ...y expensive than the dynamic-programming update. For POMDPs, policy evaluation has low-order polynomial complexity compared to the worst-case exponential complexity of the dynamic-programming update (=-=Littman et al. 1995-=-). Therefore, policy iteration appears to have a clearer advantage over value iteration for POMDPs. Table 1 compares the performance of value iteration and policy iteration on nine test problems from ... |

13 | Incremental Markov model planning
- Washington
- 1996
(Show Context)
Citation Context ...imation algorithms for POMDPs that perform forward search from a starting belief state, including work on computing bounds for the fringe nodes of a search tree (e.g., Satia & Lave 1973; Larsen 1989; =-=Washington 1996-=-, 1997; Hauskrecht 1997). In the past, heuristic search has been used to find a solution that takes the form of a tree that grows as the depth of the search increases. The contribution of this paper i... |

1 |
A Decision Tree Approach to Maintaining a Deteriorating Physical System
- Larsen
- 1989
(Show Context)
Citation Context ...ork on approximation algorithms for POMDPs that perform forward search from a starting belief state, including work on computing bounds for the fringe nodes of a search tree (e.g., Satia & Lave 1973; =-=Larsen 1989-=-; Washington 1996, 1997; Hauskrecht 1997). In the past, heuristic search has been used to find a solution that takes the form of a tree that grows as the depth of the search increases. The contributio... |