MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Greedy linear value-approximation for factored markov decision processes (2002) [18 citations — 7 self]

Download:
pdf | ps
by Relu Patrascu, Pascal Poupart
In Proceedings of the 18th National Conference on Artificial Intelligence
http://www.cs.uwaterloo.ca/~ppoupart/publications/basislp/paper.ps.gz
Add To MetaCart

Abstract:

Significant recent work has focused on using linear representations to approximate value functions for factored Markov decision processes (MDPs). Current research has adopted linear programming as an effective means to calculate approximations for a given set of basis functions, tackling very large MDPs as a result. However, a number of issues remain unresolved: How accurate are the approximations produced by linear programs? How hard is it to produce better approximations? and Where do the basis functions come from? To address these questions, we first investigate the complexity of minimizing the Bellman error of a linear value function approximation—showing that this is an inherently hard problem. Nevertheless, we provide a branch and bound method for calculating Bellman error and performing approximate policy iteration for general factored MDPs. These methods are more accurate than linear programming, but more expensive. We then consider linear programming itself and investigate methods for automatically constructing sets of basis functions that allow this approach to produce good approximations. The techniques we develop are guaranteed to reduce Äerror, but can also empirically reduce Bellman error. 1

Citations

246 Decisiontheoretic planning: Structural assumptions and computationalleverage – Boutilier, Dean, et al. - 1999
214 Hierarchical reinforcement learning with the MAXQ value function decomposition – Dietterich
81 Multiagent planning with factored MDPs – Guestrin, Koller, et al. - 2001
72 Stochastic dynamic programming with factored representations – Boutilier, Dearden, et al. - 2000
64 Computing factored value functions for policies in structured MDPs – Koller, Parr - 1999
55 The linear programming approach to approximate dynamic programming – Farias, Roy - 2003
54 Policy iteration for factored MDPs – Koller, Parr - 2000
53 2001, ‘Max-norm projections for factored MDPs – Guestrin, Koller, et al.
47 Generalized polynomial approximations in Markovian decision processes – Schweitzer, Seidmann - 1985
31 Markov decision processes: Discrete dynamic programming – Puterman - 1994
29 Complexity of finite-horizon Markov decision processes – Mundhenk, Goldsmith, et al. - 2000
22 Direct value-approximation for factored MDPs – Schuurmans, Patrascu - 2001
21 Dynamic Programming and Optimal Control, volume 2. Athena Scientific – Bertsekas - 1995
21 Nonapproximability results for partially observable markov decision processes – Lusena, Goldsmith, et al. - 2001
11 Spline approximations to value functions: A linear programming approach – Trick, Zin - 1997
8 Using free energies to represent Q-values in a multiagent reinforcement learning task – Sallans, Hinton - 2001
4 Nonlinear Optimization. Athena Scientific – Bertsekas - 1995