Results 1  10
of
2,597,120
Relative Value Function Approximation
, 1997
"... A form of temporal difference learning is presented that learns the relative utility of states, instead of the absolute utility. This formulation backs up decisions instead of values, making it possible to learn a simpler function for defining a decisionmaking policy. A nonlinear relative value fun ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
function can be learned without increasing the dimensionality of the inputs. Contents 1 Introduction 1 2 Approximating Absolute Utility 1 3 Approximating Relative Utility 2 4 Questions 3 5 Grid World 3 6 The ELF Function Approximator 4 7 Eight Puzzle 5 8 Discussion 8 9 Conclusions Relative Value
Manifold representations for valuefunction approximation
 In Working Notes of the Workshop on Markov Decision Processes, AAAI 2004
, 2005
"... ..."
Manifold Representations for ValueFunction Approximation
"... Reinforcement learning (RL) has been shown to be an effective paradigm for learning control policies for problems with discrete state spaces. For problems with continuous multidimensional state spaces, the results are ..."
Abstract
 Add to MetaCart
Reinforcement learning (RL) has been shown to be an effective paradigm for learning control policies for problems with discrete state spaces. For problems with continuous multidimensional state spaces, the results are
Manifold Representations for ValueFunction Approximation
"... Reinforcement learning (RL) has been shown to be an effective paradigm for learning control policies for problems with discrete state spaces. For problems with continuous multidimensional state spaces, the results are ..."
Abstract
 Add to MetaCart
Reinforcement learning (RL) has been shown to be an effective paradigm for learning control policies for problems with discrete state spaces. For problems with continuous multidimensional state spaces, the results are
Global Optimization for Value Function Approximation Global Optimization for Value Function Approximation
, 2010
"... Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provide ..."
Abstract
 Add to MetaCart
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation
Value Function Approximation in ZeroSum Markov Games
 In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI 2002
, 2002
"... This paper investigates value function approximation in the context of zerosum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the twoagent case. We generalize error bounds from MDPs to Markov games and describe generalizations of reinf ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
This paper investigates value function approximation in the context of zerosum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the twoagent case. We generalize error bounds from MDPs to Markov games and describe generalizations
Highaccuracy valuefunction approximation with neural networks
 In Esann
, 2004
"... Abstract. Several reinforcementlearning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algori ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Several reinforcementlearning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning
SketchBased Linear Value Function Approximation
"... Hashing is a common method to reduce large, potentially infinite feature vectors to a fixedsize table. In reinforcement learning, hashing is often used in conjunction with tile coding to represent states in continuous spaces. Hashing is also a promising approach to value function approximation in l ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Hashing is a common method to reduce large, potentially infinite feature vectors to a fixedsize table. In reinforcement learning, hashing is often used in conjunction with tile coding to represent states in continuous spaces. Hashing is also a promising approach to value function approximation
On the Smoothness of Linear Value Function Approximations
, 2006
"... Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a set of basis functions and optimize their weights b ..."
Abstract
 Add to MetaCart
Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a set of basis functions and optimize their weights
Parametric Value Function Approximation: a Unified View
"... Abstractâ€”Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the socalled value function. An important RL subt ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
subtopic is to approximate this function when the system is too large for an exact representation. This survey reviews and unifies state of the art methods for parametric value function approximation by grouping them into three main categories: bootstrapping, residuals and projected fixedpoint approaches
Results 1  10
of
2,597,120