Results 1 -
6 of
6
Reinforcement Learning I: Introduction
, 1998
"... In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search ..."
Abstract
-
Cited by 2829 (76 self)
- Add to MetaCart
In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search) plus learning (association, memory). We argue that RL is the only field that seriously addresses the special features of the problem of learning from interaction to achieve long-term goals.
Learning and Problem Solving with Multilayer Connectionist Systems
, 1986
"... Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered netwo ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
Learning and Problem Solving with Multilayer Connectionist Systems September 1986 Charles William Anderson B.S., University of Nebraska M.S., University of Massachusetts Ph.D., University of Massachusetts Directed by: Professor Andrew G. Barto The di#culties of learning in multilayered networks of computational units has limited the use of connectionist systems in complex domains. This dissertation elucidates the issues of learning in a network's hidden units, and reviews methods for addressing these issues that have been developed through the years. Issues of learning in hidden units are shown to be analogous to learning issues for multilayer systems employing symbolic representations.
Step size adaptation in reproducing kernel Hilbert space
- Journal of Machine Learning Research
, 2006
"... This paper presents an online support vector machine (SVM) that uses the stochastic meta-descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient descent in reproducing kernel Hilbert space (RKHS) and translate SMD to the nonpara ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
This paper presents an online support vector machine (SVM) that uses the stochastic meta-descent (SMD) algorithm to adapt its step size automatically. We formulate the online learning problem as a stochastic gradient descent in reproducing kernel Hilbert space (RKHS) and translate SMD to the nonparametric setting, where its gradient trace parameter is no longer a coefficient vector but an element of the RKHS. We derive efficient updates that allow us to perform the step size adaptation in linear time. We apply the online SVM framework to a variety of loss functions, and in particular show how to handle structured output spaces and achieve efficient online multiclass classification. Experiments show that our algorithm outperforms more primitive methods for setting the gradient step size.
Generalized Gradient Adaptive Step Sizes For Stochastic Gradient Adaptive Filters
- IEEE International Conf. Acoust., Speech., Signal Processing
, 1995
"... In this paper, we derive new adaptive step size algorithms for two general classes of modified stochastic gradient adaptive filters that include the sign-error, sign-data, sign-sign, and normalized gradient adaptive filters as specific cases. These computationallysimple parameter adjustment algorith ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper, we derive new adaptive step size algorithms for two general classes of modified stochastic gradient adaptive filters that include the sign-error, sign-data, sign-sign, and normalized gradient adaptive filters as specific cases. These computationallysimple parameter adjustment algorithms are based on stochastic gradient approximations of steepest descent procedures for the unknown parameters. Analyses of the algorithms show that the stationary points of the steepest descent procedures yield the optimum step size values at each time instant as obtained from statistical analyses of the adaptive filter updates. Simulations verify the theoretical results and indicate that near-optimal tracking performance can be obtained from each of the adaptive step size algorithms without any knowledge of the rate of change of the unknown system. 1. INTRODUCTION Least-mean-square (LMS) adaptive finite-impulse-response (FIR) filters have proven to be extremely useful in a number of signal...
Reinforcement Learning in Situated Agents: Some Theoretical Problems and Practical Solutions
- In 8th European Workshop on Learning Robots
, 1999
"... . In on-line reinforcement learning, often a large number of estimation parameters (e.g. Q-value estimates for 1-step Q-learning) are maintained and dynamically updated as information comes to hand during the learning process. Excessive variance of these estimators can be problematic, resulting i ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. In on-line reinforcement learning, often a large number of estimation parameters (e.g. Q-value estimates for 1-step Q-learning) are maintained and dynamically updated as information comes to hand during the learning process. Excessive variance of these estimators can be problematic, resulting in uneven or unstable learning, or even making effective learning impossible. Estimator variance is usually managed only indirectly, by selecting global learning algorithm parameters (e.g. for TD() based methods) that are a compromise between an acceptable level of estimator perturbation and other desirable system attributes, such as reduced estimator bias. In this paper, we argue that this approach may not always be adequate, particularly for noisy and non-Markovian domains, and present a direct approach to managing estimator variance, the new ccBeta algorithm. Empirical results in an autonomous robotics domain are also presented showing improved performance using the ccBeta method....
Stochastic Gradient Adaptive Step Size Algorithms for Adaptive Filtering
- IN PROC. INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, LIMASSOL, CYPRUS
, 1995
"... In this paper, we provide an overview of adaptive filtering algorithms that employ gradient adaptation for the step sizes as well as for the coefficients. Earlier works in this area have shown that LMS adaptive filters equipped with gradient adaptive step sizes have excellent convergence and trackin ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we provide an overview of adaptive filtering algorithms that employ gradient adaptation for the step sizes as well as for the coefficients. Earlier works in this area have shown that LMS adaptive filters equipped with gradient adaptive step sizes have excellent convergence and tracking properties. We will consider the history of these schemes and their extensions to other adaptive filter structures. In particular, we show through a statistical analysis that the sign-error adaptive filter with gradient adaptive step size achieves near-optimal tracking of a nonstationary system with little to no knowledge of the underlying signal and noise statistics.

