Abstract:
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcement Learning, in the continuous case, i.e. continuous state-space and time. The evaluation of the value function enables the generation of an optimal policy for reinforcement control problems, such as target or obstacle problems, viability problems or optimization problems. We propose a continuous formalism for the studying of reinforcement learning using the continuous optimal control framework, then we state the associated Hamilton-JacobiBellman equation. First, we propose to approximate the value function by a numerical scheme based on a finite-element method. This generates a discrete Markov Decision Process, with finite state and control spaces, which can be solved
Citations
|
939
|
Learning from Delayed Rewards
– Watkins
- 1989
|
|
377
|
Neuronlike adaptive elements that can solve difficult learning control problems
– Barto, Sutton, et al.
- 1983
|
|
293
|
Dynamic Programming: Deterministic and Stochastic Models
– Bertsekas
- 1987
|
|
277
|
User’s guide to viscosity solutions of second order partial differential equations
– CRANDALL, ISHII, et al.
- 1992
|
|
187
|
The Parti-game Algorithm for Variable Resolution Reinforcement Learning
– Moore
- 1993
|
|
174
|
Reinforcement learning for robots using neural networks
– Lin
- 1993
|
|
165
|
Viscosity solutions of Hamilton-Jacobi equations
– Crandall, Lions
- 1983
|
|
137
|
Residual algorithms: Reinforcement learning with function approximation
– Baird
- 1995
|
|
117
|
Numerical Methods for Stochastic Control Problems in Continuous Time
– Kushner, Dupuis
- 1992
|
|
109
|
Real-time learning and control using asynchronous dynamic programming
– Barto, Bradtke, et al.
- 1991
|
|
78
|
Solutions de Viscosité des Equations de Hamilton–Jacobi
– Barles
- 1994
|
|
50
|
Convergence of approximation schemes for fully nonlinear second order equations. Asymptotic Analysis
– Barles, Souganidis
- 1991
|
|
42
|
Reinforcement learning And Its Application on Control
– Gullapalli
- 1992
|
|
41
|
Approximation schemes for viscosity solutions of Hamilton-Jacobi equations
– Souganidis
- 1985
|
|
13
|
Comparison principle for dirichlet-type hamilton-jacobi equations and singular perturbations of degenerated elliptic equations
– Barles, Perthame
- 1990
|
|
8
|
Numerical Methods for Stochastic
– Kushner, Dupuis
- 1992
|
|
7
|
Reinforcement learning applied to a differential game
– Harmon, Baird, et al.
- 1996
|
|
5
|
M'ethodes multigrilles en controle stochastique
– Akian
- 1990
|
|
4
|
Neural networks for control
– Barto
- 1990
|