Reinforcement learning in continuous state and action space
Abstract: Continuous space reinforcement learning algorithms frequently fail to address the possibility of a continuous action space, presumably because of the difficulty of discovering the best action for a particular state. This can, in some cases, severely limit the ability of a learning algorithm to tackle some common problems where different portions of the state space require distinct action granularity. Naive action discretization does not suffice for problems of this nature, so traditional... (Update)
Cited by: More
Variable Resolution Discretization in the Joint Space - Monson, Wingate, Seppi..
(Correct)
Active bibliography (related documents): More All
4.5: Reinforcement Learning in the Joint Space: Value Iteration in.. - Monson (2003)
(Correct)
0.3: A Study of Reinforcement Learning in the Continuous Case by the.. - Munos (1999)
(Correct)
0.3: Computational Geometry Column 40 - O'Rourke (2000)
(Correct)
Similar documents based on text: More All
0.2: Solving Large MDPs Quickly with Partitioned Value Iteration - Wingate, Seppi (2003)
(Correct)
0.2: An Initial Theoretical Foundation For Object-Oriented Systems.. - Clyde (1993)
(Correct)
0.2: Polynomial Real Root Finding in Bernstein Form - Spencer (1994)
(Correct)
BibTeX entry: (Update)
C. K. Monson. Reinforcement learning in the joint space: Value iteration in worlds with continuous states and actions. Master's thesis, Brigham Young University, Computer Science Department, Apr. 2003. http://citeseer.ist.psu.edu/article/monson03reinforcement.html More
@mastersthesis{ ckmonson2003,
author = "Christopher K. Monson",
title = "Value Iteration in the Joint Space",
school = "Brigham Young University",
type = "Master's Thesis",
address = "Computer Science Department",
month = "April",
year = "2003",
url = "citeseer.ist.psu.edu/article/monson03reinforcement.html" }
Citations (may not include all citations):
374
Reinforcement learning: A survey
- Kaelbling, Littman et al. - 1996
124
A new Voronoi-based surface reconstruction algorithm
- Amenta, Bern et al. - 1998
102
Generalization in reinforcement learning: Safely approximati..
- Boyan, Moore - 1995
66
Stable function approximation in dynamic programming
- Gordon - 1995
56
Constructing higher-dimensional convex hulls at logarithmic .. (context) - Seidel - 1986
41
the randomized construction of the Delaunay tree
- Boissonnat, Teillaud - 1993
28
Reinforcement learning with highdimensional
- Baird, Klopf - 1993
24
Applied optimal control: Optimization (context) - Bryson, Ho - 1969
21
Improved incremental randomized Delaunay triangulation
- Devillers - 1998
18
Memory-based learning for control
- Moore, Atkeson et al. - 1995
15
Using local trajectory optimizers to speed up global optimiz..
- Atkeson - 1994
11
A comparison of direct and model-based reinforcement learnin..
- Atkeson, Santamaria - 1997
11
A Delaunay based shape reconstruction from large data
- Dey, Giesen et al. - 2001
10
A convergent reinforcement learning algorithm in the continu..
- Munos - 1996
9
Multidimensional triangulation and interpolation for reinfor..
- Davies - 1997
9
Variable resolution dynamic programming: E#ciently learning .. (context) - Moore - 1991
8
Reinforcement learning with dynamic covering of state-action.. (context) - Munos, Patinel - 1994
7
Barycentric interpolators for continuous space and time rein.. (context) - Munos, Moore - 1998
6
Decision boundary partitioning: Variable resolution modelfre..
- Reynolds - 1999
4
Optimal two-dimensional triangulations
- Tan - 1993
4
Memorybased active learning for optimizing noisy continuous .. (context) - Moore, Schneider et al. - 1998
3
The complexity of finding small triangulations of convex 3-p..
- Below, De Loera et al. - 2000
2
A robust implementation for three-dimensional Delaunay trian.. (context) - Mucke - 1995
2
Machine Learning (context) - discretization, control - 2002
2
Euclidean Geometry and Computers (context) - Fortune, diagrams et al. - 1992
2
Kluwer Academic Publishing (context) - Aurenhammer, Xu et al. - 2000
2
A multigrid form of value iteration applied to a markov deci..
- Heckendorn, Anderson - 1998
http://citeseer.nj.nec.com/392847.html
Documents on the same site (http://aml.cs.byu.edu/pubs.html): More
Fixed vs Dynamic Sub-transfer in Reinforcement Learning - Carroll (2002)
(Correct)
Efficient Value Iteration Using Partitioned Models - Wingate, Seppi
(Correct)
Memory-guided Exploration in Reinforcement Learning - Carroll, Peterson, Owens
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC