See this document in CiteSeerX!

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding (1996)  (Make Corrections)  (110 citations)
Richard S. Sutton
Advances in Neural Information Processing Systems



  Home/Search   Context   Related

 
View or download:
umass.edu/pub/anw/pub/su...sutton96.ps
umass.edu/pub/anw/pub/...sutton96.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  umass.edu (more)
From:  umass.edu/~rich/publications
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: On large problems, reinforcement learning systems must use parameterized function approximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational results have been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative results in attempting to apply dynamic programming together with function approximation to simple... (Update)

Cited by:   More
Planning In Hybrid Structured Stochastic - Domains Comenius University   (Correct)
Towards a Unified Theory of State Abstraction for MDPs - Li, Walsh, Littman (2006)   (Correct)
A Reinforcement Learning Algorithm based on Policy Iteration for.. - Gosavi (2004)   (Correct)

Similar documents (at the sentence level):
11.4%:   Reinforcement Learning for 3 vs. 2 Keepaway - Stone, Sutton, Singh   (Correct)

Active bibliography (related documents):   More   All
0.2:   Learning From State Differences: - Weaver, Baxter (1999)   (Correct)
0.2:   STD(λ): learning state differences with TD(λ) - Weaver, Baxter   (Correct)
0.2:   Minimum-Time Control of the Acrobot - Boone (1997)   (Correct)

Similar documents based on text:   More   All
0.1:   Efficient Value Function Approximation For Reinforcement Learning - Wang (1998)   (Correct)
0.1:   Model-Based Reinforcement Learning with an Approximate.. - Kuvayev, Sutton (1996)   (Correct)
0.1:   An Analysis of Temporal-Difference Learning with Function.. - Tsitsiklis, Van Roy (1996)   (Correct)

Related documents from co-citation:   More   All
42:   Learning from Delayed Rewards (context) - CJCH - 1989
33:   Generalization in reinforcement learning: safely approximating the value functio.. - Boyan, Moore - 1995
33:   Learning to predict by the method of temporal differences - Sutton - 1988

BibTeX entry:   (Update)

R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. In D. Touretzky, M. Mozer, and M. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8. MIT Press, 1996. http://citeseer.ist.psu.edu/sutton96generalization.html   More

@inproceedings{ sutton96generalization,
    author = "Richard S. Sutton",
    title = "Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding",
    booktitle = "Advances in Neural Information Processing Systems",
    volume = "8",
    publisher = "The {MIT} Press",
    editor = "David S. Touretzky and Michael C. Mozer and Michael E. Hasselmo",
    pages = "1038--1044",
    year = "1996",
    url = "citeseer.ist.psu.edu/sutton96generalization.html" }
Citations (may not include all citations):
563   Learning to predict by the methods of temporal differences - Sutton - 1988
219   Practical issues in temporal difference learning - Tesauro - 1992
141   Temporal Credit Assignment in Reinforcement Learning (context) - Sutton - 1984
135   Self-improving reactive agents based on reinforcement learni.. (context) - Lin - 1992
124   Improving elevator performance using reinforcement learning - Crites, Barto - 1996
102   Generalization in reinforcement learning: Safely approximati.. - Boyan, Moore - 1995
84   Residual Algorithms: Reinforcement Learning with Function Ap.. - Baird - 1995
84   Neuronlike elements that can solve difficult learning contro.. (context) - Barto, Sutton et al. - 1983
80   A reinforcement learning approach to job-shop scheduling - Zhang, Dietterich - 1995
66   Stable function approximation in dynamic programming - Gordon - 1995
59   Feature-based methods for large-scale dynamic programming - Tsitsiklis, Van Roy - 1994
59   learning using connectionist systems (context) - Rummery, Niranjan - 1994
33   and Robotics (context) - Albus - 1981
25   The convergence of TD (context) - Dayan - 1992
25   Online learning with random representations - Sutton, Whitehead - 1993
13   A counterexample to temporal differences learning - Bertsekas - 1995
12   CMAC-based adaptive critic self-learning control (context) - Lin - 1991
10   Reinforcement learning for planning and control - Dean, Basye et al. - 1992
4   Swinging up the acrobot: An example of intelligent control (context) - DeJong, Spong - 1994
1   New York: Wiley (context) - Learning, Vidyasagar et al. - 1989
1   CMAC: An associative neural network alternative to backpropa.. (context) - Networks, Miller et al. - 1990
1   Reinforcement learning with replacing eligibility traces (context) - CUED, TR et al. - 1996
1   Learning from Delayed Rewards (context) - LIDS-P, Cambridge et al. - 1989



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://fermivista.math.jussieu.fr/ftp/ftp.cs.umass.edu.html):   More
RED: Robust Earliest Deadline Scheduling - Buttazzo, Stankovic (1993)   (Correct)
Operating System Issues for Continuous Media - Schulzrinne (1996)   (Correct)
Sharpening Bounds On The Time Between Events In Maximally.. - Avrunin (1992)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC