See this document in CiteSeerX!

Mitsubishi Electric Research Laboratories (2003)  (Make Corrections)  
http://www.merl.com Non-Linear Stochastic Control in Continuous State Spaces...



  Home/Search   Context   Related

 
View or download:
merl.com/reports/docs/TR2003091.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  merl.com/reports/docs/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: We present an algorithm for sequential control of tasks with non-linear stochastic dynamics in continuous state spaces, characterized by inhomogeneous noise. The algorithm performs approximate value iteration steps on a select set of prototypical states whose cost-to-go is approximated by means of a radial-basis function network. This allows the resulting Bellman's equations to be integrated exactly with respect to the transition densities of a large class of stochastic dynamical systems,... (Update)

Active bibliography (related documents):   More   All
0.1:   Learning State Grounding for Optimal Visual Servo Control.. - Nikovski, Nourbakhsh (2001)   (Correct)
0.1:   Reinforcement Learning with Policy Constraints - Thrun, Schulte   (Correct)
0.1:   Learning Probabilistic Models for Optimal Visual Servo.. - Nikovski, Nourbakhsh   (Correct)

System load high. Please wait...
Timeout. Please try your query later.
Similar documents based on text:
0.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ merl-mitsubishi,
  author = "Http Www Merl",
  title = "Mitsubishi Electric Research Laboratories",
  url = "citeseer.ist.psu.edu/742369.html" }
Citations (may not include all citations):
413   Neuro-Dynamic Programming (context) - Bertsekas, Tsitsiklis - 1996
291   Markov Decision Processes--- Discrete Stochastic Dynamic Pro.. (context) - Puterman - 1994
278   Dynamic Programming and Optimal Control (context) - Bertsekas - 2000
219   Practical issues in temporal difference learning - Tesauro - 1992
212   A probabilistic approach to concurrent mapping and localizat.. - Thrun, Burgard et al. - 1998
102   Generalization in reinforcement learning: Safely approximati.. - Boyan, Moore - 1995
74   Probabilistic algorithms in robotics - Thrun - 2000
61   Networks and the best approximation property - Girosi, Poggio - 1990
59   Feature-based methods for large scale dynamic programming - Tsitsiklis, van Roy - 1996
33   Optimal Control and Estimation (context) - Stengel - 1994
30   High-performance job-shop scheduling with a time-delay TD - Zhang, Dietterich - 1996
28   The swingup control problem for the acrobot (context) - Spong - 1995
24   Elevator group control using multiple reinforcement learning.. - Crites, Barto - 1998
9   Kernel-based reinforcement learning - Ormoneit, Sen - 1999
7   Stable fitted reinforcement learning - Gordon - 1996
7   Pole balancing on a real rig using a reinforcement learning .. (context) - Jervis, Fallside - 1992

Documents on the same site (http://www.merl.com/reports/docs/):   More
Iterative Decoding Using Replicas - Zhang, Wang, Fossorier, Yedidia (2005)   (Correct)
A Unified Traffic Control Scheme For Atm Networks.. - Zheng, Lauer, Howard.. (1994)   (Correct)
Rapid Object Detection Using a Boosted Cascade of Simple Features - Viola, Jones (2004)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC