See this document in CiteSeerX!

Module-Based Reinforcement Learning: Experiments with a Real Robot (1998)  (Make Corrections)  (1 citation)
Zsolt Kalmar, Csaba Szepesvári, András Lörincz
Machine Learning



  Home/Search   Context   Related

 
View or download:
inf.elte.hu/~lorincz/Fi...modbase.ps.gz
iserv.iki.kfki.hu/pub/p...modbase.ps.gz
sneaker.mindmaker.kfkipar...ml98.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  inf.elte.hu/~lorincz/pub (more)
From:  iserv.iki.kfki.hu/New/pub
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: . The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the... (Update)

Context of citations to this paper:   More

.... of low level actuator control was proposed by Mataric in its PhD thesis [15] and later on this approach was adopted by other authors [16, 17]. Dorigo and Colombetti combined RL and Classifier Systems to produce highly autonomous robots [18] based on a formalisation of the...

Cited by:   More
Experiments in Robot Control for an Instance-Based.. - Ribeiro, Hemerly (1999)   (Correct)

Similar documents (at the sentence level):
27.0%:   Module Based Reinforcement Learning for a Real Robot - Kalm'ar   (Correct)

Active bibliography (related documents):   More   All
0.8:   Multi-criteria Reinforcement Learning - Gabor, Kalmar, Szepesvari (1998)   (Correct)
0.7:   Between MDPs and Semi-MDPs: A Framework for Temporal.. - Sutton, Precup, Singh (1999)   (Correct)
0.5:   A Unified Framework for Hybrid Control: Model and.. - Branicky, Borkar, Mitter (1998)   (Correct)

Similar documents based on text:   More   All
1.1:   Some Basic Facts Concerning Minimax Sequential Decision Processes - Szepesvari (1996)   (Correct)
1.1:   Certainty Equivalent Policies Are Self-Optimizing Under.. - Szepesvári (1996)   (Correct)
1.0:   Approximate Inverse-Dynamics Based Robust Control Using .. - Szepesvári.. (1997)   (Correct)

BibTeX entry:   (Update)

Z. Kalmar, C. Szepesvari, and A. Lorincz. Module-based reinforcement learning: Experiments with a real robot. Machine Learning. to appear. http://citeseer.ist.psu.edu/kalmar98modulebased.html   More

@article{ kalmar97module,
  author = 	 {Zs. Kalmar and Cs. Szepesvari and A. Lorincz},
  title = 	 {Module-Based Reinforcement Learning: Experiments with a Real Robot},
  journal = 	 {Machine Learning},
  year = 	 "1997",
  volume =	 "31",
  number =	 {1--3},
  pages =	 {55--85},
  month =	 {April},
  url = {citeseer.ist.psu.edu/kalmar98modulebased.html} }
Citations (may not include all citations):
408   Princeton University Press (context) - Bellman - 1957
374   Reinforcement learning: A survey - Kaebling, Littman et al. - 1996
288   Planning in a hierarchy of abstraction spaces (context) - Sacerdoti - 1974
281   Machine Learning (context) - Watkins, Dayan - 1992
257   Learning to act using real-time dynamic programming - Barto, Bradtke et al. - 1995
255   Human Problem Solving (context) - Newell, Simon - 1972
183   Automatic programming of behavior-based robots using reinfor.. - Mahadevan, Connell - 1992
125   Learning to coordinate behaviors - Maes, Brooks - 1990
110   Generalization in reinforcement learning: Successful example.. - Sutton - 1996
107   the convergence of stochastic iterative dynamic programming .. - Jaakkola, Jordan et al. - 1994
94   Hybrid models for motion control systems (context) - Brockett - 1993
87   Reinforcement learning with hierarchies of machines - Parr, Russell - 1997
80   The role of exploration in learning control - Thrun - 1992
78   A qualitative physics based on confluences (context) - de Kleer, Seely - 1984
71   Asynchronous stochastic approximation and q-learning (context) - Tsitsiklis - 1994
70   Computational Mechanisms for Action Selection (context) - Tyrrell - 1993
66   Robot shaping: Developing autonomous agents through learning (context) - Dorigo, Colombetti - 1994
63   An analysis of temporal difference learning with function ap.. - Tsitsiklis, Van Roy - 1995
60   Purposive behavior acquisition for a real robot by vision-ba.. - Asada, Noda et al. - 1996
59   Feature-based methods for large scale dynamic programming - Tsitsiklis, Van Roy - 1996
55   volume 736 of Lecture Notes in Computer Science (context) - Grossman, Nerode et al. - 1993
51   Finding structure in reinforcement learning - Thrun, Schwartz - 1995
43   Reinforcement learning in the multi-robot domain (context) - Matari'c - 1997
35   Alecsys and the autonomouse: Learning to control a real robo.. - Dorigo - 1995
28   Complexity analysis of real-time reinforcement learning appl.. - Koenig, Simmons - 1997
27   Behavior coordination for a mobile robot using modular reinf.. - Uchibe, Asada et al. - 1996
23   How to dynamically merge markov decision processes - Singh, Cohn - 1997
21   Adaptive Behavior (context) - Wiering, Schmidhuber - 1997
15   Advanced forecasting methods for global crisis warning and m.. (context) - Werbos - 1977
13   Modeling agents as qualitative decision makers - Brafman, Moshe - 1997
11   A unified framework for hybrid control: Background (context) - Branicky, Borkar et al. - 1994
10   The loss from imperfect value functions in expectation-based.. (context) - Heger - 1996
6   Qualitative system identification: deriving structure from b.. (context) - Say, Selahattin - 1996
5   Module based reinforcement learning for a real robot (context) - Kalm'ar, Szepesv'ari et al. - 1997
4   Vector-valued dynamic programming (context) - Henig - 1983
3   Algorithms for design of hybrid systems (context) - Sastry - 1997
2   Research Group on Artificial Intelligence (context) - G'abor, Kalm'ar et al. - 1998
2   Optimal control by means of switching (context) - Zabczyk - 1973
2   Laboratory of Information and Decision (context) - Branicky, Hybrid et al. - 1995
2   Genetic algorithm with alphabet optimization - T'oth, Kov'acs et al. - 1995
1   Also Technical Report CS-96-09 (context) - Littman, Sequential et al. - 1996
1   Temporal Credit Assignment in Reinforcement Learning (context) - LEARNING, Sutton - 1984
1   Sixth European Workshop on Learning Robots (context) - Birk, Demiris - 1998

Documents on the same site (http://www.inf.elte.hu/~lorincz/pub.html):   More
Output Sensitive Discretization for Genetic.. - Kovács..   (Correct)
Identification of Born-Oppenheimer potential energy surfaces of.. - Amstrup   (Correct)
Towards A Unified Model Of Cortical Computation II: From Control.. - Lörincz (1997)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC