| "Automatic Programming of Behavior-based Robots using Reinforcement Learning", Sridhar Mahadevan and Jonathan Connell, IBM T. J. Watson Research Center Research Report, December 5, 1990. |
....shortens the distance between the reinforcement signal and the individual actions. Consequently, the length of action sequences to be learned is decreased. However, breaking the problem up into an appropriate set of modules requires domain information about the particular learning task. Mahadevan and Cormell 90] give an example of breaking up a box pushing task into three modules, effectively introducing three subgoals into the learning task. The three are carefully chosen to be orthogonal and non conflicting, based on the particular task. The robot s behavior repertoire is designed so that whatever ....
"Automatic Programming of Behavior-based Robots using Reinforcement Learning", Sridhar Mahadevan and Jonathan Connell, IBM T. J. Watson Research Center Research Report, December 5, 1990.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC