Two Methods for Hierarchy Learning in Reinforcement Environments (1993)
| Venue: | Proceedings of the Second International Conference on Simulation of Adaptive Behavior (SAB-92 |
| Citations: | 13 - 0 self |
BibTeX
@INPROCEEDINGS{Ring93twomethods,
author = {Mark Ring},
title = {Two Methods for Hierarchy Learning in Reinforcement Environments},
booktitle = {Proceedings of the Second International Conference on Simulation of Adaptive Behavior (SAB-92},
year = {1993},
pages = {148--155},
publisher = {MIT Press}
}
Years of Citing Articles
OpenURL
Abstract
This paper describes two methods for hierarchically organizing temporal behaviors. The first is more intuitive: grouping together common sequences of events into single units so that they may be treated as individual behaviors. This system immediately encounters problems, however, because the units are binary, meaning the behaviors must execute completely or not at all, and this hinders the construction of good training algorithms. The system also runs into difficulty when more than one unit is (or should be) active at the same time. The second system is a hierarchy of transition values. This hierarchy dynamically modifies the values that specify the degree to which one unit should follow another. These values are continuous, allowing the use of gradient descent during learning. Furthermore, many units are active at the same time as part of the system's normal functionings. 1 Introduction The importance of hierarchy in adaptive systems that perform temporal tasks has been noted often...







