| Suematsu, N., & Hayashi, A. (1999). A reinforcement learning algorithm in partially observable environments using short-term memory. In Advances in neural information processing systems (Vol. 11, pp. 1059--1065). |
....way is to build a model online, which learns to predict observations and rewards, and in this way learns to infer the environmental state at each point. A separate controller can then use these inferred environmental states as the basis for its control policy (Lin Mitchell, 1993; Chrisman, 1992; Suematsu Hayashi, 1999; Schmidhuber, 1991a, 1991b) Another way to proceed is to abandon the idea of such a full blown predictive model of the environment. In certain restricted cases, the non Markovian task can be hierarchically decomposed into a set of Markovian subtasks, each of which can be solved by a reactive ....
Suematsu, N., & Hayashi, A. (1999). A reinforcement learning algorithm in partially observable environments using short-term memory. In Advances in neural information processing systems (Vol. 11, pp. 1059--1065).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC