| I. Kwee, M. Hutter, and J. Schmidhuber. Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Arti cial Neural Networks (ICANN-2001. |
....problems [Hut02a] arose from the construction of the time bounded AI# model [Hut01d] Tight [Hut02b] error [Hut01c, Hut01a] and loss [Hut01b] bounds for Solomono# s universal sequence prediction scheme have been proven. Loosely related ideas on a market economy based reinforcement learner [KHS01b] and gradient based reinforcement planner [KHS01a] have been implemented. These and other papers are available at http: www.idsia.ch # marcus ai. Acknowledgements. I am indebted to Shane Legg who proof read this article. ....
I. Kwee, M. Hutter, and J. Schmidhuber. Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Artificial Neural Networks (ICANN-2001), pages 865--873, 2001.
No context found.
I. Kwee, M. Hutter, and J. Schmidhuber. Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Arti cial Neural Networks (ICANN-2001.
....two module idea. Promising candidates may be RL schemes based on economy and market models, such as classi er systems and their variants [9,44,45,42,43] or the related Prototypical Self referential Associating Learning Mechanisms (PSALMs) 21] the Neural Bucket Brigade [22] Hayek Machines [1,12], Collective Intelligences (COINs) 46] The basic ideas of the present chapter will probably remain unchanged, however: competing agents will agree on algorithmic experiments and bet on their outcomes, the winners pro ting from outwitting others. Acknowledgments I would like to thank Jieyu ....
I. Kwee, M. Hutter, and J. Schmidhuber. Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Arti cial Neural Networks (ICANN-2001), in press, (IDSIA-10-01, cs.AI/0105025), 2001.
.... learner based on Holland s arti cial economy [15] to a simpler 3 peg blocks world problem where any disk may be placed on any other [3] thus the required number of moves grows only linearly with the number of disks, not exponentially; we were able to replicate their results for n up to 5 [23]. Traditional AI planning procedures (e.g, chapter V of [36] 21] do not learn but systematically explore all possible move combinations, using only absolutely necessary task speci c primitives (while oops will later use more than 70 general instructions, most of them unnecessary) On current ....
I. Kwee, M. Hutter, and J. Schmidhuber. Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Arti cial Neural Networks (ICANN-2001.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC