3 citations found. Retrieving documents...
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement Learning: A Survey. J. of Artificial Intelligence Research, 4:237--285, 1996.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
On Amount and Quality of Bias in Reinforcement Learning - Hailu, Sommer   (Correct)

....goal are neither useful nor available. But other biases like environment are more generic that could be applied across tasks. Unbiased, B 0 In this case the agent does not know before hand the nature of the problem, therefore, its action selection strategy is optimism in the face of uncertainty [4]. That is, entries of the belief matrices are like the unused blackboard and learning proceeds from scratch. Environment Bias, B 1 In this type of bias a part of the environment knowledge that inform the agent to stay away from likely collision (which can be the boundary of the world or other ....

....and reinforcement, produce complete (optimal) policy. This reassuring quality, however, is useless in practical terms. An agent that quickly reaches a plateau at 99 of optimality may, in many applications be preferable to an agent that guarantees eventual convergence and a sluggish early learning [4]. Insight bias, B 2 , has reduced the average action by 45:3 . This is an astonishing result, since the search space in bias B 1 is 38, whereas in bias B 2 it is 40. So, even with large search space, the insight bias has enabled the agent to learn faster than bias B 1 . This signifies the fact ....

Leslie P. Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. Artificial Intelligence Research, 4:237--285, 1996.


Continual Robot Learning with Constructive Neural Networks - Großmann, Poli (1998)   (Correct)

....robots for specific tasks. These and other features of robot learning are hoped to move autonomous robotics closer to real world applications. Reinforcement learning has been used by a number of researchers as a computational tool for constructing robots that improve themselves with experience [6]. Despite the impressive advances in this field in recent years, a number of technological gaps remain. For example, it has been found that traditional reinforcement learning techniques do not scale up well to larger problems and that, therefore, we must give up tabula rasa learning techniques ....

....action sequences through imitating an expert on one hand, and by reusing previously learned sensation action rules on the other hand. To explore these ideas, we have decided to use Q learning which is probably the most popular and well understood model free reinforcement learning algo rithm [6]. The idea of Q learning is to construct an evaluation function Q(s; a) called Q function, which returns an estimate of the discounted cumulative reinforcement, i.e. the utility, for each state action pair (s; a) given that the learning agent is in state s and executes action a. Given an ....

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey. Artificial Intelligence Research, 4:237--285, 1996.


Machine Learning for Computer Graphics: A Manifesto and Tutorial - Hertzmann (2003)   (Correct)

No context found.

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement Learning: A Survey. J. of Artificial Intelligence Research, 4:237--285, 1996.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC