Results 1 - 10
of
298
Apprenticeship Learning via Inverse Reinforcement Learning
- In Proceedings of the Twenty-first International Conference on Machine Learning
, 2004
"... We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be di#cul ..."
Abstract
-
Cited by 382 (12 self)
- Add to MetaCart
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be di#cult to write down an explicit reward function specifying exactly how di#erent desiderata should be traded o#. We think of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and give an algorithm for learning the task demonstrated by the expert. Our algorithm is based on using "inverse reinforcement learning" to try to recover the unknown reward function. We show that our algorithm terminates in a small number of iterations, and that even though we may never recover the expert's reward function, the policy output by the algorithm will attain performance close to that of the expert, where here performance is measured with respect to the expert 's unknown reward function.
A Survey of Socially Interactive Robots
, 2002
"... This paper reviews "socially interactive robots": robots for which social human-robot interaction is important. We begin by discussing the context for socially interactive robots, emphasizing the relationship to other research fields and the di#erent forms of "social robots". We ..."
Abstract
-
Cited by 305 (26 self)
- Add to MetaCart
This paper reviews "socially interactive robots": robots for which social human-robot interaction is important. We begin by discussing the context for socially interactive robots, emphasizing the relationship to other research fields and the di#erent forms of "social robots". We then present a taxonomy of design methods and system components used to build socially interactive robots. Finally, we describe the impact of these these robots on humans and discuss open issues. An expanded version of this paper, which contains a survey and taxonomy of current applications, is available as a technical report[61].
A Survey of Robot Learning from Demonstration
"... We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a ..."
Abstract
-
Cited by 281 (19 self)
- Add to MetaCart
We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a structure in which to categorize LfD research. Specifically, we analyze and categorize the multiple ways in which examples are gathered, ranging from teleoperation to imitation, as well as the various techniques for policy derivation, including matching functions, dynamics models and plans. To conclude we discuss LfD limitations and related promising areas for future research.
Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice
- In Proceedings of the Second International Joint Conference on Autonomous Agents and Multi-Agent Systems
, 2003
"... Among humans, teaching various tasks is a complex process which relies on multiple means for interaction and learning, both on the part of the teacher and of the learner. Used together, these modalities lead to effective teaching and learning approaches, respectively. In the robotics domain, task te ..."
Abstract
-
Cited by 162 (11 self)
- Add to MetaCart
(Show Context)
Among humans, teaching various tasks is a complex process which relies on multiple means for interaction and learning, both on the part of the teacher and of the learner. Used together, these modalities lead to effective teaching and learning approaches, respectively. In the robotics domain, task teaching has been mostly addressed by using only one or very few of these interactions. In this paper we present an approach for teaching robots that relies on the key features and the general approach people use when teaching each other: first give a demonstration, then allow the learner to refine the acquired capabilities by practicing under the teacher's supervision, involving a small number of trials. Depending on the quality of the learned task, the teacher may either demonstrate it again or provide specific feedback during the learner's practice trial for further refinement. Also, as people do during demonstrations, the teacher can provide simple instructions and informative cues, increasing the performance of learning. Thus, instructive demonstrations, generalization over multiple demonstrations and practice trials are essential features for a successful human-robot teaching approach. We implemented a system that enables all these capabilities and validated these concepts with a Pioneer 2DX mobile robot learning tasks from multiple demonstrations and teacher feedback.
Learning to Perceive the World as Articulated: An Approach for Hierarchical Learning in Sensory-Motor Systems
- NEURAL NETWORKS
, 1999
"... This paper describes how agents can learn an internal model of the world structurally by focusing on the problem of behavior-based articulation. We develop an on-line learning scheme -- the so-called mixture of recurrent neural net (RNN) experts -- in which a set of RNN modules becomes self-organ ..."
Abstract
-
Cited by 141 (31 self)
- Add to MetaCart
This paper describes how agents can learn an internal model of the world structurally by focusing on the problem of behavior-based articulation. We develop an on-line learning scheme -- the so-called mixture of recurrent neural net (RNN) experts -- in which a set of RNN modules becomes self-organized as experts on multiple levels in order to account for the different categories of sensory-motor flow which the robot experiences. Autonomous switching of activated modules in the lower level actually represents the articulation of the sensory-motor flow. In the meanwhile, a set of RNNs in the higher level competes to learn the sequences of module switching in the lower level, by which articulation at a further more abstract level can be achieved. The proposed scheme was examined through simulation experiments involving the navigation learning problem. Our dynamical systems analysis clarified the mechanism of the articulation; the possible correspondence between the articulation...
Imitation as a dual-route process featuring predictive and learning components: a biologically plausible computational model
, 2002
"... ..."
Exploration and apprenticeship learning in reinforcement learning
- In ICML
, 2005
"... We consider reinforcement learning in systems with unknown dynamics. Algorithms such as E3 (Kearns and Singh, 2002) learn near-optimal policies by using “exploration policies ” to drive the system towards poorly modeled states, so as to encourage exploration. But this makes these algorithms impracti ..."
Abstract
-
Cited by 102 (3 self)
- Add to MetaCart
We consider reinforcement learning in systems with unknown dynamics. Algorithms such as E3 (Kearns and Singh, 2002) learn near-optimal policies by using “exploration policies ” to drive the system towards poorly modeled states, so as to encourage exploration. But this makes these algorithms impractical for many systems; for ex-ample, on an autonomous helicopter, overly ag-gressive exploration may well result in a crash. In this paper, we consider the apprenticeship learn-ing setting in which a teacher demonstration of the task is available. We show that, given the initial demonstration, no explicit exploration is necessary, and we can attain near-optimal per-formance (compared to the teacher) simply by repeatedly executing “exploitation policies ” that try to maximize rewards. In finite-state MDPs, our algorithm scales polynomially in the num-ber of states; in continuous-state linear dynami-cal systems, it scales polynomially in the dimen-sion of the state. These results are proved using a martingale construction over relative losses. 1.
Learning human arm movements by imitation: Evaluation of a biologically-inspired connectionist architecture
- Robotics and Autonomous Systems
, 2001
"... This paper is concerned with the evaluation of a model of human imitation of arm movements. The model consists of a hierarchy of articial neural networks, which are abstractions of brain regions involved in visuo-motor control. These are the spinal cord, the primary and pre-motor cortexes (M1 & ..."
Abstract
-
Cited by 101 (9 self)
- Add to MetaCart
This paper is concerned with the evaluation of a model of human imitation of arm movements. The model consists of a hierarchy of articial neural networks, which are abstractions of brain regions involved in visuo-motor control. These are the spinal cord, the primary and pre-motor cortexes (M1 & PM), the cerebellum, and the temporal cortex. A biomechanical simulation is developed which models the muscles and the complete dynamics of a 37 degree of freedom humanoid. Input to the model are data from human arm movements recorded using video and marker-based tracking systems. The model's performance is evaluated for reproducing reaching movements and oscillatory movements of the two arms. Results show a high qualitative and quantitative agreement with human data. In particular, the model reproduces the well known features of reaching movements in humans, namely the bellshaped curves for the velocity and quasi-linear hand trajectories. Finally, the model's performance is compared to that o...
Robot See, Robot Do : An Overview of Robot Imitation
- In AISB96 Workshop on Learning in Robots and Animals
, 1996
"... There are currently two major approaches to robot teaching: explicitly tell the robot what to do (programming) or let the robot figure it out for itself (reinforcement learning/genetic algorithms) . In this paper we give an overview of a new approach, in which the robot instead learns novel behaviou ..."
Abstract
-
Cited by 92 (2 self)
- Add to MetaCart
There are currently two major approaches to robot teaching: explicitly tell the robot what to do (programming) or let the robot figure it out for itself (reinforcement learning/genetic algorithms) . In this paper we give an overview of a new approach, in which the robot instead learns novel behaviours by observing the behaviour of others: imitation learning. We summarize the psychological background of this approach, propose a definition of imitation, and identify the important issues involved in implementing imitation in robotic systems. Based on this framework, we review recent published work in this area, and describe an imitation project currently underway at the Electrotechnical Laboratory in Japan. Introduction Ever since the conception of robots, researchers have been faced with the problem of how to make them behave. That is, how to endow robots with the ability to perform complex behaviours and interact intelligently with the environment. The two most widely-used solutions t...
Incremental learning of gestures by imitation in a humanoid robot
- In Proceedings of the 2007 ACM/IEEE International Conference on Human-Robot Interaction
, 2007
"... We present an approach to teach incrementally human gestures to a humanoid robot. The learning process consists of first projecting the movement data in a latent space and encoding the resulting signals in a Gaussian Mixture Model (GMM). We compare the performance of two incremental training procedu ..."
Abstract
-
Cited by 92 (10 self)
- Add to MetaCart
(Show Context)
We present an approach to teach incrementally human gestures to a humanoid robot. The learning process consists of first projecting the movement data in a latent space and encoding the resulting signals in a Gaussian Mixture Model (GMM). We compare the performance of two incremental training procedures against a batch training procedure. Qualitative and quantitative evaluations are performed on data acquired from motion sensors attached to a human demonstrator and data acquired by kinesthetically demonstrating the task to the robot. We present experiments to show that these different modalities can be used to teach incrementally basketball officials ’ signals to a HOAP-3 humanoid robot. 1.