• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Learning from Observation Using Primitives. (2004)

by D C Bentivegna
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 70
Next 10 →

A Survey of Robot Learning from Demonstration

by Brenna D. Argall, Sonia Chernova, Manuela Veloso, Brett Browning
"... We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a ..."
Abstract - Cited by 281 (19 self) - Add to MetaCart
We present a comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings. We introduce the LfD design choices in terms of demonstrator, problem space, policy derivation and performance, and contribute the foundations for a structure in which to categorize LfD research. Specifically, we analyze and categorize the multiple ways in which examples are gathered, ranging from teleoperation to imitation, as well as the various techniques for policy derivation, including matching functions, dynamics models and plans. To conclude we discuss LfD limitations and related promising areas for future research.
(Show Context)

Citation Context

...oximation does not occur until a current observation point in need of mapping is present. The simplest Lazy Learning technique is kNN, which is applied to action selection within robotic marble-maze [=-=Bentivegna, 2004-=-] and simulated ball interception [Argall et al., 2007] domains. More complex approaches include Locally Weighted Regression (LWR) [Cleveland and Loader, 1995]. One LWR technique further anchors local...

Reinforcement Learning in Robotics: A Survey

by Jens Kober, J. Andrew Bagnell , Jan Peters
"... Reinforcement learning offers to robotics a framework and set oftoolsfor the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between di ..."
Abstract - Cited by 39 (2 self) - Add to MetaCart
Reinforcement learning offers to robotics a framework and set oftoolsfor the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between modelbased and model-free as well as between value function-based and policy search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and

Learning Robot Motion Control with Demonstration and Advice-Operators

by Brenna D. Argall, Brett Browning, Manuela Veloso
"... Abstract — As robots become more commonplace within society, the need for tools to enable non-robotics-experts to develop control algorithms, or policies, will increase. Learning from Demonstration (LfD) offers one promising approach, where the robot learns a policy from teacher task executions. Our ..."
Abstract - Cited by 33 (15 self) - Add to MetaCart
Abstract — As robots become more commonplace within society, the need for tools to enable non-robotics-experts to develop control algorithms, or policies, will increase. Learning from Demonstration (LfD) offers one promising approach, where the robot learns a policy from teacher task executions. Our interests lie with robot motion control policies which map world observations to continuous low-level actions. In this work, we introduce Advice-Operator Policy Improvement (A-OPI) as a novel approach for improving policies within LfD. Two distinguishing characteristics of the A-OPI algorithm are data source and continuous state-action space. Within LfD, more example data can improve a policy. In A-OPI, new data is synthesized from a student execution and teacher advice. By contrast, typical demonstration approaches provide the learner with exclusively teacher executions. A-OPI is effective within continuous state-action spaces because high level human advice is translated into continuous-valued corrections on the student execution. This work presents a first implementation of the A-OPI algorithm, validated on a Segway RMP robot performing a spatial positioning task. A-OPI is found to improve task performance, both in success and accuracy. Furthermore, performance is shown to be similar or superior to the typical exclusively teacher demonstrations approach. I.
(Show Context)

Citation Context

.... There are three core approaches to policy derivation from demonstration data. In the first approach, the data is used to directly approximate the underlying function mapping observations to actions =-=[5]-=-. In the second approach, the data is used to determine the world dynamics model T (s ′ |s, a) and possibly a function R(s) associating reward with world state [4]. In the third approach, a sequence o...

Learning by demonstration with critique from a human teacher. HRI

by Brenna Argall, Brett Browning, Manuela Veloso , 2007
"... Learning by demonstration can be a powerful and natural tool for developing robot control policies. That is, instead of tedious hand-coding, a robot may learn a control policy by interacting with a teacher. In this work we present an algorithm for learning by demonstration in which the teacher opera ..."
Abstract - Cited by 31 (2 self) - Add to MetaCart
Learning by demonstration can be a powerful and natural tool for developing robot control policies. That is, instead of tedious hand-coding, a robot may learn a control policy by interacting with a teacher. In this work we present an algorithm for learning by demonstration in which the teacher operates in two phases. The teacher first demonstrates the task to the learner. The teacher next critiques learner performance of the task. This critique is used by the learner to update its control policy. In our implementation we utilize a 1-Nearest Neighbor technique which incorporates both training dataset and teacher critique. Since the teacher critiques performance only, they do not need to guess at an effective critique for the underlying algorithm. We argue that this method is particularly well-suited to human teachers, who are generally better at assigning credit to performances than to algorithms. We have applied this algorithm to the simulated task of a robot intercepting a ball. Our results demonstrate improved performance with teacher critiquing, where performance is measured by both execution success and efficiency.
(Show Context)

Citation Context

...behaviors sparsely with via points. Local learning was used for motion control policy development by Atekson et al. [2]. The work presented in this paper also uses local learning techniques, and like =-=[5, 8]-=- combines this with teaching by demonstration. In general, the example executions of a teacher will not apply directly to a learner. Many techniques rely upon an unknown mapping from teacher observati...

Knowledge transfer using local features

by Martin Stolle , Christopher G Atkeson - in: Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL’07 , 2007
"... Abstract-We present a method for reducing the effort required to compute policies for tasks based on solutions to previously solved tasks. The key idea is to use a learned intermediate policy based on local features to create an initial policy for the new task. In order to further improve this init ..."
Abstract - Cited by 19 (1 self) - Add to MetaCart
Abstract-We present a method for reducing the effort required to compute policies for tasks based on solutions to previously solved tasks. The key idea is to use a learned intermediate policy based on local features to create an initial policy for the new task. In order to further improve this initial policy, we developed a form of generalized policy iteration. We achieve a substantial reduction in computation needed to find policies when previous experience is available.
(Show Context)

Citation Context

... dx, dy) where x and y specify the 2d position on the plane and dx, dy specify the 2d velocity. Actions are also two dimensional (fx, fy) and are force vectors to be applied to the marble. This is not identical but similar to tilting the board. Other simplifications are made in the simulator: the marble is only a point and has no physical extent, the physics are simulated as a sliding block (simplifies friction and inertia) and the simulator adds no artificial noise. Hence, all actions are deterministic. A more realistic but also higher dimensional marble maze simulator was used by Bentivegna [19]. The reward structure used for reinforcement learning in this domain is very simple. Reaching the goal results in a large positive reward. Falling into a hole terminates the trial and results in a large negative reward. Additionally, each action incurs a small negative reward. The agent tries to maximize the reward received, resulting in policies that roughly minimize the time to reach the goal while avoiding holes. Solving the maze from scratch was done using value iteration. In value iteration, dynamic programming sweeps across all states and performs the following update to the value funct...

Learning similar tasks from observation and practice

by Darrin C. Bentivegna, Christopher G. Atkeson - In International Conference on Intelligent Robots and Systems , 2006
"... Abstract- This paper presents a case study of learning to select behavioral primitives and generate subgoals from observation and practice. Our approach uses local features to generalize across tasks and global features to learn from practice. We demonstrate this approach applied to the marble maze ..."
Abstract - Cited by 14 (3 self) - Add to MetaCart
Abstract- This paper presents a case study of learning to select behavioral primitives and generate subgoals from observation and practice. Our approach uses local features to generalize across tasks and global features to learn from practice. We demonstrate this approach applied to the marble maze task. Our robot uses local features to initially learn primitive selection and subgoal generation policies from observing a teacher maneuver a marble through a maze. The robot then uses this information as it tries to traverse another maze, and refines the information during learning from practice. I.
(Show Context)

Citation Context

...ted by observing a sequence of critical events such as the ball just hit a single wall or the ball is against two walls. For more details on our method of segmenting observed data into primitives see =-=[14]-=-. The policies used in the Action Generation module have previously been learned from observing and practicing similar mazes and were also used in the research of [8], [9]. Execution of a primitive en...

Tactile guidance for policy refinement and reuse

by Brenna D. Argall, Eric L. Sauser, Aude G. Billard - Sharif University of Technology , 1998
"... Abstract—Demonstration learning is a powerful and practical technique to develop robot behaviors. Even so, development remains a challenge and possible demonstration limitations can degrade policy performance. This work presents an approach for policy improvement and adaptation through a tactile int ..."
Abstract - Cited by 12 (5 self) - Add to MetaCart
Abstract—Demonstration learning is a powerful and practical technique to develop robot behaviors. Even so, development remains a challenge and possible demonstration limitations can degrade policy performance. This work presents an approach for policy improvement and adaptation through a tactile interface located on the body of a robot. We introduce the Tactile Policy Correction (TPC) algorithm, that employs tactile feedback for the refinement of a demonstrated policy, as well as its reuse for the development of other policies. We validate TPC on a humanoid robot performing grasp-positioning tasks. The performance of the demonstrated policy is found to improve with tactile corrections. Tactile guidance also is shown to enable the development of policies able to successfully execute novel, undemonstrated, tasks. body. Through the tactile interface, the human teacher indicates relative adjustments to the robot pose (policy predictions) online, as the robot executes. The robot immediately modifies its pose to accommodate the adjustment, and the resulting, adjusted, pose is treated as new training data for the policy.
(Show Context)

Citation Context

... the form of behavior primitives, or simpler policies that contribute to the execution of a more complex policy. Examples include hand-coded primitives used within [9] or automatically extracted from =-=[11]-=- demonstrated tasks, and primitives learned from demonstration [10]. In our work, policy reuse takes a bootstrapping, rather than behavior-primitives, form: the adapted policy performs a different tas...

M.J.Mataric´, ‘Toward a vocabulary of primitive task programs for humanoid robots

by Evan Drumwright - Proc. of International Conference on Development and Learning (ICDL), Bloomington,IN , 2006
"... Abstract — Researchers and engineers have used primitive actions to facilitate programming of tasks since the days of Shakey [1]. Task-level programming, which requires the user to specify only subgoals of a task to be accomplished, depends on such a set of primitive task programs to perform these s ..."
Abstract - Cited by 11 (1 self) - Add to MetaCart
Abstract — Researchers and engineers have used primitive actions to facilitate programming of tasks since the days of Shakey [1]. Task-level programming, which requires the user to specify only subgoals of a task to be accomplished, depends on such a set of primitive task programs to perform these subgoals. Past research in this area has used the commands from robot programming languages as the vocabulary of primitive tasks for robotic manipulators. We propose drawing from work measurement systems to construct the vocabulary of primitive task programs. We describe one such work measurement system, present several primitive task programs for humanoid robots inspired from this system, and show how these primitive programs can be used to construct complex behaviors. Index Terms — robot programming, task-level programming, humanoid robots I.
(Show Context)

Citation Context

...te complexity of planning [6], ascertain how to ignore extraneous data in constructing plans from observed task executions [7], [8], or learning policies for executing tasks with performance criteria =-=[9]-=-. Subsequently, research into “good” sets of primitive robot actions has been relatively overlooked; researchers [4], [7] tend to propose new primitives in an ad hoc manner. In contrast, we propose a ...

Learning from observation and practice using primitives

by Darrin C. Bentivegna, Christopherg. Atkeson, Gordon Cheng - In AAAI Fall Symposium Series, Symposium on Real-life Reinforcement Learning, Washington,USA , 2004
"... We explore how to enable robots to rapidly learn from watching a human or robot perform a task, and from practicing the task itself. A key component of our approach is to use small units of behavior, which we refer to as behavioral primitives. Another key component is to use the observed human behav ..."
Abstract - Cited by 10 (1 self) - Add to MetaCart
We explore how to enable robots to rapidly learn from watching a human or robot perform a task, and from practicing the task itself. A key component of our approach is to use small units of behavior, which we refer to as behavioral primitives. Another key component is to use the observed human behavior to define the space to be explored during learning from practice. In this paper we manually define task appropriate primitives by programming how to find them in the training data. We describe memory-based approaches to learning how to select and provide subgoals for behavioral primitives. We demonstrate both learning from observation and learning from practice on a marble maze task, Labyrinth. Using behavioral primitives greatly speeds up learning relative to learning using a direct mapping from states to actions.

INTERACTIVE HUMAN POSE AND ACTION RECOGNITION USING DYNAMICAL MOTION PRIMITIVES

by Odest Chadwicke Jenkins, Germán González Serrano , 2007
"... There is currently a division between real-world human performance and the decision making of socially interactive robots. This circumstance is partially due to the difficulty in estimating human cues, such as pose and gesture, from robot sensing. Towards bridging this division, we present a method ..."
Abstract - Cited by 10 (2 self) - Add to MetaCart
There is currently a division between real-world human performance and the decision making of socially interactive robots. This circumstance is partially due to the difficulty in estimating human cues, such as pose and gesture, from robot sensing. Towards bridging this division, we present a method for kinematic pose estimation and action recognition from monocular robot vision through the use of dynamical human motion vocabularies. Our notion of a motion vocabulary is comprised of movement primitives that structure a human’s action space for decision making and predict human movement dynamics. Through prediction, such primitives can be used to both generate motor commands for specific actions and perceive humans performing those actions. In this paper, we focus specifically on the perception of human pose and performed actions using a known vocabulary of primitives. Given image observations over time, each primitive infers pose independently using its expected dynamics in the context of a particle filter. Pose estimates from a set of primitives inferencing in parallel are arbitrated to estimate the action being performed. The efficacy of our approach is demonstrated through interactive-time pose and action recognition over extended motion trials. Results evidence our approach requires small numbers of particles for tracking, is robust to unsegmented multi-action movement, movement speed, camera viewpoint and is able to recover from occlusions.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University