Results 11 - 20
of
30
Behavioral considerations suggest an average reward TD model . . .
- Neurocomputing
, 2000
"... Recently there has been much interest in modeling the activity of primate midbrain dopamine neurons as signalling reward prediction error. But since the models are based on temporal difference (TD) learning, they assume an exponential decline with time in the value of delayed reinforcers, an assumpt ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Recently there has been much interest in modeling the activity of primate midbrain dopamine neurons as signalling reward prediction error. But since the models are based on temporal difference (TD) learning, they assume an exponential decline with time in the value of delayed reinforcers, an assumption long known to conflict with animal behavior. We show that a variant of TD learning that tracks variations in the average reward per timestep rather than cumulative discounted reward preserves the models' success at explaining neurophysiological data while significantly increasing their applicability to behavioral data.
Anticipatory learning: The animat as discovery engine
- In Butz, M. V., G6rard, P., & Sigaud, O. (Eds.), Adaptive Behavior in Anticipatory Learning Systems (ABiALS'02
, 2002
"... Abstract. This paper takes an overtly anticipatory stance to the understanding of animat learning and behavior. It analyses four major animal learning theories and attempts to identify the anticipatory and predictive elements inherent to them, and to provide a new unifying approach based on the pred ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. This paper takes an overtly anticipatory stance to the understanding of animat learning and behavior. It analyses four major animal learning theories and attempts to identify the anticipatory and predictive elements inherent to them, and to provide a new unifying approach based on the predictive nature of those elements. Parallels are then drawn with Karl Popper’s “Logic of Scientific Discovery ” in order to show how an animat controller may be built inspired by those principles. The paper discusses the extent, and limitations, to this approach in an animat context and indicates how these principles were used to define the Dynamic Expectancy Model, and construct its implementation SRS/E. 1
Learning about Objects with Human Teachers
"... A general learning task for a robot in a new environment is to learn about objects and what actions/effects they afford. To approach this, we look at ways that a human partner can intuitively help the robot learn, Socially Guided Machine Learning. We present experiments conducted with our robot, Jun ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A general learning task for a robot in a new environment is to learn about objects and what actions/effects they afford. To approach this, we look at ways that a human partner can intuitively help the robot learn, Socially Guided Machine Learning. We present experiments conducted with our robot, Junior, and make six observations characterizing how people approached teaching about objects. We show that Junior successfully used transparency to mitigate errors. Finally, we present the impact of “social ” versus “nonsocial” data sets when training SVM classifiers.
Baby Steps: How “Less is More” in unsupervised dependency parsing
- In NIPS: Grammar Induction, Representation of Language and Language Learning
, 2009
"... We present an empirical study of two very simple approaches to unsupervised grammar induction. Both are based on Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, requires no initialization and bootstraps itself via iterated learning of increasingly longer sentences. This met ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
We present an empirical study of two very simple approaches to unsupervised grammar induction. Both are based on Klein and Manning’s Dependency Model with Valence. The first, Baby Steps, requires no initialization and bootstraps itself via iterated learning of increasingly longer sentences. This method substantially exceeds Klein and Manning’s published numbers and achieves 39.4 % accuracy on Section 23 of the Wall Street Journal corpus — a result that is already competitive with the recent state-of-the-art. The second, Less is More, is based on the observation that there is sometimes a trade-off between the quantity and complexity of training data. Using the standard linguistically-informed prior but training at the “sweet spot ” — sentences up to length 15, it attains 44.1 % accuracy, beating state-of-the-art. Both results generalize to the Brown corpus and shed light on opportunities in the present state of unsupervised dependency parsing. 1
Towards a four factor theory of anticipatory learning
- Lecture Notes in Artificial Intelligence
, 2003
"... Abstract. This paper takes an overtly anticipatory stance to the understanding of animat learning and behavior. It analyses four major animal learning theories and attempts to identify the anticipatory and predictive elements inherent to them, and to provide a new unifying approach based on the anti ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. This paper takes an overtly anticipatory stance to the understanding of animat learning and behavior. It analyses four major animal learning theories and attempts to identify the anticipatory and predictive elements inherent to them, and to provide a new unifying approach based on the anticipatory nature of those elements based on five simple predictive “rules”. These rules encapsulate all the principal properties of the four diverse theories (the four factors) and provide a simple framework for understanding how an individual animat may appear to operate according to different principles under varying circumstances. The paper then indicates how these anticipatory principles can be used to define a more detailed set of postulates for the Dynamic Expectancy Model of animat learning and behavior, and to construct its computer implementation SRS/E. Some of the issues discussed are illustrated with an example experimental procedure using SRS/E. 1
How robot morphology and training order affect the learning of multiple behaviors
- In Proceedings of the IEEE Congress on Evolutionary Computation
, 2009
"... Abstract — Automatically synthesizing behaviors for robots with articulated bodies poses a number of challenges beyond those encountered when generating behaviors for simpler agents. One such challenge is how to optimize a controller that can orchestrate dynamic motion of different parts of the body ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract — Automatically synthesizing behaviors for robots with articulated bodies poses a number of challenges beyond those encountered when generating behaviors for simpler agents. One such challenge is how to optimize a controller that can orchestrate dynamic motion of different parts of the body at different times. This paper presents an incremental shaping method that addresses this challenge: it trains a controller to both coordinate a robot’s leg motions to achieve directed locomotion toward an object, and then coordinate gripper motion to achieve lifting once the object is reached. It is shown that success is dependent on the order in which these behaviors are learned, and that despite the fact that one robot can master these behaviors better than another with a different morphology, this learning order is invariant across the two robot morphologies investigated here. This suggests that aspects of the task environment, learning algorithm or the controller dictate learning order more than the choice of morphology. I.
Learning from Human Teachers with Socially Guided Exploration
"... Abstract — We present a learning mechanism, Socially Guided Exploration, in which a robot learns new tasks through a combination of self-exploration and social interaction. The system’s motivational drives (novelty, mastery), along with social scaffolding from a human partner, bias behavior to creat ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract — We present a learning mechanism, Socially Guided Exploration, in which a robot learns new tasks through a combination of self-exploration and social interaction. The system’s motivational drives (novelty, mastery), along with social scaffolding from a human partner, bias behavior to create learning opportunities for a Reinforcement Learning mechanism. The system is able to learn on its own, but can flexibly use the guidance of a human partner to improve performance. An experiment with non-expert human subjects shows a human is able to shape the learning process through suggesting actions and drawing attention to goal states. Human guidance results in a task set that is significantly more focused and efficient, while self exploration results in a broader set. I.
Evolution of Functional Specialization in a Morphologically Homogeneous Robot
, 2009
"... A central tenet of embodied artificial intelligence is that intelligent behavior arises out of the coupled dynamics between an agent’s body, brain and environment. It follows that the complexity of an agents’s controller and morphology must match the complexity of a given task. However, more complex ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A central tenet of embodied artificial intelligence is that intelligent behavior arises out of the coupled dynamics between an agent’s body, brain and environment. It follows that the complexity of an agents’s controller and morphology must match the complexity of a given task. However, more complex task environments require the agent to exhibit different behaviors, which raises the question as to how to distribute responsibility for these behaviors across the agents’s controller and morphology. In this work a robot is trained to locomote and manipulate an object, but the assumption of functional specialization is relaxed: the robot has a segmented body plan in which the front segment may participate in locomotion and object manipulation, or it may specialize to only participate in object manipulation. In this way, selection pressure dictates the presence and degree of functional specialization rather than such specialization being enforced a priori. It is shown that for the given task, evolution tends to produce functionally specialized controllers, even though successful generalized controllers can also be evolved. Moreover, the robot’s initial conditions and training order have little effect on the frequency of finding specialized controllers, while the inclusion of additional proprioceptive feedback increases this frequency.
Guarding Against Premature Convergence while Accelerating Evolutionary Search ABSTRACT
"... The fundamental dichotomy in evolutionary algorithms is that between exploration and exploitation. Recently, several ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The fundamental dichotomy in evolutionary algorithms is that between exploration and exploitation. Recently, several

