Results 1 - 10
of
141
Intrinsic motivation systems for autonomous mental development
- IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
, 2007
"... Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this dr ..."
Abstract
-
Cited by 255 (56 self)
- Add to MetaCart
(Show Context)
Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development. The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without
Emergence of functional hierarchy in a multiple timescale neural network model: A humanoid robot experiment”, PLoS
- Computational Biology
"... It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which cont ..."
Abstract
-
Cited by 93 (15 self)
- Add to MetaCart
(Show Context)
It is generally thought that skilled behavior in human beings results from a functional hierarchy of the motor control system, within which reusable motor primitives are flexibly integrated into various sensori-motor sequence patterns. The underlying neural mechanisms governing the way in which continuous sensori-motor flows are segmented into primitives and the way in which series of primitives are integrated into various behavior sequences have, however, not yet been clarified. In earlier studies, this functional hierarchy has been realized through the use of explicit hierarchical structure, with local modules representing motor primitives in the lower level and a higher module representing sequences of primitives switched via additional mechanisms such as gate-selecting. When sequences contain similarities and overlap, however, a conflict arises in such earlier models between generalization and segmentation, induced by this separated modular structure. To address this issue, we propose a different type of neural network model. The current model neither makes use of separate local modules to represent primitives nor introduces explicit hierarchical structure. Rather than forcing architectural hierarchy onto the system, functional hierarchy emerges through a form of self-organization that is based on two distinct types of neurons, each with different time properties (‘‘multiple timescales’’). Through the introduction of multiple timescales, continuous sequences of behavior are segmented into reusable primitives, and the primitives, in turn, are flexibly integrated into novel sequences. In experiments, the proposed network model, coordinating the physical body of a humanoid robot through
Learning to Forget: Continual Prediction with LSTM
- NEURAL COMPUTATION
, 1999
"... Long Short-Term Memory (LSTM, Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequenc ..."
Abstract
-
Cited by 86 (25 self)
- Add to MetaCart
Long Short-Term Memory (LSTM, Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indenitely and eventually cause the network to break down. Our remedy is a novel, adaptive \forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them in an elegant way.
Multiple model-based reinforcement learning
- Neural Computation
, 2002
"... We propose a modular reinforcement learning architecture for non-linear, non-stationary control tasks, which we call multiple model-based reinforcement learn-ing (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environme ..."
Abstract
-
Cited by 85 (5 self)
- Add to MetaCart
We propose a modular reinforcement learning architecture for non-linear, non-stationary control tasks, which we call multiple model-based reinforcement learn-ing (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The 1 system is composed of multiple modules, each of which consists of a state predic-tion model and a reinforcement learning controller. The “responsibility signal,” which is given by the softmax function of the prediction errors, is used to weight the outputs of multiple modules as well as to gate the learning of the predic-tion models and the reinforcement learning controllers. We formulate MMRL for both discrete-time, finite state case and continuous-time, continuous state case. The performance of MMRL was demonstrated for discrete case in a non-stationary hunting task in a grid world and for continuous case in a non-linear, non-stationary control task of swinging up a pendulum with variable physical parameters. 1
Learning semantic combinatoriality from the interaction between linguistic and behavioral processes
- ADAPTIVE BEHAVIOR
, 2005
"... ..."
(Show Context)
The Challenges of Joint Attention
- Interaction Studies
, 2004
"... This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic ..."
Abstract
-
Cited by 62 (7 self)
- Add to MetaCart
(Show Context)
This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic and computational models of the di#erent prerequisites of joint attention is discussed in relation with a developmental timeline drawn from results in child studies.
Learning to generate articulated behavior through the bottom-up and the top-down interaction processes
- NEURAL NETW 16: 11–23
, 2003
"... A novel hierarchical neural network architecture for sensory-motor learning and behavior generation is proposed. Two levels of forward model neural networks are operated on different time scales while parametric interactions are allowed between the two network levels in the bottom-up and top-down di ..."
Abstract
-
Cited by 58 (23 self)
- Add to MetaCart
A novel hierarchical neural network architecture for sensory-motor learning and behavior generation is proposed. Two levels of forward model neural networks are operated on different time scales while parametric interactions are allowed between the two network levels in the bottom-up and top-down directions. The models are examined through experiments of behavior learning and generation using a real robot arm equipped with a vision system. The results of the learning experiments showed that the behavioral patterns are learned by self-organizing the behavioral primitives in the lower level and combining the primitives sequentially in the higher level. The results contrast with prior work
Internal Models and Anticipations in Adaptive Learning Systems
- In Proceedings of the Workshop on Adaptive Behavior in Anticipatory Learning Systems
"... The explicit investigation of anticipations in relation to adaptive behavior is a recent approach. This chapter first provides psychological background that motivates and inspires the study of anticipations in the adaptive behavior field. Next, a basic framework for the study of anticipations in ada ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
(Show Context)
The explicit investigation of anticipations in relation to adaptive behavior is a recent approach. This chapter first provides psychological background that motivates and inspires the study of anticipations in the adaptive behavior field. Next, a basic framework for the study of anticipations in adaptive behavior is suggested. Different anticipatory mechanisms are identified and characterized. First fundamental distinctions are drawn between implicit anticipatory behavior, payoff anticipatory behavior, sensory anticipatory behavior, and state anticipatory behavior. A case study allows further insights into the drawn distinctions.
Maximizing learning progress: an internal reward system for development
- Embodied Artificial Intelligence, LNCS 3139
, 2004
"... Abstract. This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. Its purpose is to drive the agent to progress in learning given its embodiment and the environment in which it is ..."
Abstract
-
Cited by 42 (7 self)
- Add to MetaCart
(Show Context)
Abstract. This chapter presents a generic internal reward system that drives an agent to increase the complexity of its behavior. This reward system does not reinforce a predefined task. Its purpose is to drive the agent to progress in learning given its embodiment and the environment in which it is placed. The dynamics created by such a system are studied first in a simple environment and then in the context of active vision. 1
Extracting Regularities in Space and Time Through a Cascade of Prediction Networks: The Case of a Mobile Robot Navigating in a Structured Environment
, 1999
"... We propose that the ability to extract regularities from time series through prediction learning can be enhanced if we use a hierarchical architecture in which higher layers are trained to predict the internal state of lower layers when such states change significantly. This hierarchical organiza ..."
Abstract
-
Cited by 42 (8 self)
- Add to MetaCart
(Show Context)
We propose that the ability to extract regularities from time series through prediction learning can be enhanced if we use a hierarchical architecture in which higher layers are trained to predict the internal state of lower layers when such states change significantly. This hierarchical organization has two functions: (a) it forces the system to progressively re-code sensory information so as to enhance useful regularities and filter out useless information; (b) it progressively reduces the length of the sequences which should be predicted going from lower to higher layers. This, in turn, allows higher levels to extract higher level regularities which are hidden at the sensory level. By training an architecture of this type to predict the next sensory state of a robot navigating in a environment divided into two rooms we show how the first level prediction layer extracts low level regularities such as `walls', `corners', and `corridors' while the second level prediction laye...