Results 1 - 10
of
85
Intrinsic motivation systems for autonomous mental development
- IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
, 2007
"... Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this dr ..."
Abstract
-
Cited by 255 (56 self)
- Add to MetaCart
(Show Context)
Exploratory activities seem to be intrinsically rewarding for children and crucial for their cognitive development. Can a machine be endowed with such an intrinsic motivation system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development. The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without
Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990-2010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990-) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract
-
Cited by 75 (16 self)
- Add to MetaCart
(Show Context)
The simple but general formal theory of fun & intrinsic motivation & creativity (1990-) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but non-optimal implementations (1991, 1995, 1997-2002) are reviewed, as well as several recent variants by others (2005-). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
Self-Organization of Distributedly Represented Multiple Behavior Schemata in a Mirror System: . . .
, 2004
"... The current paper reviews a connectionist model, the recurrent neural network with parametric biases (RNNPB), in which multiple behavior schemata can be learned by the network in a distributed manner. The parametric biases in the network play an essential role in both generating and recognizing beh ..."
Abstract
-
Cited by 66 (11 self)
- Add to MetaCart
The current paper reviews a connectionist model, the recurrent neural network with parametric biases (RNNPB), in which multiple behavior schemata can be learned by the network in a distributed manner. The parametric biases in the network play an essential role in both generating and recognizing behavior 1 patterns. They act as a mirror system by means of self-organizing adequate memory structures. Three different robot experiments are reviewed: robot and user interactions; learning and generating different types of dynamic patterns; and linguistic-behavior binding. The hallmark of this study is explaining how self-organizing internal structures can contribute to generalization in learning, and diversity in behavior generation, in the proposed distributed representation scheme.
The Challenges of Joint Attention
- Interaction Studies
, 2004
"... This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic ..."
Abstract
-
Cited by 62 (7 self)
- Add to MetaCart
(Show Context)
This paper discusses the concept of joint attention and the di#erent skills underlying its development. We argue that joint attention is much more than gaze following or simultaneous looking because it implies a shared intentional relation to the world. The current state-of-the-art in robotic and computational models of the di#erent prerequisites of joint attention is discussed in relation with a developmental timeline drawn from results in child studies.
From recurrent choice to skill learning: A reinforcement-learning model
- Journal of Experimental Psychology: General
, 2006
"... The authors propose a reinforcement-learning mechanism as a model for recurrent choice and extend it to account for skill learning. The model was inspired by recent research in neurophysiological studies of the basal ganglia and provides an integrated explanation of recurrent choice behavior and ski ..."
Abstract
-
Cited by 54 (12 self)
- Add to MetaCart
(Show Context)
The authors propose a reinforcement-learning mechanism as a model for recurrent choice and extend it to account for skill learning. The model was inspired by recent research in neurophysiological studies of the basal ganglia and provides an integrated explanation of recurrent choice behavior and skill learning. The behavior includes effects of differential probabilities, magnitudes, variabilities, and delay of reinforcement. The model can also produce the violation of independence, preference reversals, and the goal gradient of reinforcement in maze learning. An experiment was conducted to study learning of action sequences in a multistep task. The fit of the model to the data demonstrated its ability to account for complex skill learning. The advantages of incorporating the mechanism into a larger cognitive architecture are discussed.
How hierarchical control self-organizes in artificial adaptive systems
- Adaptive Behavior
, 2005
"... On behalf of: ..."
(Show Context)
Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing
- In IEEE Transactions on Systems, Man and Cybernetics
, 2005
"... Abstract—Designers face many system optimization problems when building distributed systems. Traditionally, designers have relied on optimization techniques that require either prior knowledge or centrally managed runtime knowledge of the system’s environment, but such techniques are not viable in d ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
(Show Context)
Abstract—Designers face many system optimization problems when building distributed systems. Traditionally, designers have relied on optimization techniques that require either prior knowledge or centrally managed runtime knowledge of the system’s environment, but such techniques are not viable in dynamic networks where topology, resource, and node availability are subject to frequent and unpredictable change. To address this problem, we propose collaborative reinforcement learning (CRL) as a technique that enables groups of reinforcement learning agents to solve system optimization problems online in dynamic, decentralized networks. We evaluate an implementation of CRL in a routing protocol for mobile ad hoc networks, called SAMPLE. Simulation results show how feedback in the selection of links by routing agents enables SAMPLE to adapt and optimize its routing behavior to varying network conditions and properties, resulting in optimization of network throughput. In the experiments, SAMPLE displays emergent properties such as traffic flows that exploit stable routes and reroute around areas of wireless interference or congestion. SAMPLE is an example of a complex adaptive distributed system. Index Terms—Feedback, learning systems, mobile ad hoc network routing. I.
Machine Learning for Motor Skills in Robotics.
, 2007
"... Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach pure ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
Autonomous robots that can adapt to novel situations has been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early approaches to this goal during the heydays of artificial intelligence research in the late 1980s, however, made it clear that an approach purely based on reasoning or human insights would not be able to model all the perceptuomotor tasks of future robots. Instead, new hope was put in the growing wake of machine learning that promised fully adaptive control algorithms which learn both by observation and trial-and-error. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator and humanoid robotics and usually scaling was only achieved in precisely pre-structured domains. We have investigated the ingredients for a general approach to motor skill learning in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i.e., firstly, a theoretically well-founded general approach to representing the required control structures for task representation and execution and, secondly, appropriate learning algorithms which can be applied in this setting.
Codevelopmental learning between human and humanoid robot using a dynamic neural network model
, 2008
"... The paper examines characteristics of interactive learning between human tutors and a robot having a dynamic neural network model which is inspired by human parietal cortex functions. A humanoid robot, with a recurrent neural network that has a hierarchical structure, learns to manipulate objects. ..."
Abstract
-
Cited by 22 (5 self)
- Add to MetaCart
The paper examines characteristics of interactive learning between human tutors and a robot having a dynamic neural network model which is inspired by human parietal cortex functions. A humanoid robot, with a recurrent neural network that has a hierarchical structure, learns to manipulate objects. Robots learn tasks in repeated self-trials with the assistance of human interaction which provides physical guidance until tasks are mastered and learning is consolidated within neural networks. Experimental results and the analyses showed that 1) codevelopmental shaping of task behaviors stems from interactions between the robot and tutor, 2) dynamic structures for articulating and sequencing of behavior primitives are selforganized in the hierarchically organized network, and 3) such structures can afford both generalization and context-dependency in generating skilled behaviors.
Actor-critic models of reinforcement learning in the basal ganglia: from natural to artificial rats
- Adaptive Behavior
, 2005
"... Since 1995, numerous ActorCritic architectures for reinforcement learning have been proposed as models of dopaminelike reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
(Show Context)
Since 1995, numerous ActorCritic architectures for reinforcement learning have been proposed as models of dopaminelike reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for an autonomous animat. We present here the comparison of four architectures in an animat as it performs the same rewardseeking task. This will illustrate the consequences of different hypotheses about the management of different Actor submodules and Critic units, and their more or less autonomously determined coordination. We show that the classical method of coordinations of modules by mixture of experts, depending on each module's performance, did not allow solving our task. Then we address the question of which principle should be applied to efficiently combine these units. Improvements for Critic modeling and accuracy of Actorcritic models for a natural task are finally discussed in the perspective of our Psikharpax project – an artificial rat having to survive autonomously in unpredictable environments.