Results 1  10
of
19
Curious modelbuilding control systems
 In Proceedings of the International Joint Conference on Neural Networks, Singapore
, 1991
"... ..."
(Show Context)
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract

Cited by 73 (16 self)
 Add to MetaCart
(Show Context)
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
Training Recurrent Networks by Evolino
, 2007
"... In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear O ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
In recent years, gradientbased LSTM recurrent neural networks (RNNs) solved many previously RNNunlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudoinversebased linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolinobased LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradientbased LSTM.
Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes
, 2009
"... I argue that data becomes temporarily interesting by itself to some selfimproving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful. Curiosity is the desire to create or discover m ..."
Abstract

Cited by 30 (7 self)
 Add to MetaCart
I argue that data becomes temporarily interesting by itself to some selfimproving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful. Curiosity is the desire to create or discover more nonrandom, nonarbitrary, regular data that is novel and surprising not in the traditional sense of Boltzmann and Shannon but in the sense that it allows for compression progress because its regularity was not yet known. This drive maximizes interestingness, the first derivative of subjective beauty or compressibility, that is, the steepness of the learning curve. It motivates exploring infants, pure mathematicians, composers,
Ultimate Cognition à la Gödel
 COGN COMPUT
, 2009
"... "All life is problem solving," said Popper. To deal with arbitrary problems in arbitrary environments, an ultimate cognitive agent should use its limited hardware in the "best" and "most efficient" possible way. Can we formally nail down this informal statement, and der ..."
Abstract

Cited by 28 (11 self)
 Add to MetaCart
"All life is problem solving," said Popper. To deal with arbitrary problems in arbitrary environments, an ultimate cognitive agent should use its limited hardware in the "best" and "most efficient" possible way. Can we formally nail down this informal statement, and derive a mathematically rigorous blueprint of ultimate cognition? Yes, we can, using Kurt Gödel’s celebrated selfreference trick of 1931 in a new way. Gödel exhibited the limits of mathematics and computation by creating a formula that speaks about itself, claiming to be unprovable by an algorithmic theorem prover: either the formula is true but unprovable, or math itself is flawed in an algorithmic sense. Here we describe an agentcontrolling program that speaks about itself, ready to rewrite itself in arbitrary fashion once it has found a proof that the rewrite is useful according to a userdefined utility function. Any such a rewrite is necessarily globally optimal—no local maxima!—since this proof necessarily must have demonstrated the uselessness of continuing the proof search for even better rewrites. Our selfreferential program will optimally speed up its proof searcher and other program parts, but only if the speed up’s utility is indeed provable—even ultimate cognition has limits of the Gödelian kind.
Adaptive Confidence And Adaptive Curiosity
 Institut fur Informatik, Technische Universitat Munchen, Arcisstr. 21, 800 Munchen 2
, 1991
"... Much of the recent research on adaptive neurocontrol and reinforcement learning focusses on systems with adaptive `world models'. Previous approaches, however, do not address the problem of modelling the reliability of the world model's predictions in uncertain environments. Furthermore, ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Much of the recent research on adaptive neurocontrol and reinforcement learning focusses on systems with adaptive `world models'. Previous approaches, however, do not address the problem of modelling the reliability of the world model's predictions in uncertain environments. Furthermore, with previous approaches usually some adhoc method (like random search) is used to train the world model to predict future environmental inputs from previous inputs and control outputs of the system. This paper introduces ways for modelling the reliability of the outputs of adaptive predictors, and it describes more sophisticated and sometimes more efficient methods for their adaptive construction by online state space exploration: For instance, a 4network reinforcement learning system is described which tries to maximize the expectation of the temporal derivative of the adaptive assumed reliability of future predictions. The system is `curious' in the sense that it actively tries to provoke situat...
Learning To Control FastWeight Memories: An Alternative To Dynamic Recurrent Networks
, 1991
"... Previous algorithms for supervised sequence learning are based on dynamic recurrent networks. This paper describes alternative gradientbased systems consisting of two feedforward nets which learn to deal with temporal sequences by using fast weights: The first net learns to produce context depende ..."
Abstract

Cited by 19 (13 self)
 Add to MetaCart
Previous algorithms for supervised sequence learning are based on dynamic recurrent networks. This paper describes alternative gradientbased systems consisting of two feedforward nets which learn to deal with temporal sequences by using fast weights: The first net learns to produce context dependent weight changes for the second net whose weights may vary very quickly. The method offers a potential for STM storage efficiency: A simple weight (instead of a fullfledged unit) may be sufficient for storing temporal information. Various learning methods are derived. Two experiments with unknown time delays illustrate the approach. One experiment shows how the system can be used for adaptive temporary variable binding.
Adaptive Decomposition Of Time
, 1991
"... : In this paper we introduce design principles for unsupervised detection of regularities (like causal relationships) in temporal sequences. One basic idea is to train an adaptive predictor module to predict future events from past events, and to train an additional confidence module to model the re ..."
Abstract

Cited by 13 (9 self)
 Add to MetaCart
: In this paper we introduce design principles for unsupervised detection of regularities (like causal relationships) in temporal sequences. One basic idea is to train an adaptive predictor module to predict future events from past events, and to train an additional confidence module to model the reliability of the predictor's predictions. We select system states at those points in time where there are changes in prediction reliability, and use them recursively as inputs for higherlevel predictors. This can be beneficial for `adaptive subgoal generation' as well as for `conventional' goaldirected (supervised and reinforcement) learning: Systems based on these design principles were successfully tested on tasks where conventional training algorithms for recurrent nets fail. Finally we describe the principles of the first neural sequence `chunker' which collapses a selforganizing multilevel predictor hierarchy into a single recurrent network. 1 OUTLINE OF THE PAPER This paper is ba...
Neural Sequence Chunkers
, 1991
"... This paper addresses the problem of learning to `divide and conquer' by meaningful hierarchical adaptive decomposition of temporal sequences. This problem is relevant for timeseries analysis as well as for goaldirected learning, particularily if event sequences tend to have hierarchical tempo ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of learning to `divide and conquer' by meaningful hierarchical adaptive decomposition of temporal sequences. This problem is relevant for timeseries analysis as well as for goaldirected learning, particularily if event sequences tend to have hierarchical temporal structure. The first neural systems for recursively chunking sequences are described. These systems are based on a principle called the `principle of history compression'. This principle essentially says: As long as a predictor is able to predict future environmental inputs from previous ones, no additional knowledge can be obtained by observing these inputs in reality. Only unexpected inputs deserve attention. A focus is on a class of 2network systems which try to collapse a selforganizing (possibly multilevel) hierarchy of temporal predictors into a single recurrent network. Only those input events that were not expected by the first recurrent net are transferred to the second recurrent ...
POWERPLAY: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
, 2011
"... Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
(Show Context)
Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. At any given time, the novel algorithmic framework POWERPLAY searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. The new task and its corresponding tasksolving skill are those first found and validated. Newly invented tasks may require making previously learned skills more efficient. The greedy search of typical POWERPLAY variants orders candidate pairs of tasks and solver modifications by their conditional computational complexity, given the stored experience so far. This biases the search towards pairs that can be described compactly and validated quickly. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the selfinvented training set; POWERPLAY’s ongoing search for novelty keeps fighting to extend beyond the generalization abilities of its present solver. The continually increasing repertoire of problem solving procedures can be exploited