Results 1  10
of
77
An introduction to collective intelligence
 Handbook of Agent technology. AAAI
, 1999
"... ..."
(Show Context)
Formal Theory of Creativity, Fun, and Intrinsic Motivation (19902010)
"... The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditio ..."
Abstract

Cited by 75 (15 self)
 Add to MetaCart
The simple but general formal theory of fun & intrinsic motivation & creativity (1990) is based on the concept of maximizing intrinsic reward for the active creation or discovery of novel, surprising patterns allowing for improved prediction or data compression. It generalizes the traditional field of active learning, and is related to old but less formal ideas in aesthetics theory and developmental psychology. It has been argued that the theory explains many essential aspects of intelligence including autonomous development, science, art, music, humor. This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown but learnable algorithmic regularities. Emphasis is put on the importance of limited computational resources for online prediction and compression. Discrete and continuous time formulations are given. Previous practical but nonoptimal implementations (1991, 1995, 19972002) are reviewed, as well as several recent variants by others (2005). A simplified typology addresses current confusion concerning the precise nature of intrinsic motivation.
Optimal Ordered Problem Solver
, 2002
"... We present a novel, general, optimally fast, incremental way of searching for a universal algorithm that solves each task in a sequence of tasks. The Optimal Ordered Problem Solver (OOPS) continually organizes and exploits previously found solutions to earlier tasks, eciently searching not only the ..."
Abstract

Cited by 73 (20 self)
 Add to MetaCart
We present a novel, general, optimally fast, incremental way of searching for a universal algorithm that solves each task in a sequence of tasks. The Optimal Ordered Problem Solver (OOPS) continually organizes and exploits previously found solutions to earlier tasks, eciently searching not only the space of domainspecific algorithms, but also the space of search algorithms. Essentially we extend the principles of optimal nonincremental universal search to build an incremental universal learner that is able to improve itself through experience.
Developmental Robotics, Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts
, 2006
"... Even in absence of external reward, babies and scientists and others explore their world. Using some sort of adaptive predictive world model, they improve their ability to answer questions such as: what happens if I do this or that? They lose interest in both the predictable things and those predict ..."
Abstract

Cited by 67 (18 self)
 Add to MetaCart
Even in absence of external reward, babies and scientists and others explore their world. Using some sort of adaptive predictive world model, they improve their ability to answer questions such as: what happens if I do this or that? They lose interest in both the predictable things and those predicted to remain unpredictable despite some effort. One can design curious robots that do the same. The author’s basic idea for doing so (1990, 1991): a reinforcement learning (RL) controller is rewarded for action sequences that improve the predictor. Here this idea is revisited in the context of recent results on optimal predictors and optimal RL machines. Several new variants of the basic principle are proposed. Finally it is pointed out how the fine arts can be formally understood as a consequence of the principle: given some subjective observer, great works of art and music yield observation histories exhibiting more novel, previously unknown compressibility / regularity / predictability (with respect to the observer’s particular learning algorithm) than lesser works, thus deepening the observer’s understanding of the world and what is possible in it.
A computer scientist’s view of life, the universe, and everything
 Foundations of Computer Science: Potential  Theory  Cognition
, 1997
"... Is the universe computable? If so, it may be much cheaper in terms of information requirements to compute all computable universes instead of just ours. I apply basic concepts of Kolmogorov complexity theory to the set of possible universes, and chat about perceived and true randomness, life, genera ..."
Abstract

Cited by 49 (15 self)
 Add to MetaCart
(Show Context)
Is the universe computable? If so, it may be much cheaper in terms of information requirements to compute all computable universes instead of just ours. I apply basic concepts of Kolmogorov complexity theory to the set of possible universes, and chat about perceived and true randomness, life, generalization, and learning in a given universe. Preliminaries Assumptions. A long time ago, the Great Programmer wrote a program that runs all possible universes on His Big Computer. “Possible ” means “computable”: (1) Each universe evolves on a discrete time scale. (2) Any universe’s state at a given time is describable by a finite number of bits. One of the many universes is ours, despite some who evolved in it and claim it is incomputable. Computable universes. Let TM denote an arbitrary universal Turing machine with unidirectional output tape. TM’s input and output symbols are “0”, “1”, and “, ” (comma). TM’s possible input programs can be ordered
Machine Learning Techniques for the Computer Security Domain of Anomaly Detection
, 2000
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xv 1 ..."
Abstract

Cited by 48 (2 self)
 Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xv 1
The Fastest And Shortest Algorithm For All WellDefined Problems
, 2002
"... An algorithm M is described that solves any welldefined problem p as quickly as the fastest algorithm computing a solution to p, save for a factor of 5 and loworder additive terms. M optimally distributes resources between the execution of provably correct psolving programs and an enumeration of ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
(Show Context)
An algorithm M is described that solves any welldefined problem p as quickly as the fastest algorithm computing a solution to p, save for a factor of 5 and loworder additive terms. M optimally distributes resources between the execution of provably correct psolving programs and an enumeration of all proofs, including relevant proofs of program correctness and of time bounds on program runtimes. M avoids Blum's speedup theorem by ignoring programs without correctness proof. M has broader applicability and can be faster than Levin's universal search, the fastest method for inverting functions save for a large multiplicative constant. An extension of Kolmogorov complexity and two novel natural measures of function complexity are used to show that the most efficient program computing some function f is also among the shortest programs provably computing f.
Algorithmic Theories Of Everything
, 2000
"... The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lac ..."
Abstract

Cited by 36 (15 self)
 Add to MetaCart
The probability distribution P from which the history of our universe is sampled represents a theory of everything or TOE. We assume P is formally describable. Since most (uncountably many) distributions are not, this imposes a strong inductive bias. We show that P(x) is small for any universe x lacking a short description, and study the spectrum of TOEs spanned by two Ps, one reflecting the most compact constructive descriptions, the other the fastest way of computing everything. The former derives from generalizations of traditional computability, Solomonoff’s algorithmic probability, Kolmogorov complexity, and objects more random than Chaitin’s Omega, the latter from Levin’s universal search and a natural resourceoriented postulate: the cumulative prior probability of all x incomputable within time t by this optimal algorithm should be 1/t. Between both Ps we find a universal cumulatively enumerable measure that dominates traditional enumerable measures; any such CEM must assign low probability to any universe lacking a short enumerating program. We derive Pspecific consequences for evolving observers, inductive reasoning, quantum physics, philosophy, and the expected duration of our universe.
Exploring the Predictable
, 2002
"... Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatiotemporal events. It constructs probabilistic algorithms that (1) control in ..."
Abstract

Cited by 34 (13 self)
 Add to MetaCart
Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatiotemporal events. It constructs probabilistic algorithms that (1) control interaction with the world, (2) map event sequences to abstract internal representations (IRs), (3) predict IRs from IRs computed earlier. Its goal is to create novel algorithms generating IRs useful for correct IR predictions, without wasting time on those learned before. This requires an adaptive novelty measure which is implemented by a coevolutionary scheme involving two competing modules collectively designing (initially random) algorithms representing experiments. Using special instructions, the modules can bet on the outcome of IR predictions computed by algorithms they have agreed upon. If their opinions dier then the system checks who's right, punishes the loser (the surprised one), and rewards the winner. An evolutionary or reinforcement learning algorithm forces each module to maximize reward. This motivates both modules to lure each other into agreeing upon experiments involving predictions that surprise it. Since each module essentially can veto experiments it does not consider profitable, the system is motivated to focus on those computable aspects of the environment where both modules still have confident but different opinions. Once both share the same opinion on a particular issue (via the loser's learning process, e.g., the winner is simply copied onto the loser), the winner loses a source of reward  an incentive to shift the focus of interest onto novel experiments. My simulations include an example where surprisegeneration of this kind helps to speed up ...
Reinforcement Learning by Policy Search
, 2000
"... One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations could be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are know ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations could be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as partially observable Markov decision processes (POMDPs). While the environment's dynamics are assumed to obey certain rules, the agent does not know them and must learn. In this dissertation we focus on the agent's adaptation as captured by the reinforcement learning framework. Reinforcement learning means learning a policya mapping of observations into actionsbased on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. The set of policies being searched is constrained by the architecture of the agent's controller. POMDPs require a controller to have a memory. We investigate various architectures for controllers with memory, including controllers with external memory, finite state controllers and distributed controllers for multiagent system. For these various controllers we work out the details of the algorithms which learn by ascending the gradient of expected cumulative reinforcement. Building on statistical learning theory and experiment design theory, a policy evaluation algorithm is developed for the case of experience reuse. We address the question of sufficient experience for uniform convergence of policy evaluation and obtain sample complexity bounds for various estimators. Finally, we demonstrate the performance of the proposed algorithms on several domains, the most complex of which is simulated adaptive packet routing in a telecommunication network.