Results 1 
9 of
9
Rational general reinforcement learning
"... We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models. The algorithm is shown to be nearoptimal for all but O(N log 2 N) timesteps with high probability. Infinite classes are also considered where we show ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models. The algorithm is shown to be nearoptimal for all but O(N log 2 N) timesteps with high probability. Infinite classes are also considered where we show that compactness is a key criterion for determining the existence of uniform samplecomplexity bounds. A matching lower bound is given for the finite case. 1.
Optimistic Agents are Asymptotically Optimal
, 2012
"... We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds. ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
(Show Context)
We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds.
One decade of universal artificial intelligence
 In Theoretical Foundations of Artificial General Intelligence
, 2012
"... The first decade of this century has seen the nascency of the first mathematical theory of general artificial intelligence. This theory of Universal Artificial Intelligence (UAI) has made significant contributions to many theoretical, philosophical, and practical AI questions. In a series of papers ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
The first decade of this century has seen the nascency of the first mathematical theory of general artificial intelligence. This theory of Universal Artificial Intelligence (UAI) has made significant contributions to many theoretical, philosophical, and practical AI questions. In a series of papers culminating in book (Hutter, 2005), an exciting sound and complete mathematical model for a super intelligent agent (AIXI) has been developed and rigorously analyzed. While nowadays most AI researchers avoid discussing intelligence, the awardwinning PhD thesis (Legg, 2008) provided the philosophical embedding and investigated the UAIbased universal measure of rational intelligence, which is formal, objective and nonanthropocentric. Recently, effective approximations of AIXI have been derived and experimentally investigated in JAIR paper (Veness et al. 2011). This practical breakthrough has resulted in some impressive applications, finally muting earlier critique that UAI is only a theory. For the first time, without providing any domain knowledge, the same
Optimistic AIXI
"... Abstract. We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference ma ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference machine. We connect this to removing symmetry between accepting and rejecting bets in the rationality axiomatization of AIXI and replacing it with optimism. Optimism is often used to encourage exploration in the more restrictive Markov Decision Process setting and it alleviates the problem that AIXI (with geometric discounting) stops exploring prematurely.
Asymptotic NonLearnability of Universal Agents with Computable Horizon Functions
"... Finding the universal artificial intelligent agent is the old dream of AI scientists. Solomonoff Induction was one big step towards this, giving a universal solution to the general problem of sequence prediction by defining a universal prior distribution. Hutter defined the AIXI model, which extends ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Finding the universal artificial intelligent agent is the old dream of AI scientists. Solomonoff Induction was one big step towards this, giving a universal solution to the general problem of sequence prediction by defining a universal prior distribution. Hutter defined the AIXI model, which extends the latter to the reinforcement learning framework, where almost all if not all AI problems can be formulated. However, new difficulties arise because the agent is now active, whereas it is only passive in the sequence prediction case. This makes proving AIXI’s optimality difficult. In fact, we prove that the current definition of AIXI can sometimes be suboptimal in a certain sense, but that this behavior is still the most rational one, hence emphasizing the difficulty of universal reinforcement learning.
Rationality, Optimism and Guarantees in General Reinforcement Learning
, 2015
"... Abstract In this article, 1 we present a topdown theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much sho ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In this article, 1 we present a topdown theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much shorter than what the agent designer actually wants. We axiomatize what is rational in such a setting in a manner that enables optimism, which is important to achieve systematic explorative behavior. Then, within the class of agents deemed rational, we achieve convergence and finiteerror bounds. Such results are desirable since they imply that the agent learns well from its experiences, but the bounds do not directly guarantee good performance and can be achieved by agents doing things one should obviously not. Good performance cannot in fact be guaranteed for any agent in fully general settings. Our approach is to design agents that learn well from experience and act rationally. We introduce a framework for general reinforcement learning agents based on rationality axioms for a decision function and an hypothesisgenerating function designed so as to achieve guarantees on the number errors. We will consistently use an optimistic decision function but the hypothesisgenerating function needs to change depending on what is known/assumed. We investigate a number of natural situations having either a frequentist or Bayesian flavor, deterministic or stochastic environments and either finite or countable hypothesis class. Further, to achieve sufficiently good bounds as to hold promise for practical success we introduce a notion of a class of environments being generated by a set of laws. None of the above has previously been done for fully general reinforcement learning environments.
Learning Agents with Evolving Hypothesis Classes
"... Abstract. It has recently been shown that a Bayesian agent with a universal hypothesis class resolves most induction problems discussed in the philosophy of science. These ideal agents are, however, neither practical nor a good model for how real science works. We here introduce a framework for lear ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. It has recently been shown that a Bayesian agent with a universal hypothesis class resolves most induction problems discussed in the philosophy of science. These ideal agents are, however, neither practical nor a good model for how real science works. We here introduce a framework for learning based on implicit beliefs over all possible hypotheses and limited sets of explicit theories sampled from an implicit distribution represented only by the process by which it generates new hypotheses. We address the questions of how to act based on a limited set of theories as well as what an ideal sampling process should be like. Finally, we discuss topics in philosophy of science and cognitive science from the perspective of this framework. 1
Universal KnowledgeSeeking Agents for Stochastic Environments
"... Abstract. We define an optimal Bayesian knowledgeseeking agent, KLKSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, ..."
Abstract
 Add to MetaCart
Abstract. We define an optimal Bayesian knowledgeseeking agent, KLKSA, designed for countable hypothesis classes of stochastic environments and whose goal is to gather as much information about the unknown world as possible. Although this agent works for arbitrary countable classes and priors, we focus on the especially interesting case where all stochastic computable environments are considered and the prior is based on Solomonoff’s universal prior. Among other properties, we show that KLKSA learns the true environment in the sense that it learns to predict the consequences of actions it does not take. We show that it does not consider noise to be information and avoids taking actions leading to inescapable traps. We also present a variety of toy experiments demonstrating that KLKSA behaves according to expectation.