Results 1 
5 of
5
Rational general reinforcement learning
"... We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models. The algorithm is shown to be nearoptimal for all but O(N log 2 N) timesteps with high probability. Infinite classes are also considered where we show ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models. The algorithm is shown to be nearoptimal for all but O(N log 2 N) timesteps with high probability. Infinite classes are also considered where we show that compactness is a key criterion for determining the existence of uniform samplecomplexity bounds. A matching lower bound is given for the finite case. 1.
Optimistic AIXI
"... Abstract. We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference ma ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference machine. We connect this to removing symmetry between accepting and rejecting bets in the rationality axiomatization of AIXI and replacing it with optimism. Optimism is often used to encourage exploration in the more restrictive Markov Decision Process setting and it alleviates the problem that AIXI (with geometric discounting) stops exploring prematurely.
Rationality, Optimism and Guarantees in General Reinforcement Learning
, 2015
"... Abstract In this article, 1 we present a topdown theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much sho ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In this article, 1 we present a topdown theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much shorter than what the agent designer actually wants. We axiomatize what is rational in such a setting in a manner that enables optimism, which is important to achieve systematic explorative behavior. Then, within the class of agents deemed rational, we achieve convergence and finiteerror bounds. Such results are desirable since they imply that the agent learns well from its experiences, but the bounds do not directly guarantee good performance and can be achieved by agents doing things one should obviously not. Good performance cannot in fact be guaranteed for any agent in fully general settings. Our approach is to design agents that learn well from experience and act rationally. We introduce a framework for general reinforcement learning agents based on rationality axioms for a decision function and an hypothesisgenerating function designed so as to achieve guarantees on the number errors. We will consistently use an optimistic decision function but the hypothesisgenerating function needs to change depending on what is known/assumed. We investigate a number of natural situations having either a frequentist or Bayesian flavor, deterministic or stochastic environments and either finite or countable hypothesis class. Further, to achieve sufficiently good bounds as to hold promise for practical success we introduce a notion of a class of environments being generated by a set of laws. None of the above has previously been done for fully general reinforcement learning environments.
Bayesian Reinforcement Learning with Exploration
"... Abstract. We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax samplecomplexity bounds in a very general class of (historybased) environments. We also prove lower bounds and show that the new al ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax samplecomplexity bounds in a very general class of (historybased) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worstcase.
Learning Agents with Evolving Hypothesis Classes
"... Abstract. It has recently been shown that a Bayesian agent with a universal hypothesis class resolves most induction problems discussed in the philosophy of science. These ideal agents are, however, neither practical nor a good model for how real science works. We here introduce a framework for lear ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. It has recently been shown that a Bayesian agent with a universal hypothesis class resolves most induction problems discussed in the philosophy of science. These ideal agents are, however, neither practical nor a good model for how real science works. We here introduce a framework for learning based on implicit beliefs over all possible hypotheses and limited sets of explicit theories sampled from an implicit distribution represented only by the process by which it generates new hypotheses. We address the questions of how to act based on a limited set of theories as well as what an ideal sampling process should be like. Finally, we discuss topics in philosophy of science and cognitive science from the perspective of this framework. 1