Results 1 
9 of
9
On Universal Prediction and Bayesian Confirmation
 Theoretical Computer Science
, 2007
"... The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not ..."
Abstract

Cited by 31 (14 self)
 Add to MetaCart
The Bayesian framework is a wellstudied and successful framework for inductive reasoning, which includes hypothesis testing and confirmation, parameter estimation, sequence prediction, classification, and regression. But standard statistical guidelines for choosing the model class and prior are not always available or can fail, in particular in complex situations. Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. I discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. I show that Solomonoff’s model possesses many desirable properties: Strong total and future bounds, and weak instantaneous bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the oldevidence and updating problem. It even performs well
On the foundations of universal sequence prediction
 In Proc. 3rd Annual Conference on Theory and Applications of Models of Computation (TAMC’06), volume 3959 of LNCS
, 2006
"... Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequenc ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Solomonoff completed the Bayesian framework by providing a rigorous, unique, formal, and universal choice for the model class and the prior. We discuss in breadth how and in which sense universal (noni.i.d.) sequence prediction solves various (philosophical) problems of traditional Bayesian sequence prediction. We show that Solomonoff’s model possesses many desirable properties: Fast convergence and strong bounds, and in contrast to most classical continuous prior densities has no zero p(oste)rior problem, i.e. can confirm universal hypotheses, is reparametrization and regrouping invariant, and avoids the oldevidence and updating problem. It even performs well (actually better) in noncomputable environments.
Asymptotics of Discrete MDL for Online Prediction
, 2005
"... Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning noni.i.d. processes by means of twopart MDL, where the underlying model class is countable. We consider the online learning fr ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning noni.i.d. processes by means of twopart MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state of mind after each time step. We identify two ways of predicting by MDL for this setup, namely a static and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) We will prove that under the only assumption that the data is generated by a distribution contained in the model class, the MDL predictions converge to the true values almost surely. This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the KullbackLeibler loss of the MDL learner, which are however exponentially worse than for Bayesian prediction. We demonstrate that these bounds are sharp, even for model classes containing only Bernoulli distributions. We show how these bounds imply regret bounds for arbitrary loss functions. Our results apply to a wide range of setups, namely sequence prediction, pattern classification, regression, and universal induction in the sense of Algorithmic Information Theory among others.
MDL Convergence Speed for Bernoulli Sequences
, 2006
"... The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. We discuss the application to Machine Learning tasks such as classification and hypothesis testing, and generalization to countable classes of i.i.d. models.
Sequential predictions based on algorithmic complexity
, 2004
"... This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=−log m, i.e. based on universal deterministic/onepart MDL. m is extremely close to Solomonoff’s universal prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, wh ..."
Abstract

Cited by 6 (6 self)
 Add to MetaCart
This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=−log m, i.e. based on universal deterministic/onepart MDL. m is extremely close to Solomonoff’s universal prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where performance is measured in terms of convergence of posteriors or losses. Despite this closeness to M, it is difficult to assess the prediction quality of m, since little is known about the closeness of their posteriors, which are the important quantities for prediction. We show that for deterministic computable environments, the “posterior ” and losses of m converge, but rapid convergence could only be shown onsequence; the offsequence convergence can be slow. In probabilistic environments, neither the posterior nor the losses converge,
Strong asymptotic assertions for discrete MDL in regression and classification
 IN BENELEARN 2005 (ANN. MACHINE LEARNING CONF. OF BELGIUM AND THE
, 2005
"... We study the properties of the MDL (or maximum penalized complexity) estimator for Regression and Classification, where the underlying model class is countable. We show in particular a finite bound on the Hellinger losses under the only assumption that there is a “true” model contained in the class. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
We study the properties of the MDL (or maximum penalized complexity) estimator for Regression and Classification, where the underlying model class is countable. We show in particular a finite bound on the Hellinger losses under the only assumption that there is a “true” model contained in the class. This implies almost sure convergence of the predictive distribution to the true one at a fast rate. It corresponds to Solomonoff’s central theorem of universal induction, however with a bound that is exponentially larger.
Recent Results in Universal and NonUniversal Induction
, 2006
"... We present and relate recent results in prediction based on countable classes of either probability (semi)distributions or base predictors. Learning by Bayes, MDL, and stochastic model selection will be considered as ..."
Abstract
 Add to MetaCart
We present and relate recent results in prediction based on countable classes of either probability (semi)distributions or base predictors. Learning by Bayes, MDL, and stochastic model selection will be considered as
Explicit Local Models: Towards “Optimal” Optimization Algorithms
"... We address the problem of minimizing functions on a continuous vector space with the help of models. A theoretical background is established in terms of the AIXI theory, which concerns optimal rational agents in unknown environments. It implies recommendations for the design of optimization algorith ..."
Abstract
 Add to MetaCart
(Show Context)
We address the problem of minimizing functions on a continuous vector space with the help of models. A theoretical background is established in terms of the AIXI theory, which concerns optimal rational agents in unknown environments. It implies recommendations for the design of optimization algorithms. In the light of this theory, existing modelbased algorithms are reviewed. In the second part of the paper, an optimization algorithm using local quadratic models is stated and evaluated. 1