Results 1 -
8 of
8
A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting
, 1997
"... In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic set ..."
Abstract
-
Cited by 1714 (53 self)
- Add to MetaCart
In the first part of the paper we consider the problem of dynamically apportioning resources among a set of options in a worst-case on-line framework. The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting. We show that the multiplicative weightupdate rule of Littlestone and Warmuth [20] can be adapted to this model yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. We show how the resulting learning algorithm can be applied to a variety of problems, including gambling, multiple-outcome prediction, repeated games and prediction of points in R n . In the second part of the paper we apply the multiplicative weight-update technique to derive a new boosting algorithm. This boosting algorithm does not require any prior knowledge about the performance of the weak learning algorithm. We also study generalizations of...
How to Use Expert Advice
- JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY
, 1997
"... We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the ..."
Abstract
-
Cited by 267 (60 self)
- Add to MetaCart
We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show howthis leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also compare our analysis to the case in which log loss is used instead of the expected number of mistakes.
Predicting a Binary Sequence Almost as Well as the Optimal Biased Coin
, 1996
"... We apply the exponential weight algorithm, introduced and Littlestone and Warmuth [17] and by Vovk [24] to the problem of predicting a binary sequence almost as well as the best biased coin. We first show that for the case of the logarithmic loss, the derived algorithm is equivalent to the Bayes alg ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
We apply the exponential weight algorithm, introduced and Littlestone and Warmuth [17] and by Vovk [24] to the problem of predicting a binary sequence almost as well as the best biased coin. We first show that for the case of the logarithmic loss, the derived algorithm is equivalent to the Bayes algorithm with Jeffrey's prior, that was studied by Xie and Barron under probabilistic assumptions [26]. We derive a uniform bound on the regret which holds for any sequence. We also show that if the empirical distribution of the sequence is bounded away from 0 and from 1, then, as the length of the sequence increases to infinity, the difference between this bound and a corresponding bound on the average case regret of the same algorithm (which is asymptotically optimal in that case) is only 1=2. We show that this gap of 1=2 is necessary by calculating the regret of the min-max optimal algorithm for this problem and showing that the asymptotic upper bound is tight. We also study the application...
Worst-case Quadratic Loss Bounds for Prediction Using Linear Functions and Gradient Descent
, 1996
"... In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about t ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of on-line gradient descent. Whereas our algorithms always predict using linear functions as hypotheses, none of our results requires the data to be linearly related. In fact, the bounds proved on the total prediction loss are typically expressed as a function of the total loss of the best fixed linear predictor with bounded norm. All the upper bounds are tight to within constants. Matching lower bounds are provided in some cases. Finally, we apply our results to the problem of on-line prediction for classes of smooth functions.
Regret bounds for sleeping experts and bandits
- In 21st COLT
, 2008
"... We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained largely unaddressed in the literature, despite their applicability to a large number of practical problems. Departing from pr ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained largely unaddressed in the literature, despite their applicability to a large number of practical problems. Departing from previous work on this “Sleeping Experts” problem, we compare algorithms against the payoff obtained by the best ordering of the actions, which is a natural benchmark for this type of problem. We study both the full-information (best expert) and partial-information (multi-armed bandit) settings and consider both stochastic and adaptive adversaries. For all settings we give algorithms achieving (almost) information-theoretically optimal regret bounds (up to a constant or a sublogarithmic factor) with respect to the best-ordering benchmark. 1
Continuous Experts and the Binning Algorithm
"... Abstract. We consider the design of online master algorithms for combining the predictions from a set of experts where the absolute loss of the master is to be close to the absolute loss of the best expert. For the case when the master must produce binary predictions, the Binomial Weighting algorith ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract. We consider the design of online master algorithms for combining the predictions from a set of experts where the absolute loss of the master is to be close to the absolute loss of the best expert. For the case when the master must produce binary predictions, the Binomial Weighting algorithm is known to be optimal when the number of experts is large. It has remained an open problem how to design master algorithms based on binomial weights when the predictions of the master are allowed to be real valued. In this paper we provide such an algorithm and call it the Binning algorithm because it maintains experts in an array of bins. We show that this algorithm is optimal in a relaxed setting in which we consider experts as continuous quantities. The algorithm is efficient and near-optimal in the standard experts setting. 1
Minimizing Wide Range Regret with Time Selection Functions
"... We consider the problem of minimizing regret with respect to a given set S of pairs of time selection functions and modifications rules. We give an online algorithm that has O ( √ T log |S|) regret with respect to S when the algorithm is run for T time steps and there are N actions allowed. This im ..."
Abstract
- Add to MetaCart
We consider the problem of minimizing regret with respect to a given set S of pairs of time selection functions and modifications rules. We give an online algorithm that has O ( √ T log |S|) regret with respect to S when the algorithm is run for T time steps and there are N actions allowed. This improves the upper bound of O ( √ T N log(|I||F|)) given by Blum and Mansour [BM07a] for the case when S = I × F for a set I of time selection functions and a set F of modification rules. We do so by giving a simple reduction that uses an online algorithm for external regret as a black box. 1

