Results 1  10
of
18
Beating the adaptive bandit with high probability
, 2009
"... We provide a principled way of proving Õ( √ T) highprobability guarantees for partialinformation (bandit) problems over convex decision sets. First, we prove a regret guarantee for the fullinformation problem in terms of “local ” norms, both for entropy and selfconcordant barrier regularization, ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
(Show Context)
We provide a principled way of proving Õ( √ T) highprobability guarantees for partialinformation (bandit) problems over convex decision sets. First, we prove a regret guarantee for the fullinformation problem in terms of “local ” norms, both for entropy and selfconcordant barrier regularization, unifying these methods. Given one of such algorithms as a blackbox, we can convert a bandit problem into a fullinformation problem using a sampling scheme. The main result states that a highprobability Õ ( √ T) bound holds whenever the blackbox, the sampling scheme, and the estimates of missing information satisfy a number of conditions, which are relatively easy to check. At the heart of the method is a construction of linear upper bounds on confidence intervals. As applications of the main result, we provide the first known efficient algorithm for the sphere with an Õ( √ T) highprobability bound. We also derive the result for the nsimplex, improving the O ( √ nT log(nT)) bound of Auer et al [3] by replacing the log T term with log log T and closing the gap to the lower bound of Ω ( √ nT). The guarantees we obtain hold for adaptive adversaries (unlike the inexpectation results of [1]) and the algorithms are efficient, given that the linear upper bounds on confidence can be computed. 1
InteriorPoint Methods for FullInformation and Bandit Online Learning
"... Abstract—We study the problem of predicting individual sequences with linear loss with full and partial (or bandit) feedback. Our main contribution is the first efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal regret. In addition, for ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Abstract—We study the problem of predicting individual sequences with linear loss with full and partial (or bandit) feedback. Our main contribution is the first efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal regret. In addition, for the fullinformation setting, we give a novel regret minimization algorithm. These results are made possible by the introduction of interiorpoint methods for convex optimization to online learning. Index Terms—Bandit feedback, interiorpoint methods, online convex optimization, online learning. I.
Composite SelfConcordant Minimization ∗
"... We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new set of analytic stepsize selection and correction procedures based on the structure of the problem. We describe concrete algorithmic instances of our framework for several interesting largescale applications and demonstrate them numerically on both synthetic and real data.
Online learning with predictable sequences
 In COLT
, 2013
"... We present methods for online linear optimization that take advantage of benign (as opposed to worstcase) sequences. Specifically if the sequence encountered by the learner is described well by a known “predictable process”, the algorithms presented enjoy tighter bounds as compared to the typical w ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
We present methods for online linear optimization that take advantage of benign (as opposed to worstcase) sequences. Specifically if the sequence encountered by the learner is described well by a known “predictable process”, the algorithms presented enjoy tighter bounds as compared to the typical worst case bounds. Additionally, the methods achieve the usual worstcase regret bounds if the sequence is not benign. Our approach can be seen as a way of adding prior knowledge about the sequence within the paradigm of online learning. The setting is shown to encompass partial and side information. Variance and pathlength bounds [11, 9] can be seen as particular examples of online learning with simple predictable sequences. We further extend our methods and results to include competing with a set of possible predictable processes (models), that is “learning ” the predictable process itself concurrently with using it to obtain better regret guarantees. We show that such model selection is possible under various assumptions on the available feedback. Our results suggest a promising direction of further research with potential applications to stock market and time series prediction. 1
An inexact proximal pathfollowing algorithm for constrained convex minimization. arXiv preprint arXiv:1311.1756
"... Abstract. Many scientific and engineering applications feature nonsmooth convex minimization problems over convex sets. In this paper, we address an important instance of this broad class where we assume that the nonsmooth objective is equipped with a tractable proximity operator and that the convex ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
(Show Context)
Abstract. Many scientific and engineering applications feature nonsmooth convex minimization problems over convex sets. In this paper, we address an important instance of this broad class where we assume that the nonsmooth objective is equipped with a tractable proximity operator and that the convex constraint set affords a selfconcordant barrier. We provide a new joint treatment of proximal and selfconcordant barrier concepts and illustrate that such problems can be efficiently solved, without the need of lifting the problem dimensions, as in disciplined convex optimization approach. We propose an inexact pathfollowing algorithmic framework and theoretically characterize the worstcase analytical complexity of this framework when the proximal subproblems are solved inexactly. To show the merits of our framework, we apply its instances to both synthetic and realworld applications, where it shows advantages over standard interior point methods. As a byproduct, we describe how our framework can obtain points on the Pareto frontier of regularized problems with selfconcordant objectives in a tuning free fashion. Key words. Inexact pathfollowing algorithm, selfconcordant barrier, tractable proximity, proximalNewton method, constrained convex optimization. 1. Problem statement and motivation. We consider the following constrained
A primaldual algorithmic framework for constrained convex minimization,” arXiv preprint:1406.5403
, 2014
"... We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective on Nester ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We present a primaldual algorithmic framework to obtain approximate solutions to a prototypical constrained convex optimization problem, and rigorously characterize how common structural assumptions affect the numerical efficiency. Our main analysis technique provides a fresh perspective on Nesterov’s excessive gap technique in a structured fashion and unifies it with smoothing and primaldual methods. For instance, through the choices of a dual smoothing strategy and a center point, our framework subsumes decomposition algorithms, augmented Lagrangian as well as the alternating direction methodofmultipliers methods as its special cases, and provides optimal convergence rates on the primal objective residual as well as the primal feasibility gap of the iterates for all.
The homogeneous interiorpoint algorithm: Nonsymmetric cones, . . .
, 2013
"... The overall topic of this thesis is convex conic optimization, a subfield of mathematical optimization that attacks optimization problem with a certain geometric structure. These problems allow for modelling of an extremely wide range of realworld problems, but the availability of solution algori ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The overall topic of this thesis is convex conic optimization, a subfield of mathematical optimization that attacks optimization problem with a certain geometric structure. These problems allow for modelling of an extremely wide range of realworld problems, but the availability of solution algorithms for these problems is still limited. The goal of this thesis is to investigate and shed light on two computational aspects of homogeneous interiorpoint algorithms for convex conic optimization: The first part studies the possibility of devising a homogeneous interiorpoint method aimed at solving problems involving constraints that require nonsymmetric cones in their formulation. The second part studies the possibility of warmstarting the homogeneous interiorpoint algorithm for conic problems. The main outcome of the first part is the introduction of a completely new homogeneous interiorpoint algorithm designed to solve nonsymmetric convex conic optimization problems. The algorithm is presented in detail and then analyzed. We prove its convergence and complexity. From a theoretical viewpoint,
Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback
"... The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker re ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker receives are the losses. The identities of the loss functions themselves are not revealed. In this setting, we reduce the gap between the best known lower and upper bounds for the class of smooth convex functions, i.e. convex functions with a Lipschitz continuous gradient. Building upon existing work on selfconcordant regularizers and onepoint gradient estimation, we give the first algorithm whose expected regret is O(T 2/3), ignoring constant and logarithmic factors. 1