MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Online Convex Programming and Generalized Innitesimal Gradient Ascent

Download:
Download as a PDF | Download as a PS
by Martin Zinkevich
http://reports-archive.adm.cs.cmu.edu/anon/2003/CMU-CS-03-110.ps
Add To MetaCart

Abstract:

Convex programming involves a convex set F R n and a convex function c: F! R. The goal of convex programming is to nd a point in F which minimizes c. In this paper, we introduce online convex programming. In online convex programming, the convex set is known in advance, but in each step of some repeated optimization problem, one must select a point in F before seeing the cost function for that step. This can be used to model factory production, farm production, and many other industrial optimization problems where one is unaware of the value of the items produced until they have already been constructed. We introduce an algorithm for this domain, apply it to repeated games, and show that it is really a generalization of innitesimal gradient ascent, and the results here imply that generalized in nitesimal gradient ascent (GIGA) is universally consistent.

Citations

437 The weighted majority algorithm – Littlestone, Warmuth - 1994
279 The theory of learning in games – Fudenberg, Levine - 1998
159 Exponentiated gradient versus gradient descent for linear predictors – Kivinen, Warmuth - 1997
122 Probability inequalities for sums of bounded random variables – Hoeding - 1963
89 An analog of the minimax theorem for vector payoffs – Blackwell - 1956
86 Adaptive game playing using multiplicative weights – Freund, Schapire - 1999
75 Regret in the on-line decision problem – Foster, Vohra - 1997
47 Relative loss bounds for multidimensional regression problems – Kivinen, Warmuth
34 A general class of adaptative strategies – Hart, Mas-Colell
26 Tracking the best linear predictor – Herbster, Warmuth
24 Efficient algorithms for on-line optimization – Kalai, Vempala - 2003
22 Natural gradient works eciently in learning – Amari - 1998
21 Conditional universal consistency – Fudenberg, Levine - 1999
12 2001a, ‘Convergence of Gradient Dynamics with a Variable Learning Rate – Bowling, Veloso
9 Universal consistency and cautious play – Fudenberg, Levine - 1995
8 Proving relative loss bounds for on-line learning algorithms using bregman divergences – Gentile, Warmuth - 2000
7 Online oblivious routing – Bansal, Blum, et al. - 2003
5 Prior knowledge and preferential structures in gradient descent algorithms – Mahony, Williamson - 2001
3 Duality and auxilary functions for Bregman distances – Pietra, Pietra, et al. - 1999
3 A proof of calibration via Blackwell’s approachability theorem – Foster - 1999
2 Worst-case quadratic bounds for online prediction of linear functions by gradient descent – Cesa-Bianchi, Long, et al. - 1994
1 Approximation to bayes risk in repeated play. Annals of Mathematics Studies – Hannan - 1957