See this document in CiteSeerX!

Worst-case Quadratic Loss Bounds for Prediction Using Linear Functions and Gradient Descent (1996)  (Make Corrections)  (1 citation)
Nicolo Cesa-Bianchi, Philip M. Long, Manfred K. Warmuth
IEEE Transactions on Neural Networks



  Home/Search   Context   Related

 
View or download:
neurocolt.org/abs/...nctr96011.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  neurocolt.org/abs/1996...abs96011 (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of on-line gradient descent. Whereas our algorithms always predict using linear functions as hypotheses, none of our results... (Update)

Context of citations to this paper:   More

..., whose predictions are kept hidden from the master. Using this sequential prediction model, we will show (extending results from [3, 10, 12]) that a well known algorithm for linear regression, Gradient Descent, and a recently proposed variant, Exponentiated Gradient, have a...

Cited by:   More
Analysis of Two Gradient-based Algorithms for On-line Regression - Dsi (1999)   (Correct)

Similar documents (at the sentence level):
72.4%:   Worst-case Quadratic Loss Bounds for On-line Prediction .. - Cesa-Bianchi, Long.. (1993)   (Correct)

Active bibliography (related documents):   More   All
0.9:   On the Complexity of Function Learning - Auer, Long, Maass, Woeginger (1994)   (Correct)
0.5:   Efficient Higher-order Neural Networks for Classification and.. - Ghosh, Shin (1995)   (Correct)
0.3:   New Adaptive-Filtering Techniques Applied To Speech Echo.. - Siqueira, Alwan   (Correct)

Similar documents based on text:   More   All
0.4:   On Bayes Methods for On-line Boolean Prediction - Cesa-Bianchi, Helmbold, Panizza (1997)   (Correct)
0.4:   The Perceptron algorithm vs. Winnow: linear vs. logarithmic.. - Kivinen, Warmuth (1997)   (Correct)
0.4:   Report for Publication of the Activity of the Working Group.. - Shawe-Taylor (1997)   (Correct)

BibTeX entry:   (Update)

N. Cesa-Bianchi, P.M. Long, and M.K. Warmuth. Worst-case quadratic loss bounds for prediction using linear functions and gradient descent. IEEE Transactions on Neural Networks, 7(3):604--619, 1996. http://citeseer.ist.psu.edu/cesa-bianchi96worstcase.html   More

@article{ cesabianchi96worstcase,
    author = "N. Cesa-Bianchi and P. M. Long and M. K. Warmuth",
    title = "Worst-Case Quadratic Loss Bounds for Prediction Using Linear Functions and Gradient Descent",
    journal = "IEEE Transactions on Neural Networks",
    volume = "7",
    number = "3",
    month = "May",
    pages = "604--619",
    year = "1996",
    url = "citeseer.ist.psu.edu/cesa-bianchi96worstcase.html" }
Citations (may not include all citations):
2441   The Johns Hopkins University Press (context) - Golub, Van Loan - 1990
2133   Pattern Classification and Scene Analysis (context) - Duda, Hart - 1973
815   Adaptive Filter Theory (context) - Haykin - 1991
708   Cambridge University Press (context) - Horn, Johnson - 1985
530   Linear and Nonlinear Programming (context) - Luenberger - 1984
441   Queries and concept learning (context) - Angluin - 1988
373   Adaptive signal processing (context) - Widrow, Stearns - 1985
317   Learning quickly when irrelevant attributes abound: a new li.. (context) - Littlestone - 1988
222   Adaptive switching circuits (context) - Widrow, Hoff - 1960
214   Universal approximation bounds for superpositions of a sigmo.. (context) - Barron - 1993
133   Aggregating strategies (context) - Vovk - 1990
118   How to use expert advice (context) - Cesa-Bianchi, Freund et al. - 1993
88   Exponentiated gradient versus gradient descent for linear pr.. - Kivinen, Warmuth - 1994
81   Universal prediction of individual sequences (context) - Feder, Merhav et al. - 1992
74   Mistake Bounds and Logarithmic Linear-threshold Learning Alg.. (context) - Littlestone - 1989
35   Statistical theory: The prequential approach (context) - Dawid - 1984
32   line learning of linear functions - Littlestone, Long et al. - 1991
24   Angenaherte Auflosung von systemen linearer gleichungen (context) - Kaczmarz - 1937
12   Techniques of adaptive equalization of digital communication.. (context) - Lucky - 1966
12   Universal sequential learning and decision from individual d.. (context) - Merhav, Feder - 1992
9   Smoothing Techniques (context) - Hardle - 1991
8   A learning algorithm for linear operators (context) - Mycielski - 1988
5   The learning complexity of smooth functions of a single vari.. (context) - Kimber, Long - 1992
3   An adaptive echo canceller (context) - Sondhi - 1967
3   Silencing echoes in the telephone network (context) - Sondhi, Berkley - 1980
2   Applications of learning theorems (context) - Faber, Mycielski - 1991
2   General learning theorems (context) - Mycielski, Swierczkowski - 1991
2   A preliminary version appeared in the Proceedings of the 30t.. (context) - Littlestone, Warmuth et al. - 1991
2   Robust stability analysis of adaptation algorithms for singl.. (context) - Hui, Zak - 1991
1   A Proof of Theorem 4 (context) - Young, to et al. - 1988

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC