(Enter summary)
Abstract: In this paper we study the performance of gradient descent when applied to the problem of on-line linear prediction in arbitrary inner product spaces. We show worst-case bounds on the sum of the squared prediction errors under various assumptions concerning the amount of a priori information about the sequence to predict. The algorithms we use are variants and extensions of on-line gradient descent. Whereas our algorithms always predict using linear functions as hypotheses, none of our results... (Update)
Context of citations to this paper: More
..., whose predictions are kept hidden from the master. Using this sequential prediction model, we will show (extending results from [3, 10, 12]) that a well known algorithm for linear regression, Gradient Descent, and a recently proposed variant, Exponentiated Gradient, have a...
Cited by: More
Analysis of Two Gradient-based Algorithms for On-line Regression - Dsi (1999)
(Correct)
Similar documents (at the sentence level):
72.4%: Worst-case Quadratic Loss Bounds for On-line Prediction .. - Cesa-Bianchi, Long.. (1993)
(Correct)
Active bibliography (related documents): More All
0.9: On the Complexity of Function Learning - Auer, Long, Maass, Woeginger (1994)
(Correct)
0.5: Efficient Higher-order Neural Networks for Classification and.. - Ghosh, Shin (1995)
(Correct)
0.3: New Adaptive-Filtering Techniques Applied To Speech Echo.. - Siqueira, Alwan
(Correct)
Similar documents based on text: More All
0.4: On Bayes Methods for On-line Boolean Prediction - Cesa-Bianchi, Helmbold, Panizza (1997)
(Correct)
0.4: The Perceptron algorithm vs. Winnow: linear vs. logarithmic.. - Kivinen, Warmuth (1997)
(Correct)
0.4: Report for Publication of the Activity of the Working Group.. - Shawe-Taylor (1997)
(Correct)
BibTeX entry: (Update)
N. Cesa-Bianchi, P.M. Long, and M.K. Warmuth. Worst-case quadratic loss bounds for prediction using linear functions and gradient descent. IEEE Transactions on Neural Networks, 7(3):604--619, 1996. http://citeseer.ist.psu.edu/cesa-bianchi96worstcase.html More
@article{ cesabianchi96worstcase,
author = "N. Cesa-Bianchi and P. M. Long and M. K. Warmuth",
title = "Worst-Case Quadratic Loss Bounds for Prediction Using Linear Functions and Gradient Descent",
journal = "IEEE Transactions on Neural Networks",
volume = "7",
number = "3",
month = "May",
pages = "604--619",
year = "1996",
url = "citeseer.ist.psu.edu/cesa-bianchi96worstcase.html" }
Citations (may not include all citations):
2441
The Johns Hopkins University Press (context) - Golub, Van Loan - 1990
2133
Pattern Classification and Scene Analysis (context) - Duda, Hart - 1973
815
Adaptive Filter Theory (context) - Haykin - 1991
708
Cambridge University Press (context) - Horn, Johnson - 1985
530
Linear and Nonlinear Programming (context) - Luenberger - 1984
441
Queries and concept learning (context) - Angluin - 1988
373
Adaptive signal processing (context) - Widrow, Stearns - 1985
317
Learning quickly when irrelevant attributes abound: a new li.. (context) - Littlestone - 1988
222
Adaptive switching circuits (context) - Widrow, Hoff - 1960
214
Universal approximation bounds for superpositions of a sigmo.. (context) - Barron - 1993
133
Aggregating strategies (context) - Vovk - 1990
118
How to use expert advice (context) - Cesa-Bianchi, Freund et al. - 1993
88
Exponentiated gradient versus gradient descent for linear pr..
- Kivinen, Warmuth - 1994
81
Universal prediction of individual sequences (context) - Feder, Merhav et al. - 1992
74
Mistake Bounds and Logarithmic Linear-threshold Learning Alg.. (context) - Littlestone - 1989
35
Statistical theory: The prequential approach (context) - Dawid - 1984
32
line learning of linear functions
- Littlestone, Long et al. - 1991
24
Angenaherte Auflosung von systemen linearer gleichungen (context) - Kaczmarz - 1937
12
Techniques of adaptive equalization of digital communication.. (context) - Lucky - 1966
12
Universal sequential learning and decision from individual d.. (context) - Merhav, Feder - 1992
9
Smoothing Techniques (context) - Hardle - 1991
8
A learning algorithm for linear operators (context) - Mycielski - 1988
5
The learning complexity of smooth functions of a single vari.. (context) - Kimber, Long - 1992
3
An adaptive echo canceller (context) - Sondhi - 1967
3
Silencing echoes in the telephone network (context) - Sondhi, Berkley - 1980
2
Applications of learning theorems (context) - Faber, Mycielski - 1991
2
General learning theorems (context) - Mycielski, Swierczkowski - 1991
2
A preliminary version appeared in the Proceedings of the 30t.. (context) - Littlestone, Warmuth et al. - 1991
2
Robust stability analysis of adaptation algorithms for singl.. (context) - Hui, Zak - 1991
1
A Proof of Theorem 4 (context) - Young, to et al. - 1988
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC