See this document in CiteSeerX!

The Sample Complexity of Pattern Classification With Neural Networks: The Size of the Weights is More Important Than the Size of the Network (1997)  (Make Corrections)  (91 citations)
Peter L. Bartlett



  Home/Search   Context   Related

 
View or download:
anu.edu.au/~bartle...96dUSformat.ps.Z
anu.edu.au/~bartlett/paper...TR96d.ps.Z
anu.edu.au/pub/peter/TR96d.ps.Z
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  anu.edu.au/~bartlett/papers/ (more)
From:  signal.kuamp.kyotou.ac...techrep
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared... (Update)

Cited by:   More
A Simple Additive Re-weighting Strategy for Improving Margins - Fabio Aiolli And   (Correct)
Convexity, Classification, and Risk Bounds - Peter Bartlett Bartlett   (Correct)
Journal of Machine Learning Research 5 (2004) 27-72.. - Gert Lanckriet Gert   (Correct)

Active bibliography (related documents):   More   All
0.5:   For Valid Generalization, the Size of the Weights is More.. - Bartlett (1997)   (Correct)
0.5:   The VC-Dimension and Pseudodimension of Two-Layer Neural.. - Peter Bartlett (1993)   (Correct)
0.3:   Probabilistic Analysis of Learning in Artificial Neural Networks: .. - Anthony (1994)   (Correct)

System load high. Please wait...
Timeout. Please try your query later.
Similar documents based on text:   More   All
0.4:   Generalization in decision trees and DNF: Does size matter? - Mostefa Golea Peter (1997)   (Correct)
0.0:   Advances in Large Margin Classifiers - (Eds.) (2000)   (Correct)
0.0:   A proof of independent Bartlett correctability of nested.. - Takemura, Kuriki (1995)   (Correct)

Related documents from co-citation:   More   All
38:   Boosting the margin: A new explanation for the effectiveness of voting methods - Schapire, Freund et al. - 1997
32:   Support-Vector Networks - Cortes, Vapnik - 1995
30:   Statistical Learning Theory (context) - Vapnik

BibTeX entry:   (Update)

Peter L. Bartlett, "The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network," IEEE Trans. Inf. Theory, 44(2), 525-- 536, (1998). http://citeseer.ist.psu.edu/bartlett97sample.html   More

@techreport{ bartlett96sample,
    author = "P.L. Bartlett",
    title = "The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network",
    month = "May 7",
    year = "1996",
    url = "citeseer.ist.psu.edu/bartlett97sample.html" }
Citations (may not include all citations):
1056   Introduction to the Theory of Neural Computation (context) - Hertz, Krogh et al. - 1991
500   Experiments with a new boosting algorithm - Freund, Schapire - 1996
465   Learnability and the Vapnik-Chervonenkis dimension (context) - Blumer, Ehrenfeucht et al. - 1989
454   the uniform convergence of relative frequencies of events to.. (context) - Vapnik, Chervonenkis - 1971
375   Probability inequalities for sums of bounded random variable.. (context) - Hoeffding - 1963
348   Estimation of Dependences Based on Empirical Data (context) - Vapnik - 1982
318   Convergence of Stochastic Processes (context) - Pollard - 1984
296   A probabilistic theory of pattern recognition (context) - Devroye, Gyorfi et al. - 1996
268   Decision theoretic generalizations of the PAC model for neur.. (context) - Haussler - 1992
255   A training algorithm for optimal margin classifiers - Boser, Guyon et al. - 1992
243   Boosting the margin: a new explanation for the effectiveness.. - Schapire, Freund et al. - 1997
214   Universal approximation bounds for superposition of a sigmoi.. (context) - Barron - 1993
203   What size net gives valid generalization (context) - Baum, Haussler - 1989
151   A general lower bound on the number of examples needed for l.. (context) - Ehrenfeucht, Haussler et al. - 1989
115   Efficient distribution-free learning of probabilistic concep.. - Kearns, Schapire - 1990
76   Learning to tell two spirals apart (context) - Lang, Witbrock - 1988
71   Scale-sensitive dimensions (context) - Alon, Ben-David et al. - 1997
71   A simple lemma on greedy approximation in hilbert space and .. (context) - Jones - 1992
67   Bounding the Vapnik-Chervonenkis dimension of concept classe.. (context) - Goldberg, Jerrum - 1995
64   Feedforward nets for interpolation and classification (context) - Sontag - 1992
59   A decision-theoretic generalization of online learning and a.. (context) - Freund, Schapire - 1995
37   Structural risk minimization over data-dependent hierarchies (context) - Shawe-Taylor, Bartlett et al. - 1996
32   A result of Vapnik with applications - Anthony, Shawe-Taylor - 1993
31   Rinehart and Winston (context) - Lorentz, functions - 1966
21   A generalization of Sauer's Lemma (context) - Haussler, Long - 1995
18   Vapnik-Chervonenkis dimension of neural nets - Maass - 1995
16   Characterizations of learnability for classes of f0; : : : ;.. (context) - Ben-David, Cesa-Bianchi et al. - 1995
16   Function learning from interpolation - Anthony, Bartlett - 1995
15   Efficient agnostic learning of neural networks with bounded .. (context) - Lee, Bartlett et al. - 1996
15   Approximation and learning of convex superpositions (context) - Gurvits, Koiran - 1995
15   Automatic capacity tuning of very large vcdimension classifi.. - Guyon, Boser et al. - 1993
13   ffl-entropy and ffl-capacity of sets in function spaces (context) - Kolmogorov, Tihomirov - 1961
12   the size of the weights is more important than the size of t.. (context) - Bartlett, generalization - 1997
11   A framework for structural risk minimisation - Shawe-Taylor, Bartlett et al. - 1996
8   What size neural network gives optimal generalization - Lawrence, Giles et al. - 1996
7   A data-dependent skeleton estimate for learning (context) - Lugosi, Pint'er - 1996
6   Covering numbers for real-valued function classes - Bartlett, Kulkarni et al. - 1996
5   and scalesensitive dimensions (context) - Bartlett, Long et al. - 1997
2   Quadratic VC-dimension bounds for sigmoidal networks (context) - Karpinski, Macintyre - 1995
1   A data-dependent skeleton estimate and a scale-sensitive dim.. (context) - Horvath, Lugosi - 1996
1   Generalization and the size of the weights: an experimental .. (context) - Loy, Bartlett - 1997



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://wwwsyseng.anu.edu.au/~bartlett/papers/):   More
Covering Numbers for Real-Valued Function Classes - Bartlett, Kulkarni, Posner (1997)   (Correct)
Structural Risk Minimization over Data-Dependent Hierarchies - Shawe-Taylor, Bartlett, al. (1998)   (Correct)
For Valid Generalization, the Size of the Weights is More.. - Bartlett (1997)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC