(Enter summary)
Abstract: We have used information-theoretic ideas to derive a class of practical
and nearly optimal schemes for adapting the size of a neural
network. By removing unimportant weights from a network, several
improvements can be expected: better generalization, fewer
training examples required, and improved speed of learning and/or
classification. The basic idea is to use second-derivative information
to make a tradeoff between network complexity and training
set error. Experiments confirm the... (Update)
Cited by: More
Center for Automated Learning and Discovery - Advisor Manuela Veloso
(Correct)
An Application of Pruning in the Design of Neural Networks.. - Corani, Guariso (2005)
(Correct)
Statistical Control of RBF-like Networks for - Classification Norbert Jankowski
(Correct)
Active bibliography (related documents): More All
0.5: Measuring the VC-dimension of a Learning Machine - Vapnik, Levin, Le Cun (1994)
(Correct)
0.3: Part 1: Overview of the Probably Approximately Correct (PAC).. - Haussler (1995)
(Correct)
0.2: Transforming Neural-Net Output Levels to Probability Distributions - Denker (1991)
(Correct)
Similar documents based on text: More All
0.6: Early Brain Damage - Tresp, Neuneier, Zimmermann (1996)
(Correct)
0.2: Loss Functions for Discriminative Training of Energy-Based.. - Yann Lecun And (2005)
(Correct)
0.1: Handwritten Digit Recognition with a Back-Propagation Network - Le Cun (1990)
(Correct)
Related documents from co-citation: More All
37: Second order derivatives for network pruning: Optimal brain surgeon
- Hassibi, Stork - 1993
32: The cascade-correlation learning architecture
- Fahlman, Lebiere - 1990
25: Skeletonization: a technique for trimming the fat from a network via relevance a.. (context) - Mozer, Smolensky - 1989
BibTeX entry: (Update)
Y. LeCun, J. S. Denker, and S. A. Solla. Optimal brain damage. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems 2, pages 598--605. Morgan Kaufmann, San Mateo, CA, 1990. http://citeseer.ist.psu.edu/lecun90optimal.html More
@inproceedings{ lecun90optimal,
author = "Y. LeCun and J. Denker and S. Solla and R.~E. Howard and L.~D. Jackel",
title = "Optimal Brain Damage",
booktitle = "Advances in Neural Information Processing Systems {II}",
publisher = "Morgan Kauffman",
address = "San Mateo, CA",
editor = "D.~S. Touretzky",
year = "1990",
url = "citeseer.ist.psu.edu/lecun90optimal.html" }
Citations (may not include all citations):
454
the Uniform Convergence of Relative Frequencies of Events to.. (context) - Vapnik, Chervonenkis - 1971
417
Stochastic Complexity in Statistical Inquiry (context) - Rissanen - 1989
96
Backpropagation Applied to Handwritten Zip Code Recognition (context) - LeCun, Boser et al. - 1989
47
Handwritten digit recognition with a backpropagation network
- LeCun, Boser et al. - 1990
37
Generalization and Network Design Strategies (context) - LeCun - 1989
25
Large Automatic Learning (context) - Denker, Schwartz et al. - 1987
18
Inductive Principles of the Search for Empirical Dependences (context) - Vapnik - 1989
8
Modeles connexionnistes de l'apprentissage (context) - LeCun - 1987
4
personal communication (context) - Rumelhart - 1988
2
What Size Net Gives Valid Generaliztion (context) - Baum, Haussler - 1989
1
Use of Statistical Models for Time Series Analysis (context) - Akaike - 1986
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www.research.att.com/~yann/exdb/publis/index.html): More
Word-Level Training of a Handritten Word Recognizer Based on.. - Le Cun, Bengio (1994)
(Correct)
Handwritten Digit Recognition with a Back-Propagation Network - Le Cun (1990)
(Correct)
Discriminative Feature And Model Design For Automatic.. - Rahim, Bengio, LeCun (1997)
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC