2 Problems in training with Back-Propagation 2 2.1 The network architecture must be fixed a priori............... 3
|
3052
|
Neural Networks for Pattern Recognition
– Bishop
- 1995
|
|
2961
|
Pattern Classification and Scene Analysis
– Duba, Hart
- 1973
|
|
2141
|
Learning Internal Representations by Error Propagation
– Rumelhart, Hinton, et al.
- 1986
|
|
2138
|
UCI Repository of Machine Learning Databases
– Blake, Merz
- 1998
|
|
1390
|
Introduction to the Theory of Neural Computation
– Hertz, Krogh, et al.
- 1991
|
|
1364
|
A theory of the learnable
– Valiant
- 1984
|
|
800
|
Multilayer feedforward networks are universal approximators
– Hornik, Stinchcombe, et al.
- 1989
|
|
686
|
and Nonlinear Programming
– Luenberger, Linear
- 1984
|
|
654
|
On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab
– Vapnik, Červonekis
- 1971
|
|
637
|
Estinwtion of Dependences Based on Empirical Data
– Vapnik
- 1982
|
|
536
|
Learnability and the Vapnik-Chervonenkis Dimension
– Blumer, Ehrenfeucht, et al.
- 1989
|
|
530
|
Approximation by superposition of a sigmoidal function
– Cybenko
- 1989
|
|
440
|
Cross-Validatory Choice and Assessment of Statistical Predictions
– Stone
- 1974
|
|
431
|
Learning representation by back propagating errors
– Rumelhart, Hinton, et al.
- 1986
|
|
366
|
Beyond Regression: New Tools for Prediction and Analysis
– Werbos
- 1974
|
|
363
|
Perceptrons: An Introduction To Computational Geometry
– Minsky, Papert
- 1969
|
|
350
|
Optimal brain damage
– Cun, Denker, et al.
- 1990
|
|
329
|
Learning decision lists
– Rivest
- 1987
|
|
293
|
What size net gives valid generalization
– Baum, Haussler
- 1989
|
|
265
|
On the approximate realization of continuous mappings by neural networks
– Funahashi
- 1989
|
|
245
|
Increased Rates of Convergence Through Learning Rate Adaptation. Neural Networks
– Jacobs
- 1988
|
|
219
|
Backpropagation applied to handwritten zip code recognition, Neural Computation
– Cun, Boser, et al.
- 1989
|
|
192
|
Computational vision and regularization theory
– Poggio, Torre, et al.
- 1985
|
|
189
|
Principles of Neurodynamics
– Rosenblatt
- 1962
|
|
188
|
Training a 3-Node Neural Network is NP-Complete
– Blum, Rivest
- 1992
|
|
165
|
The Upstart algorithm: a method for constructing and training feed-forward networks, Neural Comput
– Frean
- 1990
|
|
148
|
order derivatives for network pruning: optimal brain surgeon
– Hassibi, Stork, et al.
- 1993
|
|
144
|
A new polynomial time algorithm for linear programming
– Karmarkar
- 1984
|
|
132
|
Learning distributed representations of concepts
– Hinton
- 1986
|
|
131
|
Neural Network Learning and Expert Systems
– Gallant
- 1993
|
|
91
|
Accelerating the convergence of the backpropagation method
– Vogl, Mangis, et al.
- 1988
|
|
87
|
Learning in feedforward layered networks: the tilting algorithm
– Mezard, Nadal
- 1989
|
|
86
|
Local Learning Algorithms
– Bottou, Vapnik
- 1992
|
|
83
|
Boosting performance in neural networks
– Drucker, Schapire, et al.
- 1993
|
|
70
|
Review of Neural Networks for Speech Recognition
– Lippmann
- 1989
|
|
70
|
Perceptron-Based Learning Algorithms
– Gallant
- 1990
|
|
69
|
Experiments on learning by back propagation
– Plaut, Nowlan, et al.
- 1986
|
|
65
|
Large automatic learning, rule extraction, and generalization
– Denker, Schwartz, et al.
- 1987
|
|
63
|
Learning algorithms with optimal stability in neural networks
– Krauth, Mézard
- 1987
|
|
55
|
Learning logic
– Parker
- 1985
|
|
55
|
Neural net algorithms that learn in polynomial time from examples and queries
– Baum
- 1991
|
|
53
|
How neural nets work
– Lapedes, Farber
- 1987
|
|
53
|
Accelerated learning in layered neural networks
– Solla, Levin, et al.
- 1988
|
|
51
|
Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights
– Nguyen, Widrow
|
|
51
|
A thermal perceptron learning rule
– Frean
- 1992
|
|
50
|
A convergence theorem for sequential learning in two-layer perceptrons
– Marchand, Golea, et al.
- 1990
|
|
50
|
Structured induction in expert systems
– Shapiro
- 1987
|
|
42
|
Comparison of classifier methods: A case study in handwriting digit recognition
– Bottou, Cortes, et al.
- 1994
|
|
42
|
On the identification of the convex hull of a finite set of points
– Jarvis
- 1973
|
|
39
|
Efficient Parallel learning algorithms for neural networks
– Kramer, Sangiovanni-Vincentelli
- 1988
|