Download:
|
by Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, Senior Member, C. Lee Giles
IEEE Trans. on Neural Networks
http://www.neci.nec.com/~lawrence/papers/results-tnn97/results-tnn97.ps.gz
Add To MetaCart
Abstract:
Abstract--- The performance of neural-network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However, in many cases, the distribution of the individual results does not approximate a Gaussian distribution, may not be symmetric, and may be multimodal. We present the distribution of results for practical problems and show that assuming Gaussian distributions can significantly affect the interpretation of results, especially those of comparison studies. For a controlled task which we consider, we find that the distribution of performance is skewed toward better performance for smoother target functions and skewed toward worse performance for more complex target functions. We propose new guidelines for reporting performance which provide more information about the actual distribution. Index Terms--- Backpropagation, box whiskers, convergence, error analysis, Gaussian distribution, gradient training,
Citations
|
1871
|
Neural networks: A Comprehensive Foundation
– Haykin
- 1994
|
|
403
|
Numerical Recipes in C
– Press, Teukolsky, et al.
- 1992
|
|
365
|
Exploratory Data Analysis
– Tukey
- 1977
|
|
188
|
Training a 3-Node Neural Network is NP-Complete
– Blum, Rivest
- 1992
|
|
125
|
Oscillations and chaos in physiological control systems
– Mackey, Glass
- 1977
|
|
87
|
Neural Network Design and The Complexity of Learning
– Judd
- 1990
|
|
71
|
FIR and IIR synapses, a new neural network architecture for time series modeling
– Back, Tsoi
- 1991
|
|
63
|
An economics approach to hard computational problems
– Huberman, Lukose, et al.
|
|
46
|
Note on learning rate schedule for stochastic optimization
– Darken, Moody
- 1991
|
|
40
|
Complexity Results on Learning by Neural Nets
– Lin, Vitter
- 1991
|
|
35
|
Statistical evaluation of neural network experiments: Minimum requirements and current practice
– Flexer
- 1996
|
|
30
|
On the complexity of loading shallow neural networks
– Judd
- 1988
|
|
25
|
Mises. Mathematical Theory of Probability and Statistics
– von
- 1964
|
|
23
|
Exponentially many local minima for single neurons
– Auer, Herbster, et al.
- 1996
|
|
22
|
Back propagation fails to separate where perceptrons succeed
– Brady, Raghavan, et al.
- 1989
|
|
19
|
Tukey, Exploratory Data Analysis
– W
- 1977
|
|
18
|
Strong universal consistency of neural network classifiers
– Farago, Lugosi
- 1993
|
|
16
|
Chaotic attractors of an infinite-dimensional dynamical system
– Farmer
- 1982
|
|
14
|
Introductory Statistics
– Weiss, Hassett
- 1987
|
|
13
|
What Size Neural Network Gives Optimal Generalization
– Lawrence, Giles, et al.
- 1996
|
|
11
|
Lessons in neural network training: Overfitting may be harder than expected
– Lawrence, Giles
- 1997
|
|
9
|
A unifying view of some training algorithms for multilayer perceptrons with FIR filter synapses
– Back, Wan, et al.
- 1994
|
|
8
|
New Techniques for Nonlinear System Identification: A Rapprochement Between Neural Networks and Linear Systems
– Back
- 1992
|
|
8
|
An efficient algorithm for the Kolmogorov-Smirnov and Lilliefors Tests
– Gonzalez, Sahni, et al.
- 1977
|
|
8
|
Results of the Time Series Prediction Competition at the Santa Fe Institute
– Weigend, Gershenfeld
- 1993
|
|
5
|
Characterizing neural network error surfaces with a sequential quadratic programming algorithm
– Crane, Fefferman, et al.
- 1995
|
|
4
|
Introductory Statistics, 2nd ed
– Weiss, Hassett
- 1987
|
|
3
|
Tsoi, "FIR and IIR synapses, a new neural network architecture for time series modeling
– Back, C
- 1991
|
|
3
|
A statistical neural network for high-dimensional vector classification
– Verleysen, Voz, et al.
- 1995
|
|
2
|
Training a three-node neural network is � NP-complete
– Blum, Rivest
- 1992
|
|
2
|
Design and the Complexity of Learning
– Neural-Network
- 1990
|
|
1
|
Tsoi, "A unifying view of some training algorithms for multilayer perceptrons with FIR filter synapses
– Back, Wan, et al.
- 1995
|
|
1
|
Tsoi, "Lessons in neural-network training: Overfitting may be harder than expected
– Lawrence, Giles, et al.
- 1997
|
|
1
|
Gershenfeld, "Results of the time series prediction competition at the Santa Fe Institute
– Weigend, A
- 1993
|