8 citations found. Retrieving documents...
S. Lawrence, C. L. Giles, and A. C. Tsoi. What size neural network gives optimal generalization? Convergence properties of backpropagation. Technical Report UMIACS-TR-96-22 and CS-TR-3617, University of Maryland, April 1996.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Deterministic Nonmonotone Strategies for Effective.. - Plagianakos..   (Correct)

....TO FUNCTION APPROXIMATION TABLE X NONMONOTONE BBP APPLIED TO THE ALPHABETIC FONT PROBLEM D. Generalization Performance Generalization performance is a decisive factor when selecting a training algorithm. It is well known that generalization depends on the size of the weights space [17] [35], 67] the noise in the data set [8] 22] 26] 45] the size of the training set [32] and the initial weight distribution [3] A number of techniques have been proposed to avoid overfitting and improve generalization [22] 32] such as the use of two fold or fold cross validation [1] and ....

S. Lawrence, C. L. Giles, and A. C. Tsoi, "What size neural network gives optimal generalization? Convergence properties of backpropagation, " Univ. Maryland Tech. Rep. CS-TR-3617, 1996.


An Incremental Multivariate Regression Method for Function.. - Carozza, Rampone (2001)   (Correct)

.... regression functions from noisy data comes from kernel regression theory [2] However, existence theorems do not give methods to define a network architecture (mainly the number of neurons in the hidden layers) well suited to the data, i.e. allowing a good approximation without overfitting [3]. By using a decomposition of the generalization error in estimation and approximation errors, Niyogi and Girosi [4] investigated the relationship between the number of network parameters and the training set size. They showed that simply fixing the number of data and increasing the number of ....

....to the noise. Then we stop the adding procedure when the residual error is under a fixed threshold related to the noise level. Our results extend to noisy data the strategy of Esposito et al. 11] and gives a precise theoretical insight. Moreover let us note that, with respect to other methods [3], such results appear better in terms of approximation and generalization capability, and computational cost. This paper is organized as follows. In Section 2 we describe the used incremental estimator, and, in Section 3, the network training mechanism, bounding the network nodes and summarizing ....

[Article contains additional citation context not shown here]

S. Lawrence, C. Lee Giles, A. Chung Tsoi, What Size Neural Network Gives Optimal Generalization?, University of Maryland Technical Report UMIACS-TR-96-22 and CSTR -3617, 1996.


Modelling Chaotic Systems with Neural Networks: Application to.. - van Zyl   (Correct)

....The best model is one that models the data correctly, and has the minimum number of free parameters. This is known as the minimum description length principle. However, obtaining the correct model may be more difficult when training networks with only the minimum amount of free parameters [34]. A possible solution is to choose the model too large, to remove unimportant weights after training, and then to retrain the model. In this section we discuss a few methods for deciding which weights are important and which are irrelevant. 3.7.1 Optimal Brain Damage By trading the training ....

S. Lawrence, C. Giles, and A. Tsoi, "What size neural network gives optimal generalization ?," Tech. Rep. UMIACS-TR-22, Institute for Advanced Computer Studies, University of Maryland, College Park, MD, 1996.


A Neuro-Fuzzy Approach to Aerobic Fitness.. - Väinämö.. (1998)   (Correct)

.... World Congress On Computational Intelligence Anchorage, Alaska, USA, May 4 9, 1998. A Neuro Fuzzy Approach to Aerobic Fitness Classification: a multistructure solution to the context sensitive feature selection problem Kauko Vinm, Timo Mkikallio , Mikko Tulppo , Juha Rning Department of Electrical Engineering, University of Oulu, FIN 90570 OULU, FINLAND e mail: ....

....The second chapter includes a summary of the techniques used and the related work. The third chapter introduces our approach to deal with contextsensitive features. Finally, the results and the discussion are given. IEEE World Congress On Computational Intelligence Anchorage, Alaska, USA, May 4 9, 1998. 2. Related work 2.1. Neuro Fuzzy Methods The neuro fuzzy calculation approach can be implemented in notably different ways. These artificial hybrid techniques can be implemented in a very integrated manner, when fuzzy implementation is used as the learning control of the artificial ....

[Article contains additional citation context not shown here]

Lawrence, S., Giles, C.L., Tsoi, A.C.,"What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation", Technical Report, Institute for Advanced Computer Studies, University of Maryland, June 1996.


A Case Study in Neural Network Training with the Breeder Genetic .. - Belanche   (Correct)

.... connectivity (usually full between adjacent layers, and no shortcuts) reduce the problem of hidden unit determination to a (possibly incremental) search over a predetermined small space of combinations, employing a model selection technique, to achieve a good compromise between bias and variance [14]. Even focusing only in this task, it is a difficult one that can become a painful design problem for certain applications. To tackle the second task the real optimization problem several strategies have been proposed. By far the most widely used in the neural network context are those ....

Lawrence, S., Giles, C. L., Tsoi, A. C. What Size Neural Network Gives Optimal Generalization ? Convergence Properties of Backpropagation. Technical Report UMIACS-TR-96-22 and CS-TR-3617. Institute for Advanced Computer Studies. Univ. of Maryland, 1996.


The Sample Complexity of Pattern Classification With Neural.. - Bartlett (1997)   (69 citations)  (Correct)

....a satisfactory explanation of the sample size requirements of neural networks for pattern classification applications, for several reasons. First, neural networks often perform successfully with training sets that are considerably smaller than the number of network parameters (see, for example, [29]) Second, the VC dimension of the class of functions computed by a network is sensitive to small perturbations of the computation unit transfer functions (to the extent that an arbitrarily small change can make the VC dimension infinite, see [39] That this could affect the generalization ....

S. Lawrence, C. L. Giles, and A. C. Tsoi. What size neural network gives optimal generalization ? Convergence properties of backpropagation. Technical Report UMIACSTR -96-22 and CS-TR-3617, Institute for Advanced Computer Studies, University of Maryland, April 1996.


Center for Automated Learning and Discovery - Advisor Manuela Veloso   (Correct)

No context found.

S. Lawrence, C. L. Giles, and A. C. Tsoi. What size neural network gives optimal generalization? Convergence properties of backpropagation. Technical Report UMIACS-TR-96-22 and CS-TR-3617, University of Maryland, April 1996.


A Domain Independent Approach to 2D Object Detection Based on the.. - Zhang (2000)   (2 citations)  (Correct)

No context found.

S. Lawrence, C. Giles, and A. Tsoi. What size neural network gives optimal generalization? convergence properties of backpropagation. Technical Report UMIACSTR-96-22 and CSTR -3617, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 2074.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC