Results 1 - 10
of
35
Evolving Artificial Neural Networks
, 1999
"... This paper: 1) reviews different combinations between ANN's and evolutionary algorithms (EA's), including using EA's to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EA's; and 3) points out po ..."
Abstract
-
Cited by 328 (6 self)
- Add to MetaCart
This paper: 1) reviews different combinations between ANN's and evolutionary algorithms (EA's), including using EA's to evolve ANN connection weights, architectures, learning rules, and input features; 2) discusses different search operators which have been used in various EA's; and 3) points out possible future research directions. It is shown, through a considerably large literature review, that combinations between ANN's and EA's can lead to significantly better intelligent systems than relying on ANN's or EA's alone
A scaled conjugate gradient algorithm for fast supervised learning
- NEURAL NETWORKS
, 1993
"... A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural netwo ..."
Abstract
-
Cited by 239 (0 self)
- Add to MetaCart
A supervised learning algorithm (Scaled Conjugate Gradient, SCG) with superlinear convergence rate is introduced. The algorithm is based upon a class of optimization techniques well known in numerical analysis as the Conjugate Gradient Methods. SCG uses second order information from the neural network but requires only O(N) memory usage, where N is the number of weights in the network. The performance of SCG is benchmarked against the performance of the standard backpropagation algorithm (BP) [13], the conjugate gradient backpropagation (CGB) [6] and the one-step Broyden-Fletcher-Goldfarb-Shanno memoryless quasi-Newton algorithm (BFGS) [1]. SCG yields a speed-up of at least an order of magnitude relative to BP. The speed-up depends on the convergence criterion, i.e., the bigger demand for reduction in error the bigger the speed-up. SCG is fully automated including no user dependent parameters and avoids a time consuming line-search, which CGB and BFGS uses in each iteration in order to determine an appropriate step size.
Incorporating problem dependent structural information in the architecture of a neural network often lowers the overall complexity. The smaller the complexity of the neural network relative to the problem domain, the bigger the possibility that the weight space contains long ravines characterized by sharp curvature. While BP is inefficient on these ravine phenomena, it is shown that SCG handles them effectively.
A Review of Evolutionary Artificial Neural Networks
, 1993
"... Research on potential interactions between connectionist learning systems, i.e., artificial neural networks (ANNs), and evolutionary search procedures, like genetic algorithms (GAs), has attracted a lot of attention recently. Evolutionary ANNs (EANNs) can be considered as the combination of ANNs and ..."
Abstract
-
Cited by 132 (22 self)
- Add to MetaCart
Research on potential interactions between connectionist learning systems, i.e., artificial neural networks (ANNs), and evolutionary search procedures, like genetic algorithms (GAs), has attracted a lot of attention recently. Evolutionary ANNs (EANNs) can be considered as the combination of ANNs and evolutionary search procedures. This paper first distinguishes among three kinds of evolution in EANNs, i.e., the evolution of connection weights, of architectures and of learning rules. Then it reviews each kind of evolution in detail and analyses critical issues related to different evolutions. The review shows that although a lot of work has been done on the evolution of connection weights and of architectures, few attempts have been made to understand the evolution of learning rules. Interactions among different evolutions are seldom mentioned in current research. However, the evolution of learning rules and its interactions with other kinds of evolution play a vital role in EANNs. As t...
First and Second-Order Methods for Learning: between Steepest Descent and Newton's Method
- Neural Computation
, 1992
"... On-line first order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neura ..."
Abstract
-
Cited by 108 (6 self)
- Add to MetaCart
On-line first order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.
Making Use of Population Information in Evolutionary Artificial Neural Networks
, 1998
"... This paper is concerned with the simultaneous evolution of artificial neural network (ANN) architectures and weights. The current practice in evolving ANNs is to choose the best ANN in the last generation as the final result. This paper proposes a different approach to form the final result by combi ..."
Abstract
-
Cited by 65 (22 self)
- Add to MetaCart
This paper is concerned with the simultaneous evolution of artificial neural network (ANN) architectures and weights. The current practice in evolving ANNs is to choose the best ANN in the last generation as the final result. This paper proposes a different approach to form the final result by combining all the individuals in the last generation in order to make best use of all the information contained in the whole population. This approach regards a population of ANNs as an ensemble and uses a combination method to integrate them. Although there has been some work on integrating ANN modules [2], [3], little has been done in evolutionary learning to make best use of its population information. Four linear combination methods have been investigated in this paper to illustrate our ideas. Three real world data sets have been used in our experimental studies, which show that the recursive least square (RLS) algorithm always produces an integrated system that outperforms the best individua...
Evaluation of Pattern Classifiers for Fingerprint and OCR Applications
- Pattern Recognition
, 1993
"... In this paper we evaluate the classification accuracy of four statistical and three neural network classifiers for two image based pattern classification problems. These are fingerprint classification and optical character recognition (OCR) for isolated handprinted digits. The evaluation results rep ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
In this paper we evaluate the classification accuracy of four statistical and three neural network classifiers for two image based pattern classification problems. These are fingerprint classification and optical character recognition (OCR) for isolated handprinted digits. The evaluation results reported here should be useful for designers of practical systems for these two important commercial applications. For the OCR problem, the Karhunen-Lo`eve (K-L) transform of the images is used to generate the input feature set. Similarly for the fingerprint problem, the K-L transform of the ridge directions is used to generate the input feature set. The statistical classifiers used were Euclidean minimum distance, quadratic minimum distance, normal, and k-nearest neighbor. The neural network classifiers used were multilayer perceptron, radial basis function, and probabilistic. The OCR data consisted of 7,480 digit images for training and 23,140 digit images for testing. The fingerprint data co...
How to Make Best Use of Evolutionary Learning
- in Complex Systems: From Local Interactions to Global Phenomena
, 1996
"... Evolutionary learning has been developing rapidly in the last decade. It is a powerful and general learning approach which has been used successfully in both symbolic systems, e.g., rule-based systems, and subsymbolic systems, e.g., artificial neural networks. However, most evolutionary learning sys ..."
Abstract
-
Cited by 17 (10 self)
- Add to MetaCart
Evolutionary learning has been developing rapidly in the last decade. It is a powerful and general learning approach which has been used successfully in both symbolic systems, e.g., rule-based systems, and subsymbolic systems, e.g., artificial neural networks. However, most evolutionary learning systems have paid little attention to the fact that they are population-based learning. The common practice is to select the best individual in the last generation as the final learned system. Such practice in essence treats these learning systems as optimisation ones. This paper emphasises the difference between a learning system and an optimisation one, and shows that such difference requires a different approach to population-based learning and that the current practice of selecting the best individual as the learned system is not the best choice. The paper then argues that a population contains more information than the best individual and thus should be used as the final learned system. Tw...
Karhunen Loève Feature Extraction For Neural Handwritten Character Recognition
, 1992
"... this paper investigate the effectiveness of Karhunen Lo`eve transforms as classifiable features for handwritten digit recognition. The issues of interest include: 1. What is the optimal feature length? Generalization on an unseen test database is obtained as a function of the dimensionality of the b ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
this paper investigate the effectiveness of Karhunen Lo`eve transforms as classifiable features for handwritten digit recognition. The issues of interest include: 1. What is the optimal feature length? Generalization on an unseen test database is obtained as a function of the dimensionality of the basis space in which characters are represented;
Comparison of Handprinted Digit Classifiers
, 1993
"... this report were trained and tested using feature vectors derived from the digit images of NIST Special Database 3 [13]. This database consists of binary 128 by 128 pixel raster images segmented from Normalized Binary Image Feature E tractor Discriminant Functions Class Finder Re ector Hypothesized ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
this report were trained and tested using feature vectors derived from the digit images of NIST Special Database 3 [13]. This database consists of binary 128 by 128 pixel raster images segmented from Normalized Binary Image Feature E tractor Discriminant Functions Class Finder Re ector Hypothesized Class Accept or Re ect Figure 1: Components of Classification System the sample forms of 2100 writers published on CD as [14]. External results on segmentation and recognition of this database have been reported [15]. The relative difficulties of the NIST OCR databases have been discussed in [16]. For this study samples are drawn randomly from the first 250 writers to yield a training set of 7480 digits with a priori class probabilities all equal to 0:1. Even for digits, depending on the application, certain classes may be more prevalent; in banking tasks, for example, "0" is more common. The test set is similarly constructed from the second 250 writers yielding 23140 samples. The images are size normalized by pixel deletion, stroke width is bounded by binary erosion and dilation, and consistent orientation is effected by row shearing. Com onents of Classi cation System
On Langevin Updating in Multilayer Perceptrons
- Neural Computation
, 1993
"... : The Langevin updating rule, in which noise is added to the weights during learning, is presented and analyzed. It is well controlled and, being a natural extension to standard backpropagation learning, easily combined with other modifications of backpropagation. If the Hessian matrix is numericall ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
: The Langevin updating rule, in which noise is added to the weights during learning, is presented and analyzed. It is well controlled and, being a natural extension to standard backpropagation learning, easily combined with other modifications of backpropagation. If the Hessian matrix is numerically ill-conditioned, Langevin updating converges faster than backpropagation and, probably, also higher order algorithms. This is particularly important for multilayer perceptrons with many hidden layers, which tend to have ill-conditioned Hessians. In addition, Manhattan updating is shown to have a similar effect as Langevin updating. 1 denni@thep.lu.se Introduction Performances of artificial neural networks (ANN) are often improved when external noise is present during the training phase. For instance in Hopfield-type networks the basins of attraction for the stored memory patterns are enlarged when noise-corrupted training patterns are used [1]. In linear perceptrons the generalization a...

