34 citations found. Retrieving documents...
Johansson, ?., Dowla, F. and Goodman, ?. (1990). Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. UCRL-JC-104850, Lawrence Livermore National Laboratory.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Making Use of Population Information in Evolutionary Artificial.. - Yao, Liu (1998)   (22 citations)  (Correct)

....an error function (often a mean square error) of ANN s. The so called learning problem here is a typical optimization problem in numerical analysis. Many improvements on the ANN learning algorithm are actually improvements over optimization algorithms [12] such as conjugate gradient methods [13], 14] Learning is different from optimization because we want the learned system to have best generalization, which is different from minimizing an error function. The ANN with the minimum error does not necessarily mean that it has best generalization unless there is an equivalence between ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman, "Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method," Int. J. Neural Syst., vol. 2, no. 4, pp. 291--301, 1991.


Evolving Artificial Neural Networks - Yao (1999)   (66 citations)  (Correct)

....this term does not need to be differentiable or even continuous. Weight sharing and weight decay can also be incorporated into the fitness function easily. Evolutionary training can be slow for some problems in comparison with fast variants of BP [131] and conjugate gradient algorithms [19] [132]. However, EA s are generally much less sensitive to initial conditions of training. They always search for a globally optimal solution, while a gradient descent algorithm can only find a local optimum in a neighborhood of the initial solution. For some problems, evolutionary training can be ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman, "Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method," Int. J. Neural Syst., vol. 2, no. 4, pp. 291--301, 1991.


First and Second-Order Methods for Learning: between Steepest.. - Battiti (1992)   (88 citations)  (Correct)

....1988] for details s. Furthermore, summing the momentum term to the one proportional to the negative gradient may produce an ascent direction, so that the error increases after the weight update. Among the researchers using conjugate gradient methods for the MLP are [Barnard and Cole 1988] [Johansson et al. 1990], Bengio and Moore 1989] Drago and Ridella 1991] Hinton s group in Toronto, the groups at CMU, Bell Labs, etc. A version in which the one dimensional minimization is substituted by a scaling of the step that depends on success in error reduction and goodness of a one dimensional quadratic ....

Johansson, E.M., F.U. Dowla, D.M. Goodman. Backpropagation Learning for Multi-Layer Feed-Forward Neural Networks Using the Conjugate Gradient Method. Lawrence Livermore National Laboratory, Preprint UCRL-JC-104850.


Learning with First, Second, and No Derivatives: a.. - Roberto Battiti.. (1994)   (2 citations)  (Correct)

....Unfortunately, each iteration requires a sizable number of function calls (about 30) and gradient calls (about 25) The two variants do not present statistically significant differences. Some references about the use of conjugate gradient for training in neural networks are, for example, [18], 21] and [2] 3.4 One Step Secant with Fast Line Search Computing the exact Hessian requires order O(N ) operations [8] and order O(N ) memory to store the Hessian components, in addition the solution of equation 13 to find the step (or search direction) in Newton s method requires ....

E. M. Johansson, F. U. Dowla and D. M. Goodman, Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method, Preprint UCRL-JC-104850, Lawrence Livermore Nat. Lab. (Sept. 1990).


A Constrained Learning Framework for Feedforward Neural.. - Perantonis, Ampazis..   (Correct)

....in Table 2. We assess the performance of a much wider range of algorithms than those tested in [6] namely the on line and o line versions of BP with momentum (BPM) resilient propagation (RPROP) 19] conjugate gradient methods of Fletcher Reeves (CG FR) and Polak Ribi ere (CG PR) with restarts [20], the quickprop algorithm of Fahlman (QP) 18] and the Delta Bar Delta algorithm of Jacobs (DBD) 17] For each benchmark problem we performed 50 learning trials starting from di erent randomly chosen weights in the range 0.5 to 0.5. The maximum number of epochs per trial was set to 1000 and ....

E. M. Johansson, F. U. Dowla and D. M. Goodman, Backpropagation learning for multilayer feedforward networks using the conjugate gradient method. International Journal of Neural Systems 2 (1992) 291-301.


Knowledge Extracted From Trained Neural Networks - Yao (1999)   (66 citations)  (Correct)

....training, this term does not need to be differentiable or even continuous. Weight sharing and weight decay can also be incorporated into the fitness function easily. Evolutionary training can be slow for some problems in comparison with fast variants of BP [131] and conjugate gradient algorithms [19, 132]. However, EAs are generally much less sensitive to initial conditions of training. They always search for a globally optimal solution, while a gradient descent algorithm can only find a local optimum in a neighborhood of the initial solution. For some problems, evolutionary training can be ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman, "Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method," Int'l J. of Neural Systems, vol. 2, no. 4, pp. 291--301, 1991.


Orthogonal Incremental Learning of a Feedforward Network - Vysniauskas, Groen, Kröse (1995)   (6 citations)  (Correct)

....convergence of the backpropagation algorithm is very slow, on the order of 1=t at best. Numerical optimization technique offers a rich and robust set of methods which can be applied in an attempt to improve learning rates [3] In particular, the conjugate gradient method can be easily implemented [8] and the convergence rate is comparable with computationally expensive second order methods [15] It is an extended version of the paper presented in the proceedings of ICANN 95 conference (see vol. 1, p. 311) held in Paris, October 9 13. The extension covers experimental results with Net ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multilayer feed-forward neural networks using he conjugate gradient method. International Journal of Neural Systems, 2(4):291--301, 1992.


Binary Decision Clustering for Neural Network Based OCR - Wilson, al. (1994)   (2 citations)  (Correct)

....to the training data. This has been shown to increase the generalization ability of the network [20] Networks of the MLP type are the most commonly used neural nets in use today, and they are usually trained using a backpropagation algorithm [21] A scaled conjugate gradient training method [22, 23, 24, 20] has been used in our research instead of the ubiquitous backpropagation method, training speed gains of an order of magnitude being typical. Neural nets of the RBF type get their name from the fact that they are built from radially symmetric Gaussian functions of the inputs. Actually, the RBF ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. IEEE Transactions on Neural Networks, 1991.


On the Improvement of the Real Time Recurrent Learning.. - Mak, Ku, Lu (1998)   (Correct)

....the characteristics of the extended Kalman filter methods and the EM algorithm are highlighted. 2.3. 1 Second order methods Research has shown that second order methods such as the conjugate gradient method can improve the learning speed of feedforward networks, see in par12 ticular the work of [18,19]. However, extending the same idea to the training of recurrent networks is not as straightforward as one might expect. This is because the conjugate gradient is inherently an off line method. Despite this limitation, conjugate gradient has been formulated as an epoch based training algorithm for ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. Journal of Neural Systems, 2(4):291--301, 1992.


Binary Decision Clustering for Neural Network Based OCR - Wilson, Grother, Barnes (1994)   (2 citations)  (Correct)

....to the training data. This has been shown to increase the generalization ability of the network [20] Networks of the MLP type are the most commonly used neural nets in use today, and they are usually trained using a backpropagation algorithm [21] A scaled conjugate gradient training method [22, 23, 24, 20] has been used in our research instead of the ubiquitous backpropagation method, training speed gains of an order of magnitude being typical. Neural nets of the RBF type get their name from the fact that they are built from radially symmetric Gaussian functions of the inputs. Actually, the RBF ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. IEEE Transactions on Neural Networks, 1991.


On Langevin Updating in Multilayer Perceptrons - Rögnvaldsson (1993)   (Correct)

....errors are E = 0:110 and E = 0:100 for BP and LV respectively. b) Average convergence curves for the first 5,000 epochs. The MH learning rate is jM = 0:007. The third problem, parity [18] has been used in benchmark studies where Conjugate Gradient (CG) algorithms greatly outperformed BP [17]. I use 4, 5 and 6 dimensional parity problems, with architectures of n inputs, 8 hidden units and one output. Each network is trained for 10,000 epochs or until 100 correct classification. The weights are initialized with 2 [ Gamma0:1; 0:1] A large momentum term is used with ff = 0:9. The ....

E. Johansson, F. Dowla and D. Goodman, "Backpropagation Learning for Multilayer Feed-forward Neural Networks using the Conjugate Gradient Algorithm", Int. J. Neur. Syst. 2, 291 (1992)


Learning with First, Second, and No Derivatives: a Case.. - Roberto Battiti, et al. (1994)   (2 citations)  (Correct)

....Unfortunately, each iteration requires a sizable number of function calls (about 30) and gradient calls (about 25) The two variants do not present statistically significant differences. Some references about the use of conjugate gradient for training in neural networks are, for example, [18], 21] and [2] 3.4 One Step Secant with Fast Line Search Computing the exact Hessian requires order O(N 2 ) operations [8] and order O(N 2 ) memory to store the Hessian components, in addition the solution of equation 13 to find the step (or search direction) in Newton s method requires O(N ....

E. M. Johansson, F. U. Dowla and D. M. Goodman, Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method, Preprint UCRL-JC-104850, Lawrence Livermore Nat. Lab. (Sept. 1990).


Comparison of Handprinted Digit Classifiers - Grother, al. (1993)   (8 citations)  (Correct)

....data. This has been shown to increase the generalization ability of the network [27] Networks of the MLP type are the most commonly used neural nets in use today, and they are usually trained using a backpropagation algorithm [28] A scaled conjugate gradient training method instead [29, 30, 31, 27] has been preferred to the ubiquitous backpropagation method, speed gains of an order of magnitude being typical. Figure 7 shows MLP class regions resulting from varying the first two inputs to a trained 8 input, 48 hidden unit network. B s s c s B B Neural nets of the Radial Basis Functions ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. IEEE Transactions on Neural Networks, 1991.


Evaluation of Pattern Classifiers for Fingerprint.. - Blue, Candela.. (1993)   (24 citations)  (Correct)

....to the training data. This has been shown to increase the generalization ability of the network [29] Networks of the MLP type are the most commonly used neural nets in use today, and they are usually trained using a backpropagation algorithm [45] A scaled conjugate gradient training method [46, 47, 28, 29] was used in our research instead of the ubiquitous backpropagation method, training speed gains of an order of magnitude being typical. Figure 5 shows MLP class regions resulting from varying the first two inputs to a trained 8 input, 48 hidden unit network. Figure 5: MLP classification and ....

E.M. Johansson, F.U. Dowla, and D.M. Goodman. Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. IEEE Transactions on Neural Networks. To be published.


Neural Networks with Adaptive Learning Rate and Momentum Terms - Moreira, Fiesler (1995)   (3 citations)  (Correct)

....on the function and gradient values is obtained which is considered to be the standard conjugate gradient as applied to the training of feed forward neural networks. As it is only briefly outlined here, the derivation and formal description of the conjugate direction principles can be found in [Johansson 92] or [Moller 93] In this section, the vector containing the summation of the negative gradients vectors for all the pattern presentations in epoch k ( GammarE(w k ) will be denoted as g k . 1. Initialization. The weight vector w 0 is set. The initial direction d 0 is obtained by gradient descent ....

.... j k d k . 4. A new direction d k 1 is computed. If (k 1 mod W) 0 then the algorithm is restarted with d k 1 = g k 1 . Otherwise d k 1 = g k 1 fi k d k . 5. If the minimumwas reached then the search is terminated. Otherwise, a new iteration is performed: k = k 1 and jump to step 2. 4 See [Johansson 92] for details on its derivation. 5 The fi k parameter is always calculated in order to force the consecutive directions to be conjugate. Different formulas exist to calculate it, such as the following: Fletcher Reeves: fi k = g T k 1 g k 1 g T k g k Polak Ribi ere: fi k = g T k 1 [g k 1 ....

E.M. Johansson, F.U. Dowla, and D.M. Goodman. "Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method." International Journal of Neural Systems (ISSN: 0129-0657), volume 2, number 4, pages 291--301. World Scientific Publishing Company, 1992.


Efficient Training of Feed-Forward Neural Networks - Mřller (1997)   (Correct)

....version of this paper. A.1 Abstract A supervised learning algorithm (Scaled Conjugate Gradient, SCG) is introduced. The performance of SCG is benchmarked against that of the standard backpropagation algorithm (BP) Rumelhart et al. 86] the conjugate gradient algorithm with line search (CGL) Johansson et al. 91] and the one step Broyden Fletcher Goldfarb Shanno memoryless quasi Newton algorithm (BFGS) Battiti 89] SCG is fully automated, includes no user dependent parameters and avoids a time consuming line search, which CGL and BFGS use in each iteration in order to determine an appropriate step ....

.... methods, called the Conjugate Gradient Methods, are well suited to handle large scale problems in an effective way [Hestenes and Stiefel 52] Fletcher 75] Gill et al. 81] Powell 77] Several conjugate gradient algorithms have recently been introduced as learning algorithms in neural networks [Johansson et al. 91] Battiti 89] M ller 90b] Johansson, Dowla and Goodman describes the theory of general conjugate gradient methods and how to apply the methods in feed forward neural networks. They conclude that the standard conjugate gradient method with line search (CGL) is an order of magnitude faster than ....

[Article contains additional citation context not shown here]

E.M. Johansson, F.U. Dowla and D.M. Goodman (1991), Backpropagation Learning for Multi-Layer Feed-Forward Neural Networks Using the Conjugate Gradient Method, International Journal of Neural Systems, Vol. 2, No. 4, pp. 291-301.


A Conjugate Gradient Learning Algorithm for Recurrent Neural.. - Chang, Mak (1998)   (Correct)

....function to be optimized, they are typically referred to as second order methods. Over the years, a number of second order methods have been proposed [2] In particular, the conjugate gradient method is commonly used in training BP networks due to its speed and simplicity. It has been shown [4] [5], 6] 7] that training time can be significantly reduced when feedforward networks are trained by second order methods. Recurrent neural networks (RNNs) which include feedback loops (connections by which a node s prior output influences its subsequent output) are capable of processing ....

....( 1 ) w w (17) where e k (n) is the error at the kth output neuron for the nth pattern and ) n p ijk is computed as in (10) Since the CGRL algorithm requires both error function and gradient to be evaluated, the calculations should be performed together to maximize efficiency. Johansson et al. [5] proposed to use the conjugate gradient method to train backpropagation networks. We extend their idea to recurrent networks as follows. 1. A starting point, 0 w is selected by initializing the weights between 0.5 and 0.5 randomly. The gradient 0 g at this point is computed (as in (17) and an ....

[Article contains additional citation context not shown here]

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation Learning for Multilayer Feed-Forward Neural Networks Using the Conjugate Gradient Method. International Journal of Neural Systems, vol. 2, no. 4, 1992, p.291 - p.301.


Using the Discriminability Based Transfer Algorithm to Selectively .. - Pratt   (2 citations)  (Correct)

....to neural networks instead. CG has been widely validated in the field of unconstrained nonlinear optimization [ Powell, 1977, Fletcher, 1980 ] Recent studies have shown that a variety of CG methods are much faster than back propagation learning ( Kramer and Sangiovanni Vincentelli, 1989, Johansson et al. 1991, Barnard, 1992 ] CG is also used on a regular basis in several research groups (cf. Barnard and Cole, 1989, Nowlan and Hinton, 1992 ] Since the goal of transfer is to speed up learning, it makes sense to base a transfer method on the fastest existing algorithm currently reliably used. ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. International Journal of Neural Systems, 2(4):291--301, 1991. Transfer in neural networks 38


On Langevin Updating in Multilayer Perceptrons - Rögnvaldsson (1993)   (Correct)

....BP catches up with LV as the errors reach their lower limit (fig. 2c) By this time the noise level in LV is zero and there is no difference between LV and BP. The third problem, n parity, has previously been used in benchmark studies where Conjugate Gradient algorithms greatly outperformed BP [21]. We study 4, 5 and 6 dimensional parity problems, with the architecture n inputs, 8 hidden units and one output. Using more hidden units than inputs decreases the risk of getting stuck in a local minima. Each network is trained for 10000 epochs or until 100 correct classification. The weights ....

Johansson, E., Dowla, F., and Goodman, D. 1992. Backpropagation Learning for Multilayer Feed-forward Neural Networks using the Conjugate Gradient Algorithm. Int. J. Neur. Syst. 2, 291-301.


Karhunen Ločve Feature Extraction For Neural Handwritten.. - Grother (1992)   (2 citations)  (Correct)

....that convergence is slow [3] and that there are, in the usual implementation [16] two adjustable parameters, j and ff, that have to be manually optimized for the particular problem. Conjugate gradient methods have been used for many years [5] for minimizing functions, and have recently [8] been discovered by the neural network community. The usual methods require an expensive line search or its equivalent. M ller [13] has introduced a scaled conjugate gradient method; instead of a line search, an estimate of the second derivative along the search direction is used to find the ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method. IEEE Transactions on Neural Networks, 1991.


Exploiting Population Information in Evolutionary Learning - Yao, Liu, Darwen   (Correct)

....an error function (often a mean square error) of ANNs. The so called learning problem here is a typical optimisation problem in numerical analysis. Many improvements on the ANN learning algorithm are actually improvements over optimisation algorithms [3] such as conjugate gradient methods [4, 5]. Learning is different from optimisation because we want the learned system to have best generalisation, which is different from minimising an error function. An ANN with the minimum error does not necessarily means that the ANN has best generalisation unless there is an accurate way to measure ....

E. M. Johansson, F. U. Dowla, and D. M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. Int'l J. of Neural Systems, 2(4):291--301, 1991.


Implementing the Conjugate Gradient Algorithm in a Functional.. - Serrarens (1996)   (3 citations)  (Correct)

....4 we compare our implementation in Clean with implementations in the languages C and Haskell, while Section 5 concludes. 1 The Conjugate Gradient Algorithm An algorithm often used in scientific computing for solving a system of linear equations is the conjugate gradient algorithm [CSJ95] EO94] JDG92] HS93] The algorithm itself is an iterative approximation method to solve the linear equation Ax = b. A direct method is the gauss elimination. But, only when working with symbolic computation, which provides arbitrary precision integers, can we get an exact solution. Functional languages ....

E.M. Johansson, F.U. Dowla, and D.M. Goodman. Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. International Journal of Neural Systems, 2(4), 1992.


Statistical Factors in Behaviour Learning - Chris Thornton Cognitive   (Correct)

No context found.

Johansson, ?., Dowla, F. and Goodman, ?. (1990). Backpropagation learning for multi-layer feed-forward neural networks using the conjugate gradient method. UCRL-JC-104850, Lawrence Livermore National Laboratory.


Truncated-Newton Training Algorithm for.. - Al-Haik, Garmestani..   (Correct)

No context found.

F. D. E.M. Johansson, D. Goodman, Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method, Int. J. Neural Systems 2 (4) (1991) 291.


Evaluation and Improvement of Two Training Algorithms - Tae-Hoon Kim Jiang   (Correct)

No context found.

E.M. Johansson, F.U. Dowla and D.M. Goodman, Backpropagation learning for multilayer feed-forward neural networks using the conjugate gradient method, International Journal of Neural Systems, vol. 2, no. 4,

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC