13 citations found. Retrieving documents...
M. Bianchini, M. Gori, and M. Maggini, "On the Problem of Local Minima in Recurrent Neural Networks," IEEE Transaction on Neural Networks, Special Issue on Dynamic Recurrent Neural Networks, March 1994, pp. 167--177.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Trajectory Generation and Modulation Using Dynamic Neural.. - Zegers, Sundareshan (2003)   (Correct)

....degree of success with which neural networks have been used in static problems. It has been proven that a RNN can approximate any known dynamical system [16] and several techniques for training RNNs have been developed [17] 18] 19] 20] 21] 22] 23] 24] 25] 26] 27] 28] 29] [30], 31] 32] 33] 34] 35] 36] 37] 38] 39] 40] A subset of these works directed to the trajectory generation problem have shown that a RNN can indeed be trained to produce desired trajectory behavior, and have demonstrated the success of their training algorithms in generating ....

....techniques that can solve the trajectory generation problem for arbitrarily complex trajectories with acceptable computational complexity. This inherent difficulty is explained by the fact that the solution spaces are plagued with local minima and it is not always possible to find global solutions [30]. Also, gradient descent approaches do not work well because the gradients tend to vanish as the dynamics of the neural networks evolve [31] To circumvent this problem, other approaches not based on gradient descent [32] 33] 34] 35] have been recently proposed. Still, they cannot provide ....

M. Bianchini, M. Gori, and M. Maggini, "On the problem of local minima in recurrent neural networks," IEEE Trans. on Neural Networks, vol. 5, no. 2, pp. 167-177, March 1994.


New Machine Learning Methods for the Prediction Of.. - Baldi, Pollastri.. (2002)   (Correct)

....propagation algorithms for gradient computation. In practice however, it is not always trivial to get gradient descent learning procedures to work well in recurrent networks: error gradients can vanish rapidly as a function of time [11] and learning procedures can become stuck in poor local minima [13]. Another important factor in the architectures we are considering is the competition collaboration tradeoff between the hidden DAGs and their NN equivalents. Especially in homogeneous GRNNs where all the hidden components are equivalent, when the system is initialized with small weights there is ....

M. Bianchini, M. Gori, and M. Maggini. On the problem of local minima in recurrent neural networks. IEEE Transactions on Neural Networks 5(2):167--177, 1994.


Polynomial Softmax Functions for Pattern Classification - Tuerk, Young (2001)   (Correct)

....also means that such a model shares the same problems as all neural networks. For instance, it is not clear if the error surface has any local minima which correspond to suboptimal choices of parameters, or if there exists a global minimum at all. This problem has, for instance, been discussed in [2]. This paper presents an alternative to the combination of general neural networks and softmax functions in which independent polynomials or more general linear combinations of non linear functions are substituted for the network outputs. In [5] an example has been discussed where the input to ....

M. Bianchini, M. Gori, and M. Maggini. On the Problem of Local Minima in Recurrent Neural Networks. IEEE Transaction on Neural Networks, 5(2):167-177, 1994. 27


A Study of the Lamarckian Evolution of Recurrent Neural Networks - Ku, Mak, Siu (1999)   (Correct)

....Recurrent neural networks (RNNs) have closed paths in their topology that enable them to preserve their past states. Therefore, RNNs have the capability of dealing with May 6, 1999 DRAFT 3 spatiotemporal tasks that have been found to be difficult for feedforward networks [23] Bianchini et al. [6] observed that the cost function of a feedforward network for any learning task is closely related to that of an equivalent RNN. 1 As a result, any occurrence of local optima in the feedforward network can also be found in the equivalent RNN case. However, an RNN could have additional local ....

....is closely related to that of an equivalent RNN. 1 As a result, any occurrence of local optima in the feedforward network can also be found in the equivalent RNN case. However, an RNN could have additional local optima that may not exist in the feedforward network. Therefore, Bianchini et al. [6] argue that local optima occur more frequently in RNNs and that the training of RNNs is more difficult. However, this difficulty could be overcome by combining the efforts of local search (learning) and evolutionary search as they could complement each other. There are two approaches to embedding ....

M. Bianchini, M. Gori, and M. Maggini. On the problem of local minima in recurrent neural networks. IEEE Transactions on Neural Networks, 5(2):167--177, 1994.


Global Search Methods For Solving Nonlinear Optimization Problems - Shang (1997)   (6 citations)  (Correct)

.... heuristic methods that have fast learning speed include methods that learn layer by layer [70] iterative methods [9] hybrid learning algorithms [97] and methods developed from the field of optimal filtering [213, 233] Recurrent neural networks have also been trained by gradient based methods [26, 179,197,274]. Local minimization algorithms have difficulties when the surface is flat (gradient close to zero) when gradients can be in a large range, or when the surface is very rugged. When gradients can vary greatly, local search may progress too slowly when the gradient is small and may over shoot ....

M. Bianchini, M. Gori, and M. Maggini. On the problem of local minima in recurrent neural networks. IEEE Transactions on Neural Networks, 5(2):167--177, March 1994.


Financial Time Series Forecasting Using K-Nearest Neighbors - Classification Maggini Giles   Self-citation (Maggini)   (Correct)

No context found.

M. Bianchini, M. Gori, and M. Maggini, "On the Problem of Local Minima in Recurrent Neural Networks," IEEE Transaction on Neural Networks, Special Issue on Dynamic Recurrent Neural Networks, March 1994, pp. 167--177.


Optimal Learning in Artificial Neural Networks: A Theoretical.. - Bianchini, Gori   (1 citation)  Self-citation (Bianchini Gori)   (Correct)

....regardless of the dynamic relationships among frames within the sequences. The following theorem gives a first insight on the role of the input structure. Theorem 6 If rank X 0 = F then the cost function E LMS T (w 1 i;j ; w 0 k;l ; N ; L e ) has no local minima. Proof 16 : see [66]. This condition is hardly met in practice since it requires the adoption of networks with an exaggerated number of inputs. In particular, the condition is likely to hold provided that n(0) F Gamma 1. On the other hand, this theorem does not fully exploit the network structure and the stated ....

....minima if the network N and the learning environment L e satisfy the following hypotheses: ffl Network. The matrix W 1 is composed of non negative weights. ffl Learning environment. All the frames of L e are linearly separable into two classes depending on the token they belong to. Proof: see [66]. Remark 8 Network architecture. In practice, the assumption on the w 1 i;j s sign is not restrictive. In fact, for the case of symmetric squashing functions, a mapping from a general network, with no constraints on 16 The proof of this theorem, as well as that of Theorem 7, follows ....

[Article contains additional citation context not shown here]

M. Bianchini, M. Gori, and M. Maggini, "On the problem of local minima in recurrent neural networks," IEEE Transactions on Neural Networks, vol. 5, pp. 167--177, March 1994. Special Issue on Recurrent Neural Networks.


Representation of Finite State Automata in Recurrent.. - Frasconi, Gori.. (1996)   (25 citations)  Self-citation (Gori Maggini)   (Correct)

....1989) have been proposed which do not report serious convergence problems. Some attempts to understand the theoretical reasons for the successes and failures of supervised learning schemes have been carried out which explain when such schemes are likely to succeed in discovering optimal solutions (Bianchini et al. 1994; Gori Tesi, 1992; Yu, 1992) and to generalize to new examples (Baum Haussler, 1989) These results give some theoretical foundations to learning from tabula rasa configurations, but unfortunately, the conditions they provide for optimal convergence and for generalization are quite limited in ....

....(ff) 0 if ff 0 ; and 0 stands for differentiation with respect to ff. This threshold LMS error has been introduced by Sontag Sussman (1989) This cost does not penalize outputs beyond the target values. It is very well suited both for theoretical analyses and practical applications (Bianchini et al. 1994). As for other recurrent networks (e.g. first and second order recurrent networks) the learning is based on the optimization of that function. The gradient of the cost can be computed by following approximate schemes like the one proposed by Elman (1990) 2 , or exact schemes based on ....

Bianchini, M., Gori, M., and Maggini, M. (1994). On the problem of local minima in recurrent neural networks. IEEE Transactions on Neural Networks, 5(2):167--177. Special Issue on Dynamic Recurrent Neural Networks.


Terminal attractor algorithms: A critical analysis - Bianchini, Fanelli, Gori.. (1997)   Self-citation (Bianchini Gori Maggini)   (Correct)

No context found.

M. Bianchini, M. Gori, and M. Maggini, "On the problem of local minima in recurrent neural networks," IEEE Transactions on Neural Networks, vol. 5, pp. 167--177, March 1994. Special Issue on Recurrent Neural Networks.


Recurrent Neural Networks and Prior Knowledge for Sequence.. - Paolo Frasconi (1995)   (9 citations)  Self-citation (Gori)   (Correct)

....to substantially reduce the number of training examples required to achieve a satisfactory level of generalization [9] In addition to the sample complexity, adaptive systems may also face difficulties due to the computational complexity. Analyses on the problem of local minima in the cost surface [10, 11], for example, suggest that learning algorithms such as backpropagation may fail to discover optimal solutions for highly structured problems. These feelings are theoretically confirmed by results showing the NP completeness of the loading problem [12, 13] As far as recurrent networks are ....

M. Bianchini, M. Gori, and M. Maggini, "On the problem of local minima in recurrent neural networks," IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 167--177, 1994.


Optimal Convergence of On-Line Backpropagation - Gori, Maggini   Self-citation (Gori Maggini)   (Correct)

....a clear theoretical foundation. Recently, some efforts have been made to understand the behavior of batch mode Backpropation by the analysis of the shape of the error surfaces. In particular, the emphasis has been placed on the problem of local minima and on conditions that guarantee their absence [2, 3, 4, 5, 6, 7]. To the best of our knowledge, however, no attempt has been made to investigate the optimal convergence of on line Backpropagation 1 , that is used successfully in many practical applications. As earlier pointed out in [8] the on line updating departs to some extent from the true gradient ....

.... 0 if ff 0 ; 6) and 0 stands for differentiation with respect to ff. This threshold LMS error has been introduced by Sontag and Sussman in [3] This cost does not penalize outputs beyond the target values. It is very well suited both for theoretical analyses and practical applications [3, 7]. After learning has taken place, a congruent test phase for pattern classification is based on the thresholding criteria: q 2 C if x o (q) d Gamma ffl and q 2 C Gamma if x o (q) d Gamma ffl, where ffl is a positive threshold. The gradient of the function E(N ; L e ) can be ....

M. Bianchini, M. Gori, and M. Maggini, "On the problem of local minima in recurrent neural networks," IEEE Transactions on Neural Networks, vol. 5, pp. 167--177, March 1994.


Scheduling of Modular Architectures for Inductive Inference .. - Gori, Maggini, Soda (1994)   (6 citations)  Self-citation (Gori Maggini)   (Correct)

....Finally, the effect of the constraints was that of improving the performance (100 for the constrained net v.s. 96:0 for the unconstrained one) III Scheduling the activation updating in modular architecture Learning in recurrent networks may be seriously plagued by the presence of local minima [Bianchini et al. 1994], by information loss due large plateau, and by bifurcations in the weight space learning trajectory [Doya, 1993] The last two problems are essentially due to the need of dealing with long term dependencies and seem to affect very seriously recent attempts to deal with inductive inference of ....

Bianchini, M., Gori, M., & Maggini,M. (1994). On the Problem of Local Minima in Recurrent Neural Networks. IEEE Transaction on Neural Networks, Special Issue on Dynamic Recurrent Neural Networks, March 1994.


Apprentissage Dans Les Réseaux Récurrents Pour La Modélisation.. - Szilas (1995)   (Correct)

No context found.

M. Bianchni, M. Gori & M. Maggini. On the Problem of Local Minima in Recurrent Neural Networks. IEEE Trans. on Neural Networks, 5(2), p. 167-177, mars 1994.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC