(Enter summary)
Abstract: Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These... (Update)
Cited by: More
Recurrent Autoassociative - Networks Developing Distributed
(Correct)
Pro-active agents - With Recurrent Neural
(Correct)
Rule Extraction from Recurrent Neural Networks: A Taxonomy and.. - Jacobsson (2005)
(Correct)
Similar documents (at the sentence level):
27.9%: Recurrent Neural Networks for Adaptive Temporal Processing - Bengio, Frasconi, Gori, Soda (1993)
(Correct)
Active bibliography (related documents): More All
0.5: Training Multilayer Networks with Discrete Activation .. - Plagianakos..
(Correct)
0.4: Input/Output HMMs for Sequence Processing - Bengio, Frasconi (1995)
(Correct)
0.3: Dynamic Recurrent Neural Networks - Pearlmutter (1990)
(Correct)
Similar documents based on text: More All
0.3: An EM Algorithm for Asynchronous Input/Output Hidden Markov.. - Bengio, Bengio (1996)
(Correct)
0.2: An EM Approach to Learning Sequential Behavior - Bengio, Frasconi (1994)
(Correct)
0.2: Diffusion of Context and Credit Information in Markovian Models - Bengio (1995)
(Correct)
Related documents from co-citation: More All
36: Finding structure in time
- Elman - 1990
20: The induction of dynamical recognizers
- Pollack - 1991
19: A learning algorithm for continually running fully recurrent neural networks
- Williams, Zipser - 1989
BibTeX entry: (Update)
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5 (2), 157--166. http://citeseer.ist.psu.edu/bengio94learning.html More
@article{ bengio94learning,
author = "Yoshua Bengio and Patrice Simard and Paolo Frasconi",
title = "Learning Long-Term Dependencies with Gradient Descent is Difficult",
journal = "IEEE Transactions on Neural Networks",
volume = "5",
number = "2",
month = "March",
pages = "157--166",
year = "1994",
url = "citeseer.ist.psu.edu/bengio94learning.html" }
Citations (may not include all citations):
1527
Optimization by simulated annealing
- Kirkpatrick, Gelatt et al. - 1983
1491
Learning internal representation by error propagation (context) - Rumelhart, Hinton et al. - 1986
482
Iterative Solution of Non-linear Equations in Several Variab.. (context) - Ortega, Rheinboldt - 1960
49
Global Optimization of a Neural Network - Hidden Markov Mode..
- Bengio, De Mori et al. - 1992
42
Induction of multiscale temporal structure
- Mozer - 1992
42
Local Feedback Multilayered Networks (context) - Frasconi, Gori et al. - 1992
39
Minimizing Multimodal Functions of Continuous Variables with.. (context) - Corana, Marchesi et al. - 1987
37
Improving the convergence of back-propagation learning with .. (context) - Becker, Le Cun - 1988
36
A focused back-propagation algorithm for temporal pattern re.. (context) - Mozer - 1989
33
Unified Integration of Explicit Rules and Learning by Exampl.. (context) - Frasconi, Gori et al.
26
Inserting Rules into Recurrent Neural Networks (context) - Giles, Omlin - 1992
12
The problem of learning long-term dependencies in recurrent .. (context) - Bengio, Frasconi et al. - 1993
9
Learning Processes in an Asymmetric Threshold Network (context) - Le Cun - 1986
6
Artificial Neural Networks and their Application to Sequence.. (context) - Bengio - 1991
6
A first look at phonetic discrimination using connectionist .. (context) - Kuhn - 1987
6
BPS: a learning algorithm for capturing the dynamic nature o.. (context) - Gori, Bengio et al. - 1989
6
The development of the Time-Delay Neural Network architectur.. (context) - Lang, Hinton - 1988
6
Nonlinear Dynamics and Stability of Analog Neural Networks (context) - Marcus, Waugh et al. - 1991
4
A learning algorithm for continuously running fully recurren.. (context) - Williams, Zipser - 1989
4
Moving Targets' Training Algorithm (context) - The - 1990
3
Learning by choice of internal representation (context) - Grossman, Meir et al.
2
Using Random Weights to train Multilayer Networks of Hard-Li.. (context) - Bartlett, Downs - 1992
1
A Method of Training Multi-layer Networks with Heaviside Cha.. (context) - Gaynier, Downs - 1993
The graph only includes citing articles where the year of publication is known.
Documents on the same site (http://www-dii.ing.unisi.it/research/neural/tech-rep.html): More
An EM Approach to Learning Sequential Behavior - Bengio, Frasconi (1994)
(Correct)
Learning in Multilayered Networks Used as Autoassociators - Bianchini, Frasconi, Gori (1995)
(Correct)
Successes And Failures Of Backpropagation: A Theoretical.. - Frasconi, Gori, Tesi
(Correct)
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC