See this document in CiteSeerX!

Learning Long-Term Dependencies with Gradient Descent is Difficult (1994)  (Make Corrections)  (132 citations)
Yoshua Bengio, Patrice Simard, Paolo Frasconi
IEEE Transactions on Neural Networks



  Home/Search   Context   Related

 
View or download:
dsi.ing.unifi.it/p...cult.ieeetrnn.ps.Z
microsoft.com/~patrice/...long_term.pdf
microsoft.com/~patrice/P...long_term.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  dii.ing.unisi.it/resea...techrep (more)
Homepages:  P.Frasconi  

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These... (Update)

Cited by:   More
Recurrent Autoassociative - Networks Developing Distributed   (Correct)
Pro-active agents - With Recurrent Neural   (Correct)
Rule Extraction from Recurrent Neural Networks: A Taxonomy and.. - Jacobsson (2005)   (Correct)

Similar documents (at the sentence level):
27.9%:   Recurrent Neural Networks for Adaptive Temporal Processing - Bengio, Frasconi, Gori, Soda (1993)   (Correct)

Active bibliography (related documents):   More   All
0.5:   Training Multilayer Networks with Discrete Activation .. - Plagianakos..   (Correct)
0.4:   Input/Output HMMs for Sequence Processing - Bengio, Frasconi (1995)   (Correct)
0.3:   Dynamic Recurrent Neural Networks - Pearlmutter (1990)   (Correct)

Similar documents based on text:   More   All
0.3:   An EM Algorithm for Asynchronous Input/Output Hidden Markov.. - Bengio, Bengio (1996)   (Correct)
0.2:   An EM Approach to Learning Sequential Behavior - Bengio, Frasconi (1994)   (Correct)
0.2:   Diffusion of Context and Credit Information in Markovian Models - Bengio (1995)   (Correct)

Related documents from co-citation:   More   All
36:   Finding structure in time - Elman - 1990
20:   The induction of dynamical recognizers - Pollack - 1991
19:   A learning algorithm for continually running fully recurrent neural networks - Williams, Zipser - 1989

BibTeX entry:   (Update)

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5 (2), 157--166. http://citeseer.ist.psu.edu/bengio94learning.html   More

@article{ bengio94learning,
    author = "Yoshua Bengio and Patrice Simard and Paolo Frasconi",
    title = "Learning Long-Term Dependencies with Gradient Descent is Difficult",
    journal = "IEEE Transactions on Neural Networks",
    volume = "5",
    number = "2",
    month = "March",
    pages = "157--166",
    year = "1994",
    url = "citeseer.ist.psu.edu/bengio94learning.html" }
Citations (may not include all citations):
1527   Optimization by simulated annealing - Kirkpatrick, Gelatt et al. - 1983
1491   Learning internal representation by error propagation (context) - Rumelhart, Hinton et al. - 1986
482   Iterative Solution of Non-linear Equations in Several Variab.. (context) - Ortega, Rheinboldt - 1960
49   Global Optimization of a Neural Network - Hidden Markov Mode.. - Bengio, De Mori et al. - 1992
42   Induction of multiscale temporal structure - Mozer - 1992
42   Local Feedback Multilayered Networks (context) - Frasconi, Gori et al. - 1992
39   Minimizing Multimodal Functions of Continuous Variables with.. (context) - Corana, Marchesi et al. - 1987
37   Improving the convergence of back-propagation learning with .. (context) - Becker, Le Cun - 1988
36   A focused back-propagation algorithm for temporal pattern re.. (context) - Mozer - 1989
33   Unified Integration of Explicit Rules and Learning by Exampl.. (context) - Frasconi, Gori et al.
26   Inserting Rules into Recurrent Neural Networks (context) - Giles, Omlin - 1992
12   The problem of learning long-term dependencies in recurrent .. (context) - Bengio, Frasconi et al. - 1993
9   Learning Processes in an Asymmetric Threshold Network (context) - Le Cun - 1986
6   Artificial Neural Networks and their Application to Sequence.. (context) - Bengio - 1991
6   A first look at phonetic discrimination using connectionist .. (context) - Kuhn - 1987
6   BPS: a learning algorithm for capturing the dynamic nature o.. (context) - Gori, Bengio et al. - 1989
6   The development of the Time-Delay Neural Network architectur.. (context) - Lang, Hinton - 1988
6   Nonlinear Dynamics and Stability of Analog Neural Networks (context) - Marcus, Waugh et al. - 1991
4   A learning algorithm for continuously running fully recurren.. (context) - Williams, Zipser - 1989
4   Moving Targets' Training Algorithm (context) - The - 1990
3   Learning by choice of internal representation (context) - Grossman, Meir et al.
2   Using Random Weights to train Multilayer Networks of Hard-Li.. (context) - Bartlett, Downs - 1992
1   A Method of Training Multi-layer Networks with Heaviside Cha.. (context) - Gaynier, Downs - 1993



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www-dii.ing.unisi.it/research/neural/tech-rep.html):   More
An EM Approach to Learning Sequential Behavior - Bengio, Frasconi (1994)   (Correct)
Learning in Multilayered Networks Used as Autoassociators - Bianchini, Frasconi, Gori (1995)   (Correct)
Successes And Failures Of Backpropagation: A Theoretical.. - Frasconi, Gori, Tesi   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC