See this document in CiteSeerX!

Long Short-Term Memory (1996)  (Make Corrections)  (78 citations)
Sepp Hochreiter, Jürgen Schmidhuber
Neural Computation



  Home/Search   Context   Related

 
View or download:
flop.informatik.tu...ki20795rev.ps.gz
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  programming.ccp14.ac.u...msg00228 (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Learning to store information over extended time intervals via recurrent backpropagation takes a very long time, mostly due to insufficient, decaying error back flow. We briefly review Hochreiter's 1991 analysis of this problem, then address it by introducing a novel, efficient method called "Long Short-Term Memory" (LSTM). LSTM can learn to bridge time lags in excess of 1000 steps by enforcing constant error flow through "constant error carrousels" within special units. Multiplicative gate... (Update)

Cited by:   More
Pro-active agents - With Recurrent Neural   (Correct)
Classifying unprompted speech by retraining - Lstm Nets Nicole   (Correct)
Evolino for Recurrent Support Vector Machines - Schmidhuber, al. (2005)   (Correct)

Similar documents (at the sentence level):
6.0%:   Long Short-term Memory - Hochreiter, Schmidhuber (1995)   (Correct)

Active bibliography (related documents):   More   All
1.0:   Bridging Long Time Lags by Weight Guessing and.. - Sepp Hochreiter.. (1996)   (Correct)
0.7:   Recurrent Neural Net Learning and Vanishing Gradient - Of   (Correct)
0.6:   International Journal of Uncertainty, Fuzziness and.. - World Scientific..   (Correct)

Similar documents based on text:   More   All
1.1:   Gradient Flow in Recurrent Nets: the Difficulty of Learning .. - Hochreiter, Bengio   (Correct)
0.9:   Learning to Forget: Continual Prediction with LSTM - Gers, Schmidhuber, Cummins (1999)   (Correct)
0.8:   Feature extraction through LOCOCODE - Sepp Hochreiter Fakultat (1998)   (Correct)

Related documents from co-citation:   More   All
31:   Learning to forget: Continual prediction with LSTM - Gers, Schmidhuber et al. - 1999
28:   Learning long-term dependencies with gradient descent is difficult - Bengio, Simard et al. - 1994
24:   Gradient calculations for dynamic recurrent neural networks: A survey - Pearlmutter - 1995

BibTeX entry:   (Update)

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735--1780. http://citeseer.ist.psu.edu/hochreiter96long.html   More

@article{ hochreiter97long,
    author = "Sepp Hochreiter and Jurgen Schmidhuber",
    title = "Long Short-Term Memory",
    journal = "Neural Computation",
    volume = "9",
    number = "8",
    pages = "1735-1780",
    year = "1997",
    url = "citeseer.ist.psu.edu/hochreiter96long.html" }
Citations (may not include all citations):
644   Finding structure in time - Elman - 1988
240   Advances in Neural Information Processing Systems (context) - Cowan, Tesauro et al. - 1994
116   Finite-state automata and simple recurrent networks (context) - Cleeremans, Servan-Schreiber et al. - 1989
108   Induction of finite-state languages using second-order recur.. (context) - Watrous, Kuhn - 1992
104   Learning state space trajectories in recurrent neural networ.. (context) - Pearlmutter - 1989
75   Gradient calculations for dynamic recurrent neural networks:.. - Pearlmutter - 1995
73   A time-delay neural network architecture for isolated word r.. (context) - Lang, Waibel et al. - 1990
67   The utility driven dynamic error propagation network (context) - Robinson, Fallside - 1987
62   An efficient gradient-based algorithm for on-line training o.. - Williams, Peng - 1990
56   Gradient-based learning algorithms for recurrent networks an.. - Williams, Zipser - 1992
45   Advances in Neural Information Processing Systems (context) - Lippmann, Moody et al. - 1989
43   Neurocontrol of nonlinear dynamical systems with Kalman filt.. (context) - Puskorius, Feldkamp - 1994
30   Untersuchungen zu dynamischen neuronalen Netzen (context) - Hochreiter - 1991
28   extended sequences using the principle of history compressio.. (context) - Schmidhuber
25   Credit assignment through time: Alternatives to backpropagat.. - Bengio, Frasconi - 1994
24   Experimental comparison of the effect of order in recurrent .. (context) - Miller, Giles - 1993
24   Complexity of exact gradient computation algorithms for recu.. (context) - Williams - 1989
23   Learning sequential structures with the real-time recurrent .. (context) - Smith, Zipser - 1989
21   Learning unambiguous reduced sequence descriptions - Schmidhuber
15   Learning sequential tasks by incrementally adding higher ord.. - Ring - 1993
14   LSTM can solve hard long time lag problems - Hochreiter, Schmidhuber - 1997
13   Guessing can outperform many long time lag algorithms - Schmidhuber, Hochreiter - 1996
12   Zielfunktionen und Kettenregel (context) - Schmidhuber - 1993
11   The recurrent cascade-correlation learning algorithm (context) - Fahlman - 1991
8   Time warping invariant neural networks (context) - Sun, Chen et al. - 1993
8   A focused back-propagation algorithm for temporal sequence r.. (context) - Mozer - 1989
5   Induction of multiscale temporal structure (context) - Systems, Mozer - 1992
4   Language induction by phase transition in dynamical recogniz.. (context) - Pollack - 1991
1   time complexity learning algorithm for fully recurrent conti.. (context) - Science, Schmidhuber et al.
1   A local learning algorithm for dynamic feedforward and recur.. (context) - CUED, TR et al. - 1989
1   Holographic recurrent networks (context) - on, Networks et al. - 1993



The graph only includes citing articles where the year of publication is known.


Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC