MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  LSTM can solve hard long time lag problems (1997) [16 citations — 7 self]

Download:
pdf | ps
by Jurgen Schmidhuber
Advances in Neural Information Processing Systems 9
ftp://ftp.idsia.ch/pub/juergen/lstmnips97.ps.gz
Add To MetaCart

Abstract:

Standard recurrent nets cannot deal with long minimal time lags between relevant signals. Several recent NIPS papers propose alternative methods. We first show: problems used to promote various previous algorithms can be solved more quickly by random weight guessing than by the proposed algorithms. We then use LSTM, our own recent algorithm, to solve a hard problem that can neither be quickly solved by random search nor by any other recurrent net algorithm we are aware of. 1 TRIVIAL PREVIOUS LONG TIME LAG PROBLEMS Traditional recurrent nets fail in case of long minimal time lags between input signals and corresponding error signals [7, 3]. Many recent papers propose alternative methods, e.g., [16, 12, 1, 5, 9]. For instance, Bengio et al. investigate methods such as simulated annealing, multi-grid random search, time-weighted pseudo-Newton optimization, and discrete error propagation [3]. They also propose an EM approach

Citations

201 Learning long-term dependencies with gradient descent is difficult – Bengio - 1994
193 The induction of dynamical recognizers – Pollack - 1991
127 Long short-term memory – Hochreiter, Schmidhuber - 1997
109 Gradient Calculations for Dynamic Recurrent Neural Networks: A Survey – Pearlmutter - 1995
101 An e cient gradient-based algorithm for on-line training of recurrent network trajectories – Williams, Peng - 1990
91 An input/output HMM architecture – Bengio, Frasconi - 1996
82 The utility driven dynamic error propagation network – Robinson, Fallside - 1987
62 The induction of multiscale temporal structure – Mozer - 1992
55 Learning complex, extended sequences using the principle of history compression – Schmidhuber - 1992
40 Untersuchungen zu dynamischen neuronalen Netzen – Hochreiter - 1991
33 Credit assignment through time: Alternatives to backpropagation – Bengio, Frasconi - 1994
30 The cascade-correlation learning algorithm – Fahlman, Lebiere - 1990
29 Experimental comparison of the effect of order in recurrent neural networks – Miller, Giles - 1993
26 Finite-state automata and simple recurrent networks – Cleeremans, Servan-Schreiber, et al. - 1989
24 Learning sequential structures with the real-time recurrent learning algorithm – Smith, Zipser - 1989
21 Induction of Finite-State Automata Using Second-Order Recurrent Networks – Watrous, Kuhn - 1992
20 Hierarchical Recurrent Neural Networks for Long-Term Dependencies – Hihi, Bengio - 1996
17 Learning long-term dependencies is not as difficult with NARX recurrent neural networks – Lin, Horne, et al. - 1995
13 First-order recurrent neural networks and deterministic finite state automata – Manolios, Fanelli - 1994
13 Dynamic construction of finite automata from examples using hill-climbing – TOMITA - 1982