Learning long-term dependencies with gradient descent is difficult. (1994)

by Yoshua Bengio, Patrice Simard, Paolo Frasconi
Venue:IEEE Transactions on Neural Networks,