MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Draft: Fast Exact Multiplication by the Hessian

Download:
Download as a PDF | Download as a PS
by Barak A. Pearlmutter
ftp://ftp.cse.ogi.edu/pub/learning/bap/hessian-2-feb-93.ps.Z
Add To MetaCart

Abstract:

Just storing the Hessian H (the matrix of second derivatives @

Citations

366 Beyond Regression: New Tools for Prediction and Analysis – Werbos - 1974
350 Optimal brain damage – Cun, Denker, et al. - 1990
295 A learning algorithm for Boltzmann Machines – Ackley, Hinton, et al. - 1985
273 Connectionist learning procedures – Hinton - 1989
148 order derivatives for network pruning: optimal brain surgeon – Hassibi, Stork, et al. - 1993
145 The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems – Moody - 1992
106 Generalization of back-propagation to recurrent neural networks – Pineda - 1987
73 Improving the convergence of back-propagation learning with second order methods – Becker, LeCun - 1989
62 A learning rule for asynchronous perceptrons with feedback in a combinatorial environment – Almeida - 1987
51 Learning Algorithms for Connectionist Networks: Applied Gradient Methods of Nonlinear Optimization – Watrous - 1987
41 Weight Perturbation: An Optimal Architecture and learning Technique for Analog VLSI Feedforward and recurrent Multilayer Networks – Jabri, Flower - 1992
22 Second order properties of error surfaces: Learning time and generalization – LeCun, Kanter, et al. - 1991
20 A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization – Cauwenberghs - 1993
20 in press), Learning continuous probability distributions with symmetric diffusion, networks – Cambridge, Movellan, et al. - 1991
19 Summed Weight Neuron Perturbation: An O(N) Improvement over Weight Perturbation – Flower, Jabri - 1993
12 AParallel Gradient Descent Method for Learning in Analog VLSI Neural Networks – Alspector, Meir, et al. - 1993
6 Analog VLSI Implementation of Gradient Descent – Kirk, Kerns, et al. - 1993
6 Automatic learning rate maximization in large adaptive machines – LeCun, Simard, et al. - 1993
5 Gradient descent: Second-order momentum and saturating error – Pearlmutter - 1992