| Y. LeCun, L. Bottou, G. Orr, and K. M uller, "Efficient BackProp," in Neural Networks: Tricks of the trade, ser. Lecture Notes in Computer Science, G. Orr and K. Mller, Eds. Springer Verlag, 1998, vol. 1524, pp. 9--50. |
....of Eq. 6) According to [2] a choice of a 10 3 is reason able for real world optimization problems. An idea about the magnitude of a for neural networks comes from the analysis of the Hessian matrix. It is argued that a typical Hessian has few small, many medium and few very large eigenvalues [8]. This leads to a ratio between longest and shortest axis much larger than 10 3 . As the Rprop algorithms (except iRprop ) depend only on the sign of the derivative, results obtained with test functions generated by Eq. 6) do not only hold for parabolic error surfaces, but for a larger class of ....
Y. LeCun, L. Bottou, G. B. Orr, and K.-R. M/iller. Efficient backprop. In G. B. Orr and K.-R. M/iller, editors, Neural Networks: Tricks of the Trade, number 1524 in LNCS, chapter 1. Springer-Verlag, 1998.
....(ANNs) There are two established ways for the initialization, differing in the use of problem dependent knowledge. Without such knowledge, initial weights are chosen small and at random in order to introduce as small a bias as possible and to allow for symmetry breaking between the neurons (LeCun, Bottou, Orr, and Mller 1998). When problem or domain specific knowledge is to be used for the initialization, some kind of information is transferred into the ANN prior to learning. See Reed and Marks (1999) for an overview of these methods. They all can be characterized as follows: They either focus on the acceleration of ....
LeCun, Y., L. Bottou, G. B. Orr, and K.-R. Mller (1998). Efficient backprop. In G. B. Orr and K.-R. Mller (Eds.), Neural Networks: Tricks of the Trade, pp. 9--50. Springer-Verlag.
....of experiments with the resulting ANNs, whereby only the adaptability of the structure was examined. As it is well known, that the capacity to learn as well as the learning speed of an ANN strongly depends on the initialization of the weights, we initialized the weights at random with small values (LeCun, Bottou, Orr, and Mller 1998) and repeated the adaptation towards different problems for 5000 cycles 5 . Due to the reduced number of adjustable parameters compared to the fully connected 8 10 1 ANN, learning often got stuck very early with a high error. Therefore, only the best result after 10 initializations per structure ....
LeCun, Y., L. Bottou, G. B. Orr, and K.-R. Mller (1998). Efficient backprop. In G. B. Orr and K.-R. Mller (Eds.), Neural Networks: Tricks of the Trade, pp. 9--50. Springer-Verlag.
....however excellent its performance is. The present letter proposes an adaptive method of obtaining the inverse of the Fisher information matrix directly without any matrix inversion, by applying the Kalman filter technique. The proposed adaptive method generalizes the adaptive Gauss Newton method (LeCun et al. 1998) and provides a solid theoretical justification based on different philosophical ideas. Computer experiments demonstrate that the proposed method has almost the same performance as the original natural gradient method and that its convergence speed is surprisingly faster than the conventional ....
....includes the second derivatives. To avoid calculations of the second derivatives and to keep H( positive definite, the Gauss Newton method approximates H( by neglecting the terms of the second derivatives by the sample average, G( 1 T T X t=1 rf(x t ; rf(x t ; 0 : 21) LeCun et al. 1998) also describes stochastic or adaptive Gauss Newton method, when l is a quadratic error function. The proposed method generalizes the adaptive Gauss Newton algorithms and provides a solid theoretical justification even when l is not quadratic. The proposed adaptive method is computationally ....
Y. LeCun, L. Bottou, G.B. Orr and K.-R. Muller, "Efficient backprop", in Neural Networks---Tricks of the Trade, Springer Lecture Notes in Computer Sciences 1524, pp.5-50, 1998.
No context found.
Y. LeCun, L. Bottou, G. B. Orr, and K.-R. M uller. Efficient backprop. In G. Orr and K.-R. M uller, editors, Neural Networks: Tricks of the Trade, volume 1524, pages 9--53, Heidelberg, New York, 1998. Springer LNCS. N. Littlestone, P. M. Long, and M. K. Warmuth. On-line learning of linear functions. Technical Report CRL-91-29, University of California at Santa Cruz, October 1991.
No context found.
Y. LeCun, L. Bottou, G. B. Orr, and K.-R. M uller. Efficient backprop. In G.B. Orr and K.-R. M uller, editors, Neural Networks: Tricks of the Trade, pages 9--50. Springer, 1998.
....such as gradient descent, choosing small initial weights, and letting the norm of the weights grow slowly while the iterative algorithm is running. Although this algorithm is not exact, it is fast and efficient. This is in fact similar to what is usually done with back propagation neural networks (LeCun et al. 1998). The same algorithm can be used for the VRM. In that context early stopping is similar to choosing the optimal using cross validation. 4 New Algorithms and Results 4.1 Adaptive Kernel Widths It is known in density estimation theory that the quality of the density estimate can be improved ....
LeCun, Y., Bottou, L., Orr, G., and Muller, K. (1998). Efficient backprop. In Orr, G. and K., M., editors, Neural Networks: Tricks of the Trade. Springer.
No context found.
Y. LeCun, L. Bottou, G. Orr, and K. M uller, "Efficient BackProp," in Neural Networks: Tricks of the trade, ser. Lecture Notes in Computer Science, G. Orr and K. Mller, Eds. Springer Verlag, 1998, vol. 1524, pp. 9--50.
No context found.
Y. LeCun et al, "Efficient BackProp," in Neural Networks: Tricks of the trade, G. Orr and K. Mller, Eds., vol. 1524 of Lecture Notes in Computer Science, pp. 9--50. Springer Verlag, 1998.
No context found.
Y. LeCun, L. Bottou, G. Orr, and K. M uller, "Efficient BackProp," in Neural Networks: Tricks of the trade, ser. Lecture Notes in Computer Science, G. Orr and K. M uller, Eds. Springer Verlag, 1998, vol. 1524, pp. 9--50.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC