Results 1  10
of
613,724
Natural Gradient
, 2013
"... In derivativefree optimization one aims at minimizing an unknown objective function. The only information accessible are algorithmselected function measurements. Evolution Strategies (ES) are among the state of the art heuristics for this optimization problem. ES typically use parametrized probabi ..."
Abstract
 Add to MetaCart
probability distributions to generate correlated samples in promising regions. Recently, it was shown that applying gradient descent in the parameter space of the search distribution leads to algorithms that are very similar to the most successful ES. The development of those socalled Natural Evolution
Rprop Using the Natural Gradient
, 2005
"... Gradientbased optimization algorithms are the standard methods for adapting the weights of neural networks. The natural gradient gives the steepest descent direction based on a nonEuclidean, from a theoretical point of view more appropriate metric in the weight space. While the natural gradient ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Gradientbased optimization algorithms are the standard methods for adapting the weights of neural networks. The natural gradient gives the steepest descent direction based on a nonEuclidean, from a theoretical point of view more appropriate metric in the weight space. While the natural
WHY NATURAL GRADIENT?
"... Gradient adaptation is a useful technique for acljusting a set of parameters to minimize a cost fun(:tion. While often easy to implement, the conveygence speed of gradient adaptation can be slow when the slope of the cost function varies widely for small changes in the parameters. In this papel., we ..."
Abstract
 Add to MetaCart
., we outline an alternative technique, termed natural gradient adaptation, that overcomes the poor convergence properties of gradient adaptation in many cases. The natural gradient is based on differential geometry and employs knowledge of the Riemannian structure of the parameter space to adjust
Natural Gradients for Deformable Registration
"... We apply the concept of natural gradients to deformable registration. The motivation stems from the lack of physical interpretation for gradients of imagebased difference measures. The main idea is to endow the space of deformations with a distance metric which reflects the variation of the differe ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We apply the concept of natural gradients to deformable registration. The motivation stems from the lack of physical interpretation for gradients of imagebased difference measures. The main idea is to endow the space of deformations with a distance metric which reflects the variation
Learning to rank using gradient descent
 In ICML
, 2005
"... We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data f ..."
Abstract

Cited by 510 (17 self)
 Add to MetaCart
We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data
Natural Gradient Matrix Momentum
 in Proceedings of the Ninth International Conference on Neural Networks, The Institution of Electrical Engineers
, 1999
"... Natural gradient learning is an efficient and principled method for improving online learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out effic ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Natural gradient learning is an efficient and principled method for improving online learning. In practical applications there will be an increased cost required in estimating and inverting the Fisher information matrix. We propose to use the matrix momentum algorithm in order to carry out
Multichannel Blind Deconvolution and Equalization Using the Natural Gradient
 In The First Signal Processing Workshop on Signal Processing Advances in Wireless Communications
, 1997
"... Multichannel deconvolution and equalization is an important task for numerous applications in communications, signal processing, and control. In this paper, we extend the efficient natural gradient search method in [1] to derive a set of online algorithms for combined multichannel blind source separ ..."
Abstract

Cited by 119 (24 self)
 Add to MetaCart
Multichannel deconvolution and equalization is an important task for numerous applications in communications, signal processing, and control. In this paper, we extend the efficient natural gradient search method in [1] to derive a set of online algorithms for combined multichannel blind source
Topmoumoute online natural gradient algorithm
 Advances in Neural Information Processing Systems 20
, 2008
"... Guided by the goal of obtaining an optimization algorithm that is both fast and yielding good generalization, we study the descent direction maximizing the decrease in generalization error or the probability of not increasing generalization error. The surprising result is that from both the Bayesian ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
the Bayesian and frequentist perspectives this can yield the natural gradient direction. Although that direction can be very expensive to compute we develop an efficient, general, online approximation to the natural gradient descent which is suited to large scale problems. We report experimental results
Greedy Function Approximation: A Gradient Boosting Machine
 Annals of Statistics
, 2000
"... Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed for additi ..."
Abstract

Cited by 951 (12 self)
 Add to MetaCart
Function approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions and steepest{descent minimization. A general gradient{descent \boosting" paradigm is developed
Revisiting natural gradient for deep networks
 In International Conference on Learning Representations
, 2014
"... We evaluate natural gradient, an algorithm originally proposed in Amari (1997), for learning deep models. The contributions of this paper are as follows. We show the connection between natural gradient and three other recently proposed methods: HessianFree (Martens, 2010), Krylov Subspace Descent ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We evaluate natural gradient, an algorithm originally proposed in Amari (1997), for learning deep models. The contributions of this paper are as follows. We show the connection between natural gradient and three other recently proposed methods: HessianFree (Martens, 2010), Krylov Subspace Descent
Results 1  10
of
613,724