Results 1  10
of
10
Proximal Newtontype methods for convex optimization
"... We seek to solve convex optimization problems in composite form: minimize x∈R n f(x): = g(x) + h(x), where g is convex and continuously differentiable and h: R n → R is a convex but not necessarily differentiable function whose proximal mapping can be evaluated efficiently. We derive a generalizatio ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
We seek to solve convex optimization problems in composite form: minimize x∈R n f(x): = g(x) + h(x), where g is convex and continuously differentiable and h: R n → R is a convex but not necessarily differentiable function whose proximal mapping can be evaluated efficiently. We derive a generalization of Newtontype methods to handle such convex but nonsmooth objective functions. We prove such methods are globally convergent and achieve superlinear rates of convergence in the vicinity of an optimal solution. We also demonstrate the performance of these methods using problems of relevance in machine learning and statistics. 1
PROXIMAL NEWTONTYPE METHODS FOR MINIMIZING COMPOSITE FUNCTIONS
"... Abstract. We generalize Newtontype methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping. We show that the resulting proximal Newtontype methods inherit the desirable convergence behavior of Newton ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We generalize Newtontype methods for minimizing smooth functions to handle a sum of two convex functions: a smooth function and a nonsmooth function with a simple proximal mapping. We show that the resulting proximal Newtontype methods inherit the desirable convergence behavior of Newtontype methods for minimizing smooth functions, even when search directions are computed inexactly. Many popular methods tailored to problems arising in bioinformatics, signal processing, and statistical learning are special cases of proximal Newtontype methods, and our analysis yields new convergence results for some of these methods.
Composite SelfConcordant Minimization ∗
"... We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function endowed with a computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new set of analytic stepsize selection and correction procedures based on the structure of the problem. We describe concrete algorithmic instances of our framework for several interesting largescale applications and demonstrate them numerically on both synthetic and real data.
Forwardbackward truncated Newton methods for convex composite optimization. ArXiv eprints
, 2014
"... arX iv: ..."
(Show Context)
Scalable sparse covariance estimation via selfconcordance
"... We consider the class of convex minimization problems, composed of a selfconcordant function, such as the log det metric, a convex data fidelity term h(·) and, a regularizing – possibly nonsmooth – function g(·). This type of problems have recently attracted a great deal of interest, mainly due t ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We consider the class of convex minimization problems, composed of a selfconcordant function, such as the log det metric, a convex data fidelity term h(·) and, a regularizing – possibly nonsmooth – function g(·). This type of problems have recently attracted a great deal of interest, mainly due to their omnipresence in topnotch applications. Under this locally Lipschitz continuous gradient setting, we analyze the convergence behavior of proximal Newton schemes with the added twist of a probable presence of inexact evaluations. We prove attractive convergence rate guarantees and enhance stateoftheart optimization schemes to accommodate such developments. Experimental results on sparse covariance estimation show the merits of our algorithm, both in terms of recovery efficiency and complexity.
LMCMA: an Alternative to LBFGS for Large Scale Blackbox Optimization
"... The limited memory BFGS method (LBFGS) of Liu and Nocedal (1989) is often considered to be the method of choice for continuous optimization when first and/or second order information is available. However, the use of LBFGS can be complicated in a blackbox scenario where gradient information i ..."
Abstract
 Add to MetaCart
(Show Context)
The limited memory BFGS method (LBFGS) of Liu and Nocedal (1989) is often considered to be the method of choice for continuous optimization when first and/or second order information is available. However, the use of LBFGS can be complicated in a blackbox scenario where gradient information is not available and therefore should be numerically estimated. The accuracy of this estimation, obtained by finite difference methods, is often problemdependent that may lead to premature convergence of the algorithm. In this paper, we demonstrate an alternative to LBFGS, the limited memory CovarianceMatrix Adaptation Evolution Strategy (LMCMA) proposed by Loshchilov (2014). The LMCMA is a stochastic derivativefree algorithm for numerical optimization of nonlinear, nonconvex optimization problems. Inspired by the LBFGS, the LMCMA samples candidate solutions according to a covariance matrix reproduced from m direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows to reduce the memory complexity to O(mn), where n is the number of decision variables. The time complexity of sampling one candidate solution is also O(mn), but scales as only about 25 scalarvector multiplications in practice. The algorithm has an important property of invariance w.r.t. strictly increasing transformations of the objective function, such transformations do not compromise its ability to approach the optimum. The LMCMA outperforms the original CMAES and its large scale versions on nonseparable illconditioned problems with a factor increasing with problem dimension. Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to LBFGS on nontrivial large scale smooth and nonsmooth optimization problems.
Space Variant Blind Image Restoration
, 2012
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. appor t de r ech er ch e
Editor: Unknown
"... We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function, endowed with an easily computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assu ..."
Abstract
 Add to MetaCart
We propose a variable metric framework for minimizing the sum of a selfconcordant function and a possibly nonsmooth convex function, endowed with an easily computable proximal operator. We theoretically establish the convergence of our framework without relying on the usual Lipschitz gradient assumption on the smooth part. An important highlight of our work is a new set of analytic stepsize selection and correction procedures based on the structure of the problem. We describe concrete algorithmic instances of our framework for several interesting applications and demonstrate them numerically on both synthetic and real data.
DC Proximal Newton for NonConvex Optimization Problems
, 2014
"... We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are nonconvex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. ..."
Abstract
 Add to MetaCart
We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are nonconvex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm admit as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and nonconvex regularizer. We have also illustrated the benefit of our algorithm in highdimensional transductive learning problem where both loss function and regularizers are nonconvex.