Results 1 
6 of
6
A General Iterative Shrinkage and Thresholding Algorithm for Nonconvex Regularized Optimization Problems
"... Nonconvex sparsityinducing penalties have recently received considerable attentions in sparse learning. Recent theoretical investigations have demonstrated their superiority over the convex counterparts in several sparse learning settings. However, solving the nonconvex optimization problems asso ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Nonconvex sparsityinducing penalties have recently received considerable attentions in sparse learning. Recent theoretical investigations have demonstrated their superiority over the convex counterparts in several sparse learning settings. However, solving the nonconvex optimization problems associated with nonconvex penalties remains a big challenge. A commonly used approach is the MultiStage (MS) convex relaxation (or DC programming), which relaxes the original nonconvex problem to a sequence of convex problems. This approach is usually not very practical for largescale problems because its computational cost is a multiple of solving a single convex problem. In this paper, we propose a General Iterative Shrinkage and Thresholding (GIST) algorithm to solve the nonconvex optimization problem for a large class of nonconvex penalties. The GIST algorithm iteratively solves a proximal operator problem, which in turn has a closedform solution for many commonly used penalties. At each outer iteration of the algorithm, we use a line search initialized by the BarzilaiProceedings of the 30 th
Gradient Descent with Proximal Average for Nonconvex and Composite Regularization
, 2014
"... Sparse modeling has been highly successful in many realworld applications. While a lot of interests have been on convex regularization, recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However, the resulting nonconvex optimization problem ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Sparse modeling has been highly successful in many realworld applications. While a lot of interests have been on convex regularization, recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However, the resulting nonconvex optimization problems are often challenging, especially for composite regularizers such as the nonconvex overlapping group lasso. In this paper, by using a recent mathematical tool known as the proximal average, we propose a novel proximal gradient descent method for optimization with a wide class of nonconvex and composite regularizers. Instead of directly solving the proximal step associated with a composite regularizer, we average the solutions from the proximal problems of the constituent regularizers. This simple strategy has guaranteed convergence and low periteration complexity. Experimental results on a number of synthetic and realworld data sets demonstrate the effectiveness and efficiency of the proposed optimization algorithm, and also the improved classification performance resulting from the nonconvex regularizers.
DC Proximal Newton for NonConvex Optimization Problems
, 2014
"... We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are nonconvex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. ..."
Abstract
 Add to MetaCart
We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are nonconvex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm admit as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and nonconvex regularizer. We have also illustrated the benefit of our algorithm in highdimensional transductive learning problem where both loss function and regularizers are nonconvex.
Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity
"... Abstract The use of convex regularizers allow for easy optimization, though they often produce biased estimation and inferior prediction performance. Recently, nonconvex regularizers have attracted a lot of attention and outperformed convex ones. However, the resultant optimization problem is much ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract The use of convex regularizers allow for easy optimization, though they often produce biased estimation and inferior prediction performance. Recently, nonconvex regularizers have attracted a lot of attention and outperformed convex ones. However, the resultant optimization problem is much harder. In this paper, for a large class of nonconvex regularizers, we propose to move the nonconvexity from the regularizer to the loss. The nonconvex regularizer is then transformed to a familiar convex regularizer, while the resultant loss function can still be guaranteed to be smooth. Learning with the convexified regularizer can be performed by existing efficient algorithms originally designed for convex regularizers (such as the standard proximal algorithm and FrankWolfe algorithm). Moreover, it can be shown that critical points of the transformed problem are also critical points of the original problem. Extensive experiments on a number of nonconvex regularization problems show that the proposed procedure is much faster than the stateoftheart nonconvex solvers.
Linearized Alternating Direction Method of Multipliers for Constrained Nonconvex Regularized Optimization
"... Abstract In this paper, we consider a wide class of constrained nonconvex regularized minimization problems, where the constraints are linearly constraints. It was reported in the literature that nonconvex regularization usually yields a solution with more desirable sparse structural properties bey ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In this paper, we consider a wide class of constrained nonconvex regularized minimization problems, where the constraints are linearly constraints. It was reported in the literature that nonconvex regularization usually yields a solution with more desirable sparse structural properties beyond convex ones. However, it is not easy to obtain the proximal mapping associated with nonconvex regularization, due to the imposed linearly constraints. In this paper, the optimization problem with linear constraints is solved by the Linearized Alternating Direction Method of Multipliers (LADMM). Moreover, we present a detailed convergence analysis of the LADMM algorithm for solving nonconvex compositely regularized optimization with a large class of nonconvex penalties. Experimental results on several realworld datasets validate the efficacy of the proposed algorithm.
ACTIVE SET STRATEGY FOR HIGHDIMENSIONAL NONCONVEX SPARSE OPTIMIZATION PROBLEMS
, 2014
"... The use of nonconvex sparse regularization has attracted much interest when estimating a very sparse model on high dimensional data. In this work we express the optimality conditions of the optimization problem for a large class of nonconvex regularizers. From those conditions, we derive an effici ..."
Abstract
 Add to MetaCart
The use of nonconvex sparse regularization has attracted much interest when estimating a very sparse model on high dimensional data. In this work we express the optimality conditions of the optimization problem for a large class of nonconvex regularizers. From those conditions, we derive an efficient active set strategy that avoids the computing of unnecessary gradients. Numerical experiments on both generated and real life datasets show a clear gain in computational cost w.r.t. the state of the art when using our method to obtain very sparse solutions.