Results 1  10
of
28
Revisiting frankwolfe: Projectionfree sparse convex optimization
 In ICML
, 2013
"... We provide stronger and more general primaldual convergence results for FrankWolfetype algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approxi ..."
Abstract

Cited by 86 (2 self)
 Add to MetaCart
(Show Context)
We provide stronger and more general primaldual convergence results for FrankWolfetype algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approximately (as well as if the gradients are inexact), and is proven to be worstcase optimal in the sparsity of the obtained solutions. On the application side, this allows us to unify a large variety of existing sparse greedy methods, in particular for optimization over convex hulls of an atomic set, even if those sets can only be approximated, including sparse (or structured sparse) vectors or matrices, lowrank matrices, permutation matrices, or maxnorm bounded matrices. We present a new general framework for convex optimization over matrix factorizations, where every FrankWolfe iteration will consist of a lowrank update, and discuss the broad application areas of this approach. 1.
Structured Sparsity through Convex Optimization
, 2012
"... Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1norm. In this paper, we consider sit ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. While naturally cast as a combinatorial optimization problem, variable or feature selection admits a convex relaxation through the regularization by the ℓ1norm. In this paper, we consider situations where we are not only interested in sparsity, but where some structural prior knowledge is available as well. We show that the ℓ1norm can then be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures. We present applications to unsupervised learning, for structured sparse principal component analysis and hierarchical dictionary learning, and to supervised learning in the context of nonlinear variable selection.
Convex tensor decomposition via structured Schatten norm regularization
 IN ADVANCES IN NIPS 26
, 2013
"... We study a new class of structured Schatten norms for tensors that includes two recently proposed norms (“overlapped” and “latent”) for convexoptimizationbased tensor decomposition. We analyze the performance of “latent” approach for tensor decomposition, which was empirically found to perform bet ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
We study a new class of structured Schatten norms for tensors that includes two recently proposed norms (“overlapped” and “latent”) for convexoptimizationbased tensor decomposition. We analyze the performance of “latent” approach for tensor decomposition, which was empirically found to perform better than the “overlapped” approach in some settings. We show theoretically that this is indeed the case. In particular, when the unknown true tensor is lowrank in a specific unknown mode, this approach performs as well as knowing the mode with the smallest rank. Along the way, we show a novel duality result for structured Schatten norms, which is also interesting in the general context of structured sparsity. We confirm through numerical simulations that our theory can precisely predict the scaling behaviour of the mean squared error.
Convex relaxation of combinatorial penalties
, 2011
"... In this paper, we propose an unifying view of several recently proposed structured sparsityinducing norms. We consider the situation of a model simultaneously (a) penalized by a setfunction defined on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) r ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
(Show Context)
In this paper, we propose an unifying view of several recently proposed structured sparsityinducing norms. We consider the situation of a model simultaneously (a) penalized by a setfunction defined on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) regularized in ℓpnorm. We show that the natural combinatorial optimization problems obtained may be relaxed into convex optimization problems and introduce a notion, the lower combinatorial envelope of a setfunction, that characterizes the tightness of our relaxations. We moreover establish links with norms based on latent representations including the latent group Lasso and blockcoding, and with norms obtained from submodular functions. 1
Groupsparse model selection: Hardness and relaxations
, 2013
"... Groupbased sparsity models are proven instrumental in linear regression problems for recovering signals from much fewer measurements than standard compressive sensing. The main promise of these models is the recovery of “interpretable” signals along with the identification of their constituent grou ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Groupbased sparsity models are proven instrumental in linear regression problems for recovering signals from much fewer measurements than standard compressive sensing. The main promise of these models is the recovery of “interpretable” signals along with the identification of their constituent groups. To this end, we establish a combinatorial framework for groupmodel selection problems and highlight the underlying tractability issues revolving around such notions of interpretability when the regression matrix is simply the identity operator. We show that, in general, claims of correctly identifying the groups with convex relaxations would lead to polynomial time solution algorithms for a wellknown NPhard problem, called the weighted maximum cover problem. Instead, leveraging a graphbased understanding of group models, we describe group structures which enable correct model identification in polynomial time via dynamic programming. We also show that group structures that lead to totally unimodular constraints have tractable discrete as well as convex relaxations. Finally, we study the Pareto frontier of budgeted groupsparse approximations for the treebased sparsity model and illustrate identification and computation tradeoffs between our framework and the existing convex relaxations.
Structured Learning of Gaussian Graphical Models
"... We consider estimation of multiple highdimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network diffe ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
We consider estimation of multiple highdimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbednode joint graphical lasso, a convex optimization problem that is based upon the use of a rowcolumn overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1
Social sparsity! neighborhood systems enrich structured shrinkage operators
 IEEE Trans. Signal Processing
, 2013
"... Abstract—Sparse and structured signal expansions on dictionaries can be obtained through explicit modeling in the coefficient domain. The originality of the present article lies in the construction and the study of generalized shrinkage operators, whose goal is to identify structured significance ma ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Sparse and structured signal expansions on dictionaries can be obtained through explicit modeling in the coefficient domain. The originality of the present article lies in the construction and the study of generalized shrinkage operators, whose goal is to identify structured significance maps and give rise to structured thresholding. These generalize Group Lasso and the previously introduced Elitist Lasso by introducing more flexibility in the coefficient domain modeling, and lead to the notion of social sparsity. The proposed operators are studied theoretically and embedded in iterative thresholding algorithms. Moreover, a link between these operators and a convex functional is established. Numerical studies on both simulated and real signals confirm the benefits of such an approach.
Translationinvariant shrinkage/thresholding of group sparse signals
 Signal Processing, 94:476–489
, 2014
"... This paper addresses signal denoising when largeamplitude coefficients form clusters (groups). The L1norm and other separable sparsity models do not capture the tendency of coefficients to cluster (group sparsity). This work develops an algorithm, called ‘overlapping group shrinkage ’ (OGS), based ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
This paper addresses signal denoising when largeamplitude coefficients form clusters (groups). The L1norm and other separable sparsity models do not capture the tendency of coefficients to cluster (group sparsity). This work develops an algorithm, called ‘overlapping group shrinkage ’ (OGS), based on the minimization of a convex cost function involving a groupsparsity promoting penalty function. The groups are fully overlapping so the denoising method is translationinvariant and blocking artifacts are avoided. Based on the principle of majorizationminimization (MM), we derive a simple iterative minimization algorithm that reduces the cost function monotonically. A procedure for setting the regularization parameter, based on attenuating the noise to a specified level, is also described. The proposed approach is illustrated on speech enhancement, wherein the OGS approach is applied in the shorttime Fourier transform (STFT) domain. The OGS algorithm produces denoised speech that is relatively free of musical noise. 1