#### DMCA

## Social sparsity! neighborhood systems enrich structured shrinkage operators (2013)

Venue: | IEEE Trans. Signal Processing |

Citations: | 8 - 2 self |

### Citations

4200 | Regression shrinkage and selection via the lasso
- Tibshirani
- 1997
(Show Context)
Citation Context ...sen from an appropriate dictionary. Here, efficiently may be understood in the sense that only few atoms are needed to reconstruct the signal. The same idea appeared in the machine learning community =-=[2]-=-, where often only few variables are relevant in inference tasks based on observations living in very high dimensional spaces. The natural measure of the cardinality of a support set, and hence its sp... |

2712 | Atomic decomposition by basis pursuit
- Chen, Donoho, et al.
- 1998
(Show Context)
Citation Context ...e Thresholding, Convex Optimization I. INTRODUCTION A wide range of inverse problems arising in signal processing have benefited from sparsity. Introduced in the mid 90’s by Chen, Donoho and Saunders =-=[1]-=-, the idea is that a signal can be efficiently represented as a linear combination of elementary atoms chosen from an appropriate dictionary. Here, efficiently may be understood in the sense that only... |

1530 |
Embedded image coding using zerotrees of wavelet coefficients
- Shapiro
- 1993
(Show Context)
Citation Context ...ce that the idea of taking into account the persistence of the wavelet coefficients along the tree is not new. Several works propose to modelize this case of “structured sparsity” such as [39], [40], =-=[51]-=-. Initial experiments on image denoising were conducted using the well-known Lena-image to which a Gaussian white noise was added, yielding a peak signal to noise ratio (PSNR) of 20 dB. Fig. 13 compar... |

1156 | Model selection and estimation in regression with grouped variables
- Yuan, Lin
- 2006
(Show Context)
Citation Context ...g the corresponding sum by the supremum. Two mixed norms appear quite naturally by playing with the different values of p and q: the ℓ21 and ℓ12 norms. The ℓ21 norm was used with the name Group-Lasso =-=[22]-=- (G-Lasso) in machine learning, but also Multiple Measurement Vectors [23] or joint sparsity [24] in signal processing. In the context of regression, the main aim of such a norm is to keep or discard ... |

1056 | A fast iterative shrinkage-thresholding algorithm for linear inverse problems
- Beck, Teboulle
- 2009
(Show Context)
Citation Context ...ng the proximity operators was given by the forward-backward algorithm studied by Combettes et al. [18]. We will refer to this algorithm as the Iterative Shrinkage/Thresholding Algorithm (ISTA) as in =-=[19]-=- and we restate it in Algorithm 1 for the problem studied here. Algorithm 1: ISTA Initialization: α (0) ∈ CN , k = 1, γ = ‖ΦΦ ∗ ‖ repeat α (k) = prox λ γ Ω k = k + 1; until convergence; ( α (k−1) + 1 ... |

972 | T.: Regularization and variable selection via the elastic net
- Zou, Hastie
- 2005
(Show Context)
Citation Context ...sets of a given cardinality, the proximity operator can be computed exactly. This particular case corresponds in fact to the so called k–support norm [31], which is closely related to the elastic net =-=[32]-=-. Furthermore, iterative algorithms exist, cf. [33] if one needs to compute the proximity operator in the general case. Despite the “discarding” behavior of the Group-Lasso, mixed norms with overlaps ... |

745 | A iterative thresholding algorithm for linear inverse problems with a sparsity constraint
- Daubechies, Defrise, et al.
- 2004
(Show Context)
Citation Context ...inimizer of the convex functional (2) can be obtained by using proximal algorithms. The simplest proximal algorithm was found in the ℓ1 case by several researchers using very different approaches. In =-=[16]-=- Daubechies and coauthors derived the thresholded Landweber iterations using a surrogate and proved the convergence using Opial’s fixed point Theorem. In [17], Figueiredo et al. found the same algorit... |

697 | Compressive sensing
- Baraniuk
- 2007
(Show Context)
Citation Context ... high dimensional space. In addition to the convex approaches, several other solutions were proposed for the structured sparsity problem. Among others, we can cite the model-based compressive sensing =-=[39]-=-, and approaches based on coding theory [40] or Bayesian methods (see e.g. [41] and references therein). IV. STRUCTURED SHRINKAGE One of the main shortcomings of the various “structured block sparsity... |

509 | Signal recovery by proximal forward-backward splitting, Multiscale Modeling and Simulation
- Combettes, Wajs
- 2006
(Show Context)
Citation Context ...ound the same algorithm thanks to an expectation/maximization formulation. A more general version using the proximity operators was given by the forward-backward algorithm studied by Combettes et al. =-=[18]-=-. We will refer to this algorithm as the Iterative Shrinkage/Thresholding Algorithm (ISTA) as in [19] and we restate it in Algorithm 1 for the problem studied here. Algorithm 1: ISTA Initialization: α... |

477 | relax: Convex programming methods for identifying sparse signals in noise
- Just
(Show Context)
Citation Context ... a slight abuse of notation, we choose the notation argmin to represent any minimizer, as the choice of a particular minimizer has no consequences for the rest of the paper. One can refer to [13] and =-=[14]-=- for discussions of the uniqueness of the ℓ1 problem. B. Short reminder of Convex optimization The algorithms proposed in this paper are issued from convex optimization methods and rely on the notion ... |

352 | An EM algorithm for wavelet-based image restoration
- Figueiredo, Nowak
(Show Context)
Citation Context ...chers using very different approaches. In [16] Daubechies and coauthors derived the thresholded Landweber iterations using a surrogate and proved the convergence using Opial’s fixed point Theorem. In =-=[17]-=-, Figueiredo et al. found the same algorithm thanks to an expectation/maximization formulation. A more general version using the proximity operators was given by the forward-backward algorithm studied... |

335 |
Sparse representations in unions of bases
- Gribonval, Nielsen
(Show Context)
Citation Context ...to the “morphological layer”. Such an approach has been proposed in [4] as hybrid model for audio signals, and in [5] as morphological model for images. A more theoretical study has been performed in =-=[6]-=-, where sufficient conditions that guarantee the uniqueness of a sparse representation in union of orthogonal bases were obtained. In addition to these observations, we notice a grouping effect of the... |

264 | Sparse solutions to linear inverse problems with multiple measurement vectors
- Cotter, Rao, et al.
- 2005
(Show Context)
Citation Context ...rally by playing with the different values of p and q: the ℓ21 and ℓ12 norms. The ℓ21 norm was used with the name Group-Lasso [22] (G-Lasso) in machine learning, but also Multiple Measurement Vectors =-=[23]-=- or joint sparsity [24] in signal processing. In the context of regression, the main aim of such a norm is to keep or discard entire groups of coefficients. Indeed, if we consider the special case of ... |

255 |
Proximité et dualité dans un espace hilbertien
- Moreau
- 1965
(Show Context)
Citation Context ...blem. B. Short reminder of Convex optimization The algorithms proposed in this paper are issued from convex optimization methods and rely on the notion of the proximity operator, introduced by Moreau =-=[15]-=-, which allows to deal with non-smooth functionals.IEEE TRANSACTIONS ON SIGNAL PROCESSING 3 hal-00691774, version 3 - 21 May 2013 Definition 1 (Proximity operator). Let ϕ : CN → CN be a lower semicon... |

232 | Group lasso with overlap and graph lasso
- Jacob, Obozinski, et al.
(Show Context)
Citation Context ...rojection, we also propose another alternative operator, for which the convergence to a fixed point is warranted. Our framework also allows the inclusion of the recently introduced Latent-Group-Lasso =-=[10]-=-, [11], to whose performance the new algorithms will also be compared. B. Outline Section II introduces the mathematical framework used for this article and Section III presents the state of the art r... |

215 |
Simultaneous cartoon and texture image inpainting using morphological component analysis
- Elad, Starck, et al.
- 2005
(Show Context)
Citation Context ...is. Hence an idea is to construct a dictionary as a union of two others, each adapted to the “morphological layer”. Such an approach has been proposed in [4] as hybrid model for audio signals, and in =-=[5]-=- as morphological model for images. A more theoretical study has been performed in [6], where sufficient conditions that guarantee the uniqueness of a sparse representation in union of orthogonal base... |

209 | The Lasso and its dual
- Osborne, Presnell, et al.
(Show Context)
Citation Context ...ver, with a slight abuse of notation, we choose the notation argmin to represent any minimizer, as the choice of a particular minimizer has no consequences for the rest of the paper. One can refer to =-=[13]-=- and [14] for discussions of the uniqueness of the ℓ1 problem. B. Short reminder of Convex optimization The algorithms proposed in this paper are issued from convex optimization methods and rely on th... |

184 | Structured variable selection with sparsity-inducing norms
- Jenatton, Audibert, et al.
(Show Context)
Citation Context ... a ℓ21 + ℓ1 composite norm, also known as the HiLasso [35]. Such a composite norm was used with success for Magnetoencephalography inverse problems with respect to timefrequency dictionaries [36]. In =-=[37]-=-, the authors studied a very general mixed norm, allowing to generalize the Group-Lasso and the hierarchical sparse coding. In [38], the authors propose a family of “convex penalty functions, which en... |

147 | Analysis versus synthesis in signal priors
- Elad, Milanfar, et al.
(Show Context)
Citation Context ... equal, then we found the mixed norms with overlaps studied in [29]. Notice that the operator E appears in the penalty, and can then be thought of as an analysis prior. Using the results discussed in =-=[45]-=-, we can reformulate it as a constrained synthesis problem: ˆα = E T ⎞ ⎠ argmin u,s.t.u=EET ‖y − ΦE u T u‖ 2 + λ‖u‖∗ where ‖ ‖∗ is the corresponding (possibly squared) norm used to define the penalty ... |

136 | Solving monotone inclusions via compositions of nonexpansive averaged operators, - Combettes - 2004 |

127 | Learning with structured sparsity
- Huang, Zhang, et al.
- 2009
(Show Context)
Citation Context ...convex approaches, several other solutions were proposed for the structured sparsity problem. Among others, we can cite the model-based compressive sensing [39], and approaches based on coding theory =-=[40]-=- or Bayesian methods (see e.g. [41] and references therein). IV. STRUCTURED SHRINKAGE One of the main shortcomings of the various “structured block sparsity” approaches exposed above, is that the defi... |

109 | Recovery algorithms for vector valued data with joint sparsity constraints
- Fornasier, Rauhut
(Show Context)
Citation Context ...he different values of p and q: the ℓ21 and ℓ12 norms. The ℓ21 norm was used with the name Group-Lasso [22] (G-Lasso) in machine learning, but also Multiple Measurement Vectors [23] or joint sparsity =-=[24]-=- in signal processing. In the context of regression, the main aim of such a norm is to keep or discard entire groups of coefficients. Indeed, if we consider the special case of an orthogonal basis, on... |

107 | Dictionaries for sparse representation modeling
- Rubinstein, Bruckstein, et al.
- 2010
(Show Context)
Citation Context ...of the signal: Gabor dictionaries (for audio signals for example), wavelet dictionaries (for images) are commonly used, among others. The dictionary can even be learned directly on a class of signals =-=[3]-=-. In order to be able to use the sparse principle, this step of choosing an appropriate dictionary is obviously crucial. Choose a loss in order to link the observations, or measured signals, to the so... |

81 | Proximal methods for hierarchical sparse coding
- Jenatton, Mairal, et al.
- 2010
(Show Context)
Citation Context ...+ ,IEEE TRANSACTIONS ON SIGNAL PROCESSING 5 hal-00691774, version 3 - 21 May 2013 Various other kinds of structures have been proposed for refining the model of group based sparsity. For example, in =-=[34]-=- a hierarchy on groups was introduced. Such a behavior allows for sparsity inside the group, in addition to sparsity between the groups. In particular, their hierarchical sparse coding included the su... |

63 | Hybrid representations for audiophonic signal encoding
- Daudet, Torrésani
- 2002
(Show Context)
Citation Context ...ir structure) depends on the choice of the basis. Hence an idea is to construct a dictionary as a union of two others, each adapted to the “morphological layer”. Such an approach has been proposed in =-=[4]-=- as hybrid model for audio signals, and in [5] as morphological model for images. A more theoretical study has been performed in [6], where sufficient conditions that guarantee the uniqueness of a spa... |

55 | Smoothing proximal gradient method for general structured sparse learning.
- Chen, Lin, et al.
- 2011
(Show Context)
Citation Context ...EN] T , (7) w (i) 1 eT 1 , . . . , √ w (i) N eT N ] T , where ej is the j th canonical base vector of R N . A similar expansion matrix has also been used in the context of overlapping groups in [30], =-=[44]-=-. A direct consequence is that one can simply go back to the original space by using the adjoint operator. However, in order to be able to establish a link between the heuristic shrinkage operators pr... |

51 |
Sparsity and persistence: mixed norms provide simple signal models with dependent coefficients. Signal, Image and Video processing,
- Kowalski, Torresani
- 2009
(Show Context)
Citation Context ...parsity: a certain, possibly weighted, neighborhood of a given coefficient is considered for deciding whether to keep or discard the coefficient under consideration. This idea was first introduced in =-=[7]-=-; it was equipped with weights and evaluated for various audio applications in [8]. For the realization of the intuitive idea that a coefficient’s neighborhood should be relevant for its impact, we co... |

50 |
Sparse regression using mixed norms,
- Kowalski
- 2009
(Show Context)
Citation Context ...ators can not be directly linked to a convex minimization problem. While the convergence of related iterative algorithms for the classical shrinkage operators and their generalizations was studied in =-=[9]-=- in a rather general setting, the theoretical properties of the new operators have not been considered so far. In the current contribution, we establish a formal relation between the structured shrink... |

38 | Approximation accuracy, gradient methods, and error bound for structured convex optimization
- Tseng
(Show Context)
Citation Context ...adient descent for the non smooth functional (2). The algorithm is very simple, but converges slowly in practice. Recent advances in convex optimization lead to more efficient algorithms, we refer to =-=[20]-=- for a thorough discussion of proximal algorithms and their accelerations. Algorithm 2 describes the Fast Iterative Shrinkage/Thresholding Algorithm as proposed in [19]. The Algorithm 2: FISTA Initial... |

36 | CHiLasso: A collaborative hierarchical sparse modeling framework,” http://arxiv.org/abs/1006.1346
- Sprechmann, Ramrez, et al.
- 2010
(Show Context)
Citation Context ...e group, in addition to sparsity between the groups. In particular, their hierarchical sparse coding included the sums of convex penalties such as a ℓ21 + ℓ1 composite norm, also known as the HiLasso =-=[35]-=-. Such a composite norm was used with success for Magnetoencephalography inverse problems with respect to timefrequency dictionaries [36]. In [37], the authors studied a very general mixed norm, allow... |

35 | Audio denoising by time-frequency block thresholding
- Yu, Mallat, et al.
(Show Context)
Citation Context ...o denoising, using “real-life” signals and an overcomplete Gabor-dictionary. The latter dictionary was also employed by the state of the art in audio denoising, namely the BlockThresholding algorithm =-=[50]-=-. To compare these approaches, we use WG-Lasso with a neighborhood extending over time with 4 coefficients before and after the center coefficient and employ a tight Gabor-frame with Hann-window of le... |

30 | Nested iterative algorithms for convex constrained image recovery problems
- Chaux, Pesquet, et al.
(Show Context)
Citation Context ...ure synthesis approach, is actually exactly the problem of the latent-GroupLasso [10], [11]. Seeking an estimate of α as a minimizer of F , it is possible to apply an algorithm as the one proposed in =-=[46]-=-, where the proximity operator of the sum of two convex functions is derived from a Douglas Rachford Algorithm. We present such an algorithm with the “ISTA” framework in Algorithm 4, but it can be emb... |

29 | Exclusive Lasso for multi-task feature selection.
- Zhou, Jin, et al.
- 2010
(Show Context)
Citation Context ...der the special case of an orthogonal basis, only the most energetic groups remain. The ℓ12 norm was introduced under the name of ElitistLasso [7], [9] (E-Lasso), and latter called Exclusive Lasso in =-=[25]-=-. With such a penalty, and if Φ is an orthogonal basis, we keep the biggest coefficients relative to the others . Such behavior can be expected in applications such as source separation [12]. 2) Exten... |

28 | Group lasso with overlaps: The latent group lasso approach. Available at arXiv:1110.0413
- Obozinski, Jacob, et al.
- 2011
(Show Context)
Citation Context ...ion, we also propose another alternative operator, for which the convergence to a fixed point is warranted. Our framework also allows the inclusion of the recently introduced Latent-Group-Lasso [10], =-=[11]-=-, to whose performance the new algorithms will also be compared. B. Outline Section II introduces the mathematical framework used for this article and Section III presents the state of the art related... |

23 | Incorporating information on neighbouring coefficients into wavelet estimation. - Cai, Silverman - 2001 |

22 |
A primal-dual algorithm for group sparse regularization with overlapping groups.
- Mosci, Villa, et al.
- 2010
(Show Context)
Citation Context ... can be computed exactly. This particular case corresponds in fact to the so called k–support norm [31], which is closely related to the elastic net [32]. Furthermore, iterative algorithms exist, cf. =-=[33]-=- if one needs to compute the proximity operator in the general case. Despite the “discarding” behavior of the Group-Lasso, mixed norms with overlaps have been studied in [29]. Again, the proximity ope... |

20 |
The Lp spaces with mixed norm
- Benedek, Panzone
- 1961
(Show Context)
Citation Context ...imity operators which will be used later. Other kinds of grouping structures which appear in the literature are presented afterwards. A. Mixed norms Mixed norms were introduced by Benedek and Panzone =-=[21]-=- in the early 1960’s in mathematics. 1) Definition on two levels: We give here the general definition as in [7], [9]. Definition 2 (Two-level mixed norms). Let x ∈ R N = R G×M be indexed by a double i... |

19 |
A family of penalty functions for structured sparsity
- Micchelli, Morales, et al.
- 2010
(Show Context)
Citation Context ...inverse problems with respect to timefrequency dictionaries [36]. In [37], the authors studied a very general mixed norm, allowing to generalize the Group-Lasso and the hierarchical sparse coding. In =-=[38]-=-, the authors propose a family of “convex penalty functions, which encode this prior knowledge by means of a set of constraints on the absolute values of the regression coefficients”. In practice, the... |

17 |
Beyond the narrowband approximation: Wideband convex methods for under-determined reverberant audio source separation
- Kowalski, Vincent, et al.
- 2010
(Show Context)
Citation Context ...rrupted by an additive noise. However, this approach can be extended to more general inverse problems where several signals have to be estimated from several measurements such as in source separation =-=[12]-=-. Remark 2. The functionals appearing in (1) and (2) are convex but not necessarily strictly convex. Then, the set of minimizers is not necessarily a singleton. However, with a slight abuse of notatio... |

17 | Sparse prediction with the k-support norm. - Argyriou, Foygel, et al. - 2012 |

15 | Hierarchical penalization.
- Szafranski, Grandvalet, et al.
- 2007
(Show Context)
Citation Context ...not computable in a closed form. In fact, one needs to compute a “Group-Lasso” proximity operator with weights varying in groups, which does not admit a closed form. It is interesting to note that in =-=[27]-=- a hierarchical formulation of the dependencies leads to a ℓ 4 1, mixed norm. 3 Notice that the mixed norms as defined here do not consider any overlap between the groups. The need for overlapping gro... |

15 |
Structured sparsity: from mixed norms to structured shrinkage
- Kowalski, Torresani
- 2009
(Show Context)
Citation Context ...ructure, and fixing the groups can be too rigid. Instead of defining groups, and therewith keeping or discarding entire blocks of coefficients, a notion of neighborhoodbased selection was proposed in =-=[42]-=-. The introduction of this neighborhood gives rise to “social sparsity”: a decision can be made coefficient by coefficient by taking into account the “weight” of a coefficient’s neighborhood. The latt... |

14 |
Group sparsity with overlapping partition functions,”
- Peyre, Fadili
- 2011
(Show Context)
Citation Context ...ds to a ℓ 4 1, mixed norm. 3 Notice that the mixed norms as defined here do not consider any overlap between the groups. The need for overlapping groups was recognized by many authors, see [10], [28]–=-=[30]-=-, and different strategies have been proposed. B. A step beyond mixed norms In [10], [11], starting from the observation that the GroupLasso discards all coefficients in a given group, the authors def... |

13 |
Improving M/EEG source localization with an inter-condition sparse prior
- Gramfort, Kowalski
- 2009
(Show Context)
Citation Context ...r can be expected in applications such as source separation [12]. 2) Extension to 3 levels: This notion of mixed norms can be extended to more than two levels. On three levels, the definition becomes =-=[26]-=- Definition 3 (Three-level mixed norms). Let x ∈ RN = RK×G×M be indexed by a triple index (k, g, m) ∈ N3 such that x = (xk,g,m). Let p, q, r ≥ 1 and w ∈ RN +,∗ a sequenceIEEE TRANSACTIONS ON SIGNAL P... |

9 |
Functional brain imaging with m/eeg using structured sparsity in time-frequency dictionaries
- Gramfort, Strohmeier, et al.
- 2013
(Show Context)
Citation Context ...s such as a ℓ21 + ℓ1 composite norm, also known as the HiLasso [35]. Such a composite norm was used with success for Magnetoencephalography inverse problems with respect to timefrequency dictionaries =-=[36]-=-. In [37], the authors studied a very general mixed norm, allowing to generalize the Group-Lasso and the hierarchical sparse coding. In [38], the authors propose a family of “convex penalty functions,... |

8 |
Mixed norms with overlapping groups as signal priors
- Bayram
- 2011
(Show Context)
Citation Context ...ive algorithms exist, cf. [33] if one needs to compute the proximity operator in the general case. Despite the “discarding” behavior of the Group-Lasso, mixed norms with overlaps have been studied in =-=[29]-=-. Again, the proximity operator has no closed form, but an iterative scheme is proposed. The mixed norm with overlaps corresponds actually to a particular case of the regularizer proposed in [30], whe... |

5 | A hybrid scheme for encoding audio signal using hidden Markov models of waveforms
- Molla, Torrésani
- 2005
(Show Context)
Citation Context ...gnals of the form y = ∑ k∈∆ xkϕk + b where b is an additive Gaussian noise, and ∆ is a structured sparse significance map. The latter is generated using fixed frequency Markov chains as introduced in =-=[49]-=-, drawing the synthesis coefficients xk from a standard normal distribution. An example of such a map is displayed in Fig. 4. This produces an overall signal to noise ratio of about 5 dB.IEEE TRANSAC... |

4 |
Extension of the global matched filter to structured groups of atoms: Application to harmonic signals
- Fuchs
- 2011
(Show Context)
Citation Context ...s leads to a ℓ 4 1, mixed norm. 3 Notice that the mixed norms as defined here do not consider any overlap between the groups. The need for overlapping groups was recognized by many authors, see [10], =-=[28]-=-–[30], and different strategies have been proposed. B. A step beyond mixed norms In [10], [11], starting from the observation that the GroupLasso discards all coefficients in a given group, the author... |

4 | Exploiting correlation in sparse signal recovery problems: Multiple measurement vectors, block sparsity, and time-varying sparsity
- Zhang, Rao
- 2011
(Show Context)
Citation Context ...lutions were proposed for the structured sparsity problem. Among others, we can cite the model-based compressive sensing [39], and approaches based on coding theory [40] or Bayesian methods (see e.g. =-=[41]-=- and references therein). IV. STRUCTURED SHRINKAGE One of the main shortcomings of the various “structured block sparsity” approaches exposed above, is that the definition of the groups must be done a... |

2 |
Audio denoising by generalized time-frequency thresholding
- Siedenburg, Dörfler
- 2012
(Show Context)
Citation Context ...volution algorithms. However, it can be interesting to have an asymmetric or a smoother window. A more detailed link between (audio) signal characteristics and optimal neighborhood choice is given in =-=[48]-=-. Here, we rather stick with basic neighborhood shapes in order to clearly present the underlying principles. Finally, let us note that for all the experiments, we chose to initialize the algorithms w... |

1 |
Structured sparsity for audio signals,” in Proceeding of 14th conference on digital audio effects (DAFx
- Siedenburg, Dörfler
- 2011
(Show Context)
Citation Context ...idered for deciding whether to keep or discard the coefficient under consideration. This idea was first introduced in [7]; it was equipped with weights and evaluated for various audio applications in =-=[8]-=-. For the realization of the intuitive idea that a coefficient’s neighborhood should be relevant for its impact, we construct structured shrinkage operators which are directly derived from classical p... |