Results 11  20
of
2,732
KSVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
, 2006
"... In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signalatoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and inc ..."
Abstract

Cited by 935 (41 self)
 Add to MetaCart
In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signalatoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method—the KSVD algorithm—generalizing the umeans clustering process. KSVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The KSVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data.
Greed is Good: Algorithmic Results for Sparse Approximation
, 2004
"... This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries. It provides a sufficient condition under which both OMP and Donoho’s basis pursuit (BP) paradigm can recover the optimal representa ..."
Abstract

Cited by 916 (9 self)
 Add to MetaCart
(Show Context)
This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries. It provides a sufficient condition under which both OMP and Donoho’s basis pursuit (BP) paradigm can recover the optimal representation of an exactly sparse signal. It leverages this theory to show that both OMP and BP succeed for every sparse input signal from a wide class of dictionaries. These quasiincoherent dictionaries offer a natural generalization of incoherent dictionaries, and the cumulative coherence function is introduced to quantify the level of incoherence. This analysis unifies all the recent results on BP and extends them to OMP. Furthermore, the paper develops a sufficient condition under which OMP can identify atoms from an optimal approximation of a nonsparse signal. From there, it argues that OMP is an approximation algorithm for the sparse problem over a quasiincoherent dictionary. That is, for every input signal, OMP calculates a sparse approximant whose error is only a small factor worse than the minimal error that can be attained with the same number of terms.
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract

Cited by 879 (14 self)
 Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ p, and the zi’s are i.i.d. N(0, σ 2). Is it possible to estimate x reliably based on the noisy data y? To estimate x, we introduce a new estimator—we call the Dantzig selector—which is solution to the ℓ1regularization problem min ˜x∈R p ‖˜x‖ℓ1 subject to ‖A T r‖ℓ ∞ ≤ (1 + t −1) √ 2 log p · σ, where r is the residual vector y − A˜x and t is a positive scalar. We show that if A obeys a uniform uncertainty principle (with unitnormed columns) and if the true parameter vector x is sufficiently sparse (which here roughly guarantees that the model is identifiable), then with very large probability ‖ˆx − x ‖ 2 ℓ2 ≤ C2 ( · 2 log p · σ 2 + ∑ min(x 2 i, σ 2) Our results are nonasymptotic and we give values for the constant C. In short, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information about which coordinates are nonzero, and which were above the noise level. In multivariate regression and from a model selection viewpoint, our result says that it is possible nearly to select the best subset of variables, by solving a very simple convex program, which in fact can easily be recast as a convenient linear program (LP).
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 865 (3 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Signal recovery from random measurements via Orthogonal Matching Pursuit
 IEEE TRANS. INFORM. THEORY
, 2007
"... This technical report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal. This is a massive improvement over previous ..."
Abstract

Cited by 802 (9 self)
 Add to MetaCart
This technical report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal. This is a massive improvement over previous results for OMP, which require O(m 2) measurements. The new results for OMP are comparable with recent results for another algorithm called Basis Pursuit (BP). The OMP algorithm is faster and easier to implement, which makes it an attractive alternative to BP for signal recovery problems.
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint
, 2008
"... ..."
Regularization paths for generalized linear models via coordinate descent
, 2009
"... We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, twoclass logistic regression, and multinomial regression problems while the penalties include ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two (the elastic ..."
Abstract

Cited by 724 (15 self)
 Add to MetaCart
(Show Context)
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, twoclass logistic regression, and multinomial regression problems while the penalties include ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.
Compressive sensing
 IEEE Signal Processing Mag
, 2007
"... The Shannon/Nyquist sampling theorem tells us that in order to not lose information when uniformly sampling a signal we must sample at least two times faster than its bandwidth. In many applications, including digital image and video cameras, the Nyquist rate can be so high that we end up with too m ..."
Abstract

Cited by 696 (62 self)
 Add to MetaCart
(Show Context)
The Shannon/Nyquist sampling theorem tells us that in order to not lose information when uniformly sampling a signal we must sample at least two times faster than its bandwidth. In many applications, including digital image and video cameras, the Nyquist rate can be so high that we end up with too many samples and must compress in order to store or transmit them. In other applications, including imaging systems (medical scanners, radars) and highspeed analogtodigital converters, increasing the sampling rate or density beyond the current stateoftheart is very expensive. In this lecture, we will learn about a new technique that tackles these issues using compressive sensing [1, 2]. We will replace the conventional sampling and reconstruction operations with a more general linear measurement scheme coupled with an optimization in order to acquire certain kinds of signals at a rate significantly below Nyquist. 2
The adaptive LASSO and its oracle properties
 Journal of the American Statistical Association
"... The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain sc ..."
Abstract

Cited by 683 (10 self)
 Add to MetaCart
The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain scenarios where the lasso is inconsistent for variable selection. We then propose a new version of the lasso, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the!1 penalty. We show that the adaptive lasso enjoys the oracle properties; namely, it performs as well as if the true underlying model were given in advance. Similar to the lasso, the adaptive lasso is shown to be nearminimax optimal. Furthermore, the adaptive lasso can be solved by the same efficient algorithm for solving the lasso. We also discuss the extension of the adaptive lasso in generalized linear models and show that the oracle properties still hold under mild regularity conditions. As a byproduct of our theory, the nonnegative garotte is shown to be consistent for variable selection.
Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ¹ minimization
 PROC. NATL ACAD. SCI. USA 100 2197–202
, 2002
"... Given a ‘dictionary’ D = {dk} of vectors dk, we seek to represent a signal S as a linear combination S = ∑ k γ(k)dk, with scalar coefficients γ(k). In particular, we aim for the sparsest representation possible. In general, this requires a combinatorial optimization process. Previous work considered ..."
Abstract

Cited by 633 (38 self)
 Add to MetaCart
Given a ‘dictionary’ D = {dk} of vectors dk, we seek to represent a signal S as a linear combination S = ∑ k γ(k)dk, with scalar coefficients γ(k). In particular, we aim for the sparsest representation possible. In general, this requires a combinatorial optimization process. Previous work considered the special case where D is an overcomplete system consisting of exactly two orthobases, and has shown that, under a condition of mutual incoherence of the two bases, and assuming that S has a sufficiently sparse representation, this representation is unique and can be found by solving a convex optimization problem: specifically, minimizing the ℓ¹ norm of the coefficients γ. In this paper, we obtain parallel results in a more general setting, where the dictionary D can arise from two or several bases, frames, or even less structured systems. We introduce the Spark, ameasure of linear dependence in such a system; it is the size of the smallest linearly dependent subset (dk). We show that, when the signal S has a representation using less than Spark(D)/2 nonzeros, this representation is necessarily unique. We