DMCA
Fused sparsity and robust estimation for linear models with unknown variance (2012)
Venue: | In NIPS |
Citations: | 8 - 2 self |
Citations
8902 | Distinctive image features from scale-invariant keypoints
- Lowe
(Show Context)
Citation Context ...or. Thus, estimation of F from a family of matching points {xi ↔ x′i; i = 1, . . . , n} is a problem of linear regression. Typically, matches are computed by comparing local descriptors (such as SIFT =-=[16]-=-) and, for images of reasonable resolution, hundreds of matching points are found. The computation of the fundamental matrix would not be a problem in this context of large sample size / low dimension... |
4728 |
A.: Multiple View Geometry in Computer Vision. 2nd edn
- Hartley, Zisserman
- 2004
(Show Context)
Citation Context ...indexes i ∈ I ⊂ {1, ..., n}, called inliers. The indexes does not belonging to I will be referred to as outliers. The setting we are interested in is the one frequently encountered in computer vision =-=[13, 25]-=-: the dimensionality k of θ∗ is small as compared to n but the presence of outliers causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the... |
1361 | Using SeDuMi 1.02, a MATLAB toolbox for optimization overy symmetric cones. Optimization Methods and Software
- Sturm
- 1999
(Show Context)
Citation Context ... be transformed into linear inequalities, except the last one which is a second order cone constraint. The problems of this type can be efficiently solved by various standard toolboxes such as SeDuMi =-=[22]-=- or TFOCS [1]. 2.4 Finite sample risk bound To provide theoretical guarantees for our estimator, we impose the by now usual assumption of restricted eigenvalues on a suitably chosen matrix. This assum... |
861 | The Dantzig selector: Statistical estimation when p is much larger than
- Candes, Tao
- 2007
(Show Context)
Citation Context ... Lasso and the Dantzig Selector (DS), rely on convex relaxation of `0-norm penalty leading to a convex program that involves the `1-norm of β. More precisely, for a given λ̄ > 0, the Lasso and the DS =-=[26, 4, 5, 3]-=- are defined as β̂ L = arg min β∈Rp { 1 2 ‖Y −Xβ‖22 + λ̄‖β‖1 } (Lasso) β̂ DS = arg min ‖β‖1 subject to ‖X>(Y −Xβ)‖∞ ≤ λ̄. (DS) The performance of these algorithms depends heavily on the choice of the ... |
695 |
Regression shrinkage and selection via the
- Tibshirani
- 1996
(Show Context)
Citation Context ... Lasso and the Dantzig Selector (DS), rely on convex relaxation of `0-norm penalty leading to a convex program that involves the `1-norm of β. More precisely, for a given λ̄ > 0, the Lasso and the DS =-=[26, 4, 5, 3]-=- are defined as β̂ L = arg min β∈Rp { 1 2 ‖Y −Xβ‖22 + λ̄‖β‖1 } (Lasso) β̂ DS = arg min ‖β‖1 subject to ‖X>(Y −Xβ)‖∞ ≤ λ̄. (DS) The performance of these algorithms depends heavily on the choice of the ... |
678 |
The restricted isometry property and its implications for compressed sensing,” Compte Rendus de l’Academie des Sciences
- Candès
- 2008
(Show Context)
Citation Context ... Lasso and the Dantzig Selector (DS), rely on convex relaxation of `0-norm penalty leading to a convex program that involves the `1-norm of β. More precisely, for a given λ̄ > 0, the Lasso and the DS =-=[26, 4, 5, 3]-=- are defined as β̂ L = arg min β∈Rp { 1 2 ‖Y −Xβ‖22 + λ̄‖β‖1 } (Lasso) β̂ DS = arg min ‖β‖1 subject to ‖X>(Y −Xβ)‖∞ ≤ λ̄. (DS) The performance of these algorithms depends heavily on the choice of the ... |
467 | Simultaneous analysis of lasso and dantzig selector. The Annals of Statistics
- Bickel, Ritov, et al.
- 2009
(Show Context)
Citation Context ... Lasso and the Dantzig Selector (DS), rely on convex relaxation of `0-norm penalty leading to a convex program that involves the `1-norm of β. More precisely, for a given λ̄ > 0, the Lasso and the DS =-=[26, 4, 5, 3]-=- are defined as β̂ L = arg min β∈Rp { 1 2 ‖Y −Xβ‖22 + λ̄‖β‖1 } (Lasso) β̂ DS = arg min ‖β‖1 subject to ‖X>(Y −Xβ)‖∞ ≤ λ̄. (DS) The performance of these algorithms depends heavily on the choice of the ... |
252 | Computer Vision: Algorithms and Applications
- Szeliski
- 2011
(Show Context)
Citation Context ...indexes i ∈ I ⊂ {1, ..., n}, called inliers. The indexes does not belonging to I will be referred to as outliers. The setting we are interested in is the one frequently encountered in computer vision =-=[13, 25]-=-: the dimensionality k of θ∗ is small as compared to n but the presence of outliers causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the... |
128 |
On Benchmarking Camera Calibration and MultiView Stereo for High Resolution Imagery
- Strecha, Hansen, et al.
- 2008
(Show Context)
Citation Context ...obust estimation of F are crucial steps for performing 3D reconstruction. Here, we apply the SRDS to the problem of estimation of F for 10 pairs of consecutive images provided by the fountain dataset =-=[21]-=-: the 11 images are shown at the bottom of Fig. 1. Using SIFT descriptors, we found more than 17.000 point matches in most pairs of images among the 10 pairs we are considering. The CPU time for compu... |
120 | Templates for convex cone problems with applications to sparse signal recovery
- Becker, Candès, et al.
(Show Context)
Citation Context ...d into linear inequalities, except the last one which is a second order cone constraint. The problems of this type can be efficiently solved by various standard toolboxes such as SeDuMi [22] or TFOCS =-=[1]-=-. 2.4 Finite sample risk bound To provide theoretical guarantees for our estimator, we impose the by now usual assumption of restricted eigenvalues on a suitably chosen matrix. This assumption, stated... |
82 |
Sparsity and smoothness via the fused
- Tibshirani
- 2005
(Show Context)
Citation Context ..., `2 and `∞ norms, corresponding respectively to the sum of absolute values, the square root of the sum of squares and the maximum of the coefficients of v. 1 The term “fused” sparsity, introduced by =-=[27]-=-, originates from the case where Mβ is the discrete derivative of a signal β and the aim is to minimize the total variation, see [12, 19] for a recent overview and some asymptotic results. For general... |
57 | Square-root lasso: Pivotal recovery of sparse signals via conic programming
- BELLONI, CHERNOZHUKOV, et al.
- 2011
(Show Context)
Citation Context ...ceived special attention in last years, cf. [10] and the references therein, with the introduction of computationally efficient and theoretically justified σ-adaptive procedures the square-root Lasso =-=[2]-=- (a.k.a. scaled Lasso [24]) and the `1 penalized log-likelihood minimization [20]. In the present work, we are interested in the setting where β∗ is not necessarily sparse, but for a known q × p matri... |
50 | Highly robust error correction by convex programming.
- Candes, Randall
- 2008
(Show Context)
Citation Context ... causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the matrix 1nA >A has diagonal entries equal to one. Following the ideas developed in =-=[6, 7, 8, 18, 15]-=-, we introduce a new vector ω ∈ Rn that serves to characterize the outliers. If an entry ωi of ω is nonzero, then the corresponding observation Yi is an outlier. This leads to the model: Y = Aθ∗ + √ n... |
36 |
de Geer. Locally adaptive regression splines
- Mammen, van
- 1997
(Show Context)
Citation Context ...n the problem of multiple change-point detection, which is an important particular instance of fused sparsity. There are some workarounds to circumvent this limitation in that particular setting, see =-=[17, 11]-=-. The extension of these kind of arguments to the case of unknown σ∗ is an open problem we intend to tackle in the near future. 3 Application to robust estimation This methodology can be applied in th... |
29 | Properties and refinements of the fused lasso
- Rinaldo
(Show Context)
Citation Context ... the coefficients of v. 1 The term “fused” sparsity, introduced by [27], originates from the case where Mβ is the discrete derivative of a signal β and the aim is to minimize the total variation, see =-=[12, 19]-=- for a recent overview and some asymptotic results. For general matrices M, tight risk bounds were proved in [14]. We adopt here this framework of general M and aim at designing a computationally effi... |
26 | Multiple change-point estimation with a total variation penalty
- Harchaoui, Lévy-Leduc
- 2010
(Show Context)
Citation Context ...n the problem of multiple change-point detection, which is an important particular instance of fused sparsity. There are some workarounds to circumvent this limitation in that particular setting, see =-=[17, 11]-=-. The extension of these kind of arguments to the case of unknown σ∗ is an open problem we intend to tackle in the near future. 3 Application to robust estimation This methodology can be applied in th... |
20 | Robust Lasso with missing and grossly corrupted observations. Information Theory
- Nguyen, Tran
- 2013
(Show Context)
Citation Context ... causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the matrix 1nA >A has diagonal entries equal to one. Following the ideas developed in =-=[6, 7, 8, 18, 15]-=-, we introduce a new vector ω ∈ Rn that serves to characterize the outliers. If an entry ωi of ω is nonzero, then the corresponding observation Yi is an outlier. This leads to the model: Y = Aθ∗ + √ n... |
16 | High-dimensional instrumental variables regression and confidence sets. arXiv preprint arXiv:1105.2454,
- Gautier, Tsybakov
- 2011
(Show Context)
Citation Context ... the arguments used to prove analogous results for ordinary sparsity, but contains some qualitatively novel ideas. More precisely, the cornerstone of the proof of risk bounds for the Dantzig selector =-=[4, 3, 9]-=- is that the true parameter β∗ is a feasible solution. In our case, this argument cannot be used anymore. Our proposal is then to specify another vector β̃ that simultaneously satisfies the following ... |
14 |
de Geer. `1-penalization for mixture regression models
- Städler, Bühlmann, et al.
- 2010
(Show Context)
Citation Context ...h the introduction of computationally efficient and theoretically justified σ-adaptive procedures the square-root Lasso [2] (a.k.a. scaled Lasso [24]) and the `1 penalized log-likelihood minimization =-=[20]-=-. In the present work, we are interested in the setting where β∗ is not necessarily sparse, but for a known q × p matrix M, the vector Mβ∗ is sparse. We call this setting “fused sparsity scenario”. 1W... |
10 | R.: L1-penalized robust estimation for a class of inverse problems arising in multiview geometry
- Dalalyan, Keriven
(Show Context)
Citation Context ... causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the matrix 1nA >A has diagonal entries equal to one. Following the ideas developed in =-=[6, 7, 8, 18, 15]-=-, we introduce a new vector ω ∈ Rn that serves to characterize the outliers. If an entry ωi of ω is nonzero, then the corresponding observation Yi is an outlier. This leads to the model: Y = Aθ∗ + √ n... |
10 | High-dimensional regression with unknown variance
- Giraud, Huet, et al.
- 2012
(Show Context)
Citation Context ...most applications, the latter is unavailable. It is therefore vital to design statistical procedures that estimate β and σ in a joint fashion. This topic received special attention in last years, cf. =-=[10]-=- and the references therein, with the introduction of computationally efficient and theoretically justified σ-adaptive procedures the square-root Lasso [2] (a.k.a. scaled Lasso [24]) and the `1 penali... |
9 | Robust regression through the Huber’s criterion and adaptive Lasso penalty
- Lambert-Lacroix, Zwald
- 2011
(Show Context)
Citation Context ... causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the matrix 1nA >A has diagonal entries equal to one. Following the ideas developed in =-=[6, 7, 8, 18, 15]-=-, we introduce a new vector ω ∈ Rn that serves to characterize the outliers. If an entry ωi of ω is nonzero, then the corresponding observation Yi is an outlier. This leads to the model: Y = Aθ∗ + √ n... |
7 |
Scaled sparse linear regression. arXiv:1104.4595
- Sun, Zhang
- 2011
(Show Context)
Citation Context ...in last years, cf. [10] and the references therein, with the introduction of computationally efficient and theoretically justified σ-adaptive procedures the square-root Lasso [2] (a.k.a. scaled Lasso =-=[24]-=-) and the `1 penalized log-likelihood minimization [20]. In the present work, we are interested in the setting where β∗ is not necessarily sparse, but for a known q × p matrix M, the vector Mβ∗ is spa... |
7 |
de Geer and Peter Bühlmann. On the conditions used to prove oracle results for the lasso
- van
(Show Context)
Citation Context ...sumption of restricted eigenvalues on a suitably chosen matrix. This assumption, stated in Definition 2.1 below, was introduced and thoroughly discussed by [3]; we also refer the interested reader to =-=[28]-=-. Définition 2.1. We say that a n× q matrix A satisfies the restricted eigenvalue condition RE(s, 1), if κ(s, 1) ∆ = min |J|≤s min ‖δJc‖1≤‖δJ‖1 ‖Aδ‖2√ n‖δJ‖2 > 0. We say that A satisfies the strong r... |
6 |
Comments on: `1-penalization for mixture regression models by
- SUN, ZHANG
- 2010
(Show Context)
Citation Context ...eally an estimator but rather an oracle since it exploits the knowledge of the true σ∗. This is why the accuracy in estimating σ∗ is not reported in Table 1. To reduce the well known bias toward zero =-=[4, 23]-=-, we performed a post-processing for all of three procedures. It consisted in computing least squares estimators after removing all the covariates corresponding to vanishing coefficients of the estima... |
4 | Robust estimation for an inverse problem arising in multiview geometry
- Dalalyan, Keriven
(Show Context)
Citation Context ... causes the complete failure of the least squares estimator. In what follows, we use the standard assumption that the matrix 1nA >A has diagonal entries equal to one. Following the ideas developed in =-=[6, 7, 8, 18, 15]-=-, we introduce a new vector ω ∈ Rn that serves to characterize the outliers. If an entry ωi of ω is nonzero, then the corresponding observation Yi is an outlier. This leads to the model: Y = Aθ∗ + √ n... |
4 |
On the accuracy of l1-filtering of signals with block-sparse structure
- Iouditski, Karzan, et al.
- 2011
(Show Context)
Citation Context ...crete derivative of a signal β and the aim is to minimize the total variation, see [12, 19] for a recent overview and some asymptotic results. For general matrices M, tight risk bounds were proved in =-=[14]-=-. We adopt here this framework of general M and aim at designing a computationally efficient procedure capable to handle the situation of unknown noise level and for which we are able to provide theor... |