Results 11  20
of
98
Simultaneous support recovery in high dimensions: Benefits and perils of block ℓ1,∞regularization
, 2009
"... ..."
(Show Context)
The pros and cons of compressive sensing for wideband signal acquisition: Noise folding vs. dynamic range
, 2011
"... Compressive sensing (CS) exploits the sparsity present in many common signals to reduce the number of measurements needed for digital acquisition. With this reduction would come, in theory, commensurate reductions in the size, weight, power consumption, and/or monetary cost of both signal sensors an ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
(Show Context)
Compressive sensing (CS) exploits the sparsity present in many common signals to reduce the number of measurements needed for digital acquisition. With this reduction would come, in theory, commensurate reductions in the size, weight, power consumption, and/or monetary cost of both signal sensors and any associated communication links. This paper examines the use of CS in the design of a wideband radio receiver in a noisy environment. We formulate the problem statement for such a receiver and establish a reasonable set of requirements that a receiver should meet to be practically useful. We then evaluate the performance of a CSbased receiver in two ways: via a theoretical analysis of the expected performance, with a particular emphasis on noise and dynamic range, and via simulations that compare the CS receiver against the performance expected from a conventional implementation. On the one hand, we show that CSbased systems that aim to reduce the number of acquired measurements are somewhat sensitive to signal noise, exhibiting a 3dB SNR loss per octave of subsampling which parallels the classic noisefolding phenomenon. On the other hand, we demonstrate that since they sample at a lower rate, CSbased systems can potentially attain a significantly larger dynamic range. Hence, we conclude that while a CSbased system has inherent limitations that do impose some restrictions on its potential applications, it also has attributes that make it highly desirable in a number of important practical settings. 1
On the fundamental limits of adaptive sensing
, 2011
"... Suppose we can sequentially acquire arbitrary linear measurements of an ndimensional vector x resulting in the linear model y = Ax + z, where z represents measurement noise. If the signal is known to be sparse, one would expect the following folk theorem to be true: choosing an adaptive strategy wh ..."
Abstract

Cited by 25 (3 self)
 Add to MetaCart
(Show Context)
Suppose we can sequentially acquire arbitrary linear measurements of an ndimensional vector x resulting in the linear model y = Ax + z, where z represents measurement noise. If the signal is known to be sparse, one would expect the following folk theorem to be true: choosing an adaptive strategy which cleverly selects the next row of A based on what has been previously observed should do far better than a nonadaptive strategy which sets the rows of A ahead of time, thus not trying to learn anything about the signal in between observations. This paper shows that the folk theorem is false. We prove that the advantages offered by clever adaptive strategies and sophisticated estimation procedures—no matter how intractable—over classical compressed acquisition/recovery schemes are, in general, minimal.
Rate Minimaxity of the Lasso and Dantzig Selector for the ℓq Loss in ℓr Balls
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2010
"... We consider the estimation of regression coefficients in a highdimensional linear model. For regression coefficients in ℓr balls, we provide lower bounds for the minimax ℓq risk and minimax quantiles of the ℓq loss for all design matrices. Under an ℓ0 sparsity condition on a target coefficient vect ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
We consider the estimation of regression coefficients in a highdimensional linear model. For regression coefficients in ℓr balls, we provide lower bounds for the minimax ℓq risk and minimax quantiles of the ℓq loss for all design matrices. Under an ℓ0 sparsity condition on a target coefficient vector, we sharpen and unify existing oracle inequalities for the Lasso and Dantzig selector. We derive oracle inequalities for target coefficient vectors with many small elements and smaller threshold levels than the universal threshold. These oracle inequalities provide sufficient conditions on the design matrix for the rate minimaxity of the Lasso and Dantzig selector for the ℓq risk and loss inℓr balls, 0≤r ≤ 1≤q≤∞. By allowing q=∞, our risk bounds imply the variable selection consistency of threshold Lasso and Dantzig selectors.
Likelihoodbased selection and sharp parameter estimation
 Journal of American Statistical Association
"... In highdimensionaldata analysis, feature selection becomes one effective means for dimension reduction, which proceeds with parameter estimation. Concerning accuracy of selection and estimation, we study nonconvex constrained and regularized likelihoods in the presence of nuisance parameters. Theor ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
(Show Context)
In highdimensionaldata analysis, feature selection becomes one effective means for dimension reduction, which proceeds with parameter estimation. Concerning accuracy of selection and estimation, we study nonconvex constrained and regularized likelihoods in the presence of nuisance parameters. Theoretically, we show that constrained L0likelihood and its computational surrogateare optimal in that they achieve feature selectionconsistencyandsharpparameterestimation, underonenecessaryconditionrequiredforanymethod to be selection consistent and to achieve sharp parameter estimation. It permits up to exponentially many candidate features. Computationally, we develop difference convex methods to implement the computational surrogate through prime and dual subproblems. These results establish a central role of L0constrained and regularized likelihoods in feature selection and parameter estimation involving selection. As applications of the general method and theory, we perform feature selection in linear regression and logistic regression, and estimate a precision matrix in Gaussian graphical models. In these situations, we gain a new theoretical insight and obtain favorable numerical results. Finally, we discuss an application to predict the metastasis status of breast cancer patients with their gene expression profiles.
Restricted eigenvalue conditions on subgaussian random matrices,” arXiv:0912.4045v2 [math.ST
, 2009
"... ar ..."
The Geometry of Differential Privacy: The Sparse and Approximate Cases
, 2012
"... In this work, we study tradeoffs between accuracy and privacy in the context of linear queries over histograms. This is a rich class of queries that includes contingency tables and range queries, and has been a focus of a long line of work [BLR08,RR10,DRV10,HT10,HR10,LHR+10,BDKT12]. For a given set ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
In this work, we study tradeoffs between accuracy and privacy in the context of linear queries over histograms. This is a rich class of queries that includes contingency tables and range queries, and has been a focus of a long line of work [BLR08,RR10,DRV10,HT10,HR10,LHR+10,BDKT12]. For a given set of d linear queries over a database x ∈ RN, we seek to find the differentially private mechanism that has the minimum mean squared error. For pure differential privacy, [HT10, BDKT12] give an O(log2 d) approximation to the optimal mechanism. Our first contribution is to give an O(log2 d) approximation guarantee for the case of (ε, δ)differential privacy. Our mechanism is simple, efficient and adds carefully chosen correlated Gaussian noise to the answers. We prove its approximation guarantee relative to the hereditary discrepancy lower bound of [MN12], using tools from convex geometry. We next consider this question in the case when the number of queries exceeds the number of individuals in the database, i.e. when d> n, ‖x‖1. The lower bounds used in the previous approximation algorithm no longer apply, and in fact better mechanisms are known in this setting [BLR08,RR10,HR10,GHRU11,GRU12]. Our second main contribution is to give an (ε, δ)differentially private mechanism that for a given query set A and an upper bound n on ‖x‖1, has mean squared error within polylog(d,N) of the optimal for A and n. This approximation is achieved by coupling the Gaussian noise addition approach with linear regression over the `1 ball. Additionally, we show a similar polylogarithmic approximation guarantee for the best εdifferentially private mechanism in this sparse setting. Our work also shows that for arbitrary counting queries, i.e. A with entries in {0, 1}, there is an εdifferentially private mechanism with expected error Õ(√n) per query, improving on the Õ(n 2 3) bound of [BLR08], and matching the lower bound implied by [DN03] up to logarithmic factors. The connection between hereditary discrepancy and the privacy mechanism enables us to derive the first polylogarithmic approximation to the hereditary discrepancy of a matrix A.
Hypothesis testing in highdimensional regression under the gaussian random design model: Asymptotic theory. arXiv
, 2013
"... We consider linear regression in the highdimensional regime in which the number of observations n is smaller than the number of parameters p. A very successful approach in this setting uses ℓ1penalized least squares (a.k.a. the Lasso) to search for a subset of s0 < n parameters that best explai ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
We consider linear regression in the highdimensional regime in which the number of observations n is smaller than the number of parameters p. A very successful approach in this setting uses ℓ1penalized least squares (a.k.a. the Lasso) to search for a subset of s0 < n parameters that best explain the data, while setting the other parameters to zero. A considerable amount of work has been devoted to characterizing the estimation and model selection problems within this approach. In this paper we consider instead the fundamental, but far less understood, question of statistical significance. We study this problem under the random design model in which the rows of the design matrix are i.i.d. and drawn from a highdimensional Gaussian distribution. This situation arises, for instance, in learning highdimensional Gaussian graphical models. Leveraging on an asymptotic distributional characterization of regularized least squares estimators, we develop a procedure for computing pvalues and hence assessing statistical significance for hypothesis testing. We characterize the statistical power of this procedure, and evaluate it on synthetic and real data, comparing it with earlier proposals. Finally, we provide an upper bound on the minimax power of tests with a given significance level and show that our proposed procedure achieves this bound in case of design matrices with i.i.d. Gaussian entries. 1
Minimax sparse principal subspace estimation in high dimensions
 In: Ann. Statist
, 2013
"... ar ..."
Optimal computational and statistical rates of convergence for sparse nonconvex learning problems. arXiv preprint, arXiv
, 2013
"... We provide theoretical analysis of the statistical and computational properties of penalized Mestimators that can be formulated as the solution to a possibly nonconvex optimization problem. Many important estimators fall in this category, including least squares regression with nonconvex regulariz ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
(Show Context)
We provide theoretical analysis of the statistical and computational properties of penalized Mestimators that can be formulated as the solution to a possibly nonconvex optimization problem. Many important estimators fall in this category, including least squares regression with nonconvex regularization, generalized linear models with nonconvex regularization, and sparse elliptical random design regression. For these problems, it is intractable to calculate the global solution due to the nonconvex formulation. In this paper, we propose an approximate regularization path following method for solving a variety of learning problems with nonconvex objective functions. Under a unified analytic framework, we simultaneously provide explicit statistical and computational rates of convergence of any local solution obtained by the algorithm. Computationally, our algorithm attains a global geometric rate of convergence for calculating the full regularization path, which is optimal among all firstorder algorithms. Unlike most existing methods that only attain geometric rates of convergence for one single regularization parameter, our algorithm calculates the full regularization path with the same iteration complexity. In particular, we provide a refined iteration complexity bound to sharply characterize the performance of each stage along the regularization path. Statistically, we provide sharp sample complexity analysis for all the approximate local solutions along the regularization path. In particular, our analysis improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimator. These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty. 1