Results 1  10
of
57
2010): “Sparse Models and Methods for Optimal Instruments with an Application to Eminent Domain,” Arxiv Working Paper
"... Abstract. We develop results for the use of Lasso and PostLasso methods to form firststage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p. Our results apply even when p is much larger than the sample size, n. We show that the IV e ..."
Abstract

Cited by 55 (19 self)
 Add to MetaCart
Abstract. We develop results for the use of Lasso and PostLasso methods to form firststage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p. Our results apply even when p is much larger than the sample size, n. We show that the IV estimator based on using Lasso or PostLasso in the first stage is rootn consistent and asymptotically normal when the firststage is approximately sparse; i.e. when the conditional expectation of the endogenous variables given the instruments can be wellapproximated by a relatively small set of variables whose identities may be unknown. We also show the estimator is semiparametrically efficient when the structural error is homoscedastic. Notably our results allow for imperfect model selection, and do not rely upon the unrealistic ”betamin ” conditions that are widely used to establish validity of inference following model selection. In simulation experiments, the Lassobased IV estimator with a datadriven penalty performs well compared to recently advocated manyinstrumentrobust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lassobased IV estimator outperforms an intuitive benchmark. Optimal instruments are conditional expectations. In developing the IV results, we estab
Confidence intervals for lowdimensional parameters with highdimensional data
 ArXiv.org
"... Abstract. The purpose of this paper is to propose methodologies for statistical inference of lowdimensional parameters with highdimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, alth ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The purpose of this paper is to propose methodologies for statistical inference of lowdimensional parameters with highdimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broader context. The theoretical results presented here provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finitedimensional covariance matrices. These sufficient conditions allow the number of variables to far exceed the sample size. The simulation results presented here demonstrate the accuracy of the coverage probability of the proposed confidence intervals, strongly supporting the theoretical results.
Spectral Analysis of Nonuniformly Sampled Data and Applications
, 2012
"... Signal acquisition, signal reconstruction and analysis of spectrum of the signal are the three most important steps in signal processing and they are found in almost all of the modern day hardware. In most of the signal processing hardware, the signal of interest is sampled at uniform intervals sati ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
Signal acquisition, signal reconstruction and analysis of spectrum of the signal are the three most important steps in signal processing and they are found in almost all of the modern day hardware. In most of the signal processing hardware, the signal of interest is sampled at uniform intervals satisfying some conditions like Nyquist rate. However, in some cases the privilege of having uniformly sampled data is lost due to some constraints on the hardware resources. In this thesis an important problem of signal reconstruction and spectral analysis from nonuniformly sampled data is addressed and a variety of methods are presented. The proposed methods are tested via numerical experiments on both artificial and reallife data sets. The thesis starts with a brief review of methods available in the literature for signal reconstruction and spectral analysis from non uniformly sampled data. The methods discussed in the thesis are classified into two broad categories dense and sparse methods, the classification is based on the kind of spectra for which they are applicable. Under dense spectral methods the main contribution of the thesis is a nonparametric approach named LIMES, which recovers the smooth spectrum from non uniformly sampled data. Apart from recovering
Robust Subspace Clustering
, 2013
"... Subspace clustering refers to the task of finding a multisubspace representation that best fits a collection of points taken from a highdimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demo ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
(Show Context)
Subspace clustering refers to the task of finding a multisubspace representation that best fits a collection of points taken from a highdimensional space. This paper introduces an algorithm inspired by sparse subspace clustering (SSC) [17] to cluster noisy data, and develops some novel theory demonstrating its correctness. In particular, the theory uses ideas from geometric functional analysis to show that the algorithm can accurately recover the underlying subspaces under minimal requirements on their orientation, and on the number of samples per subspace. Synthetic as well as real data experiments complement our theoretical study, illustrating our approach and demonstrating its effectiveness.
Highdimensional regression with unknown variance
 SUBMITTED TO THE STATISTICAL SCIENCE
, 2012
"... We review recent results for highdimensional sparse linear regression in the practical case of unknown variance. Different sparsity settings are covered, including coordinatesparsity, groupsparsity and variationsparsity. The emphasis is put on nonasymptotic analyses and feasible procedures. In ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
We review recent results for highdimensional sparse linear regression in the practical case of unknown variance. Different sparsity settings are covered, including coordinatesparsity, groupsparsity and variationsparsity. The emphasis is put on nonasymptotic analyses and feasible procedures. In addition, a small numerical study compares the practical performance of three schemes for tuning the Lasso estimator and some references are collected for some more general models, including multivariate regression and nonparametric regression.
2013): Program Evaluation with HighDimensional Data,Working paper
"... Abstract. In the first part of the paper, we consider estimation and inference on policy relevant treatment effects, such as local average and quantile treatment effects, in a datarich environment where there may be many more control variables available than there are observations. In addition to a ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract. In the first part of the paper, we consider estimation and inference on policy relevant treatment effects, such as local average and quantile treatment effects, in a datarich environment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows endogenous receipt of treatment, heterogeneous treatment effects, and functionvalued outcomes. To make informative inference possible, we assume that some reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the control variables and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of the control variables whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to proceed after datadriven selection of control variables. We provide conditions under which postselection inference is uniformly valid across a widerange of models and show that a key condition underlying the uniform validity of postselection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) participation on accumulated assets.
Fused sparsity and robust estimation for linear models with unknown variance
 In NIPS
, 2012
"... with unknown variance ..."
(Show Context)
Tiger: A tuninginsensitive approach for optimally estimating gaussian graphical models
, 2012
"... We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuningfree and nonasymptotically tuninginsensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is signifi ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
(Show Context)
We propose a new procedure for estimating high dimensional Gaussian graphical models. Our approach is asymptotically tuningfree and nonasymptotically tuninginsensitive: it requires very few efforts to choose the tuning parameter in finite sample settings. Computationally, our procedure is significantly faster than existing methods due to its tuninginsensitive property. Theoretically, the obtained estimator is simultaneously minimax optimal for precision matrix estimation under different norms. Empirically, we illustrate the advantages of our method using thorough simulated and real examples. The R package bigmatrix implementing the proposed methods is available on the Comprehensive R Archive Network:
An R package flare for high dimensional linear regression and precision matrix estimator
, 2012
"... This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD lasso, SQRT lasso, Lq lasso and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). The proposed solver is based on the alternating dir ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD lasso, SQRT lasso, Lq lasso and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME). The proposed solver is based on the alternating direction method of multipliers (ADMM), which is further combined with the linearization and coordinate descent algorithm. The package flare is coded in C and has a friendly user interface. The memory usage is optimized by using the sparse matrix output. The experiments show that flare is efficient and can scale up to large problems.