Results 1  10
of
55
INFERENCE ON TREATMENT EFFECTS AFTER SELECTION AMONGST HIGHDIMENSIONAL CONTROLS
"... We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly nonGaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger ..."
Abstract

Cited by 30 (6 self)
 Add to MetaCart
We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly nonGaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the “postdoubleselection ” method. Our results apply to Lassotype methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties. The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard postmodel selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the longstanding problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates.
Highdimensional instrumental variables regression and confidence sets, unpublished manuscript
, 2011
"... Abstract. We propose an instrumental variables method for estimation in linear models with endogenous regressors in the highdimensional setting where the sample size n can be smaller than the number of possible regressors K, and L ≥ K instruments. We allow for heteroscedasticity and we do not need ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Abstract. We propose an instrumental variables method for estimation in linear models with endogenous regressors in the highdimensional setting where the sample size n can be smaller than the number of possible regressors K, and L ≥ K instruments. We allow for heteroscedasticity and we do not need a prior knowledge of variances of the errors. We suggest a new procedure called the STIV (Self Tuning Instrumental Variables) estimator, which is realized as a solution of a conic optimization program. The main results of the paper are upper bounds on the estimation error of the vector of coefficients in ℓpnorms for 1 ≤ p ≤ ∞ that hold with probability close to 1, as well as the corresponding confidence intervals. All results are nonasymptotic. These bounds are meaningful under the assumption that the true structural model is sparse, i.e., the vector of coefficients has few nonzero coordinates (less than the sample size n) or many coefficients are too small to matter. In our IV regression setting, the standard tools from the literature on sparsity, such as the restricted eigenvalue assumption are inapplicable. Therefore, for our analysis we develop a new approach based on datadriven sensitivity characteristics. We show that, under appropriate assumptions, a thresholded STIV estimator correctly selects the nonzero coefficients with probability close to 1. The price to pay for not knowing which coefficients are nonzero and which instruments to use is of the order √ log(L) in the rate of convergence.
2013): Program Evaluation with HighDimensional Data,Working paper
"... Abstract. In the first part of the paper, we consider estimation and inference on policy relevant treatment effects, such as local average and quantile treatment effects, in a datarich environment where there may be many more control variables available than there are observations. In addition to a ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
Abstract. In the first part of the paper, we consider estimation and inference on policy relevant treatment effects, such as local average and quantile treatment effects, in a datarich environment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows endogenous receipt of treatment, heterogeneous treatment effects, and functionvalued outcomes. To make informative inference possible, we assume that some reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the control variables and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of the control variables whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to proceed after datadriven selection of control variables. We provide conditions under which postselection inference is uniformly valid across a widerange of models and show that a key condition underlying the uniform validity of postselection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) participation on accumulated assets.
Inference for highdimensional sparse econometric models
 Advances in Economics and Econometrics. 10th World Congress of Econometric Society
, 2011
"... ar ..."
Robust Inference on Average Treatment Effects with Possibly More Covariates than Observations
, 2013
"... This paper concerns robust inference on average treatment effects following model selection. In the selection on observables framework, we show how to construct confidence intervals based on a doublyrobust estimator that are robust to model selection errors and prove that they are valid uniformly o ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
This paper concerns robust inference on average treatment effects following model selection. In the selection on observables framework, we show how to construct confidence intervals based on a doublyrobust estimator that are robust to model selection errors and prove that they are valid uniformly over a large class of treatment effect models. The class allows for multivalued treatments with heterogeneous effects (in observables), general heteroskedasticity, and selection amongst (possibly) more covariates than observations. Our estimator attains the semiparametric efficiency bound under appropriate conditions. Precise conditions are given for any model selector to yield these results, and we show how to combine datadriven selection with economic theory. For implementation, we give a specific proposal for selection based on the group lasso and derive new technical results for highdimensional, sparse multinomial logistic regression. A simulation study shows our estimator performs very well in finite samples over a wide range of models. Revisiting the National Supported Work demonstration data, our method yields accurate estimates and tight confidence intervals.
Uniform Post Selection Inference for LAD Regression and Other ZEstimation Problems.” arXiv eprint
, 2013
"... Abstract. We develop uniformly valid confidence regions for a regression coefficient in a highdimensional sparse LAD (least absolute deviation or median) regression model. The setting is one where the number of regressors could be large in comparison to the sample size , but only ≪ of them are nee ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
Abstract. We develop uniformly valid confidence regions for a regression coefficient in a highdimensional sparse LAD (least absolute deviation or median) regression model. The setting is one where the number of regressors could be large in comparison to the sample size , but only ≪ of them are needed to accurately describe the regression function. Our new methods are based on the instrumental LAD regression estimator that assembles the optimal estimating equation from either post ℓ 1 penalized LAD regression or ℓ 1 penalized LAD regression. The estimating equation is immunized against nonregular estimation of nuisance part of the regression function, in the sense of Neyman. We establish that in a homoscedastic regression model, under certain conditions, the instrumental LAD regression estimator of the regression coefficient is asymptotically rootnormal uniformly with respect to the underlying sparse model. The resulting confidence regions are valid uniformly with respect to the underlying model. The new inference methods outperform the naive, "oracle based" inference methods, which are known to be not uniformly valid with coverage property failing to hold uniformly with respect the underlying model even in the setting with = 2. We also provide MonteCarlo experiments which demonstrate that standard postselection inference breaks down over large parts of the parameter space, and the proposed method does not.
2010): “LASSO Methods for Gaussian Instrumental Variables Models,” arXiv:[math.ST
"... Abstract. In this note, we propose to use sparse methods (e.g. LASSO, PostLASSO, LASSO, and Post LASSO) to form firststage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p, in the canonical Gaussian case. The methods apply even whe ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In this note, we propose to use sparse methods (e.g. LASSO, PostLASSO, LASSO, and Post LASSO) to form firststage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p, in the canonical Gaussian case. The methods apply even when p is much larger than the sample size, n. We derive asymptotic distributions for the resulting IV estimators and provide conditions under which these sparsitybased IV estimators are asymptotically oracleefficient. In simulation experiments, a sparsitybased IV estimator with a datadriven penalty performs well compared to recently advocated manyinstrumentrobust procedures. We illustrate the procedure in an empirical example using the Angrist and Krueger (1991) schooling data. 1.
2013), “An Empirical Analysis of Systemic Risk in the EUROzone
"... We propose a new measure of systemic risk, which is based on estimating spillovers between funding costs of individual banks. The estimation proceeds in three steps: First, using data from liquidity auctions of the European Central Bank, we estimate the funding costs in a given week for each individ ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
We propose a new measure of systemic risk, which is based on estimating spillovers between funding costs of individual banks. The estimation proceeds in three steps: First, using data from liquidity auctions of the European Central Bank, we estimate the funding costs in a given week for each individual bank. In the second step, we apply the elastic net (a LASSO type estimator) to this panel to estimate the financial network. Finally, using the estimated network we propose a new measure of systemicness and vulnerability of each bank. Our measure of systemicness has quite a natural interpretation, since it can roughly be viewed as the total externality a bank would impose on the funding costs of all other banks in the system. We estimate that most of the banks have fairly weak links and, therefore, if one were to suffer an adverse shock there would likely be a rather limited effect on the other ones. On the other hand, there are a few banks that are quite central: an increase in their funding costs would result in a very significant increase (up to 140 basis points per 100 basis points shock) in the funding costs of the other
The Data Revolution and Economic Analysis *
, 2013
"... Abstract. Many believe that “big data ” will transform business, government and other aspects of the economy. In this article we discuss how new data may impact economic policy and economic research. Largescale administrative datasets and proprietary private sector data can greatly improve the way ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. Many believe that “big data ” will transform business, government and other aspects of the economy. In this article we discuss how new data may impact economic policy and economic research. Largescale administrative datasets and proprietary private sector data can greatly improve the way we measure, track and describe economic activity. They also can enable novel research designs that allow researchers to trace the consequences of different events or policies. We outline some of the challenges in accessing and making use of these data. We also consider whether the big data predictive modeling tools that have emerged in statistics and computer science may prove useful in economics.
INSTRUMENTAL VARIABLES ESTIMATION WITH MANY WEAK INSTRUMENTS USING REGULARIZED JIVE
"... Abstract. We consider instrumental variables regression in a setting where the number of instruments is large and the first stage prediction signal is not necessarily sparse. In particular, we work with models where the number of available instruments may be larger than the sample size and consisten ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. We consider instrumental variables regression in a setting where the number of instruments is large and the first stage prediction signal is not necessarily sparse. In particular, we work with models where the number of available instruments may be larger than the sample size and consistent model selection in the first stage may not be possible. Such a situation may arise when there are many weak instruments. With many weak instruments, regularization or model selection in the first stage can lead to a large bias relative to standard errors. We propose a jackknife instrumental variables estimator (JIVE) with regularization at each jackknife iteration. We derive the limiting behavior for a ridgeregularized JIV estimator (RJIVE), verifying that the RJIVE is consistent and asymptotically normal under conditions which allow for more instruments than observations and do not require consistent model selection. We provide simulation results that demonstrate the proposed RJIVE performs favorably in terms of size of tests and risk properties relative to other manyweak instrument estimation strategies in high dimensional settings. We also apply the RJIVE to the Angrist and Krueger (1991) example where it performs favorably relative to other manyinstrument robust procedures. Key Words: ridge regression, highdimensional models, endogeneity 1.