Results 1 - 10
of
38
Confidence intervals and hypothesis testing for highdimensional regression. arXiv: 1306.3171
"... Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely cha ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
(Show Context)
Fitting high-dimensional statistical models often requires the use of non-linear parameter estimation procedures. As a consequence, it is generally impossible to obtain an exact characterization of the probability distribution of the parameter estimates. This in turn implies that it is extremely challenging to quantify the uncertainty associated with a certain parameter estimate. Concretely, no commonly accepted procedure exists for computing classical measures of uncertainty and statistical significance as confidence intervals or p-values. We consider here a broad class regression problems, and propose an efficient algorithm for constructing confidence intervals and p-values. The resulting confidence intervals have nearly optimal size. When testing for the null hypothesis that a certain parameter is vanishing, our method has nearly optimal power. Our approach is based on constructing a ‘de-biased ’ version of regularized M-estimators. The new construction improves over recent work in the field in that it does not assume a special structure on the design matrix. Furthermore, proofs are remarkably simple. We test our method on a diabetes prediction problem. 1
2011): “Variance estimation using refitted cross-validation in ultrahigh dimensional regression,” forthcoming
- INFERENCE AFTER MODEL SELECTION 59
"... ar ..."
A significance test for the lasso
"... In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test st ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
In the sparse linear regression setting, we consider testing the significance of the predictor variable that enters the current lasso model, in the sequence of models visited along the lasso solution path. We propose a simple test statistic based on lasso fitted values, called the covariance test statistic, and show that when the true model is linear, this statistic has an Exp(1) asymptotic distribution under the null hypothesis (the null being that all truly active variables are contained in the current lasso model). Our proof of this result for the special case of the first predictor to enter the model (i.e., testing for a single significant predictor variable against the global null) requires only weak assumptions on the predictor matrix X. On the other hand, our proof for a general step in the lasso path places further technical assumptions on X and the generative model, but still allows for the important high-dimensional case p> n, and does not necessarily require that the current lasso model achieves perfect recovery of the truly active variables. Of course, for testing the significance of an additional variable between two nested linear models, one typically uses the chi-squared test, comparing the drop in residual sum of squares (RSS) to a χ 2 1 distribution. But when this additional variable is not fixed, and has been chosen adaptively or greedily, this test is no longer appropriate: adaptivity makes the drop in RSS stochastically much larger than χ 2 1 under the null hypothesis. Our analysis explicitly accounts for adaptivity, as it must, since the lasso builds an adaptive sequence of linear models as the tuning parameter λ decreases. In this analysis, shrinkage plays a key role: though additional variables are chosen adaptively, the coefficients of lasso active variables are shrunken due to the ℓ1 penalty. Therefore the test statistic (which is based on lasso fitted values) is in a sense balanced by these two opposing properties—adaptivity and shrinkage—and its null distribution is tractable and asymptotically Exp(1).
High-dimensional Inference: Confidence intervals, p-values and R-software hdi. arXiv:1408.4026v1
, 2014
"... Abstract. We present a (selective) review of recent frequentist high-dimensional inference methods for constructing p-values and confidence intervals in linear and generalized linear models. We include a broad, comparative empirical study which complements the viewpoint from statistical methodology ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract. We present a (selective) review of recent frequentist high-dimensional inference methods for constructing p-values and confidence intervals in linear and generalized linear models. We include a broad, comparative empirical study which complements the viewpoint from statistical methodology and theory. Furthermore, we introduce and il-lustrate the R-package hdi which easily allows the use of different meth-ods and supports reproducibility.
Exact Post Model Selection Inference for Marginal Screening
, 2014
"... We develop a framework for post model selection inference, via marginal screening, in linear regression. At the core of this framework is a result that characterizes the exact distribution of linear functions of the response y, conditional on the model being selected (“condi-tion on selection ” fram ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
We develop a framework for post model selection inference, via marginal screening, in linear regression. At the core of this framework is a result that characterizes the exact distribution of linear functions of the response y, conditional on the model being selected (“condi-tion on selection ” framework). This allows us to construct valid con-fidence intervals and hypothesis tests for regression coefficients that account for the selection procedure. In contrast to recent work in high-dimensional statistics, our results are exact (non-asymptotic) and re-quire no eigenvalue-like assumptions on the design matrix X. Further-more, the computational cost of marginal regression, constructing con-fidence intervals and hypothesis testing is negligible compared to the cost of linear regression, thus making our methods particularly suitable for extremely large datasets. Although we focus on marginal screening to illustrate the applicability of the condition on selection framework, this framework is much more broadly applicable. We show how to ap-ply the proposed framework to several other selection procedures in-cluding orthogonal matching pursuit, non-negative least squares, and marginal screening+Lasso. 1
1 Incorporating Group Correlations in Genome-Wide Association studies Using Smoothed Group Lasso
, 2011
"... In genome-wide association studies, penalization is becoming an important approach for identifying genetic markers associated with disease. Motivated by the fact that there exists natural grouping structure in SNPs and more importantly such groups are correlated, we propose a new penalization method ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In genome-wide association studies, penalization is becoming an important approach for identifying genetic markers associated with disease. Motivated by the fact that there exists natural grouping structure in SNPs and more importantly such groups are correlated, we propose a new penalization method for group variable selection which can properly accommodate the correlation between adjacent groups. This method is based on a combination of the group Lasso penalty and a quadratic penalty on difference of regression coefficients of adjacent groups. The new method is referred to as Smoothed Group Lasso, or SGL. It encourages group sparsity and smoothes regression coefficients for adjacent groups. Canonical correlations are applied to the weights between groups in the quadratic difference penalty. We derive a group coordinate descent algorithm for computing the solution path. This algorithm takes the solution of a closed form of SGL for a single group model and is efficient and stable in highdimensional settings. The SGL method is further extended to logistic regression for binary response. With the assistance of MM algorithm, the logistic regression model
High dimensional expectation-maximization algorithm: Statistical optimization and asymptotic normality. arXiv arXiv preprint:1412.8729
, 2014
"... We provide a general theory of the expectation-maximization (EM) algorithm for infer-ring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM algorithm which naturally incorporates sparsity structure ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We provide a general theory of the expectation-maximization (EM) algorithm for infer-ring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM algorithm which naturally incorporates sparsity structure into parameter estimation. With an appropriate initialization, this algorithm converges at a geometric rate and attains an estimator with the (near-)optimal statistical rate of convergence. (ii) Based on the obtained estimator, we propose new inferential procedures for testing hypotheses and constructing confidence intervals for low dimensional components of high dimensional parameters. For a broad family of statistical models, our framework establishes the first computationally feasible approach for optimal estimation and asymptotic inference in high dimensions. Our theory is supported by thorough numerical results. 1