Results 11  20
of
646
On the conditions used to prove oracle results for the Lasso
 Electron. J. Stat
"... Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue conditio ..."
Abstract

Cited by 107 (5 self)
 Add to MetaCart
(Show Context)
Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition [2] or the slightly weaker compatibility condition [18] are sufficient for oracle results. We argue that both these conditions allow for a fairly general class of design matrices. Hence, optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence [5, 4] or restricted isometry [10] assumptions.
Boosting algorithms: Regularization, prediction and model fitting
 Statistical Science
, 2007
"... Abstract. We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and correspo ..."
Abstract

Cited by 96 (12 self)
 Add to MetaCart
(Show Context)
Abstract. We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and corresponding Akaike or Bayesian information criteria, particularly useful for regularization and variable selection in highdimensional covariate spaces, are discussed as well. The practical aspects of boosting procedures for fitting statistical models are illustrated by means of the dedicated opensource software package mboost. This package implements functions which can be used for model fitting, prediction and variable selection. It is flexible, allowing for the implementation of new boosting algorithms optimizing userspecified loss functions. Key words and phrases: Generalized linear models, generalized additive models, gradient boosting, survival analysis, variable selection, software. 1.
Adaptive Lasso for sparse highdimensional regression
 University of Iowa
, 2006
"... Summary. We study the asymptotic properties of adaptive LASSO estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. We consider variable selection using the adaptive LASSO, where the L1 norms in the penalty are reweighted b ..."
Abstract

Cited by 92 (10 self)
 Add to MetaCart
Summary. We study the asymptotic properties of adaptive LASSO estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. We consider variable selection using the adaptive LASSO, where the L1 norms in the penalty are reweighted by datadependent weights. We show that, if a reasonable initial estimator is available, then under appropriate conditions, adaptive LASSO correctly select covariates with nonzero coefficients with probability converging to one and that the estimators of nonzero coefficients have the same asymptotic distribution that they would have if the zero coefficients were known in advance. Thus, the adaptive LASSO has an oracle property in the sense of Fan and Li (2001) and Fan and Peng (2004). In addition, under a partial orthogonality condition in which the covariates with zero coefficients are weakly correlated with the covariates with nonzero coefficients, univariate regression can be used to obtain the initial estimator. With this initial estimator, adaptive LASSO has the oracle property even when the number of covariates is greater than the sample size. Key Words and phrases. Penalized regression, highdimensional data, variable selection, asymptotic normality, oracle property, zeroconsistency. Short title. Sparse highdimensional regression
H (2007): Networkconstrained regularization and variable selection for analysis of genomic data. UPenn Biostatistics Working Paper
"... Motivation: Graphs or networks are common ways of depicting information. In biology in particular, many different biological processes are represented by graphs, such as regulatory networks or metabolic pathways. This kind of a priori information gathered over many years of biomedical research is a ..."
Abstract

Cited by 91 (5 self)
 Add to MetaCart
Motivation: Graphs or networks are common ways of depicting information. In biology in particular, many different biological processes are represented by graphs, such as regulatory networks or metabolic pathways. This kind of a priori information gathered over many years of biomedical research is a useful supplement to the standard numerical genomic data such as microarray gene expression data. How to incorporate information encoded by the known biological networks or graphs into analysis of numerical data raises interesting statistical challenges. In this paper, we introduce a networkconstrained regularization procedure for linear regression analysis in order to incorporate the information from these graphs into an analysis of the numerical data, where the network is represented as a graph and its corresponding Laplacian matrix. We define a networkconstrained penalty function that penalizes the L1norm of the coefficients but encourages smoothness of the coefficients on the network. Results: Simulation studies indicated that the method is quite effective in identifying genes and subnetworks that are related to disease and has higher sensitivity than the commonly used procedures that do not use the pathway structure information. Application to one glioblastoma microarray gene expression dataset identified several subnetworks on several of the KEGG transcriptional pathways that are related to survival from glioblastoma, many of which were supported by published literatures. Conclusions: The proposed networkconstrained regularization procedure efficiently utilizes the known pathway structures in identifying the relevant genes and the subnetworks that might be related to phenotype in a general regression fraemwork. As more biological networks are identified and documented in databases, the proposed method should find more applications in identifying the subnetworks that are related to diseases and other biological processes. Contact: Hongzhe Li,
Supnorm convergence rate and sign concentration property of Lasso and Dantzig estimators
 ELECTRONIC JOURNAL OF STATISTICS
, 2008
"... ..."
Bolasso: model consistent lasso estimation through the bootstrap
 In Proceedings of the Twentyfifth International Conference on Machine Learning (ICML
, 2008
"... We consider the leastsquare linear regression problem with regularization by the ℓ1norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic ..."
Abstract

Cited by 84 (15 self)
 Add to MetaCart
(Show Context)
We consider the leastsquare linear regression problem with regularization by the ℓ1norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository. 1.
Highdimensional additive modeling
 Annals of Statistics
"... We propose a new sparsitysmoothness penalty for highdimensional generalized additive models. The combination of sparsity and smoothness is crucial for mathematical theory as well as performance for finitesample data. We present a computationally efficient algorithm, with provable numerical conver ..."
Abstract

Cited by 81 (3 self)
 Add to MetaCart
(Show Context)
We propose a new sparsitysmoothness penalty for highdimensional generalized additive models. The combination of sparsity and smoothness is crucial for mathematical theory as well as performance for finitesample data. We present a computationally efficient algorithm, with provable numerical convergence properties, for optimizing the penalized likelihood. Furthermore, we provide oracle results which yield asymptotic optimality of our estimator for highdimensional but sparse additive models. Finally, an adaptive version of our sparsitysmoothness penalized approach yields large additional performance gains. 1
Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity
"... Abstract—A cooperative approach to the sensing task of wireless cognitive radio (CR) networks is introduced based on a basis expansion model of the power spectral density (PSD) map in space and frequency. Joint estimation of the model parameters enables identification of the (un)used frequency bands ..."
Abstract

Cited by 79 (7 self)
 Add to MetaCart
(Show Context)
Abstract—A cooperative approach to the sensing task of wireless cognitive radio (CR) networks is introduced based on a basis expansion model of the power spectral density (PSD) map in space and frequency. Joint estimation of the model parameters enables identification of the (un)used frequency bands at arbitrary locations, and thus facilitates spatial frequency reuse. The novel scheme capitalizes on two forms of sparsity: the first one introduced by the narrowband nature of transmitPSDs relative to the broad swaths of usable spectrum; and the second one emerging from sparsely located active radios in the operational space. An estimator of the model coefficients is developed based on the Lasso algorithm to exploit these forms of sparsity and reveal the unknown positions of transmitting CRs. The resultant scheme can be implemented via distributed online iterations, which solve quadratic programs locally (one per radio), and are adaptive to changes in the system. Simulations corroborate that exploiting sparsity in CR sensing reduces spatial and frequency spectrum leakage by 15 dB relative to leastsquares (LS) alternatives. Index Terms—Cognitive radios, compressive sampling, cooperative systems, distributed estimation, parallel network processing, sensing, sparse models, spectral analysis. I.
An interiorpoint method for largescale ℓ1regularized logistic regression
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2007
"... Recently, a lot of attention has been paid to ℓ1regularization based methods for sparse signal reconstruction (e.g., basis pursuit denoising and compressed sensing) and feature selection (e.g., the Lasso algorithm) in signal processing, statistics, and related fields. These problems can be cast as ..."
Abstract

Cited by 76 (4 self)
 Add to MetaCart
Recently, a lot of attention has been paid to ℓ1regularization based methods for sparse signal reconstruction (e.g., basis pursuit denoising and compressed sensing) and feature selection (e.g., the Lasso algorithm) in signal processing, statistics, and related fields. These problems can be cast as ℓ1regularized leastsquares programs (LSPs), which can be reformulated as convex quadratic programs, and then solved by several standard methods such as interiorpoint methods, at least for small and medium size problems. In this paper, we describe a specialized interiorpoint method for solving largescale ℓ1regularized LSPs that uses the preconditioned conjugate gradients algorithm to compute the search direction. The interiorpoint method can solve large sparse problems, with a million variables and observations, in a few tens of minutes on a PC. It can efficiently solve large dense problems, that arise in sparse signal recovery with orthogonal transforms, by exploiting fast algorithms for these transforms. The method is illustrated on a magnetic resonance imaging data set.
A SELECTIVE OVERVIEW OF VARIABLE SELECTION IN HIGH DIMENSIONAL FEATURE SPACE
, 2010
"... High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded ..."
Abstract

Cited by 70 (6 self)
 Add to MetaCart
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of nonconcave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultrahigh dimensional variable selection, with emphasis on independence screening and twoscale methods.