Results 1  10
of
313
Regularization and variable selection via the Elastic Net
 Journal of the Royal Statistical Society, Series B
, 2005
"... Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where ..."
Abstract

Cited by 922 (13 self)
 Add to MetaCart
(Show Context)
Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the p n case. An algorithm called LARSEN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lasso.
Consensus clustering  A resamplingbased method for class discovery and visualization of gene expression microarray data
 MACHINE LEARNING 52 (2003) 91–118 FUNCTIONAL GENOMICS SPECIAL ISSUE
, 2003
"... ..."
(Show Context)
An Empirical Bayes Approach to Inferring LargeScale Gene Association Networks
 BIOINFORMATICS
, 2004
"... Motivation: Genetic networks are often described statistically by graphical models (e.g. Bayesian networks). However, inferring the network structure offers a serious challenge in microarray analysis where the sample size is small compared to the number of considered genes. This renders many standar ..."
Abstract

Cited by 234 (6 self)
 Add to MetaCart
Motivation: Genetic networks are often described statistically by graphical models (e.g. Bayesian networks). However, inferring the network structure offers a serious challenge in microarray analysis where the sample size is small compared to the number of considered genes. This renders many standard algorithms for graphical models inapplicable, and inferring genetic networks an “illposed” inverse problem. Methods: We introduce a novel framework for smallsample inference of graphical models from gene expression data. Specifically, we focus on socalled graphical Gaussian models (GGMs) that are now frequently used to describe gene association networks and to detect conditionally dependent genes. Our new approach is based on (i) improved (regularized) smallsample point estimates of partial correlation, (ii) an exact test of edge inclusion with adaptive estimation of the degree of freedom, and (iii) a heuristic network search based on false discovery rate multiple testing. Steps (ii) and (iii) correspond to an empirical Bayes estimate of the network topology. Results: Using computer simulations we investigate the sensitivity (power) and specificity (true negative rate) of the proposed framework to estimate GGMs from microarray data. This shows that it is possible to recover the true network topology with high accuracy even for smallsample data sets. Subsequently, we analyze gene expression data from a breast cancer tumor study and illustrate our approach by inferring a corresponding largescale gene association network for 3,883 genes. Availability: The authors have implemented the approach in the R package “GeneTS ” that is freely available from
Correcting sample selection bias by unlabeled data
"... We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We prese ..."
Abstract

Cited by 205 (12 self)
 Add to MetaCart
(Show Context)
We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.
Sparse graphical models for exploring gene expression data
 Journal of Multivariate Analysis
, 2004
"... DMS0112069. Any opinions, findings, and conclusions or recommendations expressed in this material are ..."
Abstract

Cited by 202 (24 self)
 Add to MetaCart
DMS0112069. Any opinions, findings, and conclusions or recommendations expressed in this material are
BagBoosting for tumor classification with gene expression data
 Bioinformatics
, 2004
"... Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection ..."
Abstract

Cited by 192 (2 self)
 Add to MetaCart
Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting.
Results: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasiguaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data.
Bayesian Factor Regression Models in the "Large p, Small n" Paradigm
 Bayesian Statistics
, 2003
"... TOR REGRESSION MODELS 1.1 SVD Regression Begin with the linear model y = X# + # where y is the nvector of responses, X is the n p matrix of predictors, # is the pvector regression parameter, and # , # I) is the nvector error term. Of key interest are cases when p >> n, when X is & ..."
Abstract

Cited by 184 (16 self)
 Add to MetaCart
TOR REGRESSION MODELS 1.1 SVD Regression Begin with the linear model y = X# + # where y is the nvector of responses, X is the n p matrix of predictors, # is the pvector regression parameter, and # , # I) is the nvector error term. Of key interest are cases when p >> n, when X is "long and skinny." The standard empirical factor (principal component) regression is best represented using the reduced singularvalue decomposition (SVD) of X, namely X = FA where F is the nk factor matrix (columns are factors, rows are samples) and A is the k p SVD "loadings" matrix, subject to AA # = I and F # F = D where D is the diagonal matrix of k positive singular values, arranged in decreasing order. This reduced form assumes factors with zero singular values have been ignored without loss; k with equality only if all singular values are positive. Now the regression transforms via X# = F# where # = A# is the kvector of regression parameters for the factor variables, representing
Boosting algorithms: Regularization, prediction and model fitting
 Statistical Science
, 2007
"... Abstract. We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and correspo ..."
Abstract

Cited by 96 (12 self)
 Add to MetaCart
(Show Context)
Abstract. We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival analysis. Concepts of degrees of freedom and corresponding Akaike or Bayesian information criteria, particularly useful for regularization and variable selection in highdimensional covariate spaces, are discussed as well. The practical aspects of boosting procedures for fitting statistical models are illustrated by means of the dedicated opensource software package mboost. This package implements functions which can be used for model fitting, prediction and variable selection. It is flexible, allowing for the implementation of new boosting algorithms optimizing userspecified loss functions. Key words and phrases: Generalized linear models, generalized additive models, gradient boosting, survival analysis, variable selection, software. 1.
Boosting for highdimensional linear models
 THE ANNALS OF STATISTICS
, 2006
"... We prove that boosting with the squared error loss, L2Boosting, is consistent for very highdimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as O(exp(sample size)), assuming that the true underlying regression function is sparse in terms of th ..."
Abstract

Cited by 80 (4 self)
 Add to MetaCart
We prove that boosting with the squared error loss, L2Boosting, is consistent for very highdimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as O(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the ℓ1norm of the regression coefficients. In the language of signal processing, this means consistency for denoising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the ℓ1norm. We also propose here an AICbased method for tuning, namely for choosing the number of boosting iterations. This makes L2Boosting computationally attractive since it is not required to run the algorithm multiple times for crossvalidation as commonly used so far. We demonstrate L2Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumorclassification problem with gene expression microarray data.