Results 1  10
of
74
Regularization paths for generalized linear models via coordinate descent
, 2009
"... We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, twoclass logistic regression, and multinomial regression problems while the penalties include ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two (the elastic ..."
Abstract

Cited by 698 (14 self)
 Add to MetaCart
We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, twoclass logistic regression, and multinomial regression problems while the penalties include ℓ1 (the lasso), ℓ2 (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.
Top 10 algorithms in data mining
, 2007
"... Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, kMeans, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining a ..."
Abstract

Cited by 113 (2 self)
 Add to MetaCart
Abstract This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, kMeans, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification,
Building predictive models in R using the caret package
 J. of Stat. Soft
, 2008
"... The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes metho ..."
Abstract

Cited by 53 (1 self)
 Add to MetaCart
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for preprocessing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.
FlexMix Version 2: Finite Mixtures with Concomitant Variables and Varying and Constant Parameters
"... This article is a (slightly) modified version of Grün and Leisch (2008b), published in the Journal of Statistical Software. flexmix provides infrastructure for flexible fitting of finite mixture models in R using the expectationmaximization (EM) algorithm or one of its variants. The functionality o ..."
Abstract

Cited by 41 (13 self)
 Add to MetaCart
This article is a (slightly) modified version of Grün and Leisch (2008b), published in the Journal of Statistical Software. flexmix provides infrastructure for flexible fitting of finite mixture models in R using the expectationmaximization (EM) algorithm or one of its variants. The functionality of the package was enhanced. Now concomitant variable models as well as varying and constant parameters for the component specific generalized linear regression models can be fitted. The application of the package is demonstrated on several examples, the implementation described and examples given to illustrate how new drivers for the component specific models and the concomitant variable models can be defined.
Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent
"... We introduce a pathwise algorithm for the Cox proportional hazards model, regularized by convex combinations of ℓ1 and ℓ2 penalties (elastic net). Our algorithm fits via cyclical coordinate descent, and employs warm starts to find a solution along a regularization path. We demonstrate the efficacy o ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
We introduce a pathwise algorithm for the Cox proportional hazards model, regularized by convex combinations of ℓ1 and ℓ2 penalties (elastic net). Our algorithm fits via cyclical coordinate descent, and employs warm starts to find a solution along a regularization path. We demonstrate the efficacy of our algorithm on real and simulated data sets, and find considerable speedup between our algorithm and competing methods.
The VGAM Package for Categorical Data Analysis
"... Classical categorical regression models such as the multinomial logit and proportional odds models are shown to be readily handled by the vector generalized linear and additive model (VGLM/VGAM) framework. Additionally, there are natural extensions, such as reducedrank VGLMs for dimension reduction ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
Classical categorical regression models such as the multinomial logit and proportional odds models are shown to be readily handled by the vector generalized linear and additive model (VGLM/VGAM) framework. Additionally, there are natural extensions, such as reducedrank VGLMs for dimension reduction, and allowing covariates that have values specific to each linear/additive predictor, e.g., for consumer choice modeling. This article describes some of the framework behind the VGAM R package, its usage and implementation details.
The Goldilocks effect: Human infants allocate attention to visual sequences that are neither too simple nor too complex
 PLoS ONE
, 2012
"... Human infants, like immature members of any species, must be highly selective in sampling information from their environment to learn efficiently. Failure to be selective would waste precious computational resources on material that is already known (too simple) or unknowable (too complex). In two e ..."
Abstract

Cited by 26 (5 self)
 Add to MetaCart
(Show Context)
Human infants, like immature members of any species, must be highly selective in sampling information from their environment to learn efficiently. Failure to be selective would waste precious computational resources on material that is already known (too simple) or unknowable (too complex). In two experiments with 7 and 8montholds, we measure infants ’ visual attention to sequences of events varying in complexity, as determined by an ideal learner model. Infants’ probability of looking away was greatest on stimulus items whose complexity (negative log probability) according to the model was either very low or very high. These results suggest a principle of infant attention that may have broad applicability: infants implicitly seek to maintain intermediate rates of information absorption and avoid wasting cognitive resources on overly simple or overly complex events.
Identifying Differentially Expressed Genes in Time Course Microarray Data
"... Abstract Identifying differentially expressed (DE) genes across conditions or treatments is a typical problem in microarray experiments. In time course microarray experiments (under two or more conditions/treatments), it is sometimes of interest to identify two classes of DE genes: those with no tim ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Abstract Identifying differentially expressed (DE) genes across conditions or treatments is a typical problem in microarray experiments. In time course microarray experiments (under two or more conditions/treatments), it is sometimes of interest to identify two classes of DE genes: those with no timecondition interactions (called parallel DE genes, or PDE), and those with timecondition interactions (nonparallel DE genes, NPDE). Although many methods have been proposed for identifying DE genes in time course experiments, methods for discerning NPDE genes from the general DE genes are still lacking. We propose a functional ANOVA mixedeffect model to model time course gene expression observations. The fixed effect of (the mean curve) of the model decomposes bivariate functions of time and treatments (or experimental conditions) as in the classic ANOVA method and provides the associated notions of main effects and interactions. Random effects capture timedependent correlation structures. In this model, identifying NPDE genes is equivalent to testing the significance of the timecondition interaction, for which an approximate Ftest is suggested. We examined the performance of the proposed method on simulated datasets in comparison with some existing methods, and applied the method to a study of human reaction to the endotoxin stimulation, as well as to a cell cycle expression data set.
M.˜Schmid. A framework for unbiased model selection based on boosting
 Journal of Computational and Graphical Statistics
"... Variable selection and model choice are of major concern in many statistical applications, especially in highdimensional regression models. Boosting is a convenient statistical method that combines model fitting with intrinsic model selection. We investigate the impact of baselearner specification ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
Variable selection and model choice are of major concern in many statistical applications, especially in highdimensional regression models. Boosting is a convenient statistical method that combines model fitting with intrinsic model selection. We investigate the impact of baselearner specification on the performance of boosting as a model selection procedure. We show that variable selection may be biased if the covariates are of different nature. Important examples are models combining continuous and categorical covariates, especially if the number of categories is large. In this case, least squares baselearners offer increased flexibility for the categorical covariate and lead to a preference even if the categorical covariate is noninformative. Similar difficulties arise when comparing linear and nonlinear baselearners for a continuous covariate. The additional flexibility in the nonlinear baselearner again yields a preference of the more complex modeling alternative. We investigate these problems from a theoretical perspective and suggest a framework for unbiased model selection based on a general class of penalized least squares baselearners. Making all baselearners comparable in terms of their degrees of freedom strongly reduces the selection bias observed in naive boosting specifications. The importance of unbiased model selection is demonstrated in simulations and an application to forest health models.
Zscore linear discriminant analysis for EEG based braincomputer interfaces. PloS one. 2013; 8(9):e74433. doi: 10.1371/journal.pone.0074433 PMID
"... Linear discriminant analysis (LDA) is one of the most popular classification algorithms for braincomputer interfaces (BCI). LDA assumes Gaussian distribution of the data, with equal covariance matrices for the concerned classes, however, the assumption is not usually held in actual BCI applications ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Linear discriminant analysis (LDA) is one of the most popular classification algorithms for braincomputer interfaces (BCI). LDA assumes Gaussian distribution of the data, with equal covariance matrices for the concerned classes, however, the assumption is not usually held in actual BCI applications, where the heteroscedastic class distributions are usually observed. This paper proposes an enhanced version of LDA, namely zscore linear discriminant analysis (ZLDA), which introduces a new decision boundary definition strategy to handle with the heteroscedastic class distributions. ZLDA defines decision boundary through zscore utilizing both mean and standard deviation information of the projected data, which can adaptively adjust the decision boundary to fit for heteroscedastic distribution situation. Results derived from both simulation dataset and two actual BCI datasets consistently show that ZLDA achieves significantly higher average classification accuracies than conventional LDA, indicating the superiority of the new proposed decision boundary definition strategy.