Results 1  10
of
106
On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes
, 2001
"... We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is i ..."
Abstract

Cited by 520 (8 self)
 Add to MetaCart
We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is increased, one in which each algorithm does better. This stems from the observation  which is borne out in repeated experiments  that while discriminative learning has lower asymptotic error, a generative classifier may also approach its (higher) asymptotic error much faster.
Flexible Discriminant Analysis by Optimal Scoring
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1993
"... Fisher's linear discriminant analysis is a valuable tool for multigroup classification. With a large number of predictors, one can nd a reduced number of discriminant coordinate functions that are "optimal" for separating the groups. With two such functions one can produce a classific ..."
Abstract

Cited by 143 (12 self)
 Add to MetaCart
Fisher's linear discriminant analysis is a valuable tool for multigroup classification. With a large number of predictors, one can nd a reduced number of discriminant coordinate functions that are "optimal" for separating the groups. With two such functions one can produce a classification map that partitions the reduced space into regions that are identified with group membership, and the decision boundaries are linear. This paper is about richer nonlinear classification schemes. Linear discriminant analysis is equivalent to multiresponse linear regression using optimal scorings to represent the groups. We obtain nonparametric versions of discriminant analysis by replacing linear regression by any nonparametric regression method. In this way, any multiresponse regression technique (such as MARS or neural networks) can be postprocessed to improve their classification performence.
Influence and Measurement Error in Logistic Regression
, 1983
"... This dissertation concerns the use of logistic regression when certain standard model assumptions are violated. Chapters I and II study the problem of estimating regression parameters when covariates are subject to measurement error. The latter chapters study robust methods applicable to logistic re ..."
Abstract

Cited by 44 (10 self)
 Add to MetaCart
This dissertation concerns the use of logistic regression when certain standard model assumptions are violated. Chapters I and II study the problem of estimating regression parameters when covariates are subject to measurement error. The latter chapters study robust methods applicable to logistic regression. To facilitate study of the errorsinvariables problem a small measurement error asymptotic theory is developed. This allows comparison of certain estimators which have appeared in the literature and also suggests new estimators which are shown to have better asymptotic properties. A small MonteCarlo study confirms the superiority of the new estimators in certain settings. In the course of studying the asymptotic behavior of the various estimators interesting use is made of some random convex analysis. To deal with the problem of messy data, i.e. outliers and extreme covariables, several bounded influence estimators are proposed. The optimality properties of these estimators are studied in Chapter III. Asymptotic theory for the robust procedures is given in Chapter IV. Finally, Chapter V concludes the thesis with an application of these methods to two sets of data.
Leafsnap: A Computer Vision System for Automatic Plant Species Identification
 Computer Vision – ECCV
"... Abstract. We describe the first mobile app for identifying plant species using automatic visual recognition. The system – called Leafsnap – identifies tree species from photographs of their leaves. Key to this system are computer vision components for discarding nonleaf images, segmenting the leaf ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
(Show Context)
Abstract. We describe the first mobile app for identifying plant species using automatic visual recognition. The system – called Leafsnap – identifies tree species from photographs of their leaves. Key to this system are computer vision components for discarding nonleaf images, segmenting the leaf from an untextured background, extracting features representing the curvature of the leaf’s contour over multiple scales, and identifying the species from a dataset of the 184 trees in the Northeastern United States. Our system obtains stateoftheart performance on the realworld images from the new Leafsnap Dataset – the largest of its kind. Throughout the paper, we document many of the practical steps needed to produce a computer vision system such as ours, which currently has nearly a million users. 1
Feature extraction by burstlike spike patterns in multiple sensory maps
 J Neurosci
, 1998
"... In most sensory systems, higher order central neurons extract those stimulus features from the sensory periphery that are behaviorally relevant (e.g., Marr, 1982; Heiligenberg, 1991). Recent studies have quantified the timevarying information carried by spike trains of sensory neurons in various sy ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
(Show Context)
In most sensory systems, higher order central neurons extract those stimulus features from the sensory periphery that are behaviorally relevant (e.g., Marr, 1982; Heiligenberg, 1991). Recent studies have quantified the timevarying information carried by spike trains of sensory neurons in various systems using stimulus estimation methods (Bialek et al., 1991; Wessel et al., 1996). Here, we address the question of how this information is transferred from the sensory neuron level to higher order neurons across multiple sensory maps by using the electrosensory system in weakly electric fish as a model. To determine how electric field amplitude modulations are temporally encoded and processed at two subsequent stages of the amplitude coding pathway, we recorded the responses of Ptype afferents and E and Itype pyramidal cells in the electrosensory lateral line lobe (ELL) to random distortions of a mimic of the fish’s own
A Theory of Multiple Classifier Systems And Its Application to Visual Word Recognition
, 1992
"... Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned w ..."
Abstract

Cited by 34 (8 self)
 Add to MetaCart
Despite the success of many pattern recognition systems in constrained domains, problems that involve noisy input and many classes remain difficult. A promising direction is to use several classifiers simultaneously, such that they can complement each other in correctness. This thesis is concerned with decision combination in a multiple classifier system that is critical to its success. A multiple classifier system consists of a set of classifiers and a decision combination function. It is a preferred solution to a complex recognition problem because it allows simultaneous use of feature descriptors of many types, corresponding measures of similarity, and many classification procedures. It also allows dynamic selection, so that classifiers adapted to inputs of a particular type may be applied only when those inputs are encountered. Decisions by the classifiers are represented as rankings of the class set that are derivable from the results of feature matching. Rank scores contain more ...
Reducedrank Vector Generalized Linear Models
 Statistical Modelling
, 2000
"... this article we extend the reducedrank idea to the VGLM/VGAM classes to obtain subclasses which we term RRVGLMs and RRVGAMs. The multinomial logit model (MLM; Nerlove and Press, 1973) for categorical data is used as the main example to bring out some of the characteristics of the RRsubclasses, a ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
(Show Context)
this article we extend the reducedrank idea to the VGLM/VGAM classes to obtain subclasses which we term RRVGLMs and RRVGAMs. The multinomial logit model (MLM; Nerlove and Press, 1973) for categorical data is used as the main example to bring out some of the characteristics of the RRsubclasses, and investigate its use to regression and classication problems. Recently, Srivastava (1997) considered the problem of reducedrank regression for classication or discrimination, but only for the Gaussian model. Hastie and Tibshirani (1996) also discuss the ideas of reduced rank regression to discrimination problems, but in a larger framework involving mixture models. Gabriel (1998) and Aldrin (2000) are also recent works. One model where the reducedrank regression idea has been applied to nonGaussian errors is the MLM. This was proposed and referred to as the stereotype model by Anderson (1984). However, in that paper and in subsequent papers by others, the reducedrank regression idea was not explicitly stated in the framework presented below. The aim of this paper is twofold. Firstly, we extend the reducedrank concept to the VGLM and VGAM class. Secondly, we describe and motivate the reducedrank idea applied to regression models for categorical data analysis, especially the MLM. We do this by elaborating on its connections to other statistical models such as neural networks, projection pursuit regression, linear discriminant analysis, canonical correspondence analysis and biplots. An outline of this paper is as follows. In the remainder of this section we briey review 2 VGLMs and VGAMsfurther details can be found in Yee and Wild (1996). In Section 2 we propose reducedrank regression for the VGLM class. In Section 3 we focus on the RRMLM, and show how it relates ...
A Theory Of Classifier Combination: The Neural Network Approach
, 1995
"... There is a trend in recent OCR development to improve system performance by combining recognition results of several complementary algorithms. This thesis examines the classifier combination problem under strict separation of the classifier and combinator design. None other than the fact that every ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
There is a trend in recent OCR development to improve system performance by combining recognition results of several complementary algorithms. This thesis examines the classifier combination problem under strict separation of the classifier and combinator design. None other than the fact that every classifier has the same input and output specification is assumed about the training, design or implementation of the classifiers. A general theory of combination should possess the following properties. It must be able to combine anytype of classifiers regardless of the level of information contents in the outputs. In addition, a general combinator must be able to combine any mixture of classifier types and utilize all information available. Since classifier independence is difficult to achieve and to detect, it is essential for a combinator to handle correlated classifiers robustly. Although the performance of a robust (against correlation) combinator can be improved by adding classifiers indiscriminantly, it is generally of interest to achieve comparable performance with the minimum number of classifiers. Therefore, the combinator should have the ability to eliminate redundant classifiers. Furthermore, it is desirable to have a complexity control mechanism for the combinator. In the past, simplifications come from assumptions and constraints imposed by the system designers. In the general theory, there should be a mechanism to reduce solution complexity by exercising nonclassifierspecific constraints. Finally, a combinator should capture classifier/image dependencies. Nearly all combination methods have ignored the fact that classifier performances (and outputs) depend on various image characteristics, and this dependency is manifested in classifier output patterns in relation to input imag...
Feature selection in omics prediction problems using cat scores and false nondiscovery rate control
 Ann. Appl. Stat
, 2009
"... We revisit the problem of feature selection in linear discriminant analysis (LDA), i.e. when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobistranformed predictors are given by correlat ..."
Abstract

Cited by 21 (11 self)
 Add to MetaCart
(Show Context)
We revisit the problem of feature selection in linear discriminant analysis (LDA), i.e. when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobistranformed predictors are given by correlationadjusted t scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false nondiscovery rates (FNDR). We show that contrary to previous claims this FNDR procedures performs very well and similar to “higher criticism”. Third, training of the classifier function is conducted by plugin of JamesStein shrinkage estimates of correlations and variances, using analytic procedures for choosing regularization parameters. Overall, this results in an effective and computationally inexpensive framework for highdimensional prediction with natural feature selection. The proposed shrinkage discriminant procedures are implemented in the R package “sda ” available from the R repository CRAN.