ModelBased Clustering, Discriminant Analysis, and Density Estimation
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2000
"... Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little ..."
Cluster analysis is the automated search for groups of related observations in a data set. Most clustering done in practice is based largely on heuristic but intuitively reasonable procedures and most clustering methods available in commercial software are also of this type. However, there is little systematic guidance associated with these methods for solving important practical questions that arise in cluster analysis, such as \How many clusters are there?", "Which clustering method should be used?" and \How should outliers be handled?". We outline a general methodology for modelbased clustering that provides a principled statistical approach to these issues. We also show that this can be useful for other problems in multivariate analysis, such as discriminant analysis and multivariate density estimation. We give examples from medical diagnosis, mineeld detection, cluster recovery from noisy data, and spatial density estimation. Finally, we mention limitations of the methodology, a...
Unsupervised learning of finite mixture models
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2002
"... This paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization (EM) alg ..."
This paper proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective ªunsupervisedº is justified by two properties of the algorithm: 1) it is capable of selecting the number of components and 2) unlike the standard expectationmaximization (EM) algorithm, it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence toward a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm; in this paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach.
Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling 14
, 2007
"... Mixture modeling is a widely applied data analysis technique used to identify unobserved heterogeneity in a population. Despite mixture models ’ usefulness in practice, one unresolved issue in the application of mixture models is that there is not one commonly accepted statistical indicator for deci ..."
Mixture modeling is a widely applied data analysis technique used to identify unobserved heterogeneity in a population. Despite mixture models ’ usefulness in practice, one unresolved issue in the application of mixture models is that there is not one commonly accepted statistical indicator for deciding on the number of classes in a study population. This article presents the results of a simulation study that examines the performance of likelihoodbased tests and the traditionally used Information Criterion (ICs) used for determining the number of classes in mixture modeling. We look at the performance of these tests and indexes for 3 types of mixture models: latent class analysis (LCA), a factor mixture model (FMA), and a growth mixture models (GMM). We evaluate the ability of the tests and indexes to correctly identify the number of classes at three different sample sizes (n D 200, 500, 1,000). Whereas the Bayesian Information Criterion performed the best of the ICs, the bootstrap likelihood ratio test proved to be a very consistent indicator of classes across all of the models considered.
Distributional assumptions of growth mixture models: Implications for overextraction of latent trajectory classes
 Psychological Methods
, 2003
"... Growth mixture models are often used to determine if subgroups exist within the population that follow qualitatively distinct developmental trajectories. However, statistical theory developed for finite normal mixture models suggests that latent trajectory classes can be estimated even in the absenc ..."
Growth mixture models are often used to determine if subgroups exist within the population that follow qualitatively distinct developmental trajectories. However, statistical theory developed for finite normal mixture models suggests that latent trajectory classes can be estimated even in the absence of population heterogeneity if the distribution of the repeated measures is nonnormal. By drawing on this theory, this article demonstrates that multiple trajectory classes can be estimated and appear optimal for nonnormal data even when only 1 group exists in the population. Further, the withinclass parameter estimates obtained from these models are largely uninterpretable. Significant predictive relationships may be obscured or spurious relationships identified. The implications of these results for applied research are highlighted, and future directions for quantitative developments are suggested. Over the last decade, random coefficient growth modeling has become a centerpiece of longitudinal data analysis. These models have been adopted enthusiastically by applied psychological researchers in part because they provide a more dynamic analysis of repeated measures data than do many traditional techniques. However, these methods are not ideally suited for testing theories that posit the existence of qualitatively different developmental pathways, that is, theories in which distinct developmental pathways are thought to hold within subpopulations. One widely cited theory of this type is Moffitt’s (1993) distinction between “lifecourse persistent ” and “adolescentlimited ” antisocial behavior trajectories. Moffitt’s theory is prototypical of other developmental taxonomies that have been proposed in such diverse areas as developmental psychopathology (Schulenberg,
The integration of continuous and discrete latent variable models: Potential problems and promising opportunities
 Psychological Methods
, 2004
"... Structural equation mixture modeling (SEMM) integrates continuous and discrete latent variable models. Drawing on prior research on the relationships between continuous and discrete latent variable models, the authors identify 3 conditions that may lead to the estimation of spurious latent classes i ..."
Structural equation mixture modeling (SEMM) integrates continuous and discrete latent variable models. Drawing on prior research on the relationships between continuous and discrete latent variable models, the authors identify 3 conditions that may lead to the estimation of spurious latent classes in SEMM: misspecification of the structural model, nonnormal continuous measures, and nonlinear relationships among observed and/or latent variables. When the objective of a SEMM analysis is the identification of latent classes, these conditions should be considered as alternative hypotheses and results should be interpreted cautiously. However, armed with greater knowledge about the estimation of SEMMs in practice, researchers can exploit the flexibility of the model to gain a fuller understanding of the phenomenon under study. In recent years, many exciting developments have taken place in structural equation modeling, but perhaps none more so than the development of structural equation models that account for unobserved popula
Risk and Rationality: Uncovering Heterogeneity in Probability Distortion
 Econometrica
, 2010
"... It has long been recognized that there is considerable heterogeneity in individual risk taking behavior but little is known about the distribution of risk taking types. We present a parsimonious characterization of risk taking behavior by estimating a finite mixture model for three different experim ..."
It has long been recognized that there is considerable heterogeneity in individual risk taking behavior but little is known about the distribution of risk taking types. We present a parsimonious characterization of risk taking behavior by estimating a finite mixture model for three different experimental data sets, two Swiss and one Chinese, over a large number of real gains and losses. We find two major types of individuals: In all three data sets, the choices of roughly 80 % of the subjects exhibit significant deviations from linear probability weighting of varying strength, consistent with prospect theory. 20 % of the subjects weight probabilities near linearly and behave essentially as expected value maximizers. Moreover, individuals are cleanly assigned to one type with probabilities close to unity. The reliability and robustness of our classification suggest using a mix of preference theories in applied economic modeling.
Analyzing criminal trajectory profiles: Bridging multilevel and groupbased approaches using growth mixture modeling
 Journal of Quantitative Criminology
, 2008
"... Abstract Over the last 25 years, a lifecourse perspective on criminal behavior has assumed increasing prominence in the literature. This theoretical development has been accompanied by changes in the statistical models used to analyze criminological data. There are two main statistical modeling tec ..."
Abstract Over the last 25 years, a lifecourse perspective on criminal behavior has assumed increasing prominence in the literature. This theoretical development has been accompanied by changes in the statistical models used to analyze criminological data. There are two main statistical modeling techniques currently used to model longitudinal data. These are growth curve models and latent class growth models, also known as groupbased trajectory models. Using the well known Cambridge data and the Philadelphia cohort study, this article compares the two ‘‘classical’ ’ models—conventional growth curve model and groupbased trajectory models. In addition, two growth mixture models are introduced that bridge the gap between conventional growth models and groupbased trajectory models. For the Cambridge data, the different mixture models yield quite consistent inferences regarding the nature of the underlying trajectories of convictions. For the Philadelphia cohort study, the statistical indicators give stronger guidance on relative model fit. The main goals of this article are to contribute to the discussion about different modeling techniques for analyzing data from a lifecourse perspective and to provide a concrete stepbystep illustration of such an analysis and model checking.
Breakdown points for maximum likelihood estimators of locationscale mixtures
 The Annals of Statistics
, 2004
"... MLestimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the estimation of mixtures of tdistributions by McLachlan and Peel [Fin ..."
MLestimation based on mixtures of Normal distributions is a widely used tool for cluster analysis. However, a single outlier can make the parameter estimation of at least one of the mixture components break down. Among others, the estimation of mixtures of tdistributions by McLachlan and Peel [Finite Mixture Models (2000) Wiley, New York] and the addition of a further mixture component accounting for “noise ” by Fraley and Raftery [The Computer J. 41 (1998) 578–588] were suggested as more robust alternatives. In this paper, the definition of an adequate robustness measure for cluster analysis is discussed and bounds for the breakdown points of the mentioned methods are given. It turns out that the two alternatives, while adding stability in the presence of outliers of moderate size, do not possess a substantially better breakdown behavior than estimation based on Normal mixtures. If the number of clusters s is treated as fixed, r additional points suffice for all three methods to let the
Job Security and Job Protection
, 2005
"... We construct indicators of the perception of job security for various types of jobs in 12 European countries using individual data from the European Community Household Panel (ECHP). We then consider the relation between reported job security and OECD summary measures of Employment Protection Legisl ..."
We construct indicators of the perception of job security for various types of jobs in 12 European countries using individual data from the European Community Household Panel (ECHP). We then consider the relation between reported job security and OECD summary measures of Employment Protection Legislation (EPL) strictness on one hand, and Unemployment Insurance Benefit (UIB) generosity on the other. We find that, after controlling for selection into job types, workers feel most secure in permanent public sector jobs, least secure in temporary jobs, with permanent private sector jobs occupying an intermediate position. We also find that perceived job security in both permanent private and temporary jobs is positively correlated with UIB generosity, while the relationship with EPL strictness is negative: workers feel less secure in countries where jobs are more protected. These correlations are absent for permanent public jobs, suggesting that such jobs are perceived to be by and large insulated from labor market fluctuations.