| Gilles Celeux and Gerard Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995. |
....to use marginal mixture EM segmentation as a rst step in our analysis. The idea is to model the marginal distribution of (possibly multivariate) pixel intensities as a nite Gaussian mixture model, and use the EM (Expectation Maximization) algorithm to estimate the model parameters (e.g. 20] [4]) For multidimensional images, the relevant distributions are multivariate, and there has been much recent progress on estimation methods that combine agglomerative hierarchical clustering methods based on maximum classi cation likelihood ( 1] 8] with the EM algorithm ( 5] 10] When the ....
....feasible for very large images. We used the MCLUST software for model based clustering for our analysis, which is described in [10] 11] and [9] It combines model based Gaussian agglomerative hierarchical clustering methods ( 1] 8] with the EM algorithm for Gaussian mixture models ([4]) The EM algorithm ( 6] 17] is an iterative method widely used in parameter estimation for incomplete data. For clustering applications via mixture models, the missing values are the cluster memberships of each pixel. To be e ective, the EM algorithm generally requires a good initial estimate. ....
[Article contains additional citation context not shown here]
G. Celeux and G. Govaert. Gaussian Parsimonious Clustering Models. Pattern Recognition, 28:781793, 1995.
....To compare MMDL versus MDL BIC, wehave performed experiments on real and synthetic data. All the experiments confirm that MMDL allows a better fit to the observed data. Finally,we mention the parameterization of the covariance matrices (based on eigen decomposition) introduced in [1] see also [4]) That parameterization allows taking selected characteristics of the components to be common (for example, same shape, arbitrary orientation) MMDL can also be used to perform model selection among the options provided by that approach.Thegoalistosimultaneously choose the number of components ....
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
....model. Each row of the N x p array belongs to a cluster; there exist K signal clusters and one noise cluster. The accuracy of the model based clustering methods has been proven in several applications, but the slow convergence prevents their use, especially when the sample size N is large [3]. For gene expression data, Gene Shaving (GS) provides a faster alternative. The Principal Component GS [10] produces at each shaving stage a sequence of nested cluster candidates, N D n D n2 D D 1 where n denotes a cluster of n genes. At each shaving stage, the optimal size h is ....
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognit., 28(5):781-793, 1995.
....compare MMDL versus MDL BIC, we have performed experiments on real and synthetic data. All the experiments confirm that MMDL allows a better fit to the observed data. Finally, we mention the parameterization of the covariance matrices (based on eigen decomposition) introduced in [1] see also [4]) That parameterization allows taking selected characteristics of the components to be common (for example, same shape, arbitrary orientation) MMDL can also be used to perform model selection among the options provided by that approach. The goal is to simultaneously choose the number of ....
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
....range of elliptical models with other constraints and fewer parameters. For example, with the parameterization k = DAD T , each component is elliptical, but all have equal volume, shape and orientation (denoted by EEE) All of these models are implemented in MCLUST (Fraley and Raftery 1998) Celeux and Govaert (1995) also considered the model in which k = k B k , where B k is a diagonal matrix with jB k j = 1. Geometrically, the diagonal model corresponds to axis aligned elliptical components. In the experiments reported in this paper, we considered the equal volume spherical (EI) unequal volume ....
Celeux, G. and G. Govaert (1995). Gaussian parsimonious clustering models. The Journal of the Pattern Recognition Society 28, 781--793.
....of elliptical models with other constraints and fewer parameters. For example, with the parameterization 8 ) N P R P E , each component is elliptical, but all have equal volume, shape and orientation (denoted by EEE) All of these models are implemented in MCLUST (Fraley and Raftery 1998) Celeux and Govaert (1995) also considered the model in which 8 ) N ) S) where ) is a diagonal matrix with ) 5 . Geometrically, the diagonal model corresponds to axis aligned elliptical components. In the experiments reported in this paper, we considered the equal volume spherical (EI) unequal volume ....
Celeux, G. and G. Govaert (1995). Gaussian parsimonious clustering models. The Journal of the Pattern Recognition Society 28, 781--793.
....a range of elliptical models with other constraints and fewer parameters. For example, with the parameterization # k = #DAD T , each component is elliptical, but all have equal volume, shape and orientation (denoted EEE) All of these models are implemented in MCLUST (Fraley and Raftery, 1998) (Celeux and Govaert, 1995) also considered the model in which # k = # k B k ,whereB k is a diagonal matrix with jB k j =1. Geometrically, the diagonal model corresponds to axis aligned elliptical components. In the experiments reported in this paper, we considered the equal volume spherical (EI) unequal volume spherical ....
Celeux, G. and Govaert, G. (1995) Gaussian parsimonious clustering models. The Journal of the Pattern Recognition Society, 28, 781--793.
....in Fortran and interfaced to the S PLUS commercial software package 1 and the freely available R language 2 which has a similar look and feel. It implements parameterized Gaussian hierarchical clustering algorithms [16, 1, 7] and the EM algorithm for parameterized Gaussian mixture models [5, 13, 3, 14] with the possible addition of a Poisson noise term. MCLUST also includes functions that combine hierarchical clustering, EM and the Bayesian Information Criterion (BIC) in a comprehensive clustering strategy [4, 8] Methods of this type have shown promise in a number of practical applications, ....
.... strategy [4, 8] Methods of this type have shown promise in a number of practical applications, including character recognition [16] tissue segmentation [1] minefield and seismic fault detection [4] identification of textile flaws from images [2] and classification of astronomical data [3, 15]. A web page with related links can be found at http: www.stat.washington.edu fraley mclust home.html. 1 Models In MCLUST, each cluster is represented by a Gaussian model # k (x k , # k ) 2#) p 2 # k 1 2 exp 1 2 (x i k ) T # 1 k (x i k ) 1) ....
[Article contains additional citation context not shown here]
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
....comments greatly improved the style and presentation of this manuscript. 1 1 Introduction The problem of grouping data into a previously unknown number of homogeneous classes or clusters arises in a number of applications. Although there is a large body of methodological work in this regard [2, 3, 4, 6, 7, 8, 9, 11, 14, 16, 19, 20, 22, 23, 31, 32, 37, 39, 41, 42], most of the attention has been directed to the problem of nding groups and patterns in moderate dimensional datasets. There are two broad classes of clustering algorithms: these are the class of hierarchical clustering techniques and that of optimization partitioning algorithms [26] The former ....
Celeux, G. and Govaert, G. (1995). Gaussian parsimonious clustering models. Patt. Recog. 28:781-93.
....factor analytic models (Hinton, Dayan, Revow, 1997) and mixtures of probabilistic principle component analytic models (Tipping Bishop, 1997) Another interesting class of mixture models that has been considered can be obtained by enforcing equality constraints across the mixture components. Celeux Govaert (1995) 21 (extending the work of Banfield Raftery, 1993) developed a class of models for continuous variables in which the covariance matrix of each component is reparameterized into a volume, a shape, and an orientation. In terms of mixtures of DAG models, these authors consider equality constraints ....
Celeux, G. and Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition, Vol. 28, No. 5, pages 781--793.
.... important open problem in density estimation and probabilistic model based clustering is the issue of choosing the optimal number of components k when fitting finite mixture densities to data sets (e.g. Titterington, Makov, and Smith, 1985; McLachlan and Basford, 1988; Banfield and Raftery, 1993; Celeux and Govaert, 1995). Optimality is usually defined as choosing the model which is closest in a Kullback Leibler (KL) sense to the true data generating mechanism. There are several standard approaches (in a mixture density estimation context) for generating approximate estimators of this KL distance (within an ....
Celeux, G. and Govaert, G., `Gaussian parsimonious clustering models,' Pattern Recognition, 28, 781--793, 1995.
....of grid based algorithms is that they are fast if the grid is coarse. This is mainly because grid based algorithms use statistical summaries of cells for clustering operations, instead of individuals. But thus lowers both quality and accuracy of clustering results. Model based algorithms [3, 5, 11, 12, 29] are widely adopted within statistics community. Here, the strong assumptions is that a mixture of underlying probability distributions generates the data and each component represents a di erent cluster. Typically, a hierarchical technique or a partitioning technique is used to obtain clusters ....
....partitioning technique is used to obtain clusters from these models. In the former [3, 11] a criterion function (like the Maximum Likelihood or Minimum Message Length) are used to compare models to merge at each stage in the construction of hierarchy. Also, the algorithmic process of partitioning [5, 34] is similar to that of partitioning algorithms. Undoubtedly, deriving optimal partitions from these models is usually an elusive task [29] Further, model based algorithms inherit drawbacks of hierarchical approaches and or partitioning approaches. AUTOCLUST belongs to the family of graph based ....
G. Celeux and G. Govaert. Gaussian Parsimonious Clustering Models. Pattern Recognition, 28(5):781-793, 1995.
.... character recognition (Murtagh and Raftery [53] tissue segmentation (Banfield and Raftery [7] minefield and seismic fault detection (Dasgupta and Raftery [27] identification of textile flaws from images (Campbell et al. 21] and classification of astronomical data (Celeux and Govaert [24], Mukerjee et al. 51] Bayes factors, approximated by the Bayesian Information Criterion (BIC) have been applied successfully to the problem of determining the number of components in a model [27] 51] and for deciding which among two or more partitions most closely matches the data for a ....
.... k ) 2) where # k is the probability that an observation belongs to the kth component (# k # 0; P G k=1 # k = 1) We are mainly concerned with the case where f k (x i # k ) is multivariate normal (Gaussian) a model that has been used with considerable success in a number of applications [53, 7, 24, 27, 21, 51]. In this instance, the parameters # k consist of a mean vector k and a covariance matrix # k , and the density has the form f k (x i k , # k ) exp n 1 2 (x i k ) T # 1 k (x i k ) o (2#) p 2 # k 1 2 . 3) Clusters are ellipsoidal, centered at the means k . ....
[Article contains additional citation context not shown here]
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
....Two algorithms have been devised for nding good suboptimal solutions. One of these involve hierarchical agglomeration (HMCLUST) Ban eld and Raftery, 1993, and Fraley, 1998) The other, iterative relocation, uses the EM algorithm (Dempster, Laird and Rubin, 1977) or the classi cation EM algorithm (Celeux and Govaert, 1995). Both approaches can be combined. The relocation technique critically depends on a good starting solution such as the one provided by the rst algorithm. Conversely, it can be used to re ne the partition obtained by agglomeration. However, the EM algorithm exhibits some limitations in the context ....
Celeux, G. and Govaert, G. (1995). Gaussian Parsimonious Clustering Models. Pattern Recognition 28, 781-793.
.... density f(x) the log likelihood of Phi (k) is defined as l( Phi (k) jD train ) log p(D train j Phi (k) N X i=1 log k X j=1 ff j g j (x i j j ) 2) Note that there are alternative objective functions which can be maximized in the clustering context, e.g. see Celeux and Govaert (1995) for clustering using the classification likelihood function) Direct maximization of the mixture log likelihood expression in Equation (2) is difficult except in trivial special cases. Thus, much of the popularity of mixture models in recent years is due to the existence of efficient iterative ....
Celeux, G. and Govaert, G., `Gaussian parsimonious clustering models,' Pattern Recognition, 28, 781--793, 1995.
....massive, data, sequential sampling, Gaussian distribution, likelihood ratio test 1 Introduction The problem of grouping data into a previously unknown number of homogeneous classes or clusters arises in a number of applications. Although there is a large body of methodological work in this regard [1, 2, 3, 5, 6, 7, 8, 12, 14, 17, 18, 20, 21, 26, 27, 29, 30, 32, 33], most of the attention has been directed to the problem of finding groups and patterns in moderate dimensional datasets. There are two broad classes of clustering algorithms: these are the hierarchical clustering techniques and the optimization partitioning algorithms [23] The former partition ....
Celeux, G. and Govaert, G. (1995). Gaussian parsimonious clustering models. Patt. Recog. 28:781-93.
....hardly occur in practical knowledge discovery situations. The weaknesses of k Means result in poor quality clustering, and thus, more statistically sophisticated alternatives have been proposed. Representatives of these alternatives are Expectation Maximization (and model based clustering [5, 12, 22]) Data Augmentation [49] and Gibbs sampling Markov chain Monte Carlo algorithms [3, 24, 37, 48] While these alternatives offer more statistical accuracy, robustness and less bias, they trade this for substantially more computational requirements and more detailed prior knowledge. This paper ....
....of the part f j of the mixture. Thus, different Expectation Maximization methods update their parameters at each iteration slightly differently. In order to have analytical solutions, typically, it is assumed that each f j is a multivariate Gaussian density N ( j ; Sigma j ) normal density) [5, 12, 22]. Further simplifications assumes that all components have the same known covariance Sigma , and thus, the only unknown parameter of each component f j in the mixture is the mean j . Delicate aspects of the iteration occur at different levels. For example, even in the simple case where the ....
G. Celeux and G. Govaret. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
....These optimization algorithms have several notable weaknesses. The first is that they heavily favor spherical clusters. Secondly, they do not deal adequately with noise ; i.e. elements of S which do not cluster naturally with any other elements. Banfield and Raftery [2] and Celeux and Govaert [4] both develop frameworks in the context of statistical mixture models for clustering which subsume the optimization models above and deal with these issues. Mixture models in general and Banfield and Raftery s work in particular will be discussed in Section 2.1. An alternative k clustering ....
Gilles Celeux and G'erard Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
....Breast Prognosis Cancer database to generate well separated patient survival curves. In contrast, the k mean algorithm did not generate such well separated survival curves. Keywords: Clustering, k mean, linear regression 1. Introduction There are many approaches to clustering such as statistical [2, 9, 6], machine learning [7, 8] and mathematical programming [15, 16, 4] In this work we take a mathematical programming approach with a novel idea. Instead of generating cluster centers as points that minimize the sum of squares of distances of each given point to a nearest cluster center, we change ....
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
....front, clearly there is room for improvement over the basic algorithm described in this paper. The probabilistic cluster models can easily be extended beyond the full covariance model to incorporate, for example, the geometric shape and Poisson outlier models of Banfield and Raftery (1993) and Celeux and Govaert (1995), and the discrete variable models in AutoClass. Diagnostic tests for detecting non Gaussianity could also easily be included (cf. McLachlan and Basford (1988) Section 2.5) and would be a useful practical safeguard. Some obvious improvements could also be made to the search strategy. Instead of ....
Celeux, G., and Govaert, G. 1995. `Gaussian parsimonious clustering models,' Pattern Recognition, 28(5), 781--793.
....was tested on the publicly available Wisconsin Breast Prognosis Cancer database to generate well separated patient survival curves. In contrast, the k mean algorithm did not generate such well separated survival curves. 1 Introduction There are many approaches to clustering such as statistical [2, 9, 6], machine learning [7, 8] and mathematical programming [15, 16, 4] In this work we take a mathematical programming approach with a novel idea. Instead of generating cluster centers as points that minimize the sum of squares of distances of each given point to a nearest cluster center, we change ....
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
....cluster analysis written in Fortran and interfaced to the S PLUS commercial software package 1 . It implements parameterized Gaussian hierarchical clustering algorithms [15, 1, 6] and the EM algorithm for parameterized Gaussian mixture models with the possible addition of a Poisson noise term [5, 12, 3, 13]. MCLUST also includes functions that combine hierarchical clustering, EM and the Bayesian Information Criterion (BIC) in a comprehensive clustering strategy [4, 7] Methods of this type have shown promise in a number of practical applications, including character recognition [15] tissue ....
.... strategy [4, 7] Methods of this type have shown promise in a number of practical applications, including character recognition [15] tissue segmentation [1] minefield and seismic fault detection [4] identification of textile flaws from images [2] and classification of astronomical data [3, 14]. A web page with related links can be found at http: www.stat.washington.edu fraley mclusthome.html. 1 Models In MCLUST, each cluster is represented by a Gaussian model OE k (x j k ; Sigma k ) 2 ) Gamma p 2 j Sigma k j Gamma 1 2 exp ae Gamma 1 2 (x i Gamma k ) T Sigma ....
[Article contains additional citation context not shown here]
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
.... character recognition (Murtagh and Raftery [1] tissue segmentation (Banfield and Raftery [2] minefield and seismic fault detection (Dasgupta and Raftery [3] identification of textile flaws from images (Campbell et al. 4] and classification of astronomical data (Celeux and Govaert [5], Mukerjee et al. 6] An advantage of the model based approach is that there is an associated Bayesian criterion for assessing the model. The Bayesian Information Criterion (BIC) has been applied successfully to the problem of determining the number of components in a model [3] 6] and for ....
.... (2) where k is the probability that an observation belongs to the kth component ( k 0; P G k=1 k = 1) We are mainly concerned with the case where f k (x i j k ) is multivariate normal (Gaussian) a model that has been used with considerable success in a number of applications [1] 2] [5], 3] 4] 6] In this instance, the parameters k consist of a mean vector k and a covariance matrix Sigma k , and the density has the form f k (x i j k ; Sigma k ) 2 ) Gamma p 2 j Sigma k j Gamma 1 2 expf Gamma 1 2 (x i Gamma k ) T Sigma Gamma1 k (x i Gamma k ) ....
[Article contains additional citation context not shown here]
Celeux, G. and Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition, 28, 781--793.
....WA 98195 4322 USA MCLUST is a software package for cluster analysis written in Fortran and interfaced to the S PLUS commercial software package 1 . It implements parameterized Gaussian hierarchical clustering algorithms [16, 1, 7] and the EM algorithm for parameterized Gaussian mixture models [5, 13, 3, 14] with the possible addition of a Poisson noise term. MCLUST also includes functions that combine hierarchical clustering, EM and the Bayesian Information Criterion (BIC) in a comprehensive clustering strategy [4, 8] Methods of this type have shown promise in a number of practical applications, ....
.... strategy [4, 8] Methods of this type have shown promise in a number of practical applications, including character recognition [16] tissue segmentation [1] minefield and seismic fault detection [4] identification of textile flaws from images [2] and classification of astronomical data [3, 15]. A web page with related links can be found at http: www.stat.washington.edu fraley mclusthome.html. 1 Models In MCLUST, each cluster is represented by a Gaussian model OE k (x j k ; Sigma k ) 2 ) Gamma p 2 j Sigma k j Gamma 1 2 exp ae Gamma 1 2 (x i Gamma k ) T Sigma ....
[Article contains additional citation context not shown here]
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
.... analysis and statistical pattern recognition (see for instance McLachlan 1992 and Ripley 1996) Recently several authors have exploited the eigenvalue decomposition of the group variance matrices in Gaussian mixtures to propose numerous and powerful models for clustering (Ban eld and Raftery 1993, Celeux and Govaert 1995, Bensmail, Celeux, Raftery and Robert 1997) and discriminant analysis Flury, Schmid and Narayanan 1993, Bensmail and Celeux 1996) This parametrization of the mixture components provides a general and AEexible framework to give raise to eOEcient, although somewhat unusual, clustering criteria and ....
....the volume of the kth group, D k its orientation and A k its shape. By allowing some but not all of these quantities to vary between groups, we obtain parsimonious and easily interpreted models which are appropriate to describe various clustering or classi cation situations. For instance Celeux and Govaert (1995) and Bensmail and Celeux (1996) considered 14 dioeerent models related to dioeerent assumptions on the group variance matrices. Eight of these models are obtained by assuming equal or dioeerent volumes, shapes or orientations ( DAD 0 ] k DAD 0 ] DA k D 0 ] k DA k D 0 ] D k AD 0 ....
Celeux, G. and Govaert, G. (1995). Gaussian Parsimonious Clustering Models. Pattern Recognition, 28, 781-793.
....the EM algorithm (Dempster et al. 1977) and takes into account the spatial constraints without requiring a partition made of one region classes . 2 Introducing Spatial Constraints in the EM Algorithm 2. 1 Fuzzy Clustering and the EM Algorithm In cluster analysis based on Gaussian mixture models (Celeux and Govaert 1995), data are IR d valued vectors x 1 ; x n assumed to be a sample from a mixture of densities: f(x i j Phi) K X k=1 p k f k (x i j k ) 1) February 13, 1996 11 : 14 DRAFT 3 where the p k are the mixing proportions (0 p k 1, for all k = 1; K and P k p k = 1) and f k ....
Celeux, G. and G. Govaert (1995). "gaussian parsimonious clustering models'. Pattern Recognition 28, 781--793.
....procedure to derive m.l. estimates. And, in some circumstances, especially for models assuming different shape group variance matrices, designing these algorithms need some effort. In this section, we do not provide details on the m.l. calculations, since those details appear in a paper of Celeux and Govaert (1994) where the same models were considered in a cluster analysis context. In the following, we only give the formulas of m.l. estimators of the variance matrices for the 14 models. First, we need to define some matrices: The within group scattering matrix W W = K X k=1 X i=z i =k (x i Gamma x ....
Celeux, G. and Govaert, G. (1994). Gaussian parsimonious clustering models.
....limitations of that approach. It appears to work well in several examples. Alternative frequentist approaches, which might be easier to implement, consist of maximizing the likelihood using the EM algorithm or of maximizing the classification likelihood using the Classification EM (CEM) algorithm. Celeux and Govaert (1995) considered those approaches to the full range of clustering models derived from the eigenvalue decomposition of the group variance matrices, including those considered here. They have shown in particular how it is possible to find the maximum likelihood estimate of the shape matrix A. Both ....
Celeux, G. and Govaert, G. (1995), "Gaussian parsimonious clustering models," Pattern Recognition, to appear.
No context found.
Gilles Celeux and Gerard Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
No context found.
G. Celeux & G. Govaert (1995). Gaussian parsimonious clustering models. Pattern Recognition 5(28):781-793.
No context found.
Celeux, G. and Govaert, G., `Gaussian parsimonious clustering models,' Pattern Recognition, 28, 781--793, 1995.
No context found.
Gilles Celeux and Gerard Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28(5):781--793, 1995.
No context found.
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781--793, 1995.
No context found.
G. Celeux and G. Govaert. Gaussian parsimonious clustering models. Pattern Recognition, 28:781-793, 1995.
No context found.
Celeux, G. and Govaert, G., \Gaussian parsimonious clustering models", Pattern Recognition, 28, 781-793, 1995.
No context found.
G. Celeux and G. Govaert, Gaussian parsimonious clustering models, Pattern Recognition, 28, 781--793 (1995).
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC