13 citations found. Retrieving documents...
H. Bensmail, G. Celeux, A.E. Raftery, and C.P. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:110, 1997.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Non-Parametric Expectation Maximization: A Learning.. - Abd-Almageed.. (2003)   (Correct)

....approaches and stochastic approaches. Deterministic methods, such as [6] 7] and [8] are based on selecting the number of components according to some model selection criterion, which usually contains an increasing function that penalizes higher number of components. In [9] 10] and [11] stochastic approaches based on Markov Chain Monte Carlo methods are used. These approaches have the disadvantage of being too computationally intensive. This paper introduces a new stochastic approach for overcoming the drawbacks of the Expectation Maximization algorithm. The proposed approach ....

H. Bensmail, G. Celeus, A. Rafetry, and C. Robert, "Inference in Model-Based Cluster Analysis," Statistics and Computing, vol. 7, pp. 1--10, 1997.


On Fitting Mixture Models - Figueiredo, Leitão, Jain (1999)   (5 citations)  (Correct)

....in the previous paragraph and will not be further considered here. Stochastic approaches involveMarkovchain Monte Carlo (MCMC) sampling and are far more computationally intensive than EM. MCMC is used in two differentways: to implement model selection criteria to actually estimate k (e.g. [2], 18] 26] # and, in a more fully Bayesian way, to sample from the full a posteriori distribution with k considered unknown [19] 21] Despite their formal appeal, we think that MCMC based techniques are still far too computationally demanding to be useful in pattern recognition applications. ....

H. Bensmail, G. Celeux, A. Raftery, and C. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:1--10, 1997.


On Fitting Mixture Models - Figueiredo, Leitão, Jain (1999)   (5 citations)  (Correct)

....in the previous paragraph and will not be further considered here. Stochastic approaches involve Markov chain Monte Carlo (MCMC) sampling and are far more computationally intensive than EM. MCMC is used in two different ways: to implement model selection criteria to actually estimate k (e.g. [2], 18] 26] and, in a more fully Bayesian way, to sample from the full a posteriori distribution with k considered unknown [19] 21] Despite their formal appeal, we think that MCMC based techniques are still far too computationally demanding to be useful in pattern recognition applications. ....

H. Bensmail, G. Celeux, A. Raftery, and C. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:1--10, 1997.


Unsupervised Learning of Finite Mixture Models - Figueiredo, Jain (2000)   (21 citations)  (Correct)

....techniques (see below) than to deterministic ones. Stochastic methods. These methods resort to Markov chain Monte Carlo (MCMC) sampling and are far more computationally intensive than EM. MCMC can be used in two different ways: to implement model selection criteria to actually estimate k (e.g. [29], 30] 31] or in a more fully Bayesian way, to sample from the full a posteriori distribution with k considered unknown [32] 33] 34] Despite their formal appeal, we think that MCMC based techniques are still far too computationally demanding to be useful in pattern recognition ....

H. Bensmail, G. Celeux, A. Raftery, and C. Robert, "Inference in model-based cluster analysis," Statistics and Computing, vol. 7, pp. 1--10, 1997.


Bayesian Latent Semantic Analysis - de Freitas, Barnard (2000)   (Correct)

....Dirichlet prior over the word probabilities a;c as follows p(j ) nc Y c=1 ( 0;c ) 1;c ) na ;c ) na Y a=1 a;c 1 a;c (10) 2.2. 3 Priors for Gaussian attributes When considering multivariate normal distributions, one can adopt the following normalinverse Wishart prior (Bensmail, Celeux, Raftery and Robert 1997, Diebolt and Robert 1994, McLachlan and Peel 2000) a;c N nga ( a;c ; a;c = a;c ) 1 a;c W nga (r a;c ; a;c ) 11) where W nga (r a;c ; a;c ) denotes a Wishart distribution, with density p( 1 a;c jr a;c ; a;c ) j a;c j 1 2 (ra;c nga 1) exp 1 2 tr 1 a;c 1 ....

....EM iterations was xed to 10. We chose the same discrete prior parameters as before and selected the Gaussian hyperparameters as follows: a;c = 1, r a;c = 5, a;c = T a;c;2 T 1 a;c;1 and a;c = T a;c;3 T 1 a;c;1 T a;c;2 T 0 a;c;2 T 1 a;c;1 1 . We copied these values from (Bensmail et al. 1997) as they seem ensure that the estimates are relatively insensitive to reasonable changes in the prior. The EB method in this experiment was only applied to estimate the hyperparameters c . Figure 9 shows the contour plot of the bivariate Gaussian clusters and the means of the six clusters ....

Bensmail, H., Celeux, G., Raftery, A. E. and Robert, C. P. (1997). Inference in model-based cluster analysis, Statistics and Computing 7: 1-10.


Choosing Models in Model-based Clustering and Discriminant.. - Biernacki, Govaert (1998)   (1 citation)  (Correct)

.... pattern recognition (see for instance McLachlan 1992 and Ripley 1996) Recently several authors have exploited the eigenvalue decomposition of the group variance matrices in Gaussian mixtures to propose numerous and powerful models for clustering (Ban eld and Raftery 1993, Celeux and Govaert 1995, Bensmail, Celeux, Raftery and Robert 1997) and discriminant analysis Flury, Schmid and Narayanan 1993, Bensmail and Celeux 1996) This parametrization of the mixture components provides a general and AEexible framework to give raise to eOEcient, although somewhat unusual, clustering criteria and classi cation rules. It consists in writing ....

Bensmail H., Celeux G., Raftery A. E. and Robert C. P. (1997). Inference in Model-based Cluster Analysis. Statistics and Computing, 7, 1-10.


Clustering Via Normal Mixture Models - McLachlan, Peel, Prado   (1 citation)  (Correct)

.... Less restrictive constraints can be imposed by a reparameterization of the component covariance matrices in terms of their eigenvalue decompositions as, for example, in Banfield and Raftery (1993) Flury, Schmid, and Narayanan (1993) Celeux and Govaert (1995) Bensmail and Celeux (1996) and Bensmail et al. 1997). Under the assumption that y 1 ; y n are independent realizations of the feature vector Y , the log likelihood function for Psi is given by log L( Psi) n X j=1 log g X i=1 i c i (y j ; i ) 3) With the maximum likelihood approach to the estimation of Psi, an estimate is ....

Bensmail, H., Celeux, G., Raftery, A.E., and Robert, C.P. (1997). Inference in model-based cluster analysis. Statistics and Computing 7, 1-- 10.


Model Selection for Probabilistic Clustering Using Cross-Validated .. - Smyth (1998)   (9 citations)  (Correct)

.... true density f(x) the log likelihood of Phi (k) is defined as l( Phi (k) jD train ) log p(D train j Phi (k) N X i=1 log k X j=1 ff j g j (xj j ) 2) Note that there are alternative objective functions which can be maximized in the clustering context, e.g. see Bensmail et al. 1997) for clustering using the classification likelihood function) Direct maximization of the mixture log likelihood expression in Equation 2 is difficult except in trivial special cases. Thus, much of the popularity of mixture models in recent years is due to the existence of efficient iterative ....

....In this paper, only the problem of finding the correct numbers of components for Gaussian mixture models was discussed. However, one can in principle easily apply the methodology to a much broader class of mixture problems, such as selecting among different independence structures (e.g. see Bensmail et al. (1997) and Thiesson et al. (1998) or model selection in the context of Markov models (e.g. see Smyth (1997) for an application to hidden Markov models) Directions for further work on cross validated likelihood include a bias variance characterization for better understanding of the trade offs involved ....

Bensmail, H., Celeux, G., Raftery, A., and Robert, C. P., `Inference in model-based cluster analysis,' Statistics and Computing, 7, 1--10, 1997.


Region-of-Interest Selection and Statistical Analysis of.. - Forbes, al. (2001)   Self-citation (Raftery)   (Correct)

....value z ik of z ik at a maximum of (1) is the conditional probability that observation i belongs to group k. The maximum likelihood classi cation of observation i is fm j z im = max k z ik g, so that (1 max k z ik ) constitutes a measure of the uncertainty in the classi cation (e.g. [2]) For the M step, estimates of the means and probabilities have simple closed form expressions involving the data and z ik from the E step: p k ; k P n i=1 z ik y i ; n k z ik (8) Computation of the covariance estimate k depends on its parameterization. For ....

H. Bensmail, G. Celeux, A.E. Raftery, and C.P. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:110, 1997.


Choosing Models in Model-Based Clustering and.. - Celeux, Biernacki..   Self-citation (Celeux)   (Correct)

.... recognition (see for instance McLachlan 1992 and Ripley 1996) Recently several authors have exploited the eigenvalue decomposition of the group variance matrices in Gaussian mixtures to propose numerous and powerful models for clustering (Banfield and Raftery 1993, Celeux and Govaert 1995, Bensmail, Celeux, Raftery and Robert 1997) and discriminant analysis Flury, Schmid and Narayanan 1993, Bensmail and Celeux 1996) This parametrization of variance matrices of the mixture components provides a general and flexible framework to give raise to efficient, although somewhat unusual, clustering criteria and classification rules. ....

....BIC approximation is valid for large n when standard regularity conditions regarding the likelihood are verified. In particular, must be well within the parameter space. Clearly, this condition do not hold when testing K = K 0 versus K = K 1 (see for instance Aitkin and Rubin 1985) 3 In Bensmail et al. 1997) the integrated likelihood is computed using the Gibbs sampler output from informative conjugate priors for the models [I] k I] Sigma] and [ k Sigma] It is done by using the Laplace Metropolis estimator of the integrated likelihood (Raftery 1996) The Laplace method for integrals yields ....

[Article contains additional citation context not shown here]

Bensmail H., Celeux G., Raftery A. E. and Robert C. P. (1997). Inference in model-based cluster analysis. Statistics and Computing, 7, 1-10.


How Many Clusters? Which Clustering Method? Answers Via.. - Fraley, Raftery (1998)   (7 citations)  Self-citation (Raftery)   (Correct)

....the conditions under which convergence has been proven do not always hold in practice, the method is widely used in the mixture modeling context with good results. Moreover, for each observation i, 1 max k z # ik ) is a measure of uncertainty in the associated classification (Bensmail et al. [9]) The EM algorithm for clustering has a number of limitations. First, the rate of convergence can be very slow. This does not appear to be a problem in practice for well separated mixtures when started with reasonable values. Second, the number of conditional probabilities associated with each ....

....by means of mathematical morphology (e.g. 39] Principal curve clustering in the presence of noise using BIC is discussed in Stanford and Raftery [62] In situations where the BIC is not definitive, more computationally intensive Bayesian analysis may provide a solution. Bensmail et al. [9] showed that exact Bayesian inference via Gibbs sampling, with calculations of Bayes factors using the Laplace Metropolis estimator, works well in several real and simulated examples. Approaches to clustering based on the classification likelihood (1) are also known as classification maximum ....

H. Bensmail, G. Celeux, A. E. Raftery, and C. P. Robert. Inference in model-based cluster analysis. Statistics and Computing, 7:1--10, March 1997.


How Many Clusters? Which Clustering Method? Answers Via.. - Fraley, Raftery (1998)   (7 citations)  Self-citation (Raftery)   (Correct)

....The model and classification can be considered to be an accurate representation of the data if all of the z ik are close to either 0 or 1 at a local maximum. Moreover, for each observation i, 1 Gamma max k z ik ) is a measure of uncertainty in the associated classification (Bensmail et al. [21]) The EM algorithm for clustering has a number of limitations. First, the asymptotic rate of convergence can be very slow. This does not appear to be a problem in practice for wellseparated mixtures when started with reasonable values. Second, the number of conditional probabilities associated ....

....by means of mathematical morphology (e.g. 40] Principal curve clustering in the presence of noise using BIC is discussed in Stanford and Raftery [41] In situations where the BIC is not definitive, more computationally intensive Bayesian analysis may provide a solution. Bensmail et al. [21] showed that exact Bayesian inference via Gibbs sampling, with calculations of Bayes factors using the Laplace Metropolis estimator works well in several real and simulated examples. Other model based clustering methodologies include Cheeseman and Stutz [42, 43] implemented in the AutoClass ....

Bensmail, H., Celeux, G., Raftery, A. E., and Robert, C. P. (1997). Inference in modelbased cluster analysis. Statistics and Computing, 7, 1--10.


Bayesian Estimation and Segmentation of Spatial Point.. - Byers, Raftery (1997)   Self-citation (Raftery)   (Correct)

No context found.

Biometrics 49, 803--821. Bensmail, H., G. Celeux, A. Raftery, and C. Robert (1997). Inference in model-based cluster analysis. Statistics and Computing 7, 1--10.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC