37 citations found. Retrieving documents...
G. McLachlan and K. Basford. Mixture Models. Marcel Dekker, New York, 1988.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Heuristic Classifier Performance Bounds in High Dimensional.. - Baggenstoss   (Correct)

.... data leading some researchers to avoid it [17] or constrain the covariances of the kernels to be identical [19] or of uniform size with variable rotation [14] Adding to the covariance estimates based on a Bayesian prior density argument is the preferred method of dealing with the problem [20] [21]. This involves simply adding a diagonal matrix, representing an independent measurement noise prior, to the kernel covariances at each iteration. We have obtained excellent results with this method. When the PDF is estimated by optimizing the likelihood function, such as an EM algorithm, the ....

G. J. McLachlan, Mixture Models. Dekker, 1988.


Using Unlabeled Data to Improve Text Classification - Nigam (2001)   (10 citations)  (Correct)

....(Little, 1977) was recognized immediately. Since then, this approach continues to be used and studied (McLachlan Ganesalingam, 1982; Ganesalingam, 1989; Shahshahani Landgrebe, 1994) Three excellent surveys of the history of EM and its application to mixture modeling are the books by McLachlan and Basford (1988), McLachlan and Krishnan (1997) and McLachlan and Peel (2000) Using likelihood maximization of mixture models for combining labeled and unlabeled data for classification has only recently made its way to the machine learning 99 community (Miller Uyar, 1996; Nigam et al. 1998; Baluja, 1999) ....

McLachlan, G., & Basford, K. (1988). Mixture models. New York: Marcel Dekker.


Learning with Labeled and Unlabeled Data - Seeger (2001)   (28 citations)  (Correct)

....xju. The functional relationships are represented either by linear or by more powerful nonlinear models, in the latter case the model family is tightly regularized by an appropriate prior P ( on the model parameter . The noise model is usually a Gaussian. Other examples are mixture models (e.g. [58], 90] 69] where the latent variable is a grouping variable from a nite set (similar to the class label in supervised classi cation) and the conditional models come from simple families such as Gaussians with structurally restricted covariance matrices. Combinations of mixture and latent ....

....EM equations then parallels very closely the case of mixture models, which can be found in many textbooks, e.g. 11] 2 BASELINE METHODS 18 Indeed, using EM to ll in labels on D u has already been suggested very early, namely in a note by R. J. Little in the discussion of [26] Chapter 1. 11 of [58] gives the idea and further references, however it is not clear whether the authors suggest using the approach for classi cation or merely for partially unsupervised learning, where unsupervised tting of a mixture model to P (x) is aided by a few labeled points D l . It has been used to attack ....

[Article contains additional citation context not shown here]

G. McLachlan and K. Basford. Mixture Models. Marcel Dekker, New York, 1988.


Mixture Codebook Classification - Part 1: Method Outline - Langaas (1995)   (Correct)

....finite mixtures. The problem of constructing classification rules is not addressed in this section. 3.1 One and Two Level Mixture Models We will in this section define one and two level mixtures and the mixture codebook, M. General references to mixture models are Everitt and Hand (1981) McLachlan and Basford (1988) and Titterington et al. 1985) 5 MC 2 Mixture Codebook Classification [1] 3.1.1 One Level Mixture Models Assume that X has a finite mixture distribution defined as p(x) K X l=1 p(c l )f l (xj l ) 1) with mixing weights p(c 1 ) p(c K ) and component densities f 1 (xj 1 ) ....

....The problem of choosing K is not addressed in this report. 3. 2 Estimating the Parameters and Weights of a Mixture The problem of estimating the parameters and weight of a one level mixture is one of the oldest estimation problems in the statistical literature, see Everitt and Hand (1981) and McLachlan and Basford (1988) for an historical account. Estimation is difficult even for a one level mixture of two univariate Gaussian component distributions, where the likelihood surface is littered with singularities. Karl Pearson used in 1894 the method of moments to estimate the five parameters and weights in a mixture ....

McLachlan, G. J. and Basford, K. E. (1988), Mixture Models, Inference and Applications to Clustering, New York: Marcel Dekker.


Feature Subset Selection and Order Identification for.. - Dy, Brodley (2000)   (16 citations)  (Correct)

.... Subset Selection wrapped around EM clustering) and FSSEM k (FSSEM with order identification) In this paper, the term EM clustering represents the expectation maximization (EM) algorithm (Dempster et al. 1977) applied to estimating the maximum likelihood parameters of a finite Gaussian mixture (McLachlan Basford, 1988). Although we apply the wrapper approach to EM clustering, it can be applied to any clustering method. 2. Unsupervised Feature Selection Literature To maintain the wrapper filter model distinction used to characterize feature subset selection in supervised learning, we define the wrapper ....

....In the experiments reported, we applied sequential forward search. In the future, we plan to explore the effect of other search methods on FSSEM. Note that EM is initialized for each new feature subset. In this paper, we assume that the data comes from a mixture model of multivariate Gaussians (McLachlan Basford, 1988). We apply the EM algorithm to estimate the maximum likelihood mixture model parameters and the cluster probabilities of each data point. EM clustering results in soft clusters (i.e. each data point belongs to every cluster with some probability) Note that the framework introduced in this ....

[Article contains additional citation context not shown here]

McLachlan, G. J., & Basford, K. E. (1988). Mixture models, inference and applications to clustering.


Performance Comparison of Smoothing and Gamma Priors for.. - Hsiao, Wang, Gindi (1999)   (Correct)

....optimization in (9) is to use an alternating algorithm: At each iteration k, solve for given current estimates of , then solve for given the latest estimate. This calculation is intractable, but may be made tractable by using an EM algorithm and an appropriate complete data space [10]. The complete data is given by Z aj , which may be interpreted as the probability that pixel j belongs to class a. Note that the complete data satisfies 0 Z aj 1, P a Z aj = 1 and a = 1=N P j Z aj . In terms of objective functions, the problem becomes: L ( jg) mix P ( ....

G.J. McLachlan and K.E. Basford, Mixture Models, Marcel Dekker, 1987.


Joint-MAP Reconstruction/Segmentation for Transmission.. - Hsiao, Rangarajan, Gindi (1998)   (Correct)

....the form of an indicator variable Z = fZ ai g #a=1; L;i=1; N# is introduced, where Z ai = # 1 : pixel i belongs to class a 0 : otherwise (8) with P a Z ai = 1. Note that Z ai can be also viewed as a segmentation of #. The log likelihood function for the complete data turns out to be [11] log L c = X i X a Z ai log## a p## i j# a ## (9) Now, let s perform the E step of EM by taking the expectation of (9) over Z ai by given # k =## k ;# k # and #, E z #log L c j# k ;## = X ai log## k a p## i j# k a ##EZ #Z ai j# i ; # k # = Q##j# k # (10) where EZ #Z ....

G.J. McLachlan and K.E. Basford, Mixture Models, Marcel Dekker, 1987.


Automating the Construction of Internet Portals with.. - McCallum, Nigam..   (29 citations)  (Correct)

....computer science hierarchy would allow the unlabeled documents to benefit classification more. However, even without a complete hierarchy, we could use these documents if we could identify these outliers. Some techniques for robust estimation with EM are discussed by McLachlan and Basford (McLachlan Basford, 1988). One specific technique for these text hierarchies is to add extra leaf nodes containing uniform word distributions to each interior node of the hierarchy in order to capture documents not belonging in any of the predefined topic leaves. This should allow EM to perform well even when a large ....

McLachlan, G., & Basford, K. (1988). Mixture Models. Marcel Dekker, New York.


MDL-Based Selection of the Number of Components in Mixture .. - Tenmoto, Kudo, Shimbo (1998)   (9 citations)  (Correct)

.... [1] In this situation, two problems arise: 1) How should we select the number of components and 2) How should we construct initial components The first problem is crucial but difficult, and there have been many efforts made to resolve this problem (these efforts are discussed in reference [2]) For example, Ichimura [3] has proposed a method for selecting the number of components based on information criteria. In this method, the optimal number of components is selected on the basis of the MDL principle [4] in a trade off between the likelihood of the model to the training samples and ....

McLachlan, G. J., Basford, K. E.: Mixture Models. Marcel Dekker, Inc., New York (1988) 21--29


From Isolation to Cooperation: An Alternative View of a.. - Schaal, Atkeson (1995)   (15 citations)  (Correct)

....learning rather slowly. In incremental learning, another potential danger arises when the input distribution of the data changes. The expert selection system usually makes either implicit or explicit prior assumptions about the input data distribution. For example, in the classical mixture model (McLachlan Basford, 1988) which was employed in several local expert approaches, the prior probabilities of each mixture model can be interpreted as the fraction of data points each expert expects to experience. Therefore, a change in input distribution will cause all experts to change their domains of expertise in order ....

McLachlan, G. J., & Basford, K. E. (1988). Mixture models. New York: Marcel Dekker.


Analysis of Three-Dimensional Protein Images - Leherte, al. (1997)   (3 citations)  (Correct)

.... analysis approach to the problem, we decided to treat the estimation and combination issues together by fitting mixtures of continuous distributions to the data for each class, under the conditional independence assumption commonly used in mixture model approaches to classification and clustering (McLachlan Basford, 1988). In a latent class analysis approach to finding structure in a set of datapoints, one begins with an underlying parameterized model. For example, one might posit that a set of points represented by a 2D scatterplot was generated by a 2D Gaussian (normal) distribution, with means 1 ; 2 and ....

McLachlan, G., & Basford, K. (1988). Mixture Models. Inference and Applications to Clustering.


Bootstrapping for Text Learning Tasks - Jones, McCallum, Nigam, Riloff (1999)   (12 citations)  (Correct)

....estimation. A more inclusive computer science hierarchy would allow the unlabeled documents to benefit classification more. However, even without a complete hierarchy, we could use these documents if we could identify these outliers. Some techniques for robust estimation with EM are discussed by McLachlan and Basford [ 1988 ] One specific technique for these text hierarchies is to add extra leaf nodes containing uniform word distributions to each interior node of the hierarchy in order to capture documents not belonging in any of the predefined topic leaves. This should allow EM to perform well even when a large ....

G.J. McLachlan and K.E. Basford. Mixture Models. Marcel Dekker, New York, 1988.


SIGMA: Integrating Learning Techniques in Computational.. - Grigoris Karakoulas   (1 citation)  (Correct)

....in coping with the intricacies of the IF task may inherently be limited. Within the statistics and the computational learning communities, the idea of distributing a learning problem among a set of local experts has been proposed for robustness against partial observability and non stationarity (McLachlan Basford 1988; Jordan Jacobs 1994) These local experts compete with each other in order to acquire local expertise in regions of the input space which may be overlapping. In the models so developed, gating of the experts is fixed and often depends on prior assumptions about the input data distributions. ....

McLachlan, G.; and Basford, K. 1988. Mixture models. Marcel Dekker.


MML mixture modelling of multi-state, Poisson, von Mises.. - Wallace, Dowe (1997)   (Correct)

.... Bernoulli sampling) 6 Alternative mixture modelling programs The first Snob program (since out dated) 37] was possibly the first program for Gaussian mixture modelling, although many statistical and machine learning approaches to this problem have been developed since (e.g. McLachlan et al.[23, 22], D. Fisher s CobWeb[17] Discussions of early alternative algorithms for Gaussian mixture modelling have been given by Boulton[3] 6.1 Comparison with AutoClass II Like Snob, AutoClass II [10] assumes 6 a prior distribution over the number of classes and independent prior densities over the ....

G.J. McLachlan and K.E. Basford. Mixture Models. Marcel Dekker, New York, 1988.


Continuous Gaussian Mixture Modeling - Stephen Aylward   (Correct)

....FGMMs offer poor consistency. This inconsistency is aggravated by the reliance on the user to specify the number of components. While much research has focused on automatically determining an appropriate number of components for a given problem, a generally applicable approach has not be found [9, 18]. A FGMM s expected accuracy does not vary monotonically as a function of the number of components. Additionally, MLEM s non optimal maxima can lead to poorly utilized components; the effective number of components in an FGMM may be less than the user specified number of components. GPGDs are ....

McLachlan, G.J. and Basford, K.E., Mixture Models. Marcel Dekker, Inc., New York, vol. 84, 1988 p. 253


Neural Networks and Statistical Models - Sarle (1994)   (31 citations)  (Correct)

.... algorithms have been developed in statistics, numerical taxonomy, and many other fields, as described in countless articles and numerous books such as Everitt (1980) Massart and Kaufman (1983) Anderberg (1973) Sneath and Sokal (1973) Hartigan (1975) Titterington, Smith, and Makov (1985) McLachlan and Basford (1988), Kaufmann and Rousseeuw (1990) and Spath (1980) In adaptive vector quantization (AVQ) the inputs are acknowledged to be target values that are predicted by the means of the cluster to which a given observation belongs. This network is therefore essentially the same as that in Figure 12 except ....

McLachlan, G.J. and Basford, K.E. (1988), Mixture Models, New York: Marcel Dekker, Inc.


Document Preprocessing for Naive Bayes.. - Pavlov.. (2004)   (Correct)

No context found.

G. McLachlan and K. Basford. Mixture Models. Marcel Dekker, New York, 1988.


Sequence Modeling with Mixtures of Conditional Maximum.. - Dmitry Pavlov Yahoo (2003)   (Correct)

No context found.

G. McLachlan and K. Basford. Mixture Models. Marcel Dekker, New York, 1988.


Unsupervised Learning Using MML - Jonathan Oliver Computer (1996)   (21 citations)  (Correct)

No context found.

G.J. McLachlan and K.E. Basford. Mixture Models. Marcel Dekker, New York, 1988.


Mixture Modeling for Digital Mammogram Display and Analysis - Aylward, Hemminger, Pisano (1998)   (Correct)

No context found.

McLachlan GJ, Basford KE (1988) Mixture Models. New York, Marcel Dekker, Inc.


Continuous Gaussian Mixture Modeling - Stephen Aylward And (1997)   (Correct)

No context found.

McLachlan, G.J. and Basford, K.E., Mixture Models. Marcel Dekker, Inc., New York, vol. 84, 1988 p. 253


Joint-MAP Reconstruction/Segmentation for Transmission.. - Hsiao, Rangarajan, Gindi (1998)   (Correct)

No context found.

G.J. McLachlan and K.E. Basford, Mixture Models, Marcel Dekker, 1987.


An Empirical Comparison of Four Initialization Methods.. - Pena, Lozano, Larranaga (1999)   (15 citations)  (Correct)

No context found.

McLachlan, G.J. and Basford, K.E. (1988). Mixture Models. Marcel Dekker, Inc., New York, NY.


Annealed Competition of Experts for a Segmentation and .. - Pawelzik, Kohlmorgen, .. (1996)   (30 citations)  (Correct)

No context found.

McLachlan, G.J., Basford, K.J. (1988), Mixture models, Marcel Dekker, NY and Basel.


Unsupervised Learning Using MML - Oliver, Baxter, Wallace (1996)   (21 citations)  (Correct)

No context found.

G.J. McLachlan and K.E. Basford. Mixture Models. Marcel Dekker, New York, 1988.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC