### Citations

11964 | Maximum Likelihood from Incomplete Data via the EM Algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...distribution corresponds to a cluster in the target consensus partition, and is assumed to be a multivariate, multinomial distribution. The maximum likelihood problem is solved using the EM algorithm =-=[6]-=-. There are three main advantages to our approach: 1. It completely avoids solving the label correspondence problem. 2. The low computational complexity of the EM consensus function – O(kNH) for k clu... |

2797 |
Algorithms for clustering data
- JAIN, DUBES
- 1988
(Show Context)
Citation Context ...mixture model, EM algorithm. 1 Introduction Data clustering is a difficult inverse problem, and as such is ill-posed when prior information about the underlying data distributions is not well defined =-=[15, 16, 21]-=-. Numerous clustering algorithms are capable of producing different partitions of the same data that capture various distinct aspects of the data. The exploratory nature of clustering tasks demands ef... |

1938 | Data clustering: a review
- Jain, Murty, et al.
- 1999
(Show Context)
Citation Context ...mixture model, EM algorithm. 1 Introduction Data clustering is a difficult inverse problem, and as such is ill-posed when prior information about the underlying data distributions is not well defined =-=[15, 16, 21]-=-. Numerous clustering algorithms are capable of producing different partitions of the same data that capture various distinct aspects of the data. The exploratory nature of clustering tasks demands ef... |

1467 |
The EM Algorithm and Extensions
- McLachlan, Krishnan
- 1996
(Show Context)
Citation Context ...y function Q ( Θ; Θ′ ) that serves as a lower bound on the observed data likelihood in Eq. (3): Q( Θ; Θ′ ) = log ( P( Y, z | Θ) ) p( z | Y, Θ′ ). z (11) Classical convergence analysis of EM algorithm =-=[6, 24]-=- establishes that the maximization of the function Q( Θ; Θ′ ) with respect to Θ is equivalent to increasing the observed likelihood function in Eq. (3). Evaluation of Q( Θ; Θ′ ) is the first step of t... |

1189 | A fast and high quality multilevel scheme for partitioning irregular graphs
- Karypis, Kumar
(Show Context)
Citation Context ... hypergraph operations to search for a solution. The Cluster-based Similarity Partitioning Algorithm (CSPA) [30] induces a graph from a co-association matrix and clusters it using the METIS algorithm =-=[19]-=-. Hypergraph partitioning algorithm (HGPA) [30] represents each cluster by a hyperedge in a graph where the nodes correspond to asgiven set of objects. Good hypergraph partitions are found using minim... |

817 | On the optimality of the simple Bayesian classifier under zero-one loss.
- Domingos, Pazzani
- 1997
(Show Context)
Citation Context ...e clusters in πC are much less sensitive to the conditional independence approximation than the estimated values of probabilities P( y | Θ) , as supported by the analysis of naïve Bayes classifier in =-=[7]-=-. The last ingredient of the mixture model is the choice ( j) ( j) of a probability density P ( y | ) for the components of the vectors yi. Since the variables yij take on nominal values from a set of... |

603 | Cluster ensembles - a knowledge reuse framework for combining multiple partitions
- Strehl, Ghosh, et al.
- 2002
(Show Context)
Citation Context ...ssue is the choice of the clustering algorithms for the ensemble. Diversity of the individual clusterings can be achieved by a number of approaches, including: using different conventional algorithms =-=[30]-=-, their relaxed versions [31], built-in randomness [11, 12], or by data sampling [8, 9, 25]. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which crea... |

455 |
Latent variable models and factors analysis
- Bartholomew
- 1987
(Show Context)
Citation Context ...ective functions during the inference, such as the minimum description length of the model. In addition, the proposed consensus algorithm can be viewed as a version of Latent Class Analysis (e.g. see =-=[3]-=-), which has rigorous statistical means for quantifying plausibility of a candidate mixture model. 5 Empirical Study The experiments were conducted with artificial and realworld datasets, where true n... |

439 | An analysis of Bayesian classifier
- Langley, Iba, et al.
- 1992
(Show Context)
Citation Context ...nt clustering algorithms (indexed by j) are not truly independent, the approximation by product in Eq. (5) can be justified by the excellent performance of naive Bayes classifiers in discrete domains =-=[22]-=-. Our ultimate goal is to make a discrete label assignment to the data in X through an indirect route of density estimation of Y. The assignments of patterns to the clusters in πC are much less sensit... |

418 | Unsupervised learning of finite mixture models,”
- Figueiredo, Jain
- 2002
(Show Context)
Citation Context ...ssume that the target number of clusters is predetermined. It should be noted, however, that mixture models in unsupervised classification greatly facilitate estimation of the true number of clusters =-=[10]-=-. Maximum likelihood formulation of the problem specifically allows us to estimate M by using additional objective functions during the inference, such as the minimum description length of the model. ... |

230 | Supervised learning from incomplete data via an EM approach
- Ghahramani, M
- 1994
(Show Context)
Citation Context ...rmation can occur in clustering combination of distributed data or ensemble of clusterings of non-identical replicas of a dataset. It is possible to apply the EM algorithm in the case of missing data =-=[14]-=-, namely missing cluster labels for some of the data points. In these situations, each vector yi in Y can be split into observed and missing components yi = (yi obs , yi mis ). Incorporation of a miss... |

215 | Support vector clustering
- Biowulf, Horn, et al.
(Show Context)
Citation Context ...rtitions). It is somewhat reminiscent of classification approaches based on kernel methods which rely on linear discriminant functions in the transformed space. For example, Support Vector Clustering =-=[4]-=- seeks spherical clusters after the kernel transformation that correspond to more complex cluster shapes in the original pattern space. Section 2 describes relevant research on clustering combination.... |

139 | Dimensionality Reduction Using Genetic Algorithms
- Raymer, Punch, et al.
- 2000
(Show Context)
Citation Context ... 2 2082-2110 4192 21.1 2-spirals 2 2 100-100 200 43.5 Half-rings 2 2 100-300 400 25.6 Iris 4 3 50-50-50 150 15.1 either conserved or non-conserved type of molecules in the bound structure of proteins =-=[1, 28]-=-. Molecules are described by 8 physical and chemical features. The first feature, atomic density, was not used in the experiments because of its high correlation with atomic hydrophilicity. We also us... |

137 | An impossibility theorem for clustering.
- Kleinberg
- 2002
(Show Context)
Citation Context ...mixture model, EM algorithm. 1 Introduction Data clustering is a difficult inverse problem, and as such is ill-posed when prior information about the underlying data distributions is not well defined =-=[15, 16, 21]-=-. Numerous clustering algorithms are capable of producing different partitions of the same data that capture various distinct aspects of the data. The exploratory nature of clustering tasks demands ef... |

131 | Bagging to improve the accuracy of a clustering procedure
- Dudoit, Fridlyand
(Show Context)
Citation Context ...dividual clusterings can be achieved by a number of approaches, including: using different conventional algorithms [30], their relaxed versions [31], built-in randomness [11, 12], or by data sampling =-=[8, 9, 25]-=-. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which creates the combined clustering. We propose a new fusion method for these kinds of unsupervised... |

131 | A.: Data clustering using evidence accumulation. In:
- Fred, Jain
- 2002
(Show Context)
Citation Context ... ensemble. Diversity of the individual clusterings can be achieved by a number of approaches, including: using different conventional algorithms [30], their relaxed versions [31], built-in randomness =-=[11, 12]-=-, or by data sampling [8, 9, 25]. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which creates the combined clustering. We propose a new fusion method... |

83 | Combining Multiple Weak Clustering, - Topchy, Jain, et al. - 2003 |

55 | Finding consistent clusters in data partitions. In:
- Fred
(Show Context)
Citation Context ... ensemble. Diversity of the individual clusterings can be achieved by a number of approaches, including: using different conventional algorithms [30], their relaxed versions [31], built-in randomness =-=[11, 12]-=-, or by data sampling [8, 9, 25]. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which creates the combined clustering. We propose a new fusion method... |

53 | Path-based clustering for grouping of smooth curves and texture segmentation.
- Fischer, Buhmann
- 2003
(Show Context)
Citation Context ...dividual clusterings can be achieved by a number of approaches, including: using different conventional algorithms [30], their relaxed versions [31], built-in randomness [11, 12], or by data sampling =-=[8, 9, 25]-=-. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which creates the combined clustering. We propose a new fusion method for these kinds of unsupervised... |

53 | Collective, hierarchical clustering from distributed, heterogeneous data
- Johnson, Kargupta
- 1999
(Show Context)
Citation Context ...ns are accumulated during the course of merging. The final clustering is obtained by assigning each object to a derived cluster with the highest membership value. The distributed clustering algorithm =-=[17]-=- constructs a global dendrogram for a set of objects from multiple local models produced by single-link algorithms. Collective hierarchical clustering combines dendrograms built on different subsets o... |

52 |
Protein data bank
- Abola, Bernstein, et al.
- 1987
(Show Context)
Citation Context ... 2 2082-2110 4192 21.1 2-spirals 2 2 100-100 200 43.5 Half-rings 2 2 100-300 400 25.6 Iris 4 3 50-50-50 150 15.1 either conserved or non-conserved type of molecules in the bound structure of proteins =-=[1, 28]-=-. Molecules are described by 8 physical and chemical features. The first feature, atomic density, was not used in the experiments because of its high correlation with atomic hydrophilicity. We also us... |

46 |
Automated star/galaxy discrimination with neural networks
- Odewahn, Stockwell, et al.
- 1992
(Show Context)
Citation Context ...the experiments. Two large real-world benchmarks: (i) The dataset of galaxies and stars, characterized by 14 features extracted from their images, with known classification provided by domain experts =-=[26]-=-, (ii) Biochemical dataset of water molecules found in protein structures and categorized as jmsTable 2: Characteristics of the datasets. Dataset No. of No. of No. of Total no. Av. k -means features c... |

40 |
The median procedure for partitions,”
- Barthelemy, Leclerc
- 1995
(Show Context)
Citation Context ...ntractable label correspondence problem. The combination of multiple clustering can also be viewed as finding a median partition with respect to the given partitions which is proven to be NP-complete =-=[2]-=-. Another challenging issue is the choice of the clustering algorithms for the ensemble. Diversity of the individual clusterings can be achieved by a number of approaches, including: using different c... |

25 | Voting-merging: An ensemble method for clustering
- Dimitriadou, Weingessel, et al.
- 2001
(Show Context)
Citation Context ... objective function evaluates multiple partitions according to changes caused by data perturbations and prefers those clusterings that are least susceptible to those perturbations. Dimitriadou et al. =-=[5]-=- proposed a voting/merging procedure that combines clusterings pair-wise and iteratively. The cluster correspondence problem must be solved at each iteration and the solution is not unique. Fuzzy memb... |

17 |
Ensembles of Partitions via Data Resampling. In:
- Minaei, Topchy, et al.
- 2004
(Show Context)
Citation Context ...dividual clusterings can be achieved by a number of approaches, including: using different conventional algorithms [30], their relaxed versions [31], built-in randomness [11, 12], or by data sampling =-=[8, 9, 25]-=-. This work focuses on the primary problem of clustering ensembles, namely the consensus function, which creates the combined clustering. We propose a new fusion method for these kinds of unsupervised... |

15 |
Contrasting and Combining Clusters in Viral Gene Expression Data. In:
- Kellam, Liu, et al.
- 2001
(Show Context)
Citation Context ...ure. Such an approach [23] approach is unique in that the components of the combination and the final clustering are defined implicitly via prototypes rather than by explicit labelings. Kellam et al. =-=[20]-=- also combined clusterings through a type of co-association matrix. However, this matrix is used only to find the clusters with the highest value of support based on object co-occurrences. As a result... |

15 |
Inference with missing data
- Rubin
- 1976
(Show Context)
Citation Context ...dling missing data can be found in [14, 24]. Though data with missing cluster labels can be obtained in different ways, we analyze only the case when components of yi are missing completely at random =-=[29]-=-. It means that the probability of a component to be missing does not depend on other observed or unobserved variables. Note, that the outcome of clustering of data subsamples (e.g., bootstrap) is dif... |

6 |
Robust clustering by evolutionary computation
- Gablentz, Koppen, et al.
- 2000
(Show Context)
Citation Context ...hest value of support based on object co-occurrences. As a result, only a set of so-called robust clusters is produced which may not contain all the initial objects A genetic algorithm is employed in =-=[13]-=- to produce the most stable partitions from an evolving ensemble (population) of clustering algorithms along with a special objective function. The objective function evaluates multiple partitions acc... |

6 |
Distributed Data Mining
- Park, Kargupta
- 2003
(Show Context)
Citation Context ...ering ensembles can also be used in multiobjective clustering as a compromise between individual clusterings with conflicting objective functions and play an important role in distributed data mining =-=[27]-=-. The problem of combination of multiple clusterings brings its own new challenges. The major difficulty is in finding a consensus partition from the output partitions of various clustering algorithms... |

3 |
Bagged clustering, Working Papers SFB "Adaptive Information Systems and Modeling
- Leisch
- 1999
(Show Context)
Citation Context ...ing of the entire data set. After an overall consistent re-labeling, voting can be applied to determine cluster membership for each pattern. Bagging of multiple k-means clustering results was done in =-=[23]-=- by clustering k-means centers and assigning the objects to the closest cluster center. In fact, their component clusterings do not keep information about the individual object labels but only informa... |