MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Unsupervised and semisupervised clustering: a brief survey (2004) [6 citations — 0 self]

Download:
Download as a PDF
by Nizar Grira, Michel Crucianu, Nozha Boujemaa
in ‘A Review of Machine Learning Techniques for Processing Multimedia Content’, Report of the MUSCLE European Network of Excellence (FP6
http://www-rocq.inria.fr/~crucianu/src/BriefSurveyClustering.pdf
Add To MetaCart

Abstract:

Clustering (or cluster analysis) aims to organize a collection of data items into clusters, such that items within a cluster are more “similar ” to each other than they are to items in the other clusters. This notion of similarity can be expressed in very different ways, according to the purpose of the study, to domain-specific assumptions and to prior knowledge of the problem. Clustering is usually performed when no information is available concerning the membership of data items to predefined classes. For this reason, clustering is traditionally seen as part of unsupervised learning. We nevertheless speak here of unsupervised clustering to distinguish it from a more recent and less common approach that makes use of a small amount of supervision to “guide ” or “adjust ” clustering (see section 2). To support the extensive use of clustering in computer vision, pattern recognition, information retrieval, data mining, etc., very many different methods were developed in several communities. Detailed surveys of this domain can be found in [25], [27] or [26]. In the following, we attempt to briefly review a few core concepts of cluster analysis and describe categories of clustering methods that are best represented in the literature. We also take this opportunity to provide some pointers to more recent work on clustering.

Citations

4704 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
1479 Algorithms for Clustering Data – Jain, C - 1988
728 Finding Groups in Data: An Introduction to Cluster Analysis – Kaufman, Rousseeuw - 1990
597 Data clustering: A review – Jain, Murty, et al. - 1999
572 A density-based algorithm for discovering clusters in large spatial databases with noise – Ester, Kriegel, et al. - 1996
431 On spectral clustering: Analysis and an algorithm – Ng, Jordan, et al. - 2001
405 Automatic subspace clustering of high dimensional data for data mining applications – Agrawal, Gehrke, et al. - 1998
371 CURE: an efficient clustering algorithm for large databases – Guha, Rastogi, et al. - 1998
294 Cluster Analysis – Everitt - 1993
204 Model-based gaussian and non-gaussian clustering – Banfield, Raftery - 1993
203 OPTICS: Ordering Points To Identify the Clustering Structure – Ankerst, Breunig, et al. - 1999
203 Distance metric learning, with application to clustering with side-information – Xing, Ng, et al. - 2003
194 Hierarchical grouping to optimize an objective function – Ward - 1963
152 Graph-theoretical Methods for Detecting and Describing Gestalt Clusters – Zahn - 1971
151 Unsupervised optimal fuzzy clustering – Gath, Geva - 1989
136 An efficient approach to clustering in large multimedia databases with noise – Hinneburg, Keim - 1998
108 Adaptive duplicate detection using learnable string similarity measures – Bilenko, Mooney - 2003
97 A validity measure for fuzzy clustering – Xie, Beni - 1991
83 From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering – Klein, Kamvar, et al. - 2002
82 Support vector clustering – Ben-Hur, Horn, et al. - 2001
78 Clustering with instance-level constraints – Wagstaff, Cardie - 2000
75 Density-based clustering in spatial databases: the algorithm GDBSCAN and its applications. Data mining and knowledge discovery – Sander, Ester, et al. - 1998
62 A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining – Huang - 1997
61 Semi-supervised clustering by seeding – Basu, Banerjee, et al. - 2002
61 Some New Indexes of Cluster Validity – Bezdek, Pal - 1998
58 Semi-supervised clustering with user feedback – Cohn, Caruana, et al. - 2003
47 Step-wise clustering procedures – King - 1967
40 Semi-supervised clustering using genetic algorithms – Demiriz, Bennett, et al. - 1999
19 Comparing and unifying search-based and similarity-based approaches to semi-supervised clustering – Basu, Bilenko, et al. - 2003
5 Use of the adaptive fuzzy clustering algorithm to detect lines in digital images – Dav'e - 1989
3 Structure of hierarchic clusterings: implications for information retrieval and for multivariate data analysis – Murtagh - 1984
2 FCM: Fuzzy c-means algorithm. Computers and Geoscience – Bezdek, Ehrlich, et al. - 1984
2 Davé and Raghu Krishnapuram. Robust clustering methods: A unified view – Rajesh - 1997
2 Some methods for classifcation and analysis of multivariate observations – McQueen - 1967
1 Celeux and Gérard Govaert. Gaussian parsimonious clustering models – Gilles - 1995
1 Frederix and Eric Pauwels. Two general geometric cluster validity indices – Greet - 2004
1 Saux and Nozha Boujemaa. Unsupervised robust clustering for image database categorization – Le - 2002