Results 1  10
of
265
Data Clustering: 50 Years Beyond KMeans
, 2008
"... Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and m ..."
Abstract

Cited by 294 (7 self)
 Add to MetaCart
Organizing data into sensible groupings is one of the most fundamental modes of understanding and learning. As an example, a common scheme of scientific classification puts organisms into taxonomic ranks: domain, kingdom, phylum, class, etc.). Cluster analysis is the formal study of algorithms and methods for grouping, or clustering, objects according to measured or perceived intrinsic characteristics or similarity. Cluster analysis does not use category labels that tag objects with prior identifiers, i.e., class labels. The absence of category information distinguishes data clustering (unsupervised learning) from classification or discriminant analysis (supervised learning). The aim of clustering is exploratory in nature to find structure in data. Clustering has a long and rich history in a variety of scientific fields. One of the most popular and simple clustering algorithms, Kmeans, was first published in 1955. In spite of the fact that Kmeans was proposed over 50 years ago and thousands of clustering algorithms have been published since then, Kmeans is still widely used. This speaks to the difficulty of designing a general purpose clustering algorithm and the illposed problem of clustering. We provide a brief overview of clustering, summarize well known clustering methods, discuss the major challenges and key issues in designing clustering algorithms, and point out some of the emerging and useful research directions, including semisupervised clustering, ensemble clustering, simultaneous feature selection, and data clustering and large scale data clustering.
Weighted graph cuts without eigenvectors: A multilevel approach
 IEEE TRANS. PATTERN ANAL. MACH. INTELL
, 2007
"... A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel kmeans are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods—in par ..."
Abstract

Cited by 175 (22 self)
 Add to MetaCart
(Show Context)
A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel kmeans are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods—in particular, a general weighted kernel kmeans objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast highquality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods such as Metis have suffered from the restriction of equalsized clusters; our multilevel algorithm removes this restriction by using kernel kmeans to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a stateoftheart spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to largescale clustering tasks such as image segmentation, social network analysis, and gene network analysis.
Iterative quantization: A procrustean approach to learning binary codes
 In Proc. of the IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR
, 2011
"... This paper addresses the problem of learning similaritypreserving binary codes for efficient retrieval in largescale image collections. We propose a simple and efficient alternating minimization scheme for finding a rotation of zerocentered data so as to minimize the quantization error of mapping t ..."
Abstract

Cited by 157 (6 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of learning similaritypreserving binary codes for efficient retrieval in largescale image collections. We propose a simple and efficient alternating minimization scheme for finding a rotation of zerocentered data so as to minimize the quantization error of mapping this data to the vertices of a zerocentered binary hypercube. This method, dubbed iterative quantization (ITQ), has connections to multiclass spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). Our experiments show that the resulting binary coding schemes decisively outperform several other stateoftheart methods. 1.
Diffusion maps, spectral clustering and eigenfunctions of fokkerplanck operators
 in Advances in Neural Information Processing Systems 18
, 2005
"... This paper presents a diffusion based probabilistic interpretation of spectral clustering and dimensionality reduction algorithms that use the eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency matrix of all points, we define a diffusion distance between any two data points ..."
Abstract

Cited by 110 (14 self)
 Add to MetaCart
(Show Context)
This paper presents a diffusion based probabilistic interpretation of spectral clustering and dimensionality reduction algorithms that use the eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency matrix of all points, we define a diffusion distance between any two data points and show that the low dimensional representation of the data by the first few eigenvectors of the corresponding Markov matrix is optimal under a certain mean squared error criterion. Furthermore, assuming that data points are random samples from a density p(x) = e −U(x) we identify these eigenvectors as discrete approximations of eigenfunctions of a FokkerPlanck operator in a potential 2U(x) with reflecting boundary conditions. Finally, applying known results regarding the eigenvalues and eigenfunctions of the continuous FokkerPlanck operator, we provide a mathematical justification for the success of spectral clustering and dimensional reduction algorithms based on these first few eigenvectors. This analysis elucidates, in terms of the characteristics of diffusion processes, many empirical findings regarding spectral clustering algorithms.
Semisupervised graph clustering: a kernel approach
, 2008
"... Semisupervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semisupervised clustering algorithms are designed for data represented as vectors. In this ..."
Abstract

Cited by 94 (3 self)
 Add to MetaCart
Semisupervised clustering algorithms aim to improve clustering results using limited supervision. The supervision is generally given as pairwise constraints; such constraints are natural for graphs, yet most semisupervised clustering algorithms are designed for data represented as vectors. In this paper, we unify vectorbased and graphbased approaches. We first show that a recentlyproposed objective function for semisupervised clustering based on Hidden Markov Random Fields, with squared Euclidean distance and a certain class of constraint penalty functions, can be expressed as a special case of the weighted kernel kmeans objective (Dhillon et al., in Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining, 2004a). A recent theoretical connection between weighted kernel kmeans and several graph clustering objectives enables us to perform semisupervised clustering of data given either as vectors or as a graph. For graph data, this result leads to algorithms for optimizing several new semisupervised graph clustering objectives. For vector data, the kernel approach also enables us to find clusters with nonlinear boundaries in the input data space. Furthermore, we show that recent work on spectral learning (Kamvar et al., in Proceedings of the 17th International Joint Conference on Artificial Intelligence, 2003) may be viewed as a special case of our formulation. We empirically show that our algorithm is able to outperform current stateoftheart semisupervised algorithms on both vectorbased and graphbased data sets.
A survey of kernel and spectral methods for clustering,”
 Pattern Recognit.,
, 2008
"... Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a ..."
Abstract

Cited by 88 (5 self)
 Add to MetaCart
(Show Context)
Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., Kmeans, SOM and Neural Gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel Kmeans clustering algorithm.
A unified view of kernel kmeans, spectral clustering and graph cuts
, 2004
"... Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel kmeans are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel kmeans objective is mathematically equ ..."
Abstract

Cited by 73 (6 self)
 Add to MetaCart
(Show Context)
Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel kmeans are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel kmeans objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning objective include ratio cut, normalized cut and ratio association. Our equivalence has important consequences: the weighted kernel kmeans algorithm may be used to directly optimize the graph partitioning objectives, and conversely, spectral methods may be used to optimize the weighted kernel kmeans objective. Hence, in cases where eigenvector computation is prohibitive, we eliminate the need for any eigenvector computation for graph partitioning. Moreover, we show that the KernighanLin objective can also be incorporated into our framework, leading to an incremental weighted kernel kmeans algorithm for local optimization of the objective. We further discuss the issue of convergence of weighted kernel kmeans for an arbitrary graph affinity matrix and provide a number of experimental results. These results show that nonspectral methods for graph partitioning are as effective as spectral methods and can be used for problems such as image segmentation in addition to data clustering.