Results 1  10
of
54
Weighted graph cuts without eigenvectors: A multilevel approach
 IEEE TRANS. PATTERN ANAL. MACH. INTELL
, 2007
"... A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel kmeans are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods—in par ..."
Abstract

Cited by 175 (22 self)
 Add to MetaCart
(Show Context)
A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel kmeans are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods—in particular, a general weighted kernel kmeans objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast highquality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods such as Metis have suffered from the restriction of equalsized clusters; our multilevel algorithm removes this restriction by using kernel kmeans to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a stateoftheart spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to largescale clustering tasks such as image segmentation, social network analysis, and gene network analysis.
Trajectory Clustering: A PartitionandGroup Framework
 In SIGMOD
, 2007
"... Existing trajectory clustering algorithms group similar trajectories as a whole, thus discovering common trajectories. Our key observation is that clustering trajectories as a whole could miss common subtrajectories. Discovering common subtrajectories is very useful in many applications, especiall ..."
Abstract

Cited by 168 (12 self)
 Add to MetaCart
(Show Context)
Existing trajectory clustering algorithms group similar trajectories as a whole, thus discovering common trajectories. Our key observation is that clustering trajectories as a whole could miss common subtrajectories. Discovering common subtrajectories is very useful in many applications, especially if we have regions of special interest for analysis. In this paper, we propose a new partitionandgroup framework for clustering trajectories, which partitions a trajectory into a set of line segments, and then, groups similar line segments together into a cluster. The primary advantage of this framework is to discover common subtrajectories from a trajectory database. Based on this partitionandgroup framework, we develop a trajectory clustering algorithm TRACLUS. Our algorithm consists of two phases: partitioning and grouping. For the first phase, we present a formal trajectory partitioning algorithm using the minimum description length (MDL) principle. For the second phase, we present a densitybased linesegment clustering algorithm. Experimental results demonstrate that TRACLUS correctly discovers common subtrajectories from real trajectory data.
A survey of kernel and spectral methods for clustering,”
 Pattern Recognit.,
, 2008
"... Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a ..."
Abstract

Cited by 88 (5 self)
 Add to MetaCart
(Show Context)
Abstract Clustering algorithms are a useful tool to explore data structures and have been employed in many disciplines. The focus of this paper is the partitioning clustering problem with a special interest in two recent approaches: kernel and spectral methods. The aim of this paper is to present a survey of kernel and spectral clustering methods, two approaches able to produce nonlinear separating hypersurfaces between clusters. The presented kernel clustering methods are the kernel version of many classical clustering algorithms, e.g., Kmeans, SOM and Neural Gas. Spectral clustering arise from concepts in spectral graph theory and the clustering problem is configured as a graph cut problem where an appropriate objective function has to be optimized. An explicit proof of the fact that these two paradigms have the same objective is reported since it has been proven that these two seemingly different approaches have the same mathematical foundation. Besides, fuzzy kernel clustering methods are presented as extensions of kernel Kmeans clustering algorithm.
A unified view of kernel kmeans, spectral clustering and graph cuts
, 2004
"... Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel kmeans are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel kmeans objective is mathematically equ ..."
Abstract

Cited by 73 (6 self)
 Add to MetaCart
(Show Context)
Recently, a variety of clustering algorithms have been proposed to handle data that is not linearly separable. Spectral clustering and kernel kmeans are two such methods that are seemingly quite different. In this paper, we show that a general weighted kernel kmeans objective is mathematically equivalent to a weighted graph partitioning objective. Special cases of this graph partitioning objective include ratio cut, normalized cut and ratio association. Our equivalence has important consequences: the weighted kernel kmeans algorithm may be used to directly optimize the graph partitioning objectives, and conversely, spectral methods may be used to optimize the weighted kernel kmeans objective. Hence, in cases where eigenvector computation is prohibitive, we eliminate the need for any eigenvector computation for graph partitioning. Moreover, we show that the KernighanLin objective can also be incorporated into our framework, leading to an incremental weighted kernel kmeans algorithm for local optimization of the objective. We further discuss the issue of convergence of weighted kernel kmeans for an arbitrary graph affinity matrix and provide a number of experimental results. These results show that nonspectral methods for graph partitioning are as effective as spectral methods and can be used for problems such as image segmentation in addition to data clustering.
Similaritybased Classification: Concepts and Algorithms
, 2008
"... This report reviews and extends the field of similaritybased classification, presenting new analyses, algorithms, data sets, and the most comprehensive set of experimental results to date. Specifically, the generalizability of using similarities as features is analyzed, design goals and methods for ..."
Abstract

Cited by 57 (3 self)
 Add to MetaCart
(Show Context)
This report reviews and extends the field of similaritybased classification, presenting new analyses, algorithms, data sets, and the most comprehensive set of experimental results to date. Specifically, the generalizability of using similarities as features is analyzed, design goals and methods for weighting nearestneighbors for similaritybased learning are proposed, and different methods for consistently converting similarities into kernels are compared. Experiments on eight real data sets compare eight approaches and their variants to similaritybased learning. 1
LinearTime Computation of Similarity Measures for Sequential Data
, 2008
"... Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and nonmetric similarity functions. The basis for comp ..."
Abstract

Cited by 38 (24 self)
 Add to MetaCart
Efficient and expressive comparison of sequences is an essential procedure for learning with sequential data. In this article we propose a generic framework for computation of similarity measures for sequences, covering various kernel, distance and nonmetric similarity functions. The basis for comparison is embedding of sequences using a formal language, such as a set of natural words, kgrams or all contiguous subsequences. As realizations of the framework we provide lineartime algorithms of different complexity and capabilities using sorted arrays, tries and suffix trees as underlying data structures. Experiments on data sets from bioinformatics, text processing and computer security illustrate the efficiency of the proposed algorithms—enabling peak performances of up to 10^6 pairwise comparisons per second. The utility of distances and nonmetric similarity measures for sequences as alternatives to string kernels is demonstrated in applications of text categorization, network intrusion detection and transcription site recognition in DNA.
Feature discovery in nonmetric pairwise data
 Journal of Machine Learning Research
, 2004
"... Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic nonmetric features such as they arise from human judgments. So far the problem of nonmetric pairwise data has been tackled by essent ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic nonmetric features such as they arise from human judgments. So far the problem of nonmetric pairwise data has been tackled by essentially omitting the negative eigenvalues or shifting the spectrum of the associated (pseudo)covariance matrix for a subsequent embedding. However, little attention has been paid to the negative part of the spectrum itself. In particular no answer was given to whether the directions associated to the negative eigenvalues would at all code variance other than noise related. We show by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques. The information hidden in the negative eigenvalue part of the spectrum is illustrated and discussed for three data sets, namely USPS handwritten digits, textmining and data from cognitive psychology.
2004), Learning with distance substitution kernels
 in Pattern Rcognition  Proc. of the 26th DAGM Symposium
"... Abstract. During recent years much effort has been spent in incorporating problem specific apriori knowledge into kernel methods for machine learning. A common example is apriori knowledge given by a distance measure between objects. A simple but effective approach for kernel construction consists ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
(Show Context)
Abstract. During recent years much effort has been spent in incorporating problem specific apriori knowledge into kernel methods for machine learning. A common example is apriori knowledge given by a distance measure between objects. A simple but effective approach for kernel construction consists of substituting the Euclidean distance in ordinary kernel functions by the problem specific distance measure. We formalize this distance substitution procedure and investigate theoretical and empirical effects. In particular we state criteria for definiteness of the resulting kernels. We demonstrate the wide applicability by solving several classification tasks with SVMs. Regularization of the kernel matrices can additionally increase the recognition accuracy. 1
Learning with constrained and unlabeled data
 In CVPR
, 2005
"... Classification problems abundantly arise in many computer vision tasks – being of supervised, semisupervised or unsupervised nature. Even when class labels are not available, a user still might favor certain grouping solutions over others. This bias can be expressed either by providing a clustering ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
(Show Context)
Classification problems abundantly arise in many computer vision tasks – being of supervised, semisupervised or unsupervised nature. Even when class labels are not available, a user still might favor certain grouping solutions over others. This bias can be expressed either by providing a clustering criterion or cost function and, in addition to that, by specifying pairwise constraints on the assignment of objects to classes. In this work, we discuss a unifying formulation for labelled and unlabelled data that can incorporate constrained data for model fitting. Our approach models the constraint information by the maximum entropy principle. This modeling strategy allows us (i) to handle constraint violations and soft constraints, and, at the same time, (ii) to speed up the optimization process. Experimental results on face classification and image segmentation indicates that the proposed algorithm is computationally efficient and generates superior groupings when compared with alternative techniques. 1.
Training SVM with Indefinite Kernels
"... Similarity matrices generated from many applications may not be positive semidefinite, and hence can’t fit into the kernel machine framework. In this paper, we study the problem of training support vector machines with an indefinite kernel. We consider a regularized SVM formulation, in which the ind ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
(Show Context)
Similarity matrices generated from many applications may not be positive semidefinite, and hence can’t fit into the kernel machine framework. In this paper, we study the problem of training support vector machines with an indefinite kernel. We consider a regularized SVM formulation, in which the indefinite kernel matrix is treated as a noisy observation of some unknown positive semidefinite one (proxy kernel) and the support vectors and the proxy kernel can be computed simultaneously. We propose a semiinfinite quadratically constrained linear program formulation for the optimization, which can be solved iteratively to find a global optimum solution. We further propose to employ an additional pruning strategy, which significantly improves the efficiency of the algorithm, while retaining the convergence property of the algorithm. In addition, we show the close relationship between the proposed formulation and multiple kernel learning. Experiments on a collection of benchmark data sets demonstrate the efficiency and effectiveness of the proposed algorithm. 1.