Results 11  20
of
265
Learning spectral clustering, with application to speech separation
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost fun ..."
Abstract

Cited by 70 (6 self)
 Add to MetaCart
Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost functions for spectral clustering based on measures of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing these cost functions with respect to the partition leads to new spectral clustering algorithms. Minimizing with respect to the similarity matrix leads to algorithms for learning the similarity matrix from fully labelled datasets. We apply our learning algorithm to the blind onemicrophone speech separation problem, casting the problem as one of segmentation of the spectrogram.
1 Parallel Spectral Clustering in Distributed Systems
"... Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms such as kmeans. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform cluster ..."
Abstract

Cited by 63 (1 self)
 Add to MetaCart
(Show Context)
Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms such as kmeans. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarity matrix. We compare one approach by sparsifying the matrix with another by the Nyström method. We then pick the strategy of sparsifying the matrix via retaining nearest neighbors and investigate its parallelization. We parallelize both memory use and computation on distributed computers. Through
Randomized Cuts for 3D Mesh Analysis
"... The goal of this paper is to investigate a new shape analysis method based on randomized cuts of 3D surface meshes. The general strategy is to generate a random set of mesh segmentations and then to measure how often each edge of the mesh lies on a segmentation boundary in the randomized set. The re ..."
Abstract

Cited by 60 (2 self)
 Add to MetaCart
The goal of this paper is to investigate a new shape analysis method based on randomized cuts of 3D surface meshes. The general strategy is to generate a random set of mesh segmentations and then to measure how often each edge of the mesh lies on a segmentation boundary in the randomized set. The resulting “partition function” defined on edges provides a continuous measure of where natural part boundaries occur in a mesh, and the set of “most consistent cuts ” provides a stable list of global shape features. The paper describes methods for generating random distributions of mesh segmentations, studies sensitivity of the resulting partition functions to noise, tessellation, pose, and intraclass shape variations, and investigates applications in mesh visualization, segmentation, deformation, and registration.
Spectral Matting
, 2008
"... We present spectral matting: a new approach to natural image matting that automatically computes a set of fundamental fuzzy matting components from the smallest eigenvectors of a suitably defined Laplacian matrix. Thus, our approach extends spectral segmentation techniques, whose goal is to extract ..."
Abstract

Cited by 60 (2 self)
 Add to MetaCart
(Show Context)
We present spectral matting: a new approach to natural image matting that automatically computes a set of fundamental fuzzy matting components from the smallest eigenvectors of a suitably defined Laplacian matrix. Thus, our approach extends spectral segmentation techniques, whose goal is to extract hard segments, to the extraction of soft matting components. These components may then be used as building blocks to easily construct semantically meaningful foreground mattes, either in an unsupervised fashion, or based on a small amount of user input. 1.
Fast Approximate Spectral Clustering
, 2009
"... Spectral clustering refers to a flexible class of clustering procedures that can produce highquality clusterings on small data sets but which has limited applicability to largescale problems due to its computational complexity of O(n 3), with n the number of data points. We extend the range of spe ..."
Abstract

Cited by 58 (1 self)
 Add to MetaCart
(Show Context)
Spectral clustering refers to a flexible class of clustering procedures that can produce highquality clusterings on small data sets but which has limited applicability to largescale problems due to its computational complexity of O(n 3), with n the number of data points. We extend the range of spectral clustering by developing a general framework for fast approximate spectral clustering in which a distortionminimizing local transformation is first applied to the data. This framework is based on a theoretical analysis that provides a statistical characterization of the effect of local distortion on the misclustering rate. We develop two concrete instances of our general framework, one based on local kmeans clustering (KASP) and one based on random projection trees (RASP). Extensive experiments show that these algorithms can achieve significant speedups with little degradation in clustering accuracy. Specifically, our algorithms outperform kmeans by a large margin in terms of accuracy, and run several times faster than approximate spectral clustering based on the Nyström method, with comparable accuracy and significantly smaller memory footprint. Remarkably, our algorithms make it possible for a single machine to spectral cluster data sets with a million observations within several minutes. 1
Data fusion and multicue data matching by diffusion maps
 IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density i ..."
Abstract

Cited by 57 (5 self)
 Add to MetaCart
(Show Context)
Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density invariant embeddings which are essential for integrating different sources of data. Second, we describe a refinement of the Nyström extension algorithm called “geometric harmonics. ” We also explain how to use this tool for data assimilation. Finally, we introduce a multicue data matching scheme based on nonlinear spectral graphs alignment. The effectiveness of the presented schemes is validated by applying it to the problems of lipreading and image sequence alignment. Index Terms—Pattern matching, graph theory, graph algorithms, Markov processes, machine learning, data mining, image databases. Ç 1
S.: Video behavior profiling for anomaly detection
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2008
"... Abstract—This paper aims to address the problem of modeling video behavior captured in surveillance videos for the applications of online normal behavior recognition and anomaly detection. A novel framework is developed for automatic behavior profiling and online anomaly sampling/detection without a ..."
Abstract

Cited by 57 (10 self)
 Add to MetaCart
(Show Context)
Abstract—This paper aims to address the problem of modeling video behavior captured in surveillance videos for the applications of online normal behavior recognition and anomaly detection. A novel framework is developed for automatic behavior profiling and online anomaly sampling/detection without any manual labeling of the training data set. The framework consists of the following key components: 1) A compact and effective behavior representation method is developed based on discretescene event detection. The similarity between behavior patterns are measured based on modeling each pattern using a Dynamic Bayesian Network (DBN). 2) The natural grouping of behavior patterns is discovered through a novel spectral clustering algorithm with unsupervised model selection and feature selection on the eigenvectors of a normalized affinity matrix. 3) A composite generative behavior model is constructed that is capable of generalizing from a small training set to accommodate variations in unseen normal behavior patterns. 4) A runtime accumulative anomaly measure is introduced to detect abnormal behavior, whereas normal behavior patterns are recognized when sufficient visual evidence has become available based on an online Likelihood Ratio Test (LRT) method. This ensures robust and reliable anomaly detection and normal behavior recognition at the shortest possible time. The effectiveness and robustness of our approach is demonstrated through experiments using noisy and sparse data sets collected from both indoor and outdoor surveillance scenarios. In particular, it is shown that a behavior model trained using an unlabeled data set is superior to those trained using the same but labeled data set in detecting anomaly from an unseen video. The experiments also suggest that our online LRTbased behavior recognition approach is advantageous over the commonly used Maximum Likelihood (ML) method in differentiating ambiguities among different behavior classes observed online.
Untangling Cycles for Contour Grouping
"... We introduce a novel topological formulation for contour grouping. Our grouping criterion, called untangling cycles, exploits the inherent topological 1D structure of salient contours to extract them from the otherwise 2D image clutter. To define a measure for topological classification robust to cl ..."
Abstract

Cited by 56 (11 self)
 Add to MetaCart
(Show Context)
We introduce a novel topological formulation for contour grouping. Our grouping criterion, called untangling cycles, exploits the inherent topological 1D structure of salient contours to extract them from the otherwise 2D image clutter. To define a measure for topological classification robust to clutter and broken edges, we use a graph formulation instead of the standard computational topology. The key insight is that a pronounced 1D contour should have a clear ordering of edgels, to which all graph edges adhere, and no long range entanglements persist. Finding the contour grouping by optimizing these topological criteria is challenging. We introduce a novel concept of circular embedding to encode this combinatorial task. Our solution leads to computing the dominant complex eigenvectors/eigenvalues of the random walk matrix of the contour grouping graph. We demonstrate major improvements over stateoftheart approaches on challenging real images. 1.
MultiLabel Image Segmentation for Medical Applications Based on GraphTheoretic Electrical Potentials
 ECCV
, 2004
"... Abstract. A novel method is proposed for performing multilabel, semiautomated image segmentation. Given a small number of pixels with userdefined labels, one can analytically (and quickly) determine the probability that a random walker starting at each unlabeled pixel will first reach one of the ..."
Abstract

Cited by 48 (10 self)
 Add to MetaCart
(Show Context)
Abstract. A novel method is proposed for performing multilabel, semiautomated image segmentation. Given a small number of pixels with userdefined labels, one can analytically (and quickly) determine the probability that a random walker starting at each unlabeled pixel will first reach one of the prelabeled pixels. By assigning each pixel to the label for which the greatest probability is calculated, a highquality image segmentation may be obtained. Theoretical properties of this algorithm are developed along with the corresponding connections to discrete potential theory and electrical circuits. This algorithm is formulated in discrete space (i.e., on a graph) using combinatorial analogues of standard operators and principles from continuous potential theory, allowing it to be applied in arbitrary dimension. 1
A fast kernelbased multilevel algorithm for graph clustering
 In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, 2005
"... Graph clustering (also called graph partitioning) — clustering the nodes of a graph — is an important problem in diverse data mining applications. Traditional approaches involve optimization of graph clustering objectives such as normalized cut or ratio association; spectral methods are widely used ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
(Show Context)
Graph clustering (also called graph partitioning) — clustering the nodes of a graph — is an important problem in diverse data mining applications. Traditional approaches involve optimization of graph clustering objectives such as normalized cut or ratio association; spectral methods are widely used for these objectives, but they require eigenvector computation which can be slow. Recently, graph clustering with a general cut objective has been shown to be mathematically equivalent to an appropriate weighted kernel kmeans objective function. In this paper, we exploit this equivalence to develop a very fast multilevel algorithm for graph clustering. Multilevel approaches involve coarsening, initial partitioning and refinement phases, all of which may be specialized to different graph clustering objectives. Unlike existing multilevel clustering approaches, such as METIS, our algorithm does not constrain the cluster sizes to be nearly equal. Our approach gives a theoretical guarantee that the refinement step decreases the graph cut objective under consideration. Experiments show that we achieve better final objective function values as compared to a stateoftheart spectral clustering algorithm: on a series of benchmark test graphs with up to thirty thousand nodes and one million edges, our algorithm achieves lower normalized cut values in 67 % of our experiments and higher ratio association values in 100 % of our experiments. Furthermore, on large graphs, our algorithm is significantly faster than spectral methods. Finally, our algorithm requires far less memory than spectral methods; we cluster a 1.2 million node movie network into 5000 clusters, which due to memory requirements cannot be done directly with spectral methods.