Results 1  10
of
19
Learning spectral clustering, with application to speech separation
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost fun ..."
Abstract

Cited by 69 (6 self)
 Add to MetaCart
Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost functions for spectral clustering based on measures of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing these cost functions with respect to the partition leads to new spectral clustering algorithms. Minimizing with respect to the similarity matrix leads to algorithms for learning the similarity matrix from fully labelled datasets. We apply our learning algorithm to the blind onemicrophone speech separation problem, casting the problem as one of segmentation of the spectrogram.
Clustering by weighted cuts in directed graphs
 In Proceedings of the 2007 SIAM International Conference on Data Mining
, 2007
"... In this paper we formulate spectral clustering in directed graphs as an optimization problem, the objective being a weighted cut in the directed graph. This objective extends several popular criteria like the normalized cut and the averaged cut to asymmetric affinity data. We show that this problem ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
(Show Context)
In this paper we formulate spectral clustering in directed graphs as an optimization problem, the objective being a weighted cut in the directed graph. This objective extends several popular criteria like the normalized cut and the averaged cut to asymmetric affinity data. We show that this problem can be relaxed to a Rayleigh quotient problem for a symmetric matrix obtained from the original affinities and therefore a large body of the results and algorithms developed for spectral clustering of symmetric data immediately extends to asymmetric cuts. 1
Model averaging and dimension selection for the singular value decomposition
 Journal of the American Statistical Association
, 2007
"... Many multivariate data analysis techniques for an m × n matrix Y are related to the model Y = M+E, where Y is an m×n matrix of full rank and M is an unobserved mean matrix of rank K < (m ∧ n). Typically the rank of M is estimated in a heuristic way and then the leastsquares estimate of M is obta ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
(Show Context)
Many multivariate data analysis techniques for an m × n matrix Y are related to the model Y = M+E, where Y is an m×n matrix of full rank and M is an unobserved mean matrix of rank K < (m ∧ n). Typically the rank of M is estimated in a heuristic way and then the leastsquares estimate of M is obtained via the singular value decomposition of Y, yielding an estimate that can have a very high variance. In this paper we suggest a modelbased alternative to the above approach by providing prior distributions and posterior estimation for the rank of M and the components of its singular value decomposition. In addition to providing more accurate inference, such an approach has the advantage of being extendable to more general dataanalysis situations, such as inference in the presence of missing data and estimation in a generalized linear modeling framework.
An Information Theoretic Approach to Machine Learning
, 2005
"... In this thesis, theory and applications of machine learning systems based on information theoretic criteria as performance measures are studied. A new clustering algorithm based on maximizing the CauchySchwarz (CS) divergence measure between probability density functions (pdfs) is proposed. The CS ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
In this thesis, theory and applications of machine learning systems based on information theoretic criteria as performance measures are studied. A new clustering algorithm based on maximizing the CauchySchwarz (CS) divergence measure between probability density functions (pdfs) is proposed. The CS divergence is estimated nonparametrically using the Parzen window technique for density estimation. The problem domain is transformed from discrete 0/1 cluster membership values to continuous membership values. A constrained gradient descent maximization algorithm is implemented. The gradients are stochastically approximated to reduce computational complexity, making the algorithm more practical. Parzen window annealing is incorporated into the algorithm to help avoid convergence to a local maximum. The clustering results obtained on synthetic and real data are encouraging. The Parzen windowbased estimator for the CS divergence is shown to have a dual expression as a measure of the cosine of the angle between cluster mean vectors in a feature space determined by the eigenspectrum of a Mercer kernel matrix. A spectral clustering
Information Theoretic Spectral Clustering
 In Proceedings of International Joint Conference on Neural Networks
"... Abstract — We discuss a new informationtheoretic framework for spectral clustering that is founded on the recently introduced Information Cut. A novel spectral clustering algorithm is proposed, where the clustering solution is given as a linearly weighted combination of certain top eigenvectors of ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Abstract — We discuss a new informationtheoretic framework for spectral clustering that is founded on the recently introduced Information Cut. A novel spectral clustering algorithm is proposed, where the clustering solution is given as a linearly weighted combination of certain top eigenvectors of the data affinity matrix. The Information Cut provides us with a theoretically well defined graphspectral cost function, and also establishes a close link between spectral clustering, and nonparametric density estimation. As a result, a natural criterion for creating the data affinity matrix is provided. We present preliminary clustering results to illustrate some of the properties of our algorithm, and we also make comparative remarks. I.
On Potts Model Clustering, Kernel KMeans and Density Estimation
 JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
, 2008
"... ... follow the same recipe: (i) choose a measure of similarity between observations; (ii) define a figure of merit assigning a large value to partitions of the data that put similar observations in the same cluster; and (iii) optimize this figure of merit over partitions. Potts model clustering repr ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
... follow the same recipe: (i) choose a measure of similarity between observations; (ii) define a figure of merit assigning a large value to partitions of the data that put similar observations in the same cluster; and (iii) optimize this figure of merit over partitions. Potts model clustering represents an interesting variation on this recipe. Blatt, Wiseman, and Domany defined a new figure of merit for partitions that is formally similar to the Hamiltonian of the Potts model for ferromagnetism, extensively studied in statistical physics. For each temperature T, the Hamiltonian defines a distribution assigning a probability to each possible configuration of the physical system or, in the language of clustering, to each partition. Instead of searching for a single partition optimizing the Hamiltonian, they sampled a large number of partitions from this distribution for a range of temperatures. They proposed a heuristic for choosing an appropriate temperature and from the sample of partitions associated with this chosen temperature, they then derived what we call a consensus clustering: two observations are put in the same consensus cluster if they belong to the same cluster in the majority of the random partitions. In a sense, the consensus clustering is an “average ” of plausible
Spectral clustering for speech separation
"... Spectral clustering refers to a class of recent techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this chapter, we introduce ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Spectral clustering refers to a class of recent techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this chapter, we introduce the main concepts and algorithms together with recent advances in learning the similarity matrix from data. The techniques are illustrated on the blind onemicrophone speech separation problem, by casting the problem as one of segmentation of the spectrogram. 1.
The stability of a good clustering
, 2011
"... If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. This paper proves spectral bounds on the distance d(C, Copt) for the case when “goodness ” ..."
Abstract
 Add to MetaCart
If we have found a ”good ” clustering C of a data set, can we prove that C is not far from the (unknown) best clustering Copt of these data? Perhaps surprisingly, the answer to this question is sometimes yes. This paper proves spectral bounds on the distance d(C, Copt) for the case when “goodness ” is measured by a quadratic cost, such as the squared distortion of Kmeans clustering, or the Normalized Cut criterion of spectral clustering. The bounds exist if the data admits a “good”, lowcost clustering. 1
Unsupervised Learning of Face Detection Models from Unlabeled Image Streams
"... Abstract: Modern artificial face detection shows impressive performance in a variety of application areas. This success comes at the cost of supervised training, using largescale databases provided by human experts. In this paper, we propose a face detection system based on Organic Computing [vdM0 ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: Modern artificial face detection shows impressive performance in a variety of application areas. This success comes at the cost of supervised training, using largescale databases provided by human experts. In this paper, we propose a face detection system based on Organic Computing [vdM08] paradigms that acquires necessary domain knowledge autonomously and learns a conceptual model of the human face/head region. Performance of the novel approach is experimentally compared to stateoftheart face detection, yielding competitive results in scenarios of moderate complexity. 1
Segmentation of Images SEGMENTATION
"... If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a process in which regions or features sharing similar characteristics are identified and grouped together. Image segmentation may use st ..."
Abstract
 Add to MetaCart
If an image has been preprocessed appropriately to remove noise and artifacts, segmentation is often the key step in interpreting the image. Image segmentation is a process in which regions or features sharing similar characteristics are identified and grouped together. Image segmentation may use statistical classification, thresholding, edge detection, region detection, or any combination of these techniques. The output of the segmentation step is usually a set of classified elements, Segmentation techniques are either regionbased or edgebased. • Regionbased techniques rely on common patterns in intensity values within a cluster of neighboring pixels. The cluster is referred to as the region, and the goal of the segmentation algorithm is to group regions according to their anatomical or functional roles.