Results 1  10
of
70
Diffrac: a discriminative and flexible framework for clustering
 In Advances in Neural Information Processing Systems 20
, 2007
"... We present a novel linear clustering framework (DIFFRAC) which relies on a linear discriminative cost function and a convex relaxation of a combinatorial optimization problem. The large convex optimization problem is solved through a sequence of lower dimensional singular value decompositions. Thi ..."
Abstract

Cited by 54 (11 self)
 Add to MetaCart
(Show Context)
We present a novel linear clustering framework (DIFFRAC) which relies on a linear discriminative cost function and a convex relaxation of a combinatorial optimization problem. The large convex optimization problem is solved through a sequence of lower dimensional singular value decompositions. This framework has several attractive properties: (1) although apparently similar to Kmeans, it exhibits superior clustering performance than Kmeans, in particular in terms of robustness to noise. (2) It can be readily extended to non linear clustering if the discriminative cost function is based on positive definite kernels, and can then be seen as an alternative to spectral clustering. (3) Prior information on the partition is easily incorporated, leading to stateoftheart performance for semisupervised learning, for clustering or classification. We present empirical evaluations of our algorithms on synthetic and real mediumscale datasets. 1
M.K.: Learning to recognize activities from the wrong view point
 ECCV 2008, Part I. LNCS
, 2008
"... Abstract. Appearance features are good at discriminating activities in a fixed view, but behave poorly when aspect is changed. We describe a method to build features that are highly stable under change of aspect. It is not necessary to have multiple views to extract our features. Our features make i ..."
Abstract

Cited by 52 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Appearance features are good at discriminating activities in a fixed view, but behave poorly when aspect is changed. We describe a method to build features that are highly stable under change of aspect. It is not necessary to have multiple views to extract our features. Our features make it possible to learn a discriminative model of activity in one view, and spot that activity in another view, for which one might poses no labeled examples at all. Our construction uses labeled examples to build activity models, and unlabeled, but corresponding, examples to build an implicit model of how appearance changes with aspect. We demonstrate our method with challenging sequences of real human motion, where discriminative methods built on appearance alone fail badly. 1
SelfPaced Learning for Latent Variable Models
, 2010
"... Latent variable models are a powerful tool for addressing several tasks in machine learning. However, the algorithms for learning the parameters of latent variable models are prone to getting stuck in a bad local optimum. To alleviate this problem, we build on the intuition that, rather than conside ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
Latent variable models are a powerful tool for addressing several tasks in machine learning. However, the algorithms for learning the parameters of latent variable models are prone to getting stuck in a bad local optimum. To alleviate this problem, we build on the intuition that, rather than considering all samples simultaneously, the algorithm should be presented with the training data in a meaningful order that facilitates learning. The order of the samples is determined by how easy they are. The main challenge is that often we are not provided with a readily computable measure of the easiness of samples. We address this issue by proposing a novel, iterative selfpaced learning algorithm where each iteration simultaneously selects easy samples and learns a new parameter vector. The number of samples selected is governed by a weight that is annealed until the entire training data has been considered. We empirically demonstrate that the selfpaced learning algorithm outperforms the state of the art method for learning a latent structural SVM on four applications: object localization, noun phrase coreference, motif finding and handwritten digit recognition.
Tighter and convex maximum margin clustering
 In AISTATS, 2009b
"... Maximum margin principle has been successfully applied to many supervised and semisupervised problems in machine learning. Recently, this principle was extended for clustering, referred to as Maximum Margin Clustering (MMC) and achieved promising performance in recent studies. To avoid the problem ..."
Abstract

Cited by 41 (14 self)
 Add to MetaCart
(Show Context)
Maximum margin principle has been successfully applied to many supervised and semisupervised problems in machine learning. Recently, this principle was extended for clustering, referred to as Maximum Margin Clustering (MMC) and achieved promising performance in recent studies. To avoid the problem of local minima, MMC can be solved globally via convex semidefinite programming (SDP) relaxation. Although many efficient approaches have been proposed to alleviate the computational burden of SDP, convex MMCs are still not scalable for medium data sets. In this paper, we propose a novel convex optimization method, LGMMC, which maximizes the margin of opposite clusters via “Label Generation”. It can be shown that LGMMC is much more scalable than existing convex approaches. Moreover, we show that our convex relaxation is tighter than stateofart convex MMCs. Experiments on seventeen UCI datasets and MNIST dataset show significant improvement over existing MMC algorithms. 1
Efficient MultiClass Maximum Margin Clustering
"... This paper presents a cutting plane algorithm for multiclass maximum margin clustering (MMC). The proposed algorithm constructs a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the cons ..."
Abstract

Cited by 28 (7 self)
 Add to MetaCart
(Show Context)
This paper presents a cutting plane algorithm for multiclass maximum margin clustering (MMC). The proposed algorithm constructs a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the constrained concaveconvex procedure (CCCP). Experimental evaluations on several real world datasets show that our algorithm converges much faster than existing MMC methods with guaranteed accuracy, and can thus handle much larger datasets efficiently.
Discriminative Clustering by Regularized Information Maximization
"... Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled data set? We present a framework that simultaneously clusters the data and trains a discriminative classifier. We call it Regularized Information Maximization (RIM). RIM optimizes an intuitive information ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
(Show Context)
Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled data set? We present a framework that simultaneously clusters the data and trains a discriminative classifier. We call it Regularized Information Maximization (RIM). RIM optimizes an intuitive informationtheoretic objective function which balances class separation, class balance and classifier complexity. The approach can flexibly incorporate different likelihood functions, express prior assumptions about the relative size of different classes and incorporate partial labels for semisupervised learning. In particular, we instantiate the framework to unsupervised, multiclass kernelized logistic regression. Our empirical evaluation indicates that RIM outperforms existing methods on several real data sets, and demonstrates that RIM is an effective model selection method. 1
Efficient Maximum Margin Clustering via Cutting Plane Algorithm
"... Maximum margin clustering (MMC) is a recently proposed clustering method, which extends the theory of support vector machine to the unsupervised scenario and aims at finding the maximum margin hyperplane which separates the data from different classes. Traditionally, MMC is formulated as a nonconve ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
Maximum margin clustering (MMC) is a recently proposed clustering method, which extends the theory of support vector machine to the unsupervised scenario and aims at finding the maximum margin hyperplane which separates the data from different classes. Traditionally, MMC is formulated as a nonconvex integer programming problem and is thus difficult to solve. Several methods have been proposed in the literature to solve the MMC problem based on either semidefinite programming or alternative optimization. However, these methods are time demanding while handling large scale datasets and therefore unsuitable for real world applications. In this paper, we propose the cutting plane maximum margin clustering (CPMMC) algorithm, to solve the MMC problem. Specifically, we construct a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the constrained concaveconvex procedure (CCCP). Moreover, we prove theoretically that the CPMMC algorithm takes time O(sn) to converge with guaranteed accuracy, where n is the total number of samples in the dataset and s is the average number of nonzero features, i.e. the sparsity. Experimental evaluations on several real world datasets show that CPMMC performs better than existing MMC methods, both in efficiency and accuracy.
Multiple Kernel Clustering
"... Maximum margin clustering (MMC) has recently attracted considerable interests in both the data mining and machine learning communities. It first projects data samples to a kernelinduced feature space and then performs clustering by finding the maximum margin hyperplane over all possible cluster lab ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
Maximum margin clustering (MMC) has recently attracted considerable interests in both the data mining and machine learning communities. It first projects data samples to a kernelinduced feature space and then performs clustering by finding the maximum margin hyperplane over all possible cluster labelings. As in other kernel methods, choosing a suitable kernel function is imperative to the success of maximum margin clustering. In this paper, we propose a multiple kernel clustering (MKC) algorithm that simultaneously finds the maximum margin hyperplane, the best cluster labeling, and the optimal kernel. Moreover, we provide detailed analysis on the time complexity of the MKC algorithm and also extend multiple kernel clustering to the multiclass scenario. Experimental results on both toy and realworld data sets demonstrate the effectiveness and efficiency of the MKC algorithm. 1
Towards Making Unlabeled Data Never Hurt
"... It is usually expected that, when labeled data are limited, the learning performance can be improved by exploiting unlabeled data. In many cases, however, the performances of current semisupervised learning approaches may be even worse than purely using the limited labeled data. It is desired to ha ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
(Show Context)
It is usually expected that, when labeled data are limited, the learning performance can be improved by exploiting unlabeled data. In many cases, however, the performances of current semisupervised learning approaches may be even worse than purely using the limited labeled data. It is desired to have safe semisupervised learning approaches which never degenerate learning performance by using unlabeled data. In this paper, we focus on semisupervised support vector machines (S3VMs) and propose S4VMs, i.e., safe S3VMs. Unlike S3VMs which typically aim at approaching an optimal lowdensity separator, S4VMs try to exploit the candidate lowdensity separators simultaneously to reduce the risk of identifying a poor separator with unlabeled data. We describe two implementations of S4VMs, and our comprehensive experiments show that the overall performance of S4VMs are highly competitive to S3VMs, while in contrast to S3VMs which degenerate performance in many cases, S4VMs are never significantly inferior to inductive SVMs. 1.
Spectral Embedded Clustering
"... In this paper, we propose a new spectral clustering method, referred to as Spectral Embedded Clustering (SEC), to minimize the normalized cut criterion in spectral clustering as well as control the mismatch between the cluster assignment matrix and the low dimensional embedded representation of the ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
In this paper, we propose a new spectral clustering method, referred to as Spectral Embedded Clustering (SEC), to minimize the normalized cut criterion in spectral clustering as well as control the mismatch between the cluster assignment matrix and the low dimensional embedded representation of the data. SEC is based on the observation that the cluster assignment matrix of high dimensional data can be represented by a low dimensional linear mapping of data. We also discover the connection between SEC and other clustering methods, such as spectral clustering, Clustering with local and global regularization, Kmeans and Discriminative Kmeans. The experiments on many realworld data sets show that SEC significantly outperforms the existing spectral clustering methods as well as Kmeans clustering related methods.