Results 11  20
of
27
Maximum volume clustering
 In Proceedings of 14th International Conference on Artificial Intelligence and Statistics (AISTATS
, 2011
"... The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a clustering model based on the large volume principle c ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a clustering model based on the large volume principle called maximum volume clustering (MVC), and propose two algorithms to solve it approximately: a softlabel and a hardlabel MVC algorithms based on sequential quadratic programming and semidefinite programming, respectively. Our MVC model includes spectral clustering and maximum margin clustering as special cases, and is substantially more general. We also establish the finite sample stability and an error bound for softlabel MVC method. Experiments show that the proposed MVC approach compares favorably with stateoftheart clustering algorithms. 1
The utility of cognitive plausibility in language acquisition modeling: Evidence from word segmentation. (Manuscript
, 2014
"... Abstract The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's cognitive plausibility. We suggest that though every computational model necessarily idealizes the mo ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract The informativity of a computational model of language acquisition is directly related to how closely it approximates the actual acquisition task, sometimes referred to as the model's cognitive plausibility. We suggest that though every computational model necessarily idealizes the modeled task, an informative language acquisition model can aim to be cognitively plausible in multiple ways. We discuss these cognitive plausibility checkpoints in general terms, and then apply them to a case study in word segmentation, investigating a promising Bayesian segmentation strategy. We create a more cognitively plausible model of this learning strategy which uses an ageappropriate unit of perceptual representation, evaluates the model output in terms of its utility, and incorporates cognitive constraints into the inference process. Our more cognitively plausible model of the Bayesian word segmentation strategy not only yields better performance than previous implementations but also shows more strongly the beneficial effect of cognitive constraints on segmentation. One interpretation of this effect is as a synergy between the naive theories of language structure that infants may have and the cognitive constraints that limit the fidelity of their inference processes, where less accurate inference approximations are better when the underlying assumptions about how words are generated are less accurate. More generally, these results highlight the utility of incorporating cognitive plausibility more fully into computational models of language acquisition.
Which Distance Metric is Right: An Evolutionary KMeans View
"... It is well known that the distance metric plays an important role in the clustering process. Indeed, many clustering problems can be treated as an optimization problem of a criterion function defined over one distance metric. While many distance metrics have been developed, it is not clear that how ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
It is well known that the distance metric plays an important role in the clustering process. Indeed, many clustering problems can be treated as an optimization problem of a criterion function defined over one distance metric. While many distance metrics have been developed, it is not clear that how these distance metrics can impact on the clustering/optimization process. To that end, in this paper, we study the impact of a set of popular cosinebased distance metrics on Kmeans clustering. Specifically, by revealing the common orderpreserving property, we first show that Kmeans has exactly the same cluster assignment for these metrics during the Estep. Next, by both theoretical and empirical studies, we prove that the cluster centroid is a good approximator of their respective optimal centers in the Mstep. As such, we identify a problem with Kmeans: it cannot differentiate these metrics. To explore the nature of these metrics, we propose an evolutionary Kmeans framework that integrates Kmeans and genetic algorithms. This framework not only enables inspection of arbitrary distance metrics, but also can be used to investigate different formulations of the optimization problem. Finally, this framework is used in extensive experiments on realworld data sets. The results validate our theoretical findings on the characteristics and interrelationships of these metrics. Most importantly, this paper furthers our understanding of the impact of the distance metrics on the optimization process of Kmeans.
Recent developments in clustering algorithms
 Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
, 2012
"... Abstract. In this paper, we give a short review of recent developments in clustering. We shortly summarize important clustering paradigms before addressing important topics including metric adaptation in clustering, dealing with nonEuclidean data or large data sets, clustering evaluation, and lear ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper, we give a short review of recent developments in clustering. We shortly summarize important clustering paradigms before addressing important topics including metric adaptation in clustering, dealing with nonEuclidean data or large data sets, clustering evaluation, and learning theoretical foundations. 1
OpenBox Spectral Clustering: Applications to Medical Image Analysis
"... Subject I Subject J (a) Our system for interactive openbox spectral clustering. (b) Results from transferring the user actions performed on the subject shown in (a) to ten other subjects. Fig. 1. Adapting spectral clustering to specific image analysis problems involves tuning parameters and explori ..."
Abstract
 Add to MetaCart
(Show Context)
Subject I Subject J (a) Our system for interactive openbox spectral clustering. (b) Results from transferring the user actions performed on the subject shown in (a) to ten other subjects. Fig. 1. Adapting spectral clustering to specific image analysis problems involves tuning parameters and exploring hierarchical clustering strategies, different graph types and spectral embeddings. We present a Visual Analytics system that supports this process and finally outputs rules that can be successfully applied to similar data. Abstract—Spectral clustering is a powerful and versatile technique, whose broad range of applications includes 3D image analysis. However, its practical use often involves a tedious and timeconsuming process of tuning parameters and making applicationspecific choices. In the absence of training data with labeled clusters, help from a human analyst is required to decide the number of clusters, to determine whether hierarchical clustering is needed, and to define the appropriate distance measures, parameters of the underlying graph, and type of graph Laplacian. We propose to simplify this process via an openbox approach, in which an interactive system visualizes the involved mathematical quantities, suggests parameter values, and provides immediate feedback to support the required decisions. Our framework focuses
Workshop on Unsupervised and Transfer Learning ICML2011 Unsupervised and Transfer Learning Workshop
"... We organized a data mining challenge in “unsupervised and transfer learning ” (the UTL challenge) followed by a workshop of the same name at the ICML 2011 conference in Bellevue, Washington 1. This introduction presents the highlights of the outstanding contributions that were made, which are regrou ..."
Abstract
 Add to MetaCart
We organized a data mining challenge in “unsupervised and transfer learning ” (the UTL challenge) followed by a workshop of the same name at the ICML 2011 conference in Bellevue, Washington 1. This introduction presents the highlights of the outstanding contributions that were made, which are regrouped in this issue of JMLR W&CP. Novel methodologies emerged to capitalize on large volumes of unlabeled data from tasks related (but different) from a target task, including a method to learn data kernels (similarity measures) and new deep architectures for feature learning.
Maximum Volume Clustering
"... The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a clustering model based on the large volume principle ca ..."
Abstract
 Add to MetaCart
The large volume principle proposed by Vladimir Vapnik, which advocates that hypotheses lying in an equivalence class with a larger volume are more preferable, is a useful alternative to the large margin principle. In this paper, we introduce a clustering model based on the large volume principle called maximum volume clustering (MVC), and propose two algorithms to solve it approximately: a softlabel and a hardlabel MVC algorithms based on sequential quadratic programming and semidefinite programming, respectively. Our MVC model includes spectral clustering and maximum margin clustering as special cases, and is substantially more general. We also establish the finite sample stability and an error bound for softlabel MVC method. Experiments show that the proposed MVC approach compares favorably with stateoftheart clustering algorithms. 1
École Doctorale Information, Structures et Systèmes (I2S)
"... It will be replaced for the final print by a version provided by the service academique. soutenue le 12 Décembre 2014 à la Faculté des Sciences ..."
Abstract
 Add to MetaCart
It will be replaced for the final print by a version provided by the service academique. soutenue le 12 Décembre 2014 à la Faculté des Sciences
A unified view of generative models for networks: models, methods, opportunities, and challenges
"... Research on probabilistic models of networks now spans a wide variety of fields, including physics, sociology, biology, statistics, and machine learning. These efforts have produced a diverse ecology of models and methods. Despite this diversity, many of these models share a common underlying struct ..."
Abstract
 Add to MetaCart
(Show Context)
Research on probabilistic models of networks now spans a wide variety of fields, including physics, sociology, biology, statistics, and machine learning. These efforts have produced a diverse ecology of models and methods. Despite this diversity, many of these models share a common underlying structure: pairwise interactions (edges) are generated with probability conditional on latent vertex attributes. Differences between models generally stem from different philosophical choices about how to learn from data or different empiricallymotivated goals. The highly interdisciplinary nature of work on these generative models, however, has inhibited the development of a unified view of their similarities and differences. For instance, novel theoretical models and optimization techniques developed in machine learning are largely unknown within the social and biological sciences, which have instead emphasized model interpretability. Here, we describe a unified view of generative models for networks that draws together many of these disparate threads and highlights the fundamental similarities and differences that span these fields. We then describe a number of opportunities and challenges for future work that are revealed by this view. 1