Results 1  10
of
13
Optimistic concurrency control for distributed unsupervised learning
 In Advances in Neural Information Processing Systems 26, NIPS ’13
, 2013
"... Abstract Research on distributed machine learning algorithms has focused primarily on one of two extremesalgorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints. We consider an intermediate alternative in which algorithms optimistically assume that co ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Abstract Research on distributed machine learning algorithms has focused primarily on one of two extremesalgorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints. We consider an intermediate alternative in which algorithms optimistically assume that conflicts are unlikely and if conflicts do arise a conflictresolution protocol is invoked. We view this "optimistic concurrency control" paradigm as particularly appropriate for largescale machine learning algorithms, particularly in the unsupervised setting. We demonstrate our approach in three problem areas: clustering, feature learning and online facility location. We evaluate our methods via largescale experiments in a cluster computing environment.
A Convex Exemplarbased Approach to MADBayes Dirichlet Process Mixture Models
"... MADBayes (MAPbased Asymptotic Derivations) has been recently proposed as a general technique to derive scalable algorithm for Bayesian Nonparametric models. However, the combinatorial nature of objective functions derived from MADBayes results in hard optimization problem, for which current p ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
MADBayes (MAPbased Asymptotic Derivations) has been recently proposed as a general technique to derive scalable algorithm for Bayesian Nonparametric models. However, the combinatorial nature of objective functions derived from MADBayes results in hard optimization problem, for which current practice employs heuristic algorithms analogous to kmeans to find local minimum. In this paper, we consider the exemplarbased version of MADBayes formulation for DP and Hierarchical DP (HDP) mixture model. We show that an exemplarbased MADBayes formulation can be relaxed to a convex structuralregularized program that, under clusterseparation conditions, shares the same optimal solution to its combinatorial counterpart. An algorithm based on Alternating Direction Method of Multiplier (ADMM) is then proposed to solve such program. In our experiments on several benchmark data sets, the proposed method finds optimal solution of the combinatorial problem and significantly improves existing methods in terms of the exemplarbased objective. 1.
DPspace: Bayesian nonparametric subspace clustering with smallvariance asymptotic analysis.
 In ICML,
, 2015
"... Abstract Subspace clustering separates data points approximately lying on union of affine subspaces into several clusters. This paper presents a novel nonparametric Bayesian subspace clustering model that infers both the number of subspaces and the dimension of each subspace from the observed data. ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract Subspace clustering separates data points approximately lying on union of affine subspaces into several clusters. This paper presents a novel nonparametric Bayesian subspace clustering model that infers both the number of subspaces and the dimension of each subspace from the observed data. Though the posterior inference is hard, our model leads to a very efficient deterministic algorithm, DPspace, which retains the nonparametric ability under a smallvariance asymptotic analysis. DPspace monotonically minimizes an intuitive objective with an explicit tradeoff between data fitness and model complexity. Experimental results demonstrate that DPspace outperforms various competitors in terms of clustering accuracy and at the same time it is highly efficient.
MAD Bayes for Tumor Heterogeneity – Feature Allocation with Exponential Family Sampling
"... ar ..."
Learning Scalable Discriminative Dictionary with Sample Relatedness
"... Attributes are widely used as midlevel descriptors of object properties in object recognition and retrieval. Mostly, such attributes are manually predefined based on domain knowledge, and their number is fixed. However, predefined attributes may fail to adapt to the properties of the data at han ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Attributes are widely used as midlevel descriptors of object properties in object recognition and retrieval. Mostly, such attributes are manually predefined based on domain knowledge, and their number is fixed. However, predefined attributes may fail to adapt to the properties of the data at hand, may not necessarily be discriminative, and/or may not generalize well. In this work, we propose a dictionary learning framework that flexibly adapts to the complexity of the given data set and reliably discovers the inherent discriminative middlelevel binary features in the data. We use sample relatedness information to improve the generalization of the learned dictionary. We demonstrate that our framework is applicable to both object recognition and complex image retrieval tasks even with few training examples. Moreover, the learned dictionary also help classify novel object categories. Experimental results on the Animals with Attributes, ILSVRC2010 and PASCAL VOC2007 datasets indicate that using relatedness information leads to significant performance gains over established baselines. 1.
Smallvariance Asymptotics for Dirichlet Process Mixtures of SVMs
"... Infinite SVM (iSVM) is a Dirichlet process (DP) mixture of largemargin classifiers. Though flexible in learning nonlinear classifiers and discovering latent clustering structures, iSVM has a difficult inference task and existing methods could hinder its applicability to largescale problems. This p ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Infinite SVM (iSVM) is a Dirichlet process (DP) mixture of largemargin classifiers. Though flexible in learning nonlinear classifiers and discovering latent clustering structures, iSVM has a difficult inference task and existing methods could hinder its applicability to largescale problems. This paper presents a smallvariance asymptotic analysis to derive a simple and efficient algorithm, which monotonically optimizes a maxmargin DPmeans (M 2 DPM) problem, an extension of DPmeans for both predictive learning and descriptive clustering. Our analysis is built on Gibbs infinite SVMs, an alternative DP mixture of largemargin machines, which admits a partially collapsed Gibbs sampler without truncation by exploring data augmentation techniques. Experimental results show that M 2 DPM runs much faster than similar algorithms without sacrificing prediction accuracies.
unknown title
, 2014
"... www.atmoschemphys.net/14/11883/2014/ doi:10.5194/acp14118832014 © Author(s) 2014. CC Attribution 3.0 License. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment ..."
Abstract
 Add to MetaCart
(Show Context)
www.atmoschemphys.net/14/11883/2014/ doi:10.5194/acp14118832014 © Author(s) 2014. CC Attribution 3.0 License. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment
Scalable Approximate Bayesian Inference for Outlier Detection under Informative Sampling
, 2016
"... Abstract Government surveys of business establishments receive a large volume of submissions where a small subset contain errors. Analysts need a fastcomputing algorithm to flag this subset due to a short time window between collection and reporting. We offer a computationallyscalable optimization ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Government surveys of business establishments receive a large volume of submissions where a small subset contain errors. Analysts need a fastcomputing algorithm to flag this subset due to a short time window between collection and reporting. We offer a computationallyscalable optimization method based on nonparametric mixtures of hierarchical Dirichlet processes that allows discovery of multiple industryindexed local partitions linked to a set of global cluster centers. Outliers are nominated as those clusters containing few observations. We extend an existing approach with a new "merge" step that reduces sensitivity to hyperparameter settings. Survey data are typically acquired under an informative sampling design where the probability of inclusion depends on the surveyed response such that the distribution for the observed sample is different from the population. We extend the derivation of a penalized objective function to use a pseudoposterior that incorporates sampling weights that "undo" the informative design. We provide a simulation study to demonstrate that our approach produces unbiased estimation for the outlying cluster under informative sampling. The method is applied for outlier nomination for the Current Employment Statistics survey conducted by the Bureau of Labor Statistics.
PowerLaw Graph Cuts
, 2014
"... Algorithms based on spectral graph cut objectives such as normalized cuts, ratio cuts and ratio association have become popular in recent years because they are widely applicable and simple to implement via standard eigenvector computations. Despite strong performance for a number of clustering task ..."
Abstract
 Add to MetaCart
(Show Context)
Algorithms based on spectral graph cut objectives such as normalized cuts, ratio cuts and ratio association have become popular in recent years because they are widely applicable and simple to implement via standard eigenvector computations. Despite strong performance for a number of clustering tasks, spectral graph cut algorithms still suffer from several limitations: first, they require the number of clusters to be known in advance, but this information is often unknown a priori; second, they tend to produce clusters with uniform sizes. In some cases, the true clusters exhibit a known size distribution; in image segmentation, for instance, humansegmented images tend to yield segment sizes that follow a powerlaw distribution. In this paper, we propose a general framework of powerlaw graph cut algorithms that produce clusters whose sizes are powerlaw distributed, and also does not fix the number of clusters upfront. To achieve our goals, we treat the PitmanYor exchangeable partition probability function (EPPF) as a regularizer to graph cut objectives. Because the resulting objectives cannot be solved by relaxing via eigenvectors, we derive a simple iterative algorithm to locally optimize the objectives. Moreover, we show that our proposed algorithm can be viewed as performing MAP inference on a particular PitmanYor mixture model. Our experiments on various data sets show the effectiveness of our algorithms. 1