Results 1 
9 of
9
Fast Computation of Wasserstein Barycenters
"... We present new algorithms to compute the mean of a set of empirical probability measures under the optimal transport metric. This mean, known as the Wasserstein barycenter, is the measure that minimizes the sum of its Wasserstein distances to each element in that set. We propose two original algorit ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We present new algorithms to compute the mean of a set of empirical probability measures under the optimal transport metric. This mean, known as the Wasserstein barycenter, is the measure that minimizes the sum of its Wasserstein distances to each element in that set. We propose two original algorithms to compute Wasserstein barycenters that build upon the subgradient method. A direct implementation of these algorithms is, however, too costly because it would require the repeated resolution of large primal and dual optimal transport problems to compute subgradients. Extending the work of Cuturi (2013), we propose to smooth the Wasserstein distance used in the definition of Wasserstein barycenters with an entropic regularizer and recover in doing so a strictly convex objective whose gradients can be computed for a considerably cheaper computational cost using matrix scaling algorithms. We use these algorithms to visualize a large family of images and to solve a constrained clustering problem. (a) (c)
Ground Metric Learning
"... Optimal transport distances have been used for more than a decade in machine learning to compare histograms of features. They have one parameter: the ground metric, which can be any metric between the features themselves. As is the case for all parameterized distances, optimal transport distances c ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Optimal transport distances have been used for more than a decade in machine learning to compare histograms of features. They have one parameter: the ground metric, which can be any metric between the features themselves. As is the case for all parameterized distances, optimal transport distances can only prove useful in practice when this parameter is carefully chosen. To date, the only option available to practitioners to set the ground metric parameter was to rely on a priori knowledge of the features, which limited considerably the scope of application of optimal transport distances. We propose to lift this limitation and consider instead algorithms that can learn the ground metric using only a training set of labeled histograms. We call this approach ground metric learning. We formulate the problem of learning the ground metric as the minimization of the difference of two convex polyhedral functions over a convex set of metric matrices. We follow the presentation of our algorithms with promising experimental results which show that this approach is useful both for retrieval and binary/multiclass classification tasks.
Larry Wasserman, CoChair
, 2015
"... analysis, randomized algorithms. To my pampering parents, C.P. and Nalini. This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothes ..."
Abstract
 Add to MetaCart
analysis, randomized algorithms. To my pampering parents, C.P. and Nalini. This thesis makes fundamental computational and statistical advances in testing and estimation, making critical progress in theory and application of classical statistical methods like classification, regression and hypothesis testing, and understanding the relationships between them. Our work connects multiple fields in often counterintuitive and surprising ways, leading to new theory, new algorithms, and new insights, and ultimately to a crossfertilization of varied fields like optimization, statistics and machine learning. The first of three thrusts has to do with active learning, a form of sequential learning from feedbackdriven queries that often has a provable statistical advantage over passive learning. We unify concepts from two seemingly different areas — active learning and stochastic firstorder optimization. We use this unified view to develop new lower bounds for stochastic optimization using tools from active learning and new algorithms for active learning using ideas from optimization. We also study the effect of feature noise, or errorsinvariables, on
Where is the Soho of Rome?∗ Measures and algorithms for finding similar neighborhoods in cities
"... Data generated on locationaware social media provide rich information about the places (shopping malls, restaurants, cafés, etc) where citizens spend their time. That information can, in turn, be used to describe city neighborhoods in terms of the activity that takes place therein. For example, t ..."
Abstract
 Add to MetaCart
Data generated on locationaware social media provide rich information about the places (shopping malls, restaurants, cafés, etc) where citizens spend their time. That information can, in turn, be used to describe city neighborhoods in terms of the activity that takes place therein. For example, the data might reveal that citizens visit one neighborhood mainly for shopping, while another for its dining venues. In this paper, we present a methodology to analyze such data, describe neighborhoods in terms of the activity they host, and discover similar neighborhoods across cities. Using millions of Foursquare checkins from cities in Europe and the US, we conduct an extensive study on features and measures that can be used to quantify similarity of city neighborhoods. We find that the earthmover’s distance outperforms other candidate measures in finding similar neighborhoods. Subsequently, using the earthmover’s distance as our measure of choice, we address the issue of computational efficiency: given a neighborhood in one city, how to efficiently retrieve the k most similar neighborhoods in other cities. We propose a similaritysearch strategy that yields significant speed improvement over the bruteforce search, with minimal loss in accuracy. We conclude with a case study that compares neighborhoods of Paris to neighborhoods of other cities.
A Numerical Method to solve Optimal Transport Problems with Coulomb Cost
, 2015
"... HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
 Add to MetaCart
(Show Context)
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
WASP: Scalable Bayes via barycenters of subset posteriors
"... The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient appr ..."
Abstract
 Add to MetaCart
(Show Context)
The promise of Bayesian methods for big data sets has not fully been realized due to the lack of scalable computational algorithms. For massive data, it is necessary to store and process subsets on different machines in a distributed manner. We propose a simple, general, and highly efficient approach, which first runs a posterior sampling algorithm in parallel on different machines for subsets of a large data set. To combine these subset posteriors, we calculate the Wasserstein barycenter via a highly efficient linear program. The resulting estimate for the Wasserstein posterior (WASP) has an atomic form, facilitating straightforward estimation of posterior summaries of functionals of interest. The WASP approach allows posterior sampling algorithms for smaller data sets to be trivially scaled to huge data. We provide theoretical justification in terms of posterior consistency and algorithm efficiency. Examples are provided in complex settings including Gaussian process regression and nonparametric Bayes mixture models. 1
Transport between RGB Images Motivated by Dynamic Optimal Transport
, 2016
"... We propose two models for the interpolation between RGB images based on the dynamic optimal transport model of Benamou and Brenier [8]. While the application of dynamic optimal transport and its extensions to unbalanced transform were examined for grayvalues images in various papers, this is the fi ..."
Abstract
 Add to MetaCart
(Show Context)
We propose two models for the interpolation between RGB images based on the dynamic optimal transport model of Benamou and Brenier [8]. While the application of dynamic optimal transport and its extensions to unbalanced transform were examined for grayvalues images in various papers, this is the first attempt to generalize the idea to color images. The nontrivial task to incorporate color into the model is tackled by considering RGB images as threedimensional arrays, where the transport in the RGB direction is performed in a periodic way. Following the approach of Papadakis et al. [35] for grayvalue images we propose two discrete variational models, a constrained and a penalized one which can also handle unbalanced transport. We show that a minimizer of our discrete model exists, but it is not unique for some special initial/final images. For minimizing the resulting functionals we apply a primaldual algorithm. One step of this algorithm requires the solution of a fourdimensional discretized Poisson equation with various boundary conditions in each dimension. For instance, for the penalized approach we have simultaneously zero, mirror and periodic boundary conditions. The solution can be computed efficiently using fast SinI, CosII and Fourier transforms. Numerical examples demonstrate the meaningfulness of our model. 1.