Results

**1 - 4**of**4**### Diverse Yet Efficient Retrieval using Hash Functions

"... ABSTRACT Typical retrieval systems have three requirements: a) Accurate retrieval i.e., the method should have high precision, b) Diverse retrieval, i.e., the obtained set of points should be diverse, c) Retrieval time should be small. However, most of the existing methods address only one or two o ..."

Abstract
- Add to MetaCart

(Show Context)
ABSTRACT Typical retrieval systems have three requirements: a) Accurate retrieval i.e., the method should have high precision, b) Diverse retrieval, i.e., the obtained set of points should be diverse, c) Retrieval time should be small. However, most of the existing methods address only one or two of the above mentioned requirements. In this work, we present a method based on randomized locality sensitive hashing which tries to address all of the above requirements simultaneously. While earlier hashing approaches considered approximate retrieval to be acceptable only for the sake of efficiency, we argue that one can further exploit approximate retrieval to provide impressive trade-offs between accuracy and diversity. We extend our method to the problem of multi-label prediction, where the goal is to output a diverse and accurate set of labels for a given document in real-time. Moreover, we introduce a new notion to simultaneously evaluate a method's performance for both the precision and diversity measures. Finally, we present empirical results on several different retrieval tasks and show that our method retrieves diverse and accurate images/labels while ensuring 100x-speed-up over the existing diverse retrieval approaches.

### Large-Scale Bayesian Multi-Label Learning via Topic-Based Label Embeddings

"... Abstract We present a scalable Bayesian multi-label learning model based on learning lowdimensional label embeddings. Our model assumes that each label vector is generated as a weighted combination of a set of topics (each topic being a distribution over labels), where the combination weights (i.e. ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract We present a scalable Bayesian multi-label learning model based on learning lowdimensional label embeddings. Our model assumes that each label vector is generated as a weighted combination of a set of topics (each topic being a distribution over labels), where the combination weights (i.e., the embeddings) for each label vector are conditioned on the observed feature vector. This construction, coupled with a Bernoulli-Poisson link function for each label of the binary label vector, leads to a model with a computational cost that scales in the number of positive labels in the label matrix. This makes the model particularly appealing for real-world multi-label learning problems where the label matrix is usually very massive but highly sparse. Using a data-augmentation strategy leads to full local conjugacy in our model, facilitating simple and very efficient Gibbs sampling, as well as an Expectation Maximization algorithm for inference. Also, predicting the label vector at test time does not require doing an inference for the label embeddings and can be done in closed form. We report results on several benchmark data sets, comparing our model with various state-of-the art methods.

### Locally Non-linear Embeddings for Extreme Multi-label Learning

, 2015

"... The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set. Embedding based approaches make training and prediction tractable by assuming that the training label matrix ..."

Abstract
- Add to MetaCart

(Show Context)
The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set. Embedding based approaches make training and prediction tractable by assuming that the training label matrix is low-rank and hence the effective number of labels can be reduced by projecting the high dimensional label vectors onto a low dimensional linear subspace. Still, leading embedding approaches have been unable to deliver high prediction accuracies or scale to large problems as the low rank assumption is violated in most real world applications. This paper develops the X1 classifier to address both limitations. The main technical contribution in X1 is a formulation for learning a small ensemble of local distance preserving embeddings which can accurately predict infrequently occurring (tail) labels. This allows X1 to break free of the traditional low-rank assumption and boost classification accuracy by learning embeddings which preserve pairwise distances between only the nearest label vectors. We conducted extensive experiments on several real-world as well as benchmark data sets and compared our method against state-of-the-art methods for extreme multi-label classification. Experiments reveal that X1 can make significantly more accurate predictions then the state-of-the-art methods including both embeddings (by as much as 35%) as well as trees (by as much as 6%). X1 can also scale efficiently to data sets with a million labels which are beyond the pale of leading embedding methods. 1

### Large-Scale Bayesian Multi-Label Learning via Topic-Based Label Embeddings

"... We present a scalable Bayesian multi-label learning model based on learning low-dimensional label embeddings. Our model assumes that each label vector is gen-erated as a weighted combination of a set of topics (each topic being a distribution over labels), where the combination weights (i.e., the em ..."

Abstract
- Add to MetaCart

(Show Context)
We present a scalable Bayesian multi-label learning model based on learning low-dimensional label embeddings. Our model assumes that each label vector is gen-erated as a weighted combination of a set of topics (each topic being a distribution over labels), where the combination weights (i.e., the embeddings) for each label vector are conditioned on the observed feature vector. This construction, coupled with a Bernoulli-Poisson link function for each label of the binary label vector, leads to a model with a computational cost that scales in the number of posi-tive labels in the label matrix. This makes the model particularly appealing for real-world multi-label learning problems where the label matrix is usually very massive but highly sparse. Using a data-augmentation strategy leads to full local conjugacy in our model, facilitating simple and very efficient Gibbs sampling, as well as an Expectation Maximization algorithm for inference. Also, predicting the label vector at test time does not require doing an inference for the label em-beddings and can be done in closed form. We report results on several benchmark data sets, comparing our model with various state-of-the art methods. 1