Results 1  10
of
14
Generalized Relational Topic Models with Data Augmentation
"... Relational topic models have shown promise on analyzing document network structures and discovering latent topic representations. This paper presents three extensions: 1) unlike the common link likelihood with a diagonal weight matrix that allows thesametopic interactions only, we generalize it to ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Relational topic models have shown promise on analyzing document network structures and discovering latent topic representations. This paper presents three extensions: 1) unlike the common link likelihood with a diagonal weight matrix that allows thesametopic interactions only, we generalize it to use a full weight matrix that captures all pairwise topic interactions and is applicable to asymmetric networks; 2) instead of doing standard Bayesian inference, we perform regularized Bayesian inference with a regularization parameter to deal with the imbalanced link structure issue in common real networks; and 3) instead of doing variational approximation with strict meanfield assumptions, we present a collapsed Gibbs sampling algorithm for the generalized relational topic models without making restricting assumptions. Experimental results demonstrate the significance of these extensions on improving the prediction performance, and the time efficiency can be dramatically improved with a simple fast approximation method. 1
Fast MaxMargin Matrix Factorization with Data Augmentation
"... Existing maxmargin matrix factorization (M 3 F) methods either are computationally inefficient or need a model selection procedure to determine the number of latent factors. In this paper we present a probabilistic M 3 F model that admits a highly efficient Gibbs sampling algorithm through data aug ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Existing maxmargin matrix factorization (M 3 F) methods either are computationally inefficient or need a model selection procedure to determine the number of latent factors. In this paper we present a probabilistic M 3 F model that admits a highly efficient Gibbs sampling algorithm through data augmentation. We further extend our approach to incorporate Bayesian nonparametrics and build accordingly a truncationfree nonparametric M 3 F model where the number of latent factors is literally unbounded and inferred from data. Empirical studies on two large realworld data sets verify the efficacy of our proposed methods. 1.
Improved Bayesian Logistic Supervised Topic Models with Data Augmentation
"... Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually overweighted by document word counts; and 2) existing variational inference methods make strict meanfield assumptions. We address these issues by: 1) int ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually overweighted by document word counts; and 2) existing variational inference methods make strict meanfield assumptions. We address these issues by: 1) introducing a regularization constant to better balance the two parts based on an optimization formulation of Bayesian inference; and 2) developing a simple Gibbs sampling algorithm by introducing auxiliary PolyaGamma variables and collapsing out Dirichlet variables. Our augmentandcollapse sampling algorithm has analytical forms of each conditional distribution without making any restricting assumptions and can be easily parallelized. Empirical results demonstrate significant improvements on prediction performance and time efficiency. 1
MaxMargin Majority Voting for Learning from Crowds
"... Abstract Learningfromcrowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers. This paper presents maxmargin majority voting (M 3 V) to improve the discriminative ability of majority voting and further presents a ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract Learningfromcrowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers. This paper presents maxmargin majority voting (M 3 V) to improve the discriminative ability of majority voting and further presents a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices. We formulate the joint learning as a regularized Bayesian inference problem, where the posterior regularization is derived by maximizing the margin between the aggregated score of a potential true label and that of any alternative label. Our Bayesian model naturally covers the DawidSkene estimator and M 3 V. Empirical results demonstrate that our methods are competitive, often achieving better results than stateoftheart estimators.
Supervised topic models with word order . . .
, 2015
"... One limitation of most existing probabilistic latent topic models for document classification is that the topic model itself does not consider useful sideinformation, namely, class labels of documents. Topic models, which in turn consider the sideinformation, popularly known as supervised topic m ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
One limitation of most existing probabilistic latent topic models for document classification is that the topic model itself does not consider useful sideinformation, namely, class labels of documents. Topic models, which in turn consider the sideinformation, popularly known as supervised topic models, do not consider the word order structure in documents. One of the motivations behind considering the word order structure is to capture the semantic fabric of the document. We investigate a lowdimensional latent topic model for document classification. Class label information and word order structure are integrated into a supervised topic model enabling a more effective interaction among such information for solving document classification. We derive a
Constrained relative entropy minimization with applications to multitask learning
, 2013
"... Copyright by ..."
(Show Context)
Robust Bayesian MaxMargin Clustering
"... We present maxmargin Bayesian clustering (BMC), a general and robust framework that incorporates the maxmargin criterion into Bayesian clustering models, as well as two concrete models of BMC to demonstrate its flexibility and effectiveness in dealing with different clustering tasks. The Dirichl ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
We present maxmargin Bayesian clustering (BMC), a general and robust framework that incorporates the maxmargin criterion into Bayesian clustering models, as well as two concrete models of BMC to demonstrate its flexibility and effectiveness in dealing with different clustering tasks. The Dirichlet process maxmargin Gaussian mixture is a nonparametric Bayesian clustering model that relaxes the underlying Gaussian assumption of Dirichlet process Gaussian mixtures by incorporating maxmargin posterior constraints, and is able to infer the number of clusters from data. We further extend the ideas to present maxmargin clustering topic model, which can learn the latent topic representation of each document while at the same time cluster documents in the maxmargin fashion. Extensive experiments are performed on a number of real datasets, and the results indicate superior clustering performance of our methods compared to related baselines. 1
Kernel Bayesian Inference with Posterior Regularization
"... Abstract We propose a vectorvalued regression problem whose solution is equivalent to the reproducing kernel Hilbert space (RKHS) embedding of the Bayesian posterior distribution. This equivalence provides a new understanding of kernel Bayesian inference. Moreover, the optimization problem induces ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We propose a vectorvalued regression problem whose solution is equivalent to the reproducing kernel Hilbert space (RKHS) embedding of the Bayesian posterior distribution. This equivalence provides a new understanding of kernel Bayesian inference. Moreover, the optimization problem induces a new regularization for the posterior embedding estimator, which is faster and has comparable performance to the squared regularization in kernel Bayes' rule. This regularization coincides with a former thresholding approach used in kernel POMDPs whose consistency remains to be established. Our theoretical work solves this open problem and provides consistency analysis in regression settings. Based on our optimizational formulation, we propose a flexible Bayesian posterior regularization framework which for the first time enables us to put regularization at the distribution level. We apply this method to nonparametric statespace filtering tasks with extremely nonlinear dynamics and show performance gains over all other baselines.
DiversityPromoting Bayesian Learning of Latent Variable Models
"... Abstract In learning latent variable models (LVMs), it is important to effectively capture infrequent patterns and shrink model size without sacrificing modeling power. Various studies have been done to "diversify" a LVM, which aim to learn a diverse set of latent components in LVMs. Most ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract In learning latent variable models (LVMs), it is important to effectively capture infrequent patterns and shrink model size without sacrificing modeling power. Various studies have been done to "diversify" a LVM, which aim to learn a diverse set of latent components in LVMs. Most existing studies fall into a frequentiststyle regularization framework, where the components are learned via point estimation. In this paper, we investigate how to "diversify" LVMs in the paradigm of Bayesian learning, which has advantages complementary to point estimation, such as alleviating overfitting via model averaging and quantifying uncertainty. We propose two approaches that have complementary advantages. One is to define diversitypromoting mutual angular priors which assign larger density to components with larger mutual angles based on Bayesian network and von MisesFisher distribution and use these priors to affect the posterior via Bayes rule. We develop two efficient approximate posterior inference algorithms based on variational inference and Markov chain Monte Carlo sampling. The other approach is to impose diversitypromoting regularization directly over the postdata distribution of components. These two methods are applied to the Bayesian mixture of experts model to encourage the "experts" to be diverse and experimental results demonstrate the effectiveness and efficiency of our methods.
A Unified Posterior Regularized Topic Model with Maximum Margin for LearningtoRank∗
"... While most methods for learningtorank documents only consider relevance scores as features, better results can often be obtained by taking into account the latent topic structure of the document collection. Existing approaches that consider latent topics follow a twostage approach, in which top ..."
Abstract
 Add to MetaCart
While most methods for learningtorank documents only consider relevance scores as features, better results can often be obtained by taking into account the latent topic structure of the document collection. Existing approaches that consider latent topics follow a twostage approach, in which topics are discovered in an unsupervised way, as usual, and then used as features for the learningtorank task. In contrast, we propose a learningtorank framework which integrates the supervised learning of a maximum margin classifier with the discovery of a suitable probabilistic topic model. In this way, the labelled data that is available for the learningtorank task can be exploited to identify the most appropriate topics. To this end, we use a unified constrained optimization framework, which can dynamically compute the latent topic similarity score between the query and the document. Our experimental results show a consistent improvement over the stateoftheart learningtorank models.