Results 1  10
of
209
Probabilistic topic models
 IEEE Signal Processing Magazine
, 2010
"... Probabilistic topic models are a suite of algorithms whose aim is to discover the ..."
Abstract

Cited by 235 (6 self)
 Add to MetaCart
(Show Context)
Probabilistic topic models are a suite of algorithms whose aim is to discover the
Stochastic Variational Inference
 JOURNAL OF MACHINE LEARNING RESEARCH (2013, IN PRESS)
, 2013
"... We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet proce ..."
Abstract

Cited by 131 (27 self)
 Add to MetaCart
We develop stochastic variational inference, a scalable algorithm for approximating posterior distributions. We develop this technique for a large class of probabilistic models and we demonstrate it with two probabilistic topic models, latent Dirichlet allocation and the hierarchical Dirichlet process topic model. Using stochastic variational inference, we analyze several large collections of documents: 300K articles from Nature, 1.8M articles from The New York Times, and 3.8M articles from Wikipedia. Stochastic inference can easily handle data sets of this size and outperforms traditional variational inference, which can only handle a smaller subset. (We also show that the Bayesian nonparametric topic model outperforms its parametric counterpart.) Stochastic variational inference lets us apply complex Bayesian models to massive data sets.
Practical bayesian optimization of machine learning algorithms
, 2012
"... In this section we specify additional details of our Bayesian optimization algorithm which, for brevity, were omitted from the paper. For more detail, the code used in this work is made publicly available at ..."
Abstract

Cited by 130 (16 self)
 Add to MetaCart
(Show Context)
In this section we specify additional details of our Bayesian optimization algorithm which, for brevity, were omitted from the paper. For more detail, the code used in this work is made publicly available at
Optimizing Semantic Coherence in Topic Models
"... Latent variable models have the potential to add value to large document collections by discovering interpretable, lowdimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirich ..."
Abstract

Cited by 80 (5 self)
 Add to MetaCart
(Show Context)
Latent variable models have the potential to add value to large document collections by discovering interpretable, lowdimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirichlet allocation, often produce lowdimensional subspaces (topics) that are obviously flawed to human domain experts. The contributions of this paper are threefold: (1) An analysis of the ways in which topics can be flawed; (2) an automated evaluation metric for identifying such topics that does not rely on human annotators or reference collections outside the training data; (3) a novel statistical topic model based on this metric that significantly improves topic quality in a largescale document collection from the National Institutes of Health (NIH). 1
A Spectral Algorithm for Latent Dirichlet Allocation
"... Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating th ..."
Abstract

Cited by 49 (11 self)
 Add to MetaCart
(Show Context)
Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topicword distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topicword distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of loworder moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on k × k matrices, where k is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space. 1
Sparse stochastic inference for latent dirichlet allocation
 In International Conference on Machine Learning
, 2012
"... We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias ..."
Abstract

Cited by 43 (4 self)
 Add to MetaCart
(Show Context)
We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias of variational inference and generalizes to many Bayesian hiddenvariable models. 1.
Streaming variational bayes
 In Neural Information Processing Systems (NIPS
, 2013
"... We present SDABayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a userspecified approximation batch primitive. We demonstrate the usefulness of our framework, with va ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
We present SDABayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a userspecified approximation batch primitive. We demonstrate the usefulness of our framework, with variational Bayes (VB) as the primitive, by fitting the latent Dirichlet allocation model to two largescale document collections. We demonstrate the advantages of our algorithm over stochastic variational inference (SVI) by comparing the two after a single pass through a known amount of data—a case where SVI may be applied—and in the streaming setting, where SVI does not apply. 1
A Probabilistic Model of Syntactic and Semantic Acquisition from ChildDirected Utterances and their Meanings
"... This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of childdirected utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they d ..."
Abstract

Cited by 29 (6 self)
 Add to MetaCart
This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of childdirected utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model. We use the CCG grammatical framework and train a nonparametric Bayesian model of parse structure with online variational Bayesian expectation maximization. When tested on utterances from the CHILDES corpus, our learner outperforms a stateoftheart semantic parser. In addition, it models such aspects of child acquisition as “fast mapping,” while also countering previous criticisms of statistical syntactic learners. 1
Betanegative binomial process and Poisson factor analysis
 In AISTATS
, 2012
"... A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesia ..."
Abstract

Cited by 28 (15 self)
 Add to MetaCart
(Show Context)
A betanegative binomial (BNB) process is proposed, leading to a betagammaPoisson process, which may be viewed as a “multiscoop” generalization of the betaBernoulli process. The BNB process is augmented into a betagammagammaPoisson hierarchical structure, and applied as a nonparametric Bayesian prior for an infinite Poisson factor analysis model. A finite approximation for the beta process Lévy random measure is constructed for convenient implementation. Efficient MCMC computations are performed with data augmentation and marginalization techniques. Encouraging results are shown on document count matrix factorization. 1
ParallelTopics: A Probabilistic Approach to Exploring Document Collections
"... Scalable and effective analysis of large text corpora remains a challenging problem as our ability to collect textual data continues to increase at an exponential rate. To help users make sense of large text corpora, we present a novel visual analytics system, ParallelTopics, which integrates a sta ..."
Abstract

Cited by 26 (9 self)
 Add to MetaCart
(Show Context)
Scalable and effective analysis of large text corpora remains a challenging problem as our ability to collect textual data continues to increase at an exponential rate. To help users make sense of large text corpora, we present a novel visual analytics system, ParallelTopics, which integrates a stateoftheart probabilistic topic model Latent Dirichlet Allocation (LDA) with interactive visualization. To describe a corpus of documents, ParallelTopics first extracts a set of semantically meaningful topics using LDA. Unlike most traditional clustering techniques in which a document is assigned to a specific cluster, the LDA model accounts for different topical aspects of each individual document. This permits effective full text analysis of larger documents that may contain multiple topics.To highlight this property of the model, ParallelTopics utilizes the parallel coordinate metaphor to present the probabilistic distribution of a document across topics. Such representation allows the users to discover singletopic vs. multitopic documents and the relative importance of each topic to a document of interest. In addition, since