Results 1 -
6 of
6
Topics over time: A non-Markov continuous-time model of topical trends
- in SIGKDD
, 2006
"... This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution ..."
Abstract
-
Cited by 236 (10 self)
- Add to MetaCart
This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution over timestamps, and for each generated document, the mixture distribution over topics is influenced by both word co-occurrences and the document’s timestamp. Thus, the meaning of a particular topic can be relied upon as constant, but the topics ’ occurrence and correlations change significantly over time. We present results on nine months of personal email, 17 years of NIPS research papers and over 200 years of presidential state-of-the-union addresses, showing improved topics, better timestamp prediction, and interpretable trends.
Nested Chinese Restaurant Franchise Processes: Applications to User Tracking and Document Modeling
, 2013
"... Much natural data is hierarchical in nature. Moreover, this hierarchy is often shared between different instances. We introduce the nested Chinese Restaurant Franchise Process to obtain both hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Much natural data is hierarchical in nature. Moreover, this hierarchy is often shared between different instances. We introduce the nested Chinese Restaurant Franchise Process to obtain both hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese Restaurant Process while sharing their structure akin to the Hierarchical Dirichlet Process. Moreover, by decoupling the structure generating part of the process from the components responsible for the observations, we are able to apply the same statistical approach to a variety of user generated data. In particular, we model the joint distribution of microblogs and locations for Twitter for users. This leads to a 40 % reduction in location uncertainty relative to the best previously published results. Moreover, we model documents from the NIPS papers dataset, obtaining excellent perplexity relative to (hierarchical) Pachinko allocation and LDA.
ABSTRACT Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends
"... This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution ..."
Abstract
- Add to MetaCart
This paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each topic is associated with a continuous distribution over timestamps, and for each generated document, the mixture distribution over topics is influenced by both word co-occurrences and the document’s timestamp. Thus, the meaning of a particular topic can be relied upon as constant, but the topics ’ occurrence and correlations change significantly over time. We present results on nine months of personal email, 17 years of NIPS research papers and over 200 years of presidential state-of-the-union addresses, showing improved topics, better timestamp prediction, and interpretable trends.
Editor: Somebody
"... We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. A generative pro-cess for the tree structure is defined in terms of particles (representing the objects) diffusing in some conti ..."
Abstract
- Add to MetaCart
We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation. A generative pro-cess for the tree structure is defined in terms of particles (representing the objects) diffusing in some continuous space, analogously to the Dirichlet diffusion tree (Neal, 2003b), which defines a tree structure over partitions (i.e., non-overlapping subsets) of the objects. Un-like in the Dirichlet diffusion tree, multiple copies of a particle may exist and diffuse along multiple branches in the beta diffusion tree, and an object may therefore belong to multiple subsets of particles. We demonstrate how to build a hierarchically-clustered factor analysis model with the beta diffusion tree and how to perform inference over the random tree structures with a Markov chain Monte Carlo algorithm. We conclude with several numer-ical experiments on missing data problems with data sets of gene expression microarrays, international development statistics, and intranational socioeconomic measurements.
Markov Mixed Membership Models
"... We present a Markov mixed membership model (Markov M3) for grouped data that learns a fully connected graph structure among mixing com-ponents. A key feature of Markov M3 is that it interprets the mixed membership assignment as a Markov random walk over this graph of nodes. This is in contrast to tr ..."
Abstract
- Add to MetaCart
We present a Markov mixed membership model (Markov M3) for grouped data that learns a fully connected graph structure among mixing com-ponents. A key feature of Markov M3 is that it interprets the mixed membership assignment as a Markov random walk over this graph of nodes. This is in contrast to tree-structured mod-els in which the assignment is done according to a tree structure on the mixing components. The Markov structure results in a simple paramet-ric model that can learn a complex dependency structure between nodes, while still maintaining full conjugacy for closed-form stochastic varia-tional inference. Empirical results demonstrate that Markov M3 performs well compared with tree structured topic models, and can learn mean-ingful dependency structure between topics. 1.
Scalable Deep Poisson Factor Analysis for Topic Modeling
"... A new framework for topic modeling is devel-oped, based on deep graphical models, where interactions between topics are inferred through deep latent binary hierarchies. The proposed multi-layer model employs a deep sigmoid be-lief network or restricted Boltzmann machine, the bottom binary layer of w ..."
Abstract
- Add to MetaCart
(Show Context)
A new framework for topic modeling is devel-oped, based on deep graphical models, where interactions between topics are inferred through deep latent binary hierarchies. The proposed multi-layer model employs a deep sigmoid be-lief network or restricted Boltzmann machine, the bottom binary layer of which selects topics for use in a Poisson factor analysis model. Under this setting, topics live on the bottom layer of the model, while the deep specification serves as a flexible prior for revealing topic structure. Scal-able inference algorithms are derived by applying Bayesian conditional density filtering algorithm, in addition to extending recently proposed work on stochastic gradient thermostats. Experimental results on several corpora show that the proposed approach readily handles very large collections of text documents, infers structured topic repre-sentations, and obtains superior test perplexities when compared with related models. 1.