Results 1 - 10
of
273
Hierarchical beta processes and the Indian buffet process. This volume
- In Practical Nonparametric and Semiparametric Bayesian Statistics
, 2007
"... We show that the beta process is the de Finetti mixing distribution underlying the Indian buffet process of [2]. This result shows that the beta process plays the role for the Indian buffet process that the Dirichlet process plays for Chinese restaurant process, a parallel that guides us in deriving ..."
Abstract
-
Cited by 131 (18 self)
- Add to MetaCart
We show that the beta process is the de Finetti mixing distribution underlying the Indian buffet process of [2]. This result shows that the beta process plays the role for the Indian buffet process that the Dirichlet process plays for Chinese restaurant process, a parallel that guides us in deriving analogs for the beta process of the many known extensions of the Dirichlet process. In particular we define Bayesian hierarchies of beta processes and use the connection to the beta process to develop posterior inference algorithms for the Indian buffet process. We also present an application to document classification, exploring a relationship between the hierarchical beta process and smoothed naive Bayes models. 1 1
The nested chinese restaurant process and bayesian inference of topic hierarchies
, 2007
"... We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Spe ..."
Abstract
-
Cited by 128 (15 self)
- Add to MetaCart
(Show Context)
We present the nested Chinese restaurant process (nCRP), a stochastic process which assigns probability distributions to infinitely-deep, infinitely-branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning—the use of Bayesian nonparametric methods to infer distributions on flexible data structures.
Nonparametric Latent Feature Models for Link Prediction
"... As the availability and importance of relational data—such as the friendships summarized on a social networking website—increases, it becomes increasingly important to have good models for such data. The kinds of latent structure that have been considered for use in predicting links in such networks ..."
Abstract
-
Cited by 106 (1 self)
- Add to MetaCart
(Show Context)
As the availability and importance of relational data—such as the friendships summarized on a social networking website—increases, it becomes increasingly important to have good models for such data. The kinds of latent structure that have been considered for use in predicting links in such networks have been relatively limited. In particular, the machine learning community has focused on latent class models, adapting Bayesian nonparametric methods to jointly infer how many latent classes there are while learning which entities belong to each class. We pursue a similar approach with a richer kind of latent variable—latent features—using a Bayesian nonparametric approach to simultaneously infer the number of features at the same time we learn which entities have each feature. Our model combines these inferred features with known covariates in order to perform link prediction. We demonstrate that the greater expressiveness of this approach allows us to improve performance on three datasets. 1
Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations
"... Non-parametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this non-parametric method naturally infers ..."
Abstract
-
Cited by 92 (34 self)
- Add to MetaCart
(Show Context)
Non-parametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this non-parametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stick-breaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be nonstationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other state-of-the-art approaches.
Modeling dyadic data with binary latent factors
- Neural Information Processing Systems
, 2008
"... We introduce binary matrix factorization, a novel model for unsupervised matrix decomposition. The decomposition is learned by fitting a non-parametric Bayesian probabilistic model with binary latent variables to a matrix of dyadic data. Unlike bi-clustering models, which assign each row or column t ..."
Abstract
-
Cited by 81 (17 self)
- Add to MetaCart
(Show Context)
We introduce binary matrix factorization, a novel model for unsupervised matrix decomposition. The decomposition is learned by fitting a non-parametric Bayesian probabilistic model with binary latent variables to a matrix of dyadic data. Unlike bi-clustering models, which assign each row or column to a single cluster based on a categorical hidden feature, our binary feature model reflects the prior belief that items and attributes can be associated with more than one latent cluster at a time. We provide simple learning and inference rules for this new model and show how to extend it to an infinite model in which the number of features is not a priori fixed but is allowed to grow with the size of the data. 1 Distributed representations for dyadic data One of the major goals of probabilistic unsupervised learning is to discover underlying or hidden structure in a dataset by using latent variables to describe a complex data generation process. In this paper we focus on dyadic data: our domains have two finite sets of objects/entities and observations are made on dyads (pairs with one element from each set). Examples include sparse matrices
Stick-breaking construction for the Indian buffet process
- In Proceedings of the International Conference on Artificial Intelligence and Statistics
"... The Indian buffet process (IBP) is a Bayesian nonparametric distribution whereby objects are modelled using an unbounded number of latent features. In this paper we derive a stick-breaking representation for the IBP. Based on this new representation, we develop slice samplers for the IBP that are ef ..."
Abstract
-
Cited by 79 (13 self)
- Add to MetaCart
The Indian buffet process (IBP) is a Bayesian nonparametric distribution whereby objects are modelled using an unbounded number of latent features. In this paper we derive a stick-breaking representation for the IBP. Based on this new representation, we develop slice samplers for the IBP that are efficient, easy to implement and are more generally applicable than the currently available Gibbs sampler. This representation, along with the work of Thibaux and Jordan [17], also illuminates interesting theoretical connections between the IBP, Chinese restaurant processes, Beta processes and Dirichlet processes. 1
Nonparametric Factor Analysis with Beta Process Priors
"... We propose a nonparametric extension to the factor analysis problem using a beta process prior. This beta process factor analysis (BP-FA) model allows for a dataset to be decomposed into a linear combination of a sparse set of factors, providing information on the underlying structure of the observa ..."
Abstract
-
Cited by 79 (26 self)
- Add to MetaCart
(Show Context)
We propose a nonparametric extension to the factor analysis problem using a beta process prior. This beta process factor analysis (BP-FA) model allows for a dataset to be decomposed into a linear combination of a sparse set of factors, providing information on the underlying structure of the observations. As with the Dirichlet process, the beta process is a fully Bayesian conjugate prior, which allows for analytical posterior calculation and straightforward inference. We derive a variational Bayes inference algorithm and demonstrate the model on the MNIST digits and HGDP-CEPH cell line panel datasets. 1.
A Bayesian Sampling Approach to Exploration in Reinforcement Learning
"... We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and selecting actions optimistically. It extends previous work by pr ..."
Abstract
-
Cited by 74 (9 self)
- Add to MetaCart
(Show Context)
We present a modular approach to reinforcement learning that uses a Bayesian representation of the uncertainty over models. The approach, BOSS (Best of Sampled Set), drives exploration by sampling multiple models from the posterior and selecting actions optimistically. It extends previous work by providing a rule for deciding when to resample and how to combine the models. We show that our algorithm achieves nearoptimal reward with high probability with a sample complexity that is low relative to the speed at which the posterior distribution converges during learning. We demonstrate that BOSS performs quite favorably compared to state-of-the-art reinforcement-learning approaches and illustrate its flexibility by pairing it with a non-parametric model that generalizes across states. 1
Bayesian nonparametric latent feature models
- In Bayesian Statistics 8
, 2007
"... We are often interested in explaining data through a set of hidden factors or features. When the number of hidden features is unknown, the Indian Buffet Process (IBP) is a nonparametric latent feature model that does not bound the number of active features in dataset. However, the IBP assumes that a ..."
Abstract
-
Cited by 66 (9 self)
- Add to MetaCart
(Show Context)
We are often interested in explaining data through a set of hidden factors or features. When the number of hidden features is unknown, the Indian Buffet Process (IBP) is a nonparametric latent feature model that does not bound the number of active features in dataset. However, the IBP assumes that all latent features are uncorrelated, making it inadequate for many realworld problems. We introduce a framework for correlated nonparametric feature models, generalising the IBP. We use this framework to generate several specific models and demonstrate applications on realworld datasets. 1