Results 1  10
of
515
Markov chains for exploring posterior distributions
 Annals of Statistics
, 1994
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 1136 (6 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Bayesian density estimation and inference using mixtures.
 J. Amer. Statist. Assoc.
, 1995
"... JSTOR is a notforprofit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about J ..."
Abstract

Cited by 653 (18 self)
 Add to MetaCart
JSTOR is a notforprofit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. We describe and illustrate Bayesian inference in models for density estimation using mixtures of Dirichlet processes. These models provide natural settings for density estimation and are exemplified by special cases where data are modeled as a sample from mixtures of normal distributions. Efficient simulation methods are used to approximate various prior, posterior, and predictive distributions. This allows for direct inference on a variety of practical issues, including problems of local versus global smoothing, uncertainty about density estimates, assessment of modality, and the inference on the numbers of components. Also, convergence results are established for a general class of normal mixture models. American Statistical Association
Evaluating the Accuracy of SamplingBased Approaches to the Calculation of Posterior Moments
 IN BAYESIAN STATISTICS
, 1992
"... Data augmentation and Gibbs sampling are two closely related, samplingbased approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accurac ..."
Abstract

Cited by 604 (12 self)
 Add to MetaCart
Data augmentation and Gibbs sampling are two closely related, samplingbased approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accuracy of the approximations to the expected value of functions of interest under the posterior. In this paper methods from spectral analysis are used to evaluate numerical accuracy formally and construct diagnostics for convergence. These methods are illustrated in the normal linear model with informative priors, and in the Tobitcensored regression model.
General methods for monitoring convergence of iterative simulations
 J. Comput. Graph. Statist
, 1998
"... We generalize the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence. We review methods of inference from simulations in order to develo ..."
Abstract

Cited by 551 (8 self)
 Add to MetaCart
We generalize the method proposed by Gelman and Rubin (1992a) for monitoring the convergence of iterative simulations by comparing between and within variances of multiple chains, in order to obtain a family of tests for convergence. We review methods of inference from simulations in order to develop convergencemonitoring summaries that are relevant for the purposes for which the simulations are used. We recommend applying a battery of tests for mixing based on the comparison of inferences from individual sequences and from the mixture of sequences. Finally, we discuss multivariate analogues, for assessing convergence of several parameters simultaneously.
Markov chain monte carlo convergence diagnostics
 JASA
, 1996
"... A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise ..."
Abstract

Cited by 371 (6 self)
 Add to MetaCart
(Show Context)
A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise for the future but currently has yielded relatively little that is of practical use in applied work. Consequently, most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. After giving a brief overview of the area, we provide an expository review of thirteen convergence diagnostics, describing the theoretical basis and practical implementation of each. We then compare their performance in two simple models and conclude that all the methods can fail to detect the sorts of convergence failure they were designed to identify. We thus recommend a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence, including applying diagnostic procedures to a small number of parallel chains, monitoring autocorrelations and crosscorrelations, and modifying parameterizations or sampling algorithms appropriately. We emphasize, however, that it is not possible to say with certainty that a finite sample from an MCMC algorithm is representative of an underlying stationary distribution. 1
Being Bayesian about network structure
 Machine Learning
, 2000
"... Abstract. In many multivariate domains, we are interested in analyzing the dependency structure of the underlying distribution, e.g., whether two variables are in direct interaction. We can represent dependency structures using Bayesian network models. To analyze a given data set, Bayesian model sel ..."
Abstract

Cited by 299 (3 self)
 Add to MetaCart
Abstract. In many multivariate domains, we are interested in analyzing the dependency structure of the underlying distribution, e.g., whether two variables are in direct interaction. We can represent dependency structures using Bayesian network models. To analyze a given data set, Bayesian model selection attempts to find the most likely (MAP) model, and uses its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have nonnegligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed order over network variables. This allows us to compute, for a given order, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orders rather than over network structures. The space of orders is smaller and more regular than the space of structures, and has much a smoother posterior “landscape”. We present empirical results on synthetic and reallife datasets that compare our approach to full model averaging (when possible), to MCMC over network structures, and to a nonBayesian bootstrap approach.
Variational inference for Dirichlet process mixtures
 Bayesian Analysis
, 2005
"... Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of MonteCarlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis prob ..."
Abstract

Cited by 244 (27 self)
 Add to MetaCart
(Show Context)
Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of MonteCarlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One class of alternatives is provided by variational methods, a class of deterministic algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003). Thus far, variational methods have mainly been explored in the parametric setting, in particular within the formalism of the exponential family (Attias 2000; Ghahramani and Beal 2001; Blei et al. 2003). In this paper, we present a variational inference algorithm for DP mixtures. We present experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a largescale image analysis problem.
Efficient Simulation from the Multivariate Normal and Studentt Distributions Subject to Linear Constraints and the Evaluation of Constraint Probabilities
, 1991
"... The construction and implementation of a Gibbs sampler for efficient simulation from the truncated multivariate normal and Studentt distributions is described. It is shown how the accuracy and convergence of integrals based on the Gibbs sample may be constructed, and how an estimate of the probabil ..."
Abstract

Cited by 211 (10 self)
 Add to MetaCart
The construction and implementation of a Gibbs sampler for efficient simulation from the truncated multivariate normal and Studentt distributions is described. It is shown how the accuracy and convergence of integrals based on the Gibbs sample may be constructed, and how an estimate of the probability of the constraint set under the unrestricted distribution may be produced. Keywords: Bayesian inference; Gibbs sampler; Monte Carlo; multiple integration; truncated normal This paper was prepared for a presentation at the meeting Computing Science and Statistics: the TwentyThird Symposium on the Interface, Seattle, April 2224, 1991. Research assistance from Zhenyu Wang and financial support from National Science Foundation Grant SES8908365 are gratefully acknowledged. The software for the examples may be requested by electronic mail, and will be returned by that medium. 2 1. Introduction The generation of random samples from a truncated multivariate normal distribution, that is, a ...
Simulating ratios of normalizing constants via a simple identity: A theoretical exploration
 Statistica Sinica
, 1996
"... Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. ..."
Abstract

Cited by 187 (3 self)
 Add to MetaCart
Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in fields such as physics and genetics. Many methods proposed in statistical and other literature (e.g., computational physics) for dealing with this problem are based on various special cases of the following simple identity: c1 c2 = E2[q1(w)α(w)] E1[q2(w)α(w)]. Here Ei denotes the expectation with respect to pi (i =1, 2), and α is an arbitrary function such that the denominator is nonzero. A main purpose of this paper is to provide a theoretical study of the usefulness of this identity, with focus on (asymptotically) optimal and practical choices of α. Using a simple but informative example, we demonstrate that with sensible (not necessarily optimal) choices of α, we can reduce the simulation error by orders of magnitude when compared to the conventional importance sampling method, which corresponds to α =1/q2. We also introduce several generalizations of this identity for handling more complicated settings (e.g., estimating several ratios simultaneously) and pose several open problems that appear to have practical as well as theoretical value. Furthermore, we discuss related theoretical and empirical work.