Results 1 - 10
of
23
Variational inference for Dirichlet process mixtures
- Bayesian Analysis
, 2005
"... Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis prob ..."
Abstract
-
Cited by 90 (12 self)
- Add to MetaCart
Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of Monte-Carlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One class of alternatives is provided by variational methods, a class of deterministic algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003). Thus far, variational methods have mainly been explored in the parametric setting, in particular within the formalism of the exponential family (Attias 2000; Ghahramani and Beal 2001; Blei et al. 2003). In this paper, we present a variational inference algorithm for DP mixtures. We present experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a large-scale image analysis problem.
CODA: Convergence Diagnosis and Output Analysis Software for Gibbs sampling output Version 0.30
, 1995
"... ing beta ... 200 valid values Abstracting alpha ... 200 valid values Abstracting sigma ... 200 valid values Reading Data file... Abstracting beta ... 200 valid values Abstracting alpha ... 200 valid values Abstracting sigma ... 200 valid values 10 Next, you will be prompted to specify which (if any ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
ing beta ... 200 valid values Abstracting alpha ... 200 valid values Abstracting sigma ... 200 valid values Reading Data file... Abstracting beta ... 200 valid values Abstracting alpha ... 200 valid values Abstracting sigma ... 200 valid values 10 Next, you will be prompted to specify which (if any) variables take values restricted to either the range (0, 1) or to the positive real line. CODA requires this information in order to correctly compute Gelman and Rubin (1992)'s convergence diagnostic for non-normal variables (see x4.2), and to produce kernel density estimates within the appropriate range (see x3.1). Are any variables restricted to values between 0 and 1 (y/n) ? 1: For the line example, you should respond n to this question. The next prompt to appear is as follows: Are any variables restricted to all positive values (y/n) ? 1: For the line example, you should respond y to this question, which causes the following display to appear: Available variables: +---------------+--...
Variational methods for the Dirichlet process
- In Proceedings of the 21st International Conference on Machine Learning
, 2004
"... Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While often less accurate than MCMC, variational methods provide a fast deterministic approximation to marginal and conditional ..."
Abstract
-
Cited by 33 (4 self)
- Add to MetaCart
Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While often less accurate than MCMC, variational methods provide a fast deterministic approximation to marginal and conditional probabilities. Such approximations can be particularly useful in high dimensional problems where sampling methods are too slow to be effective. A limitation of current methods, however, is that they are restricted to parametric probabilistic models. MCMC does not have such a limitation; indeed, MCMC samplers have been developed for the Dirichlet process (DP), a nonparametric distribution on distributions (Ferguson, 1973) that is the cornerstone of Bayesian nonparametric statistics (Escobar & West, 1995; Neal, 2000). In this paper, we develop a meanfield variational approach to approximate inference for the Dirichlet process, where the approximate posterior is based on the truncated stick-breaking construction (Ishwaran & James, 2001). We compare our approach to DP samplers for Gaussian DP mixture models. 1.
The Number of Iterations, Convergence Diagnostics and Generic Metropolis Algorithms
- In Practical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalter and
, 1995
"... Introduction In order to use Markov chain Monte Carlo, MCMC, it is necessary to determine how long the simulation needs to be run. It is also a good idea to discard a number of initial "burnin " simulations, since from an arbitrary starting point it would be unlikely that the initial simulations ca ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Introduction In order to use Markov chain Monte Carlo, MCMC, it is necessary to determine how long the simulation needs to be run. It is also a good idea to discard a number of initial "burnin " simulations, since from an arbitrary starting point it would be unlikely that the initial simulations came from the stationary distribution intended for the Markov chain. Also, consecutive simulations from Markov chains are dependent, sometimes highly so. Since saving all simulations can require a large amount of storage, researchers using MCMC sometimes prefer saving only every third, fifth, tenth, etc. simulation, especially if the chain is highly dependent. This is sometimes referred to as thinning the chain. While neither burn-in nor thinning are mandatory practices, they both reduce the amount of data saved from a MCMC run. In this chapter, we outline a way of determining in advance the number of iterations needed for a given level of precision in a MCMC algorithm.
Publication Bias in Meta-Analysis: A Bayesian Data-Augmentation Approach to Account for Issues Exemplified in the Passive Smoking Debate
- Statistical Science
, 1997
"... `Publication bias' is a relatively new statistical phenomenon that only arises when one attempts through a meta-analysis to review all studies, significant or insignificant, in order to provide a total perspective on a particular issue. This has recently received some notoriety as an issue in the ev ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
`Publication bias' is a relatively new statistical phenomenon that only arises when one attempts through a meta-analysis to review all studies, significant or insignificant, in order to provide a total perspective on a particular issue. This has recently received some notoriety as an issue in the evaluation of the relative risk of lung cancer associated with passive smoking, following legal challenges to a 1992 EPA analysis which concluded that such exposure is associated with significant excess risk of lung cancer. We introduce a Bayesian approach which estimates and adjusts for publication bias. Estimation is based on a data augmentation principle within a hierarchical model, and the number and outcomes of unobserved studies are simulated using Gibbs sampling methods. This technique yields a quantitative adjustment for the passive smoking meta-analysis. We estimate that there may be both negative and positive but insignificant studies omitted, and that failing to allow for these woul...
Bayesian Methods for Change-point Detection in Long-range Dependent Processes
, 2001
"... We describe a Bayesian method for detecting structural changes in a long-range dependent process. In particular, we focus on changes in the long-range dependence parameter, d, and changes in the process level, µ. Markov chain Monte Carlo methods are used to estimate the posterior probability and siz ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We describe a Bayesian method for detecting structural changes in a long-range dependent process. In particular, we focus on changes in the long-range dependence parameter, d, and changes in the process level, µ. Markov chain Monte Carlo methods are used to estimate the posterior probability and size of a change at time t, along with other model parameters. A time-dependent Kalman filter approach is used to evaluate the likelihood of the fractionally integrated ARMA model characterizing the long-range dependence. The method allows for multiple change points and can be extended to the long-memory stochastic volatility case. We apply the method to investigate a change in persistence of the yearly Nile River minima. We also use the method to investigate structural changes in the series of durations between intraday trades of IBM stock on the New York Stock Exchange and to detect structural breaks in daily stock returns for the Coca Cola Company during the 1990’s.
Factor analysis and outliers: A Bayesian approach
, 1999
"... Classical factor analysis decomposes n observations of dimension p into K (!p) orthogonal factors. In a Bayesian approach we decompose the observation matrix into a product of a factor score and a factor loading matrix of unknown rank by using a normal-Wishart conjugate density family. We assume an ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Classical factor analysis decomposes n observations of dimension p into K (!p) orthogonal factors. In a Bayesian approach we decompose the observation matrix into a product of a factor score and a factor loading matrix of unknown rank by using a normal-Wishart conjugate density family. We assume an informative prior and show how the posterior distribution can be simulated in multivariate blocks by a Gibbs sampling algorithm. The number of factors is determined using the ordinary marginal likelihood and the posterior marginal likelihood criteria. Furthermore, the sensitivity of the factor analysis with respect to outliers in the data set is explored. Assuming additive outliers, a Gibbs sampling approach is suggested for a multivariate outlier model in extension of the approach of Verdinelli and Wasserman (1991). The approach is demonstrated for the language data set of Fuller (1987). Keywords: Factor analysis, Gibbs sampling, marginal likelihood, multivariate outliers. 1 1 Introducti...
Markov Chain Monte Carlo and Gibbs Sampling
, 2004
"... A major limitation towards more widespread implementation of Bayesian approaches is that obtaining the posterior distribution often requires the integration of high-dimensional functions. This can be computationally very difficult, but several approaches short of direct integration have been propose ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
A major limitation towards more widespread implementation of Bayesian approaches is that obtaining the posterior distribution often requires the integration of high-dimensional functions. This can be computationally very difficult, but several approaches short of direct integration have been proposed (reviewed by Smith 1991, Evans and Swartz 1995, Tanner 1996). We focus here on Markov Chain Monte Carlo (MCMC) methods, which attempt to simulate direct draws from some complex distribution of interest. MCMC approaches are so-named because one uses the previous sample values to randomly generate the next sample value, generating a Markov chain (as the transition probabilities between sample values are only a function of the most recent sample value). The realization in the early 1990’s (Gelfand and Smith 1990) that one particular MCMC method, the Gibbs sampler, is very widely applicable to a broad class of Bayesian problems has sparked a major increase in the application of Bayesian analysis, and this interest is likely to continue expanding for sometime to come. MCMC methods have their roots in the Metropolis algorithm (Metropolis and Ulam 1949, Metropolis et al. 1953), an attempt by physicists to compute complex integrals by expressing them as expectations for some distribution and then estimate this expectation by drawing samples from that distribution. The Gibbs sampler (Geman and Geman 1984) has its origins in image processing. It is thus somewhat ironic that the powerful machinery of MCMC methods had essentially no impact on the field of statistics until rather recently. Excellent (and detailed) treatments of MCMC methods are found in Tanner (1996) and Chapter two of Draper (2000). Additional references are given in the particular sections below. MONTE CARLO INTEGRATION The original Monte Carlo approach was a method developed by physicists to use random number generation to compute integrals. Suppose we wish to compute a complex integral � b h(x) dx (1a) a If we can decompose h(x) into the production of a function f(x) and a probability 1
Estimating L¹ Error of Kernel Estimator: Monitoring Convergence of Markov Samplers
"... In many Markov chain Monte Carlo problems, the target density function is known up to a normalization constant. In this paper, we take advantage of this knowledge to facilitate the convergence diagnostic of a Markov sampler by estimating the L 1 error of a kernel estimator. Firstly, we propose an ..."
Abstract
- Add to MetaCart
In many Markov chain Monte Carlo problems, the target density function is known up to a normalization constant. In this paper, we take advantage of this knowledge to facilitate the convergence diagnostic of a Markov sampler by estimating the L 1 error of a kernel estimator. Firstly, we propose an estimator of the normalization constant which is shown to be asymptotically normal under mixing and moment conditions. Secondly, the L 1 error of the kernel estimator is estimated using the normalization constant estimator, and the ratio of the estimated L 1 error to the true L 1 error is shown to converge to 1 in probability under similar conditions. Thirdly, we propose a sequential plot of the estimated L 1 error as a tool to monitor the convergence of the Markov sampler. Finally, a 2-dimensional bimodal example is given to illustrate the proposal, and two Markov samplers are compared in the example using the proposed diagnostic plot. KEY WORDS: fi-mixing; Diagnostic; Normalization...

