Results 1  10
of
732
The calculation of posterior distributions by data augmentation.
 Journal of the American Statistical Association
, 1987
"... ..."
(Show Context)
Probabilistic Inference Using Markov Chain Monte Carlo Methods
, 1993
"... Probabilistic inference is an attractive approach to uncertain reasoning and empirical learning in artificial intelligence. Computational difficulties arise, however, because probabilistic models with the necessary realism and flexibility lead to complex distributions over highdimensional spaces. R ..."
Abstract

Cited by 735 (24 self)
 Add to MetaCart
Probabilistic inference is an attractive approach to uncertain reasoning and empirical learning in artificial intelligence. Computational difficulties arise, however, because probabilistic models with the necessary realism and flexibility lead to complex distributions over highdimensional spaces. Related problems in other fields have been tackled using Monte Carlo methods based on sampling using Markov chains, providing a rich array of techniques that can be applied to problems in artificial intelligence. The "Metropolis algorithm" has been used to solve difficult problems in statistical physics for over forty years, and, in the last few years, the related method of "Gibbs sampling" has been applied to problems of statistical inference. Concurrently, an alternative method for solving problems in statistical physics by means of dynamical simulation has been developed as well, and has recently been unified with the Metropolis algorithm to produce the "hybrid Monte Carlo" method. In computer science, Markov chain sampling is the basis of the heuristic optimization technique of "simulated annealing", and has recently been used in randomized algorithms for approximate counting of large sets. In this review, I outline the role of probabilistic inference in artificial intelligence, present the theory of Markov chains, and describe various Markov chain Monte Carlo algorithms, along with a number of supporting techniques. I try to present a comprehensive picture of the range of methods that have been developed, including techniques from the varied literature that have not yet seen wide application in artificial intelligence, but which appear relevant. As illustrative examples, I use the problems of probabilistic inference in expert systems, discovery of latent classes from data, and Bayesian learning for neural networks.
Bayesian Interpolation
 NEURAL COMPUTATION
, 1991
"... Although Bayesian analysis has been in use since Laplace, the Bayesian method of modelcomparison has only recently been developed in depth. In this paper, the Bayesian approach to regularisation and modelcomparison is demonstrated by studying the inference problem of interpolating noisy data. T ..."
Abstract

Cited by 726 (17 self)
 Add to MetaCart
Although Bayesian analysis has been in use since Laplace, the Bayesian method of modelcomparison has only recently been developed in depth. In this paper, the Bayesian approach to regularisation and modelcomparison is demonstrated by studying the inference problem of interpolating noisy data. The concepts and methods described are quite general and can be applied to many other problems. Regularising constants are set by examining their posterior probability distribution. Alternative regularisers (priors) and alternative basis sets are objectively compared by evaluating the evidence for them. `Occam's razor' is automatically embodied by this framework. The way in which Bayes infers the values of regularising constants and noise levels has an elegant interpretation in terms of the effective number of parameters determined by the data set. This framework is due to Gull and Skilling.
Image denoising using a scale mixture of Gaussians in the wavelet domain
 IEEE TRANS IMAGE PROCESSING
, 2003
"... We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vecto ..."
Abstract

Cited by 512 (17 self)
 Add to MetaCart
(Show Context)
We describe a method for removing noise from digital images, based on a statistical model of the coefficients of an overcomplete multiscale oriented basis. Neighborhoods of coefficients at adjacent positions and scales are modeled as the product of two independent random variables: a Gaussian vector and a hidden positive scalar multiplier. The latter modulates the local variance of the coefficients in the neighborhood, and is thus able to account for the empirically observed correlation between the coefficient amplitudes. Under this model, the Bayesian least squares estimate of each coefficient reduces to a weighted average of the local linear estimates over all possible values of the hidden multiplier variable. We demonstrate through simulations with images contaminated by additive white Gaussian noise that the performance of this method substantially surpasses that of previously published methods, both visually and in terms of mean squared error.
A Bayesian Framework for the Analysis of Microarray Expression Data: Regularized tTest and Statistical Inferences of Gene Changes
 Bioinformatics
, 2001
"... Motivation: DNA microarrays are now capable of providing genomewide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory ..."
Abstract

Cited by 491 (6 self)
 Add to MetaCart
Motivation: DNA microarrays are now capable of providing genomewide patterns of gene expression across many different conditions. The first level of analysis of these patterns requires determining whether observed differences in expression are significant or not. Current methods are unsatisfactory due to the lack of a systematic framework that can accommodate noise, variability, and low replication often typical of microarray data. Results: We develop a Bayesian probabilistic framework for microarray data analysis. At the simplest level, we model logexpression values by independent normal distributions, parameterized by corresponding means and variances with hierarchical prior distributions. We derive point estimates for both parameters and hyperparameters, and regularized expressions for the variance of each gene by combining the empirical variance with a local background variance associated with neighboring genes. An additional hyperparameter, inversely related to the number of empirical observations, determines the strength of the background variance. Simulations show that these point estimates, combined with a ttest, provide a systematic inference approach that compares favorably with simple ttest or fold methods, and partly compensate for the lack of replication. Availability: The approach is implemented in a software called CyberT accessible through a Web interface at www.genomics.uci.edu/software.html. The code is available as Open Source and is written in the freely available statistical language R. and Department of Biological Chemistry, College of Medicine, University of California, Irvine. To whom all correspondence should be addressed. Contact: pfbaldi@ics.uci.edu, tdlong@uci.edu. 1
Using confidence intervals in withinsubject designs
 PSYCHONOMIC BULLETIN & REVIEW
, 1994
"... ..."
Predictive regressions
 Journal of Financial Economics
, 1999
"... When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression set ..."
Abstract

Cited by 462 (20 self)
 Add to MetaCart
(Show Context)
When a rate of return is regressed on a lagged stochastic regressor, such as a dividend yield, the regression disturbance is correlated with the regressor's innovation. The OLS estimator's "nitesample properties, derived here, can depart substantially from the standard regression setting. Bayesian posterior distributions for the regression parameters are obtained under speci"cations that di!er with respect to (i) prior beliefs about the autocorrelation of the regressor and (ii) whether the initial observation of the regressor is speci"ed as "xed or stochastic. The posteriors di!er across such speci"cations, and asset allocations in the presence of estimation risk exhibit sensitivity to those
Prior distributions for variance parameters in hierarchical models
 Bayesian Analysis
, 2006
"... Various noninformative prior distributions have been suggested for scale parameters in hierarchical models. We construct a new foldednoncentralt family of conditionally conjugate priors for hierarchical standard deviation parameters, and then consider noninformative and weakly informative priors i ..."
Abstract

Cited by 428 (15 self)
 Add to MetaCart
(Show Context)
Various noninformative prior distributions have been suggested for scale parameters in hierarchical models. We construct a new foldednoncentralt family of conditionally conjugate priors for hierarchical standard deviation parameters, and then consider noninformative and weakly informative priors in this family. We use an example to illustrate serious problems with the inversegamma family of “noninformative ” prior distributions. We suggest instead to use a uniform prior on the hierarchical standard deviation, using the halft family when the number of groups is small and in other settings where a weakly informative prior is desired.
Bayesian Experimental Design: A Review
 Statistical Science
, 1995
"... This paper reviews the literature on Bayesian experimental design, both for linear and nonlinear models. A unified view of the topic is presented by putting experimental design in a decision theoretic framework. This framework justifies many optimality criteria, and opens new possibilities. Various ..."
Abstract

Cited by 320 (1 self)
 Add to MetaCart
(Show Context)
This paper reviews the literature on Bayesian experimental design, both for linear and nonlinear models. A unified view of the topic is presented by putting experimental design in a decision theoretic framework. This framework justifies many optimality criteria, and opens new possibilities. Various design criteria become part of a single, coherent approach.