Results 1 -
9 of
9
Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Biometrika
, 1995
"... This article proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology ..."
Abstract
-
Cited by 578 (18 self)
- Add to MetaCart
This article proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of differing dimensionality, which is flexible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology is illustrated with applications to multiple change-point analysis in one and two dimensions, and to a Bayesian comparison of binomial experiments. Some key words: Change-point analysis, Image segmentation, Jump diffusion, Markov chain Monte Carlo, Multiple binomial experiments, Multiple shrinkage, Step function, Voronoi tessellation. 1 Introduction
Multiple Shrinkage and Subset Selection in Wavelets
, 1997
"... This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by u ..."
Abstract
-
Cited by 91 (16 self)
- Add to MetaCart
This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by using Bayesian hierarchical models, assigning a positive prior probability to the wavelet coefficients being zero. The resulting estimator for the wavelet coefficients is a multiple shrinkage estimator that exhibits a wide variety of nonlinear shrinkage patterns. We discuss fast computational implementations, with a focus on easy-to-compute analytic approximations as well as importance sampling and Markov chain Monte Carlo methods. Multiple shrinkage estimators prove to have excellent mean squared error performance in reconstructing standard test functions. We demonstrate this in simulated test examples, comparing various implementations of multiple shrinkage to commonly used shrinkage rules. Finally, we illustrate our approach with an application to the so-called "glint" data.
Prediction via Orthogonalized Model Mixing
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictor-specific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...
Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing
- In Modelling and Prediction
, 1996
"... Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Ca ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Carlo methods as well as importance sampling. Clyde, DeSimone and Parmigiani (1996) developed an importance sampling strategy based on expressing the space of predictors in terms of an orthogonal basis. This leads both to a better identified problem and to simple approximations to the posterior model probabilities. Such approximations can be used to construct efficient importance samplers. For brevity, we call this strategy orthogonalized model mixing. Two key elements of orthogonalized model mixing are: a) the orthogonalization method and b) the prior probability distributions assigned to the models and the coefficients. In this paper we consider in further detail the specification of these t...
Distribution of eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues are infinitely dispersed
, 2002
"... We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually indepen ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually independently distributed. The limiting distributions of the normalized sample eigenvalues are chi-squared distributions with varying degrees of freedom and the distribution of the relevant elements of the eigenvectors is the standard normal distribution. As an application of this result, we investigate tail minimaxity in the estimation of the population covariance matrix of Wishart distribution with respect to Stein's loss function and the quadratic loss function. Under mild regularity conditions, we show that the behavior of a broad class of minimax estimators is identical when the sample eigenvalues become infinitely dispersed. Keywords and phrases asymptotic distribution, covariance matrix, minimax estimator, quadratic loss, singular parameter, Stein's loss, tail minimaxity. 1
Improved minimax predictive densities under Kullback–Leibler loss
- Ann. Statist
, 2006
"... Let X|µ ∼ Np(µ, vxI)and Y |µ ∼ Np(µ, vyI)be independent p-dimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(y|x) for Y that is close to p(y|µ) as measured by expected Kullback–Leibler loss. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Let X|µ ∼ Np(µ, vxI)and Y |µ ∼ Np(µ, vyI)be independent p-dimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(y|x) for Y that is close to p(y|µ) as measured by expected Kullback–Leibler loss. A natural procedure for this problem is the (formal) Bayes predictive density ˆpU(y|x) under the uniform prior πU(µ) ≡ 1, which is best invariant and minimax. We show that any Bayes predictive density will be minimax if it is obtained by a prior yielding a marginal that is superharmonic or whose square root is superharmonic. This yields wide classes of minimax procedures that dominate ˆpU(y|x), including Bayes predictive densities under superharmonic priors. Fundamental similarities and differences with the parallel theory of estimating a multivariate normal mean under quadratic loss are described. 1. Introduction. Let X|µ ∼ Np(µ, vxI) and Y |µ ∼ Np(µ, vyI) be independent p-dimensional multivariate normal vectors with common unknown mean µ,
Robust Hierarchical Bayes Methodology for Clinical Studies
, 1996
"... Outlier observations can have an adverse effect on statistical inference. Identification and elimination of such observations are one option, however, dealing with outliers in this manner has many drawbacks. An alternative approach is to utilize statistical methods that are robust to outliers. Robus ..."
Abstract
- Add to MetaCart
Outlier observations can have an adverse effect on statistical inference. Identification and elimination of such observations are one option, however, dealing with outliers in this manner has many drawbacks. An alternative approach is to utilize statistical methods that are robust to outliers. Robustness is a desirable property of statistical estimators because it ensures that the estimator reflects the pattern in the majority of the data, without being too sensitive to a handful of outliers. In this dissertation robust methodology for constructing empirical Bayes confidence intervals is presented. Three different robust models are proposed: a variance inflation model, a location-shift model and a heavy-tailed model. These three general types of models are described within a hierarchical Bayes framework and are applied in two separate contexts. In the first, we apply the robust methodologies to the normal means problem, and in the second we apply them to the modelling of longitudinal data by random-effects models. The Gibbs sampler is used for analysis of these complex models. Four alternative types of confidence intervals are proposed and evaluated. The proposed
a discussion
"... In this discussion of Polson and Scott, we emphasize the links with the classical shrinkage literature. It is quite pleasant to witness the links made by Polson and Scott between the current sparse modeling strategies and the more classical (or James-Stein) shrinkage literature of the 70’s and 80’s ..."
Abstract
- Add to MetaCart
In this discussion of Polson and Scott, we emphasize the links with the classical shrinkage literature. It is quite pleasant to witness the links made by Polson and Scott between the current sparse modeling strategies and the more classical (or James-Stein) shrinkage literature of the 70’s and 80’s that was instrumental in the first author’s (CPR) personal Bayesian epiphany! Nevertheless, we have some reservation about this unification process in that (a) MAP estimators do not fit a decision-theoretic framework and (b) the classical shrinkage approach is some adverse to sparsity. Indeed, as shown in Judge and Bock (1978), the so-called pre-test estimators that took the value zero with positive probability are inadmissible and dominated by smooth shrinkage estimators under the classical losses. While the efficiency of priors (respective to others) is not clearly defined in Polson and Scott’s paper, the use of a mean sum of squared errors in Table 1 seems to indicate the authors favour the quadratic loss (Berger, 1985) at the core of the James-Stein literature. It would be of considerable interest to connect sparseness and minimaxity, if at all possible.

