Results 1  10
of
22
On Sequential Monte Carlo Sampling Methods for Bayesian Filtering
 STATISTICS AND COMPUTING
, 2000
"... In this article, we present an overview of methods for sequential simulation from posterior distributions. These methods are of particular interest in Bayesian filtering for discrete time dynamic models that are typically nonlinear and nonGaussian. A general importance sampling framework is develop ..."
Abstract

Cited by 1051 (76 self)
 Add to MetaCart
In this article, we present an overview of methods for sequential simulation from posterior distributions. These methods are of particular interest in Bayesian filtering for discrete time dynamic models that are typically nonlinear and nonGaussian. A general importance sampling framework is developed that unifies many of the methods which have been proposed over the last few decades in several different scientific disciplines. Novel extensions to the existing methods are also proposed. We show in particular how to incorporate local linearisation methods similar to those which have previously been employed in the deterministic filtering literature; these lead to very effective importance distributions. Furthermore we describe a method which uses RaoBlackwellisation in order to take advantage of the analytic structure present in some important classes of statespace models. In a final section we develop algorithms for prediction, smoothing and evaluation of the likelihood in dynamic models.
On sequential simulationbased methods for bayesian filtering
, 1998
"... Abstract. In this report, we present an overview of sequential simulationbased methods for Bayesian filtering of nonlinear and nonGaussian dynamic models. It includes in a general framework numerous methods proposed independently in various areas of science and proposes some original developments. ..."
Abstract

Cited by 251 (12 self)
 Add to MetaCart
(Show Context)
Abstract. In this report, we present an overview of sequential simulationbased methods for Bayesian filtering of nonlinear and nonGaussian dynamic models. It includes in a general framework numerous methods proposed independently in various areas of science and proposes some original developments.
Clustering microarray gene expression data using weighted Chinese restaurant process
 Bioinformatics
, 2006
"... ABSTRACT Motivation: Clustering microarray gene expression data is a powerful tool for elucidating coregulatory relationships among genes. Many different clustering techniques have been successfully applied and the results are promising. However, substantial fluctuation contained in microarray dat ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
ABSTRACT Motivation: Clustering microarray gene expression data is a powerful tool for elucidating coregulatory relationships among genes. Many different clustering techniques have been successfully applied and the results are promising. However, substantial fluctuation contained in microarray data, lack of knowledge on the number of clusters and complex regulatory mechanisms underlying biological systems make the clustering problems tremendously challenging. Results: We devised an improved modelbased Bayesian approach to cluster microarray gene expression data. Cluster assignment is carried out by an iterative weighted Chinese restaurant seating scheme such that the optimal number of clusters can be determined simultaneously with cluster assignment. The predictive updating technique was applied to improve the efficiency of the Gibbs sampler. An additional step is added during reassignment to allow genes that display complex correlation relationships such as timeshifted and/or inverted to be clustered together. Analysis done on a real dataset showed that as much as 30% of significant genes clustered in the same group display complex relationships with the consensus pattern of the cluster. Other notable features including automatic handling of missing data, quantitative measures of cluster strength and assignment confidence. Synthetic and real microarray gene expression datasets were analyzed to demonstrate its performance. Availability: A computer program named Chinese restaurant cluster (CRC) has been developed based on this algorithm. The program can be downloaded at
Bayesian Linear Regression
 3594 Security Ticket Control
, 1999
"... This note derives the posterior, evidence, and predictive density for linear multivariate regression under zeromean Gaussian noise. Many Bayesian texts, such as Box & Tiao (1973), cover linear regression. This note contributes to the discussion by paying careful attention to invariance issues, ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
This note derives the posterior, evidence, and predictive density for linear multivariate regression under zeromean Gaussian noise. Many Bayesian texts, such as Box & Tiao (1973), cover linear regression. This note contributes to the discussion by paying careful attention to invariance issues, demonstrating model selection based on the evidence, and illustrating the shape of the predictive density. Piecewise regression and basis function regression are also discussed. 1 Introduction The data model is that an input vector x of length m multiplies a coefficient matrix A to produce an output vector y of length d, with Gaussian noise added: y = Ax + e (1) e N (0; V) (2) p(yjx; A;V) N (Ax; V) (3) This is a conditional model for y only: the distribution of x is not needed and in fact irrelevant to all inferences in this paper. As we shall see, conditional models create subtleties in Bayesian inference. In the special case x = 1 and m = 1, the conditioning disappears and we simply have a ...
Some Further Developments for StickBreaking Priors: Finite and Infinite Clustering and Classification
 Sankhya Series A
, 2003
"... this paper will be to develop new surrounding theory for the hierarchical model (7) and show how these may be used to develop computational algorithms for computing posterior quantities. Our theoretical contributions include developing key properties for the class of extended stickbreaking measures ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
this paper will be to develop new surrounding theory for the hierarchical model (7) and show how these may be used to develop computational algorithms for computing posterior quantities. Our theoretical contributions include developing key properties for the class of extended stickbreaking measures, which includes establishing a conjugacy property of their random weights to i.i.d sampling, and a characterization of the posterior for the extended stickbreaking prior under i.i.d sampling. See Section 3. These properties then lead us in Section 4 to a general characterization for the posterior of (7). In Section 5 we outline a collapsed Gibbs sampling algorithm and an i.i.d SIS (sequential importance sampling) algorithm that can be used for inference in (7). One important implication is our ability to t the posterior of (6) subject to in nite dimensional stickbreaking measures. The paper begins with a brief discussion of stickbreaking priors in Section 2
Model Likelihoods and Bayes Factors for Switching and Mixture Models
, 2002
"... In the present paper we discuss the problem of estimating model likelihoods from the MCMC output for a general mixture and switching model. Estimation is based on the method of bridge sampling (Meng and Wong, 1996), where the MCMC sample is combined with an iid sample from an importance density. ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
(Show Context)
In the present paper we discuss the problem of estimating model likelihoods from the MCMC output for a general mixture and switching model. Estimation is based on the method of bridge sampling (Meng and Wong, 1996), where the MCMC sample is combined with an iid sample from an importance density. The importance density is constructed in an unsupervised manner from the MCMC output using a mixture of complete data posteriors. Whereas the importance sampling estimator as well as the reciprocal importance sampling estimator are sensitive to the tail behaviour of the importance density, we demonstrate that the bridge sampling estimator is far more robust in this concern. Our case studies range from computing marginal likelihoods for a mixture of multivariate normal distributions, testing for the inhomogeneity of a discrete time Poisson process, to testing for the presence of Markov switching and order selection in the MSAR model.
MCMC Estimation of Classical and Dynamic Switching and Mixture Models
 Journal of the American Statistical Association
, 2001
"... ..."
Bayesian Time Series: Analysis Methods Using SimulationBased Computation
, 2000
"... This dissertation introduces new simulationbased analysis approaches, including both sequential and offline learning algorithms, for various Bayesian time series models. We provide a Markov Chain Monte Carlo (MCMC) method for an autoregressive (AR) model with innovations following exponential powe ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
This dissertation introduces new simulationbased analysis approaches, including both sequential and offline learning algorithms, for various Bayesian time series models. We provide a Markov Chain Monte Carlo (MCMC) method for an autoregressive (AR) model with innovations following exponential power distributions using the fact that an exponential power distribution is a scale mixture of normals. This model has application in signal processing, specifically image processing, with orthogonal wave...
Query Large Scale Microarray Compendium Datasets using a ModelBased Bayesian Approach with Variable Selection”, PLoS
 ONE
, 2009
"... In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify coexpressed genes. While working ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify coexpressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most coexpression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a modelbased gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting coexpression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying coexpressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual informationbased query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors.
Unified Gibbs Method For Biological Sequence Analysis
"... The biotechnology revolution stems from rapid advances in the biological sciences. One important product of these advances is a large and rapidly growing data base of biopolymer (DNA, RNA, and protein) sequences, which has attracted much attention from researchers in different fields. The great majo ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
The biotechnology revolution stems from rapid advances in the biological sciences. One important product of these advances is a large and rapidly growing data base of biopolymer (DNA, RNA, and protein) sequences, which has attracted much attention from researchers in different fields. The great majority of the techniques generated for studying these data have been designed to analyze a single sequence or for the comparison of a pair of sequences. Multiple sequence analysis has remained a difficult challenge. In recent years, formal statistical models have shown potential in one such problem, multiple sequence alignment. In this article we describe a general statistical paradigm, the unified Gibbs method, for the conversion of nearly any existing method for the analysis of a single sequence or for the comparison of a pair of sequences into a multiple sequence analysis method. Our previous successful experiences with the unified Gibbs include the development of the site sampler, the moti...