Results 1  10
of
13
Ancestor Sampling for Particle Gibbs
"... We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PGAS). Similarly to the existing PG with backward simulation (PGBS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. Instead of usin ..."
Abstract

Cited by 14 (7 self)
 Add to MetaCart
(Show Context)
We present a novel method in the family of particle MCMC methods that we refer to as particle Gibbs with ancestor sampling (PGAS). Similarly to the existing PG with backward simulation (PGBS) procedure, we use backward sampling to (considerably) improve the mixing of the PG kernel. Instead of using separate forward and backward sweeps as in PGBS, however, we achieve the same effect in a single forward sweep. We apply the PGAS framework to the challenging class of nonMarkovian statespace models. We develop a truncation strategy of these models that is applicable in principle to any backwardsimulationbased method, but which is particularly well suited to the PGAS framework. In particular, as we show in a simulation study, PGAS can yield an orderofmagnitude improved accuracy relative to PGBS due to its robustness to the truncation error. Several application examples are discussed, including RaoBlackwellized particle smoothing and inference in degenerate statespace models. 1
Evolutionary inference via the Poisson indel process
 Proc. Natl. Acad. Sci., 10.1073/pnas.1220450110
, 2012
"... We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of stringvalued evolutionary processes along the branches of a phylogenetic tree. The classical evoluti ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
(Show Context)
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of stringvalued evolutionary processes along the branches of a phylogenetic tree. The classical evolutionary process, the TKF91 model [62], is a continuoustime Markov chain model comprised of insertion, deletion and substitution events. Unfortunately this model gives rise to an intractable computational problem—the computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa [39]. In this work, we present a new stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The new model is closely related to the TKF91 model, differing only in its treatment of insertions, but the new model has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared to separate inference of phylogenies and alignments.
Topdown particle filtering for Bayesian decision trees
"... Decision tree learning is a popular approach for classification and regression in machine learning and statistics, and Bayesian formulations—which introduce a prior distribution over decision trees, and formulate learning as posterior inference given data— have been shown to produce competitive perf ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Decision tree learning is a popular approach for classification and regression in machine learning and statistics, and Bayesian formulations—which introduce a prior distribution over decision trees, and formulate learning as posterior inference given data— have been shown to produce competitive performance. Unlike classic decision tree learning algorithms like ID3, C4.5 and CART, which work in a topdown manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e.g., using Markov chain Monte Carlo (MCMC). We present a sequential Monte Carlo (SMC) algorithm that instead works in a topdown manner, mimicking the behavior and speed of classic algorithms. We demonstrate empirically that our approach delivers accuracy comparable to the most popular MCMC method, but operates more than an order of magnitude faster, and thus represents a better computationaccuracy tradeoff. 1.
Filtering with abstract particles
 In International Conference on Machine Learning (ICML), 2014. Robert H. Swendsen and JianSheng
"... By using particles, beam search and sequential Monte Carlo can approximate distributions in an extremely flexible manner. However, they can suffer from sparsity and inadequate coverage on large state spaces. We present a new filtering method for discrete spaces that addresses this issue by using “ ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
By using particles, beam search and sequential Monte Carlo can approximate distributions in an extremely flexible manner. However, they can suffer from sparsity and inadequate coverage on large state spaces. We present a new filtering method for discrete spaces that addresses this issue by using “abstract particles, ” each of which represents an entire region of state space. These abstract particles are combined into a hierarchical decomposition, yielding a compact and flexible representation. Empirically, our method outperforms beam search and sequential Monte Carlo on both a text reconstruction task and a multiple object tracking task.
Efficient ContinuousTime Markov Chain Estimation
"... Many problems of practical interest rely on Continuoustime Markov chains (CTMCs) defined over combinatorial state spaces, rendering the computation of transition probabilities, and hence probabilistic inference, difficult or impossible with existing methods. For problems with countably infinite sta ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Many problems of practical interest rely on Continuoustime Markov chains (CTMCs) defined over combinatorial state spaces, rendering the computation of transition probabilities, and hence probabilistic inference, difficult or impossible with existing methods. For problems with countably infinite states, where classical methods such as matrix exponentiation are not applicable, the main alternative has been particle Markov chain Monte Carlo methods imputing both the holding times and sequences of visited states. We propose a particlebased Monte Carlo approach where the holding times are marginalized analytically. We demonstrate that in a range of realistic inferential setups, our scheme dramatically reduces the variance of the Monte Carlo approximation and yields more accurate parameter posterior approximations given a fixed computational budget. These experiments are performed on both synthetic and real datasets, drawing from two important examples of CTMCs having combinatorial state spaces: stringvalued mutation models in phylogenetics and nucleic acid folding pathways. 1.
Divideandconquer with Sequential Monte Carlo. arXiv preprint
, 2014
"... We develop a Sequential Monte Carlo (SMC) procedure for inference in probabilistic graphical models using the divideandconquer methodology. The method is based on an auxiliary treestructured decomposition of the model of interest turning the overall inferential task into a collection of recursiv ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We develop a Sequential Monte Carlo (SMC) procedure for inference in probabilistic graphical models using the divideandconquer methodology. The method is based on an auxiliary treestructured decomposition of the model of interest turning the overall inferential task into a collection of recursively solved subproblems. Unlike a standard SMC sampler, the proposed method employs multiple independent populations of weighted particles, which are resampled, merged, and propagated as the method progresses. We illustrate empirically that this approach can outperform standard methods in estimation accuracy. It also opens up novel parallel implementation options and the possibility of concentrating the computational effort on the most challenging subproblems. The proposed method is applicable to a broad class of probabilistic graphical models. We demonstrate its performance on a Markov random field and on a hierarchical Bayesian model. 1
Memory (and Time) Efficient Sequential Monte Carlo SeongHwan Jun
"... Abstract Memory efficiency is an important issue in Sequential Monte Carlo (SMC) algorithms, arising for example in inference of highdimensional latent variables via RaoBlackwellized SMC algorithms, where the size of individual particles combined with the required number of particles can stress t ..."
Abstract
 Add to MetaCart
Abstract Memory efficiency is an important issue in Sequential Monte Carlo (SMC) algorithms, arising for example in inference of highdimensional latent variables via RaoBlackwellized SMC algorithms, where the size of individual particles combined with the required number of particles can stress the main memory. Standard SMC methods have a memory requirement that scales linearly in the number of particles present at all stage of the algorithm. Our contribution is a simple scheme that makes the memory cost of SMC methods depends on the number of distinct particles that survive resampling. We show that this difference has a large empirical impact on the quality of the approximation in realistic scenarios, and alsosince memory access is generally slowon the running time. The method is based on a two pass generation of the particles, which are represented implicitly in the first pass. We parameterize the accuracy of our algorithm with a memory budget rather than with a fixed number of particles. Our algorithm adaptively selects an optimal number of particle to exploit this fixed memory budget. We show that this adaptation does not interfere with the usual consistency guarantees that come with SMC algorithms.
Bayesian Phylogenetic Inference using a Combinatorial Sequential Monte Carlo Method ∗
"... The application of Bayesian methods to large scale phylogenetics problems is increasingly limited by computational issues, motivating the development of methods that can complement existing Markov Chain Monte Carlo (MCMC) schemes. Sequential Monte Carlo (SMC) methods are approximate inference algor ..."
Abstract
 Add to MetaCart
The application of Bayesian methods to large scale phylogenetics problems is increasingly limited by computational issues, motivating the development of methods that can complement existing Markov Chain Monte Carlo (MCMC) schemes. Sequential Monte Carlo (SMC) methods are approximate inference algorithms that have become very popular for time series models. Such methods have been recently developed to address phylogenetic inference problems but currently available techniques are only applicable to a restricted class of phylogenetic tree models compared to MCMC. In this paper, we propose an original Combinatorial SMC (CSMC) method to approximate posterior phylogenetic tree distributions which is applicable to a general class of models and can be easily combined with MCMC to infer evolutionary parameters. Our method only relies on the existence of a flexible partially ordered set structure and is more generally applicable to sampling problems on combinatorial spaces. We demonstrate that the proposed CSMC algorithm provides consistent estimates under weak assumptions, is computationally fast and is additionally easily parallelizable.
A Note on Probabilistic Models over Strings: the Linear Algebra Approach
, 2013
"... Probabilistic models over strings have played a key role in developing methods allowing indels to be treated as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important ..."
Abstract
 Add to MetaCart
(Show Context)
Probabilistic models over strings have played a key role in developing methods allowing indels to be treated as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question in the field is the complexity of computing the normalization of a class of stringvalued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a new proof of a known result on the complexity of inference on TKF91, a wellknown probabilistic model over strings. Our proof uses a different approach based on classical linear algebra results, and is in some cases easier to extend to other models. The proving method also has consequences on the implementation and complexity of inference algorithms.
A Simulation Approach for ChangePoints on Phylogenetic Trees
"... We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all s ..."
Abstract
 Add to MetaCart
(Show Context)
We observe n sequences at each of m sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of changepoints and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal MetropolisHastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our timemachine PMMH algorithm copes well with one of the bottlenecks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques.