Results 1  10
of
23
Inducing Features of Random Fields
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1997
"... We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the ..."
Abstract

Cited by 669 (10 self)
 Add to MetaCart
We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the KullbackLeibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are nonMarkovian and have a large number of parameters that must be estimated. Relations to other learning approaches, including decision trees, are given. As a demonstration of the method, we describe its application to the problem of automatic word classifica...
Error bounds for computing the expectation by Markov chain Monte Carlo
, 2009
"... We study the error of reversible Markov chain Monte Carlo methods for approximating the expectation of a function. Explicit error bounds with respect to the l2, l4 and l∞norm of the function are proven. By the estimation the well known asymptotical limit of the error is attained, i.e. there is n ..."
Abstract

Cited by 117 (2 self)
 Add to MetaCart
We study the error of reversible Markov chain Monte Carlo methods for approximating the expectation of a function. Explicit error bounds with respect to the l2, l4 and l∞norm of the function are proven. By the estimation the well known asymptotical limit of the error is attained, i.e. there is no gap between the estimate and the asymptotical behavior. We discuss the dependence of the error on a burnin of the Markov chain. Furthermore we suggest and justify a specific burnin for optimizing the algorithm.
Ordering Monte Carlo Markov Chains
 School of Statistics, University of Minnesota
, 1999
"... Markov chains having the same stationary distribution ß can be partially ordered by performance in the central limit theorem. We say that one chain is at least as good as another in the efficiency partial ordering if the variance in the central limit theorem is at least as small for every L 2 (ß) ..."
Abstract

Cited by 31 (7 self)
 Add to MetaCart
(Show Context)
Markov chains having the same stationary distribution ß can be partially ordered by performance in the central limit theorem. We say that one chain is at least as good as another in the efficiency partial ordering if the variance in the central limit theorem is at least as small for every L 2 (ß) functional of the chain. Peskun partial ordering implies efficiency partial ordering [25, 30]. Here we show that Peskun partial ordering implies, for finite state spaces, ordering of all the eigenvalues of the transition matrices, and, for general state spaces, ordering of the suprema of the spectra of the transition operators. We also define a covariance partial ordering based on lag one autocovariances and show that it is equivalent to the efficiency partial ordering when restricted to reversible Markov chains. Similar but weaker results are provided for nonreversible Markov chains. Keywords: Peskun ordering, Eigenvalues, Spectral decomposition, Nonreversible kernels. 1 Introduction I...
Ordering and improving the performance of Monte Carlo Markov chains
 Statistical Science
, 2001
"... Abstract. An overview of orderings defined on the space of Markov chains having a prespecified unique stationary distribution is given. The intuition gained by studying these orderings is used to improve existing Markov chain Monte Carlo algorithms. Key words and phrases: Asymptotic variance, conver ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Abstract. An overview of orderings defined on the space of Markov chains having a prespecified unique stationary distribution is given. The intuition gained by studying these orderings is used to improve existing Markov chain Monte Carlo algorithms. Key words and phrases: Asymptotic variance, convergence ordering,
Ordering, Slicing And Splitting Monte Carlo Markov Chains
, 1998
"... Markov chain Monte Carlo is a method of approximating the integral of a function f with respect to a distribution ß. A Markov chain that has ß as its stationary distribution is simulated producing samples X 1 ; X 2 ; : : : . The integral is approximated by taking the average of f(X n ) over the sam ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Markov chain Monte Carlo is a method of approximating the integral of a function f with respect to a distribution ß. A Markov chain that has ß as its stationary distribution is simulated producing samples X 1 ; X 2 ; : : : . The integral is approximated by taking the average of f(X n ) over the sample path. The standard way to construct such Markov chains is the MetropolisHastings algorithm. The class P of all Markov chains having ß as their unique stationary distribution is very large, so it is important to have criteria telling when one chain performs better than another. The Peskun ordering is a partial ordering on P. If two Markov chains are Peskun ordered, then the better chain has smaller variance in the central limit theorem for every function f that has a variance. Peskun ordering is sufficient for this but not necessary. We study the implications of the Peskun ordering both in finite and general state spaces. Unfortunately there are many MetropolisHastings samplers that are...
Outperforming the Gibbs sampler empirical estimator for nearest neighbor random fields
, 1996
"... Given a Markov chain sampling scheme, does the standard empirical estimator make best use of the data? We show that this is not so and construct better estimators. We restrict attention to nearest neighbor random fields and to Gibbs samplers with deterministic sweep, but our approach applies to any ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Given a Markov chain sampling scheme, does the standard empirical estimator make best use of the data? We show that this is not so and construct better estimators. We restrict attention to nearest neighbor random fields and to Gibbs samplers with deterministic sweep, but our approach applies to any sampler that uses reversible variableatatime updating with deterministic sweep. The structure of the transition distribution of the sampler is exploited to construct further empirical estimators that are combined with the standard empirical estimator to reduce asymptotic variance. The extra computational cost is negligible. When the random field is spatially homogeneous, symmetrizations of our estimator lead to further variance reduction. The performance of the estimators is evaluated in a simulation study of the Ising model.
Von Mises type statistics for single site updated local interaction random fields
, 1998
"... Random field models in image analysis and spatial statistics usually have local interactions. They can be simulated by Markov chains which update a single site at a time. The updating rules typically condition on only a few neighboring sites. If we want to approximate the expectation of a bounded fu ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Random field models in image analysis and spatial statistics usually have local interactions. They can be simulated by Markov chains which update a single site at a time. The updating rules typically condition on only a few neighboring sites. If we want to approximate the expectation of a bounded function, can we make better use of the simulations than through the empirical estimator? We describe symmetrizations of the empirical estimator which are computationally feasible and can lead to considerable variance reduction. The method is reminiscent of the idea behind generalized von Mises statistics. To simplify the exposition, we consider mainly nearest neighbor random fields and the Gibbs sampler. Key words and Phrases. Asymptotic relative efficiency, Gibbs sampler, Ising model, Markov chain Monte Carlo, Metropolis algorithm, parallel updating, variance reduction. Research partially supported by NSF Grant ATM9417528. 1 All three authors were partially supported by NSERC, Canada. ...
Information bounds for Gibbs samplers
 In preparation
, 1995
"... If we wish to efficiently estimate the expectation of an arbitrary function on the basis of the output of a Gibbs sampler, which is better: deterministic or random sweep? In each case we calculate the asymptotic variance of the empirical estimator, the average of the function over the output, and de ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
If we wish to efficiently estimate the expectation of an arbitrary function on the basis of the output of a Gibbs sampler, which is better: deterministic or random sweep? In each case we calculate the asymptotic variance of the empirical estimator, the average of the function over the output, and determine the minimal asymptotic variance for estimators that use no information about the underlying distribution. The empirical estimator has noticeably smaller variance for deterministic sweep. The variance bound for random sweep is in general smaller than for deterministic sweep, but the two are equal if the target distribution is continuous. If the components of the target distribution are not strongly dependent, the empirical estimator is close to efficient under deterministic sweep, and its asymptotic variance approximately doubles under random sweep. 1 Introduction The Gibbs sampler is a widely used Markov chain Monte Carlo (MCMC) method for estimating analytically intractable feature...