Results 1 - 10
of
54
Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms
- IEEE Transactions on Information Theory
, 2005
"... Important inference problems in statistical physics, computer vision, error-correcting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems t ..."
Abstract
-
Cited by 279 (11 self)
- Add to MetaCart
Important inference problems in statistical physics, computer vision, error-correcting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems that is exact when the factor graph is a tree, but only approximate when the factor graph has cycles. We show that BP fixed points correspond to the stationary points of the Bethe approximation of the free energy for a factor graph. We explain how to obtain regionbased free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms. We emphasize the conditions a free energy approximation must satisfy in order to be a “valid ” or “maxent-normal ” approximation. We describe the relationship between four different methods that can be used to generate valid approximations: the “Bethe method, ” the “junction graph method, ” the “cluster variation method, ” and the “region graph method.” Finally, we explain how to tell whether a region-based approximation, and its corresponding GBP algorithm, is likely to be accurate, and describe empirical results showing that GBP can significantly outperform BP.
MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches
- IEEE Transactions on Information Theory
, 2002
"... We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of tree-structured problems, we obtain an upper bound on the optimal value of the original ..."
Abstract
-
Cited by 77 (7 self)
- Add to MetaCart
We develop an approach for computing provably exact maximum a posteriori (MAP) configurations for a subclass of problems on graphs with cycles. By decomposing the original problem into a convex combination of tree-structured problems, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is met with equality if and only if the tree problems share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original problem. Next we present and analyze two methods for attempting to obtain tight upper bounds: (a) a tree-reweighted messagepassing algorithm that is related to but distinct from the max-product (min-sum) algorithm; and (b) a tree-relaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds. Finally, we discuss the conditions that govern when the relaxation is tight, in which case the MAP configuration can be obtained. The analysis described here generalizes naturally to convex combinations of hypertree-structured distributions.
MAP estimation via agreement on trees: Message-passing and linear programming
, 2002
"... We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of tree-structured distributions, we obtain an upper bound ..."
Abstract
-
Cited by 76 (7 self)
- Add to MetaCart
We develop and analyze methods for computing provably optimal maximum a posteriori (MAP) configurations for a subclass of Markov random fields defined on graphs with cycles. By decomposing the original distribution into a convex combination of tree-structured distributions, we obtain an upper bound on the optimal value of the original problem (i.e., the log probability of the MAP assignment) in terms of the combined optimal values of the tree problems. We prove that this upper bound is tight if and only if all the tree distributions share an optimal configuration in common. An important implication is that any such shared configuration must also be a MAP configuration for the original distribution. Next we develop two approaches to attempting to obtain tight upper bounds: (a) a tree-relaxed linear program (LP), which is derived from the Lagrangian dual of the upper bounds; and (b) a tree-reweighted max-product messagepassing algorithm that is related to but distinct from the max-product algorithm. In this way, we establish a connection between a certain LP relaxation of the modefinding problem, and a reweighted form of the max-product (min-sum) message-passing algorithm.
Walk-Sums and Belief Propagation in Gaussian Graphical Models
- Journal of Machine Learning Research
, 2006
"... We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edg ..."
Abstract
-
Cited by 49 (10 self)
- Add to MetaCart
We present a new framework based on walks in a graph for analysis and inference in Gaussian graphical models. The key idea is to decompose the correlation between each pair of variables as a sum over all walks between those variables in the graph. The weight of each walk is given by a product of edge-wise partial correlation coefficients. This representation holds for a large class of Gaussian graphical models which we call walk-summable. We give a precise characterization of this class of models, and relate it to other classes including diagonally dominant, attractive, nonfrustrated, and pairwise-normalizable. We provide a walk-sum interpretation of Gaussian belief propagation in trees and of the approximate method of loopy belief propagation in graphs with cycles. The walk-sum perspective leads to a better understanding of Gaussian belief propagation and to stronger results for its convergence in loopy graphs.
Consensus propagation
- IEEE TRANSACTIONS ON INFORMATION THEORY
, 2006
"... We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative ..."
Abstract
-
Cited by 44 (6 self)
- Add to MetaCart
We propose consensus propagation, an asynchronous distributed protocol for averaging numbers across a network. We establish convergence, characterize the convergence rate for regular graphs, and demonstrate that the protocol exhibits better scaling properties than pairwise averaging, an alternative that has received much recent attention. Consensus propagation can be viewed as a special case of belief propagation, and our results contribute to the belief propagation literature. In particular, beyond singly-connected graphs, there are very few classes of relevant problems for which belief propagation is known to converge.
Tree Consistency and Bounds on the Performance of the Max-Product Algorithm and Its Generalizations
, 2002
"... Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on gr ..."
Abstract
-
Cited by 43 (4 self)
- Add to MetaCart
Finding the maximum a posteriori (MAP) assignment of a discrete-state distribution specified by a graphical model requires solving an integer program. The max-product algorithm, also known as the max-plus or min-sum algorithm, is an iterative method for (approximately) solving such a problem on graphs with cycles.
Kernel-Based Learning of Hierarchical Multilabel Classification Models
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Mark ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show
Embedded Trees: Estimation of Gaussian Processes on Graphs with Cycles
- IEEE Transactions on Signal Processing
, 2002
"... Graphical models provide a powerful general framework for encoding the structure of large-scale estimation problems. However, the graphs describing typical real-world phenomena contain many cycles, making direct estimation procedures prohibitively costly. In this paper, we develop an iterative infer ..."
Abstract
-
Cited by 33 (12 self)
- Add to MetaCart
Graphical models provide a powerful general framework for encoding the structure of large-scale estimation problems. However, the graphs describing typical real-world phenomena contain many cycles, making direct estimation procedures prohibitively costly. In this paper, we develop an iterative inference algorithm for general Gaussian graphical models. It operates by exactly solving a series of modified estimation problems on spanning trees embedded within the original cyclic graph. When these subproblems are suitably chosen, the algorithm converges to the correct conditional means. Moreover, and in contrast to many other iterative methods, the tree-based procedures we propose can also be used to calculate exact error variances. Although the conditional mean iteration is effective for quite densely connected graphical models, the error variance computation is most efficient for sparser graphs. In this context, we present a modeling example which suggests that very sparsely connected graphs with cycles may provide significant advantages relative to their tree-structured counterparts, thanks both to the expressive power of these models and to the efficient inference algorithms developed herein.
Learning conditional random fields for stereo
- In CVPR
, 2007
"... State-of-the-art stereo vision algorithms utilize color changes as important cues for object boundaries. Most methods impose heuristic restrictions or priors on disparities, for example by modulating local smoothness costs with intensity gradients. In this paper we seek to replace such heuristics wi ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
State-of-the-art stereo vision algorithms utilize color changes as important cues for object boundaries. Most methods impose heuristic restrictions or priors on disparities, for example by modulating local smoothness costs with intensity gradients. In this paper we seek to replace such heuristics with explicit probabilistic models of disparities and intensities learned from real images. We have constructed a large number of stereo datasets with ground-truth disparities, and we use a subset of these datasets to learn the parameters of Conditional Random Fields (CRFs). We present experimental results illustrating the potential of our approach for automatically learning the parameters of models with richer structure than standard hand-tuned MRF models. 1. Introduction and
Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting
- Journal of Machine Learning Research
, 2006
"... Consider the problem of joint parameter estimation and prediction in a Markov random field: that is, the model parameters are estimated on the basis of an initial set of data, and then the fitted model is used to perform prediction (e.g., smoothing, denoising, interpolation) on a new noisy observa ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
Consider the problem of joint parameter estimation and prediction in a Markov random field: that is, the model parameters are estimated on the basis of an initial set of data, and then the fitted model is used to perform prediction (e.g., smoothing, denoising, interpolation) on a new noisy observation.

