Results 1  10
of
33
NormProduct Belief Propagation: PrimalDual MessagePassing for Approximate Inference
, 2008
"... Inference problems in graphical models can be represented as a constrained optimization of a free energy function. In this paper we treat both forms of probabilistic inference, estimating marginal probabilities of the joint distribution and finding the most probable assignment, through a unified me ..."
Abstract

Cited by 53 (11 self)
 Add to MetaCart
(Show Context)
Inference problems in graphical models can be represented as a constrained optimization of a free energy function. In this paper we treat both forms of probabilistic inference, estimating marginal probabilities of the joint distribution and finding the most probable assignment, through a unified messagepassing algorithm architecture. In particular we generalize the Belief Propagation (BP) algorithms of sumproduct and maxproduct and treerewaighted (TRW) sum and max product algorithms (TRBP) and introduce a new set of convergent algorithms based on ”convexfreeenergy” and LinearProgramming (LP) relaxation as a zerotemprature of a convexfreeenergy. The main idea of this work arises from taking a general perspective on the existing BP and TRBP algorithms while observing that they all are reductions from the basic optimization formula of f + ∑ i hi
Ising models on locally treelike graphs
, 2008
"... Abstract We consider Ising models on graphs that converge locally to trees. Examples include random regulargraphs with bounded degree and uniformly random graphs with bounded average degree. We prove that the `cavity ' prediction for the limiting free energy per spin is correct for any temperat ..."
Abstract

Cited by 35 (4 self)
 Add to MetaCart
(Show Context)
Abstract We consider Ising models on graphs that converge locally to trees. Examples include random regulargraphs with bounded degree and uniformly random graphs with bounded average degree. We prove that the `cavity ' prediction for the limiting free energy per spin is correct for any temperature and externalfield. Further, local marginals can be approximated by iterating a set of mean field (cavity) equations. Both results are achieved by proving the local convergence of the Boltzmann distribution on the originalgraph to the Boltzmann distribution on the appropriate infinite random tree. 1 Introduction An Ising model on the finite graph G (with vertex set V, and edge set E) is defined by the following Boltzmann distributions over x = {xi: i 2 V}, with xi 2 {+1,1}
Counting in graph covers: a combinatorial characterization of the Bethe entropy function
 SUBMITTED TO IEEE TRANS. INF. THEORY
, 2012
"... ..."
(Show Context)
Learning and evaluating Boltzmann machines
, 2008
"... We provide a brief overview of the variational framework for obtaining deterministic approximations or upper bounds for the logpartition function. We also review some of the Monte Carlo based methods for estimating partition functions of arbitrary Markov Random Fields. We then develop an annealed i ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
We provide a brief overview of the variational framework for obtaining deterministic approximations or upper bounds for the logpartition function. We also review some of the Monte Carlo based methods for estimating partition functions of arbitrary Markov Random Fields. We then develop an annealed importance sampling (AIS) procedure for estimating partition functions of restricted Boltzmann machines (RBM’s), semirestricted Boltzmann machines (SRBM’s), and Boltzmann machines (BM’s). Our empirical results indicate that the AIS procedure provides much better estimates of the partition function than some of the popular variationalbased methods. Finally, we develop a new learning algorithm for training general Boltzmann machines and show that it can be successfully applied to learning good generative models. Learning and Evaluating Boltzmann Machines
What Cannot be Learned with Bethe Approximations
"... We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its Bethe approximation. We show that there exists a regime of empirical marginals where such Bethe learning will fail. By fai ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its Bethe approximation. We show that there exists a regime of empirical marginals where such Bethe learning will fail. By failure we mean that the empirical marginals cannot be recovered from the approximated maximum likelihood parameters (i.e., moment matching is not achieved). We provide several conditions on empirical marginals that yield outer and inner bounds on the set of Bethe learnable marginals. An interesting implication of
The Bethe partition function of logsupermodular graphical models
 In Neural Information Processing Systems
, 2012
"... Sudderth, Wainwright, and Willsky conjectured that the Bethe approximation corresponding to any fixed point of the belief propagation algorithm over an attractive, pairwise binary graphical model provides a lower bound on the true partition function. In this work, we resolve this conjecture in the a ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
Sudderth, Wainwright, and Willsky conjectured that the Bethe approximation corresponding to any fixed point of the belief propagation algorithm over an attractive, pairwise binary graphical model provides a lower bound on the true partition function. In this work, we resolve this conjecture in the affirmative by demonstrating that, for any graphical model with binary variables whose potential functions (not necessarily pairwise) are all logsupermodular, the Bethe partition function always lower bounds the true partition function. The proof of this result follows from a new variant of the “four functions ” theorem that may be of independent interest. 1
Counting independent sets using the Bethe approximation
 SIAM J. Discr. Math
, 2011
"... Abstract. We consider the #Pcomplete problem of counting the number of independent sets in a given graph. Our interest is in understanding the effectiveness of the popular belief propagation (BP) heuristic. BP is a simple iterative algorithm that is known to have at least one fixed point, where eac ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
Abstract. We consider the #Pcomplete problem of counting the number of independent sets in a given graph. Our interest is in understanding the effectiveness of the popular belief propagation (BP) heuristic. BP is a simple iterative algorithm that is known to have at least one fixed point, where each fixed point corresponds to a stationary point of the Bethe free energy (introduced by Yedidia, Freeman, and Weiss [IEEE Trans. Inform. Theory, 51 (2004), pp. 2282–2312] in recognition of Bethe’s earlier work in 1935). The evaluation of the Bethe free energy at such a stationary point (or BP fixed point) leads to the Bethe approximation for the number of independent sets of the given graph. BP is not known to converge in general, nor is an efficient, convergent procedure for finding stationary points of the Bethe free energy known. Furthermore, the effectiveness of the Bethe approximation is not well understood. As the first result of this paper we propose a BPlike algorithm that always converges to a stationary point of the Bethe free energy for any graph for the independent set problem. This procedure finds an εapproximate stationary point in Oðn 2 d 4 2 d ε −4 log 3 ðnε −1 ÞÞ iterations for a graph of n nodes with maxdegree d. We study the quality of the resulting Bethe approximation using the recently developed “loop series ” framework of Chertkov and Chernyak [J. Stat. Mech. Theory Exp.,6 (2006), P06009]. As this characterization is applicable only for exact stationary points of the Bethe free energy, we provide a slightly modified characterization that holds for εapproximate stationary points. We establish that for any graph on n nodes with maxdegree d and girth larger than 8d log 2 n, the multiplicative error between the number of independent sets and the Bethe approximation decays as 1 þ Oðn −γ Þ for some γ> 0. This provides a deterministic counting algorithm that leads to strictly different results compared to
Improving on Expectation Propagation
"... A series of corrections is developed for the fixed points of Expectation Propagation (EP), which is one of the most popular methods for approximate probabilistic inference. These corrections can lead to improvements of the inference approximation or serve as a sanity check, indicating when EP yields ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
A series of corrections is developed for the fixed points of Expectation Propagation (EP), which is one of the most popular methods for approximate probabilistic inference. These corrections can lead to improvements of the inference approximation or serve as a sanity check, indicating when EP yields unrealiable results. 1
On sampling from the gibbs distribution with random maximum aposteriori perturbations
 Advances in Neural Information Processing Systems
, 2013
"... In this paper we describe how MAP inference can be used to sample efficiently from Gibbs distributions. Specifically, we provide means for drawing either approximate or unbiased samples from Gibbs ’ distributions by introducing low dimensional perturbations and solving the corresponding MAP assign ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
In this paper we describe how MAP inference can be used to sample efficiently from Gibbs distributions. Specifically, we provide means for drawing either approximate or unbiased samples from Gibbs ’ distributions by introducing low dimensional perturbations and solving the corresponding MAP assignments. Our approach also leads to new ways to derive lower bounds on partition functions. We demonstrate empirically that our method excels in the typical “high signalhigh coupling ” regime. The setting results in ragged energy landscapes that are challenging for alternative approaches to sampling and/or lower bounds. 1
Approximate Inference in Gaussian Graphical Models
, 2008
"... The focus of this thesis is approximate inference in Gaussian graphical models. A graphical model is a family of probability distributions in which the structure of interactions among the random variables is captured by a graph. Graphical models have become a powerful tool to describe complex highd ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
The focus of this thesis is approximate inference in Gaussian graphical models. A graphical model is a family of probability distributions in which the structure of interactions among the random variables is captured by a graph. Graphical models have become a powerful tool to describe complex highdimensional systems specified through local interactions. While such models are extremely rich and can represent a diverse range of phenomena, inference in general graphical models is a hard problem. In this thesis we study Gaussian graphical models, in which the joint distribution of all the random variables is Gaussian, and the graphical structure is exposed in the inverse of the covariance matrix. Such models are commonly used in a variety of fields, including remote sensing, computer vision, biology and sensor networks. Inference in Gaussian models reduces to matrix inversion, but for very largescale models and for models requiring distributed inference, matrix inversion is not feasible. We first study a representation of inference in Gaussian graphical models in terms of computing sums of weights of walks in the graph – where means, variances and correlations can be represented as such walksums. This representation holds in a wide class