Results 1  10
of
67
An Augmented Lagrangian Approach to Constrained MAP Inference
"... We propose a new algorithm for approximate MAP inference on factor graphs, by combining augmented Lagrangian optimization with the dual decomposition method. Each slave subproblem is given a quadratic penalty, which pushes toward faster consensus than in previous subgradient approaches. Our algorith ..."
Abstract

Cited by 37 (3 self)
 Add to MetaCart
(Show Context)
We propose a new algorithm for approximate MAP inference on factor graphs, by combining augmented Lagrangian optimization with the dual decomposition method. Each slave subproblem is given a quadratic penalty, which pushes toward faster consensus than in previous subgradient approaches. Our algorithm is provably convergent, parallelizable, and suitable for fine decompositions of the graph. We show how it can efficiently handle problems with (possibly global) structural constraints via simple sort operations. Experiments on synthetic and realworld data show that our approach compares favorably with the stateoftheart. 1.
An Alternating Direction Method for Dual MAP LP Relaxation
"... Maximum aposteriori (MAP) estimation is an important task in many applications of probabilistic graphical models. Although finding an exact solution is generally intractable, approximations based on linear programming (LP) relaxation often provide good approximate solutions. In this paper we prese ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
(Show Context)
Maximum aposteriori (MAP) estimation is an important task in many applications of probabilistic graphical models. Although finding an exact solution is generally intractable, approximations based on linear programming (LP) relaxation often provide good approximate solutions. In this paper we present an algorithm for solving the LP relaxation optimization problem. In order to overcome the lack of strict convexity, we apply an augmented Lagrangian method to the dual LP. The algorithm, based on the alternating direction method of multipliers (ADMM), is guaranteed to converge to the global optimum of the LP relaxation objective. Our experimental results show that this algorithm is competitive with other stateoftheart algorithms for approximate MAP estimation.
A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing
"... Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This tutorial gives an overview of the technique. We describe example algorithms, descr ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP). This tutorial gives an overview of the technique. We describe example algorithms, describe formal guarantees for the method, and describe practical issues in implementing the algorithms. While our examples are predominantly drawn from the NLP literature, the material should be of general relevance to inference problems in machine learning. A central theme of this tutorial is that Lagrangian relaxation is naturally applied in conjunction with a broad class of combinatorial algorithms, allowing inference in models that go significantly beyond previous work on Lagrangian relaxation for inference in graphical models.
Variational algorithms for marginal map
 In UAI
, 2011
"... Marginal MAP problems are notoriously difficult tasks for graphical models. We derive a general variational framework for solving marginal MAP problems, in which we apply analogues of the Bethe, treereweighted, and mean field approximations. We then derive a “mixed ” message passing algorithm and a ..."
Abstract

Cited by 17 (4 self)
 Add to MetaCart
(Show Context)
Marginal MAP problems are notoriously difficult tasks for graphical models. We derive a general variational framework for solving marginal MAP problems, in which we apply analogues of the Bethe, treereweighted, and mean field approximations. We then derive a “mixed ” message passing algorithm and a convergent alternative using CCCP to solve the BPtype approximations. Theoretically, we give conditions under which the decoded solution is a global or local optimum, and obtain novel upper bounds on solutions. Experimentally we demonstrate that our algorithms outperform related approaches. We also show that EM and variational EM comprise a special case of our framework. 1
Efficiently Searching for Frustrated Cycles in MAP Inference
"... Dual decomposition provides a tractable framework for designing algorithms for finding the most probable (MAP) configuration in graphical models. However, for many realworld inference problems, the typical decomposition has a large integrality gap, due to frustrated cycles. One way to tighten the r ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Dual decomposition provides a tractable framework for designing algorithms for finding the most probable (MAP) configuration in graphical models. However, for many realworld inference problems, the typical decomposition has a large integrality gap, due to frustrated cycles. One way to tighten the relaxation is to introduce additional constraints that explicitly enforce cycle consistency. Earlier work showed that clusterpursuit algorithms, which iteratively introduce cycle and other higherorder consistency constraints, allows one to exactly solve many hard inference problems. However, these algorithms explicitly enumerate a candidate set of clusters, limiting them to triplets or other short cycles. We solve the search problem for cycle constraints, giving a nearly linear time algorithm for finding the most frustrated cycle of arbitrary length. We show how to use this search algorithm together with the dual decomposition framework and clusterpursuit. The new algorithm exactly solves MAP inference problems arising from relational classification and stereo vision. 1
Convergence rate analysis of MAP coordinate minimization algorithms
 In NIPS. 2012
"... Finding maximum a posteriori (MAP) assignments in graphical models is an important task in many applications. Since the problem is generally hard, linear programming (LP) relaxations are often used. Solving these relaxations efficiently is thus an important practical problem. In recent years, seve ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Finding maximum a posteriori (MAP) assignments in graphical models is an important task in many applications. Since the problem is generally hard, linear programming (LP) relaxations are often used. Solving these relaxations efficiently is thus an important practical problem. In recent years, several authors have proposed message passing updates corresponding to coordinate descent in the dual LP. However, these are generally not guaranteed to converge to a global optimum. One approach to remedy this is to smooth the LP, and perform coordinate descent on the smoothed dual. However, little is known about the convergence rate of this procedure. Here we perform a thorough rate analysis of such schemes and derive primal and dual convergence rates. We also provide a simple dual to primal mapping that yields feasible primal solutions with a guaranteed rate of convergence. Empirical evaluation supports our theoretical claims and shows that the method is highly competitive with state of the art approaches that yield global optima. 1
Comparing apples to oranges: a scalable solution with heterogeneous hashing
 In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
, 2013
"... Although hashing techniques have been popular for the large scale similarity search problem, most of the existing methods for designing optimal hash functions focus on homogeneous similarity assessment, i.e., the data entities to be indexed are of the same type. Realizing that heterogeneous entities ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
(Show Context)
Although hashing techniques have been popular for the large scale similarity search problem, most of the existing methods for designing optimal hash functions focus on homogeneous similarity assessment, i.e., the data entities to be indexed are of the same type. Realizing that heterogeneous entities and relationships are also ubiquitous in the real world applications, there is an emerging need to retrieve and search similar or relevant data entities from multiple heterogeneous domains, e.g., recommending relevant posts and images to a certain Facebook user. In this paper, we address the problem of “comparing apples to oranges ” under the large scale setting. Specifically, we propose a novel Relationaware Heterogeneous Hashing (RaHH), which provides a general framework for generating hash codes of data entities sitting in multiple heterogeneous domains. Unlike some existing hashing methods that map heterogeneous data in a common Hamming space, the RaHH approach constructs a Hamming space for each type of data entities, and learns optimal mappings between them simultaneously. This makes the learned hash codes flexibly cope with the characteristics of different data domains. Moreover, the RaHH framework encodes both homogeneous and heterogeneous relationships between the data entities to design hash functions with improved accuracy. To validate the proposed RaHH method, we conduct extensive evaluations on two large datasets; one is crawled from a popular social media sites, Tencent Weibo, and the other is an open dataset of Flickr(NUSWIDE). The experimental results clearly demonstrate that the RaHH outperforms several stateoftheart hashing methods with significant performance gains.
BetheADMM for Tree Decomposition based Parallel MAP Inference
"... We consider the problem of maximum a posteriori (MAP) inference in discrete graphical models. We present a parallel MAP inference algorithm called BetheADMM based on two ideas: treedecomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard A ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
We consider the problem of maximum a posteriori (MAP) inference in discrete graphical models. We present a parallel MAP inference algorithm called BetheADMM based on two ideas: treedecomposition of the graph and the alternating direction method of multipliers (ADMM). However, unlike the standard ADMM, we use an inexact ADMM augmented with a Bethedivergence based proximal function, which makes each subproblem in ADMM easy to solve in parallel using the sumproduct algorithm. We rigorously prove global convergence of BetheADMM. The proposed algorithm is extensively evaluated on both synthetic and real datasets to illustrate its effectiveness. Further, the parallel BetheADMM is shown to scale almost linearly with increasing number of cores. 1
Exploring Compositional High Order Pattern Potentials for Structured Output Learning
"... When modeling structured outputs such as image segmentations, prediction can be improved by accurately modeling structure present in the labels. A key challenge is developing tractable models that are able to capture complex high level structure like shape. In this work, we study the learning of a g ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
When modeling structured outputs such as image segmentations, prediction can be improved by accurately modeling structure present in the labels. A key challenge is developing tractable models that are able to capture complex high level structure like shape. In this work, we study the learning of a general class of patternlike high order potential, which we call Compositional High Order Pattern Potentials (CHOPPs). We show that CHOPPs include the linear deviation pattern potentials of Rother et al. [26] and also Restricted Boltzmann Machines (RBMs); we also establish the near equivalence of these two models. Experimentally, we show that performance is affected significantly by the degree of variability present in the datasets, and we define a quantitative variability measure to aid in studying this. We then improve CHOPPs performance in high variability datasets with two primary contributions: (a) developing a losssensitive joint learning procedure, so that internal pattern parameters can be learned in conjunction with other model potentials to minimize expected loss;and (b) learning an imagedependent mapping that encourages or inhibits patterns depending on image features. We also explore varying how multiple patterns are composed, and learning convolutional patterns. Quantitative results on challenging highly variable datasets show that the joint learning and imagedependent high order potentials can improve performance. 1.
Generalized sequential treereweighted message passing
 arXiv:1205.6352
"... This paper addresses the problem of approximate MAPMRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs t ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of approximate MAPMRF inference in general graphical models. Following [36], we consider a family of linear programming relaxations of the problem where each relaxation is specified by a set of nested pairs of factors for which the marginalization constraint needs to be enforced. We develop a generalization of the TRWS algorithm [9] for this problem, where we use a decomposition into junction chains, monotonic w.r.t. some ordering on the nodes. This generalizes the monotonic chains in [9] in a natural way. We also show how to deal with nested factors in an efficient way. Experiments show an improvement over minsum diffusion, MPLP and subgradient ascent algorithms on a number of computer vision and natural language processing problems. 1