Results 1 - 10
of
41
Learning structured prediction models: a large margin approach
, 2004
"... We consider large margin estimation in a broad range of prediction models where inference involves solving combinatorial optimization problems, for example, weighted graphcuts or matchings. Our goal is to learn parameters such that inference using the model reproduces correct answers on the training ..."
Abstract
-
Cited by 127 (7 self)
- Add to MetaCart
We consider large margin estimation in a broad range of prediction models where inference involves solving combinatorial optimization problems, for example, weighted graphcuts or matchings. Our goal is to learn parameters such that inference using the model reproduces correct answers on the training data. Our method relies on the expressive power of convex optimization problems to compactly capture inference or solution optimality in structured prediction models. Directly embedding this structure within the learning formulation produces concise convex problems for efficient estimation of very complex and diverse models. We describe experimental results on a matching task, disulfide connectivity prediction, showing significant improvements over state-of-the-art methods. 1.
Discriminative learning of Markov random fields for segmentation of 3d scan data
- In Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR
, 2005
"... We address the problem of segmenting 3D scan data into objects or object classes. Our segmentation framework is based on a subclass of Markov Random Fields (MRFs) which support efficient graph-cut inference. The MRF models incorporate a large set of diverse features and enforce the preference that a ..."
Abstract
-
Cited by 65 (5 self)
- Add to MetaCart
We address the problem of segmenting 3D scan data into objects or object classes. Our segmentation framework is based on a subclass of Markov Random Fields (MRFs) which support efficient graph-cut inference. The MRF models incorporate a large set of diverse features and enforce the preference that adjacent scan points have the same classification label. We use a recently proposed maximummargin framework to discriminatively train the model from a set of labeled scans; as a result we automatically learn the relative importance of the features for the segmentation task. Performing graph-cut inference in the trained MRF can then be used to segment new scenes very efficiently. We test our approach on three large-scale datasets produced by different kinds of 3D sensors, showing its applicability to both outdoor and indoor environments containing diverse objects. 1.
Get out the vote: Determining support or opposition from Congressional floor-debate transcripts
- In Proceedings of EMNLP
, 2006
"... We investigate whether one can determine from the transcripts of U.S. Congressional floor debates whether the speeches represent support of or opposition to proposed legislation. To address this problem, we exploit the fact that these speeches occur as part of a discussion; this allows us to use sou ..."
Abstract
-
Cited by 56 (2 self)
- Add to MetaCart
We investigate whether one can determine from the transcripts of U.S. Congressional floor debates whether the speeches represent support of or opposition to proposed legislation. To address this problem, we exploit the fact that these speeches occur as part of a discussion; this allows us to use sources of information regarding relationships between discourse segments, such as whether a given utterance indicates agreement with the opinion expressed by another. We find that the incorporation of such information yields substantial improvements over classifying speeches in isolation. 1
Supervised clustering with support vector machines
- in ICML
, 2005
"... Supervised clustering is the problem of training a clustering algorithm to produce desirable clusterings: given sets of items and complete clusterings over these sets, we learn how to cluster future sets of items. Example applications include noun-phrase coreference clustering, and clustering news a ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
Supervised clustering is the problem of training a clustering algorithm to produce desirable clusterings: given sets of items and complete clusterings over these sets, we learn how to cluster future sets of items. Example applications include noun-phrase coreference clustering, and clustering news articles by whether they refer to the same topic. In this paper we present an SVM algorithm that trains a clustering algorithm by adapting the item-pair similarity measure. The algorithm may optimize a variety of different clustering functions to a variety of clustering performance measures. We empirically evaluate the algorithm for noun-phrase and news article clustering. 1.
Learning CRFs using Graph Cuts
"... Abstract. Many computer vision problems are naturally formulated as random fields, specifically MRFs or CRFs. The introduction of graph cuts has enabled efficient and optimal inference in associative random fields, greatly advancing applications such as segmentation, stereo reconstruction and many o ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
Abstract. Many computer vision problems are naturally formulated as random fields, specifically MRFs or CRFs. The introduction of graph cuts has enabled efficient and optimal inference in associative random fields, greatly advancing applications such as segmentation, stereo reconstruction and many others. However, while fast inference is now widespread, parameter learning in random fields has remained an intractable problem. This paper shows how to apply fast inference algorithms, in particular graph cuts, to learn parameters of random fields with similar efficiency. We find optimal parameter values under standard regularized objective functions that ensure good generalization. Our algorithm enables learning of many parameters in reasonable time, and we explore further speedup techniques. We also discuss extensions to non-associative and multi-class problems. We evaluate the method on image segmentation and geometry recognition. 1
Kernel-Based Learning of Hierarchical Multilabel Classification Models
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Mark ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show
Structured learning with approximate inference
- Advances in Neural Information Processing Systems
"... In many structured prediction problems, the highest-scoring labeling is hard to compute exactly, leading to the use of approximate inference methods. However, when inference is used in a learning algorithm, a good approximation of the score may not be sufficient. We show in particular that learning ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
In many structured prediction problems, the highest-scoring labeling is hard to compute exactly, leading to the use of approximate inference methods. However, when inference is used in a learning algorithm, a good approximation of the score may not be sufficient. We show in particular that learning can fail even with an approximate inference method with rigorous approximation guarantees. There are two reasons for this. First, approximate methods can effectively reduce the expressivity of an underlying model by making it impossible to choose parameters that reliably give good predictions. Second, approximations can respond to parameter changes in such a way that standard learning algorithms are misled. In contrast, we give two positive results in the form of learning bounds for the use of LP-relaxed inference in structured perceptron and empirical risk minimization settings. We argue that without understanding combinations of inference and learning, such as these, that are appropriately compatible, learning performance under approximate inference cannot be guaranteed. 1
Structured prediction, dual extragradient and Bregman projections
- Journal of Machine Learning Research
, 2006
"... We present a simple and scalable algorithm for maximum-margin estimation of structured output models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convex-concave saddle-point problem that allows us to use simple projection methods ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
We present a simple and scalable algorithm for maximum-margin estimation of structured output models, including an important class of Markov networks and combinatorial models. We formulate the estimation problem as a convex-concave saddle-point problem that allows us to use simple projection methods based on the dual extragradient algorithm (Nesterov, 2003). The projection step can be solved using dynamic programming or combinatorial algorithms for min-cost convex flow, depending on the structure of the problem. We show that this approach provides a memory-efficient alternative to formulations based on reductions to a quadratic program (QP). We analyze the convergence of the method and present experiments on two very different structured prediction tasks: 3D image segmentation and word alignment, illustrating the favorable scaling properties of our algorithm. 1 1.
Using combinatorial optimization within max-product belief propagation
- Advances in Neural Information Processing Systems (NIPS
, 2007
"... In general, the problem of computing a maximum a posteriori (MAP) assignment in a Markov random field (MRF) is computationally intractable. However, in certain subclasses of MRF, an optimal or close-to-optimal assignment can be found very efficiently using combinatorial optimization algorithms: cert ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
In general, the problem of computing a maximum a posteriori (MAP) assignment in a Markov random field (MRF) is computationally intractable. However, in certain subclasses of MRF, an optimal or close-to-optimal assignment can be found very efficiently using combinatorial optimization algorithms: certain MRFs with mutual exclusion constraints can be solved using bipartite matching, and MRFs with regular potentials can be solved using minimum cut methods. However, these solutions do not apply to the many MRFs that contain such tractable components as sub-networks, but also other non-complying potentials. In this paper, we present a new method, called COMPOSE, for exploiting combinatorial optimization for sub-networks within the context of a max-product belief propagation algorithm. COMPOSE uses combinatorial optimization for computing exact maxmarginals for an entire sub-network; these can then be used for inference in the context of the network as a whole. We describe highly efficient methods for computing max-marginals for subnetworks corresponding both to bipartite matchings and to regular networks. We present results on both synthetic and real networks encoding correspondence problems between images, which involve both matching constraints and pairwise geometric constraints. We compare to a range of current methods, showing that the ability of COMPOSE to transmit information globally across the network leads to improved convergence, decreased running time, and higher-scoring assignments. 1
Associative hierarchical CRFs for object class image segmentation
- in Proc. ICCV
, 2009
"... Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space- pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisat ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space- pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisation level suitable for all object categories is highly unlikely. Motivated by this observation, we propose a hierarchical random field model, that allows integration of features computed at different levels of the quantisation hierarchy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalises much of the previous work based on pixels or segments. We evaluate its efficiency on some of the most challenging data-sets for object class segmentation, and show it obtains state-of-the-art results. 1.

