• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Robust higher order potentials for enforcing label consistency. In: CVPR (2008)

by P Kohli, L Ladicky, P Torr
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 259
Next 10 →

The PASCAL Visual Object Classes (VOC) Challenge

by M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman - INTERNATIONAL JOURNAL OF COMPUTER VISION
"... ... and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. ..."
Abstract - Cited by 629 (20 self) - Add to MetaCart
... and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. This paper describes the dataset and evaluation procedure. We review the state-of-the-art in evaluated methods for both classification and detection, analyse whether the methods are statistically different, what they are learning from the images (e.g. the object or its context), and what the methods find easy or confuse. The paper concludes with lessons learnt in the three year history of the challenge, and proposes directions for future improvement and extension.

Computer Vision: Algorithms and Applications

by Richard Szeliski , 2010
"... ..."
Abstract - Cited by 252 (2 self) - Add to MetaCart
Abstract not found

Decomposing a Scene into Geometric and Semantically Consistent Regions

by Stephen Gould, Richard Fulton, Daphne Koller
"... High-level, or holistic, scene understanding involves reasoning about objects, regions, and the 3D relationships between them. This requires a representation above the level of pixels that can be endowed with high-level attributes such as class of object/region, its orientation, and (rough 3D) locat ..."
Abstract - Cited by 174 (11 self) - Add to MetaCart
High-level, or holistic, scene understanding involves reasoning about objects, regions, and the 3D relationships between them. This requires a representation above the level of pixels that can be endowed with high-level attributes such as class of object/region, its orientation, and (rough 3D) location within the scene. Towards this goal, we propose a region-based model which combines appearance and scene geometry to automatically decompose a scene into semantically meaningful regions. Our model is defined in terms of a unified energy function over scene appearance and structure. We show how this energy function can be learned from data and present an efficient inference technique that makes use of multiple over-segmentations of the image to propose moves in the energy-space. We show, experimentally, that our method achieves state-of-the-art performance on the tasks of both multi-class image segmentation and geometric reasoning. Finally, by understanding region classes and geometry, we show how our model can be used as the basis for 3D reconstruction of the scene. 1.
(Show Context)

Citation Context

...how how our model can be used as the basis for 3D reconstruction of the scene. 1. Introduction With recent success on many vision subtasks—object detection [21, 18, 3], multi-class image segmentation =-=[17, 7, 13]-=-, and 3D reconstruction [10, 16]—holistic scene understanding has emerged as one of the next great challenges for computer vision [11, 9, 19]. Here the aim is to reason jointly about objects, regions ...

Associative hierarchical CRFs for object class image segmentation

by Chris Russell, Philip H. S. Torr, Pushmeet Kohli - in Proc. ICCV , 2009
"... Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space- pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisat ..."
Abstract - Cited by 172 (25 self) - Add to MetaCart
Most methods for object class segmentation are formulated as a labelling problem over a single choice of quantisation of an image space- pixels, segments or group of segments. It is well known that each quantisation has its fair share of pros and cons; and the existence of a common optimal quantisation level suitable for all object categories is highly unlikely. Motivated by this observation, we propose a hierarchical random field model, that allows integration of features computed at different levels of the quantisation hierarchy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalises much of the previous work based on pixels or segments. We evaluate its efficiency on some of the most challenging data-sets for object class segmentation, and show it obtains state-of-the-art results. 1.
(Show Context)

Citation Context

...erson and the sign board. a loss in the information content/discriminative power associated with each segment. Another interesting method, and one closely related to ours was proposed by Kohli et al. =-=[15]-=-. By formulating the labelling problem as a CRF defined over pixels, they were able to recover from misleading segments which spanned multiple object classes. Further, they were able to encouraged ind...

Fast approximate energy minimization with label costs

by Andrew Delong, Anton Osokin, Hossam N. Isack, Yuri Boykov , 2010
"... The α-expansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main contribution is to extend α-expansion so that it can simult ..."
Abstract - Cited by 110 (9 self) - Add to MetaCart
The α-expansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main contribution is to extend α-expansion so that it can simultaneously optimize “label costs ” as well. An energy with label costs can penalize a solution based on the set of labels that appear in it. The simplest special case is to penalize the number of labels in the solution. Our energy is quite general, and we prove optimality bounds for our algorithm. A natural application of label costs is multi-model fitting, and we demonstrate several such applications in vision: homography detection, motion segmentation, and unsupervised image segmentation. Our C++/MATLAB implementation is publicly available.
(Show Context)

Citation Context

...combination of label subset cost potentials, but not the other way around. Section 6 elaborates on this point, and mentions a possible extension to our work based on the Robust P n Potts construction =-=[21]-=-. A final detail is how to handle the case when label α was not used in the current labeling f ′ . The corrective term C α in (6) incorporates the label costs for α itself: C α (x) = ∑ ( ∏ ) hL − hL ¯...

Recognition using Regions

by Chunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik
"... This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have ..."
Abstract - Cited by 106 (5 self) - Add to MetaCart
This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have not been popular as features due to their sensitivity to segmentation errors. In this paper, we start by producing a robust bag of overlaid regions for each image using Arbeláez et al., CVPR 2009. Each region is represented by a rich set of image cues (shape, color and texture). We then learn region weights using a max-margin framework. In detection and segmentation, we apply a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis. The proposed approach significantly outperforms the state of the art on the ETHZ shape database (87.1 % average detection rate compared to Ferrari et al.’s 67.2%), and achieves competitive performance on the Caltech 101 database.
(Show Context)

Citation Context

...or recognition has been widely acknowledged in the computer vision community but it is often represented by a holistic descriptor of an image[72], or pairwise potentials of neighboring pixels/segments=-=[43, 51, 90, 78, 99, 9, 39, 49, 77]-=-. However, contextual cues are naturally encoded through a “partonomy” of the image, the hierarchical representation relating parts to objects and to the scene. Our second ancestry framework[62] natur...

Graph Cut based Inference with Co-occurrence Statistics

by Lubor Ladicky, Chris Russell, Pushmeet Kohli, Philip H. S. Torr, Oxford Brookes
"... Abstract. Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily ..."
Abstract - Cited by 100 (13 self) - Add to MetaCart
Abstract. Markov and Conditional random fields (CRFs) used in computer vision typically model only local interactions between variables, as this is computationally tractable. In this paper we consider a class of global potentials defined over all variables in the CRF. We show how they can be readily optimised using standard graph cut algorithms at little extra expense compared to a standard pairwise field. This result can be directly used for the problem of class based image segmentation which has seen increasing recent interest within computer vision. Here the aim is to assign a label to each pixel of a given image from a set of possible object classes. Typically these methods use random fields to model local interactions between pixels or super-pixels. One of the cues that helps recognition is global object co-occurrence statistics, a measure of which classes (such as chair or motorbike) are likely to occur in the same image together. There have been several approaches proposed to exploit this property, but all of them suffer from different limitations and typically carry a high computational cost, preventing their application on large images. We find that the new model we propose produces an improvement in the labelling compared to just using a pairwise model. 1
(Show Context)

Citation Context

...et Kohli, and Philip H.S. Torr L successively. The transformation function Tα(xi, ti) for an α-expansion move transforms the label of a random variable xi as: Tα(xi, ti) = { xi if ti = 0 α if ti = 1. =-=(13)-=- To derive a graph-construction that approximates the true cost of an α-expansion move we rewrite C(L) as: C(L) = ∑ kB, (14) B⊆L where the coefficients kB are calculated recursively as: kB = C(B) − ∑ ...

Beyond pairwise energies: Efficient optimization for higher-order MRFs

by Nikos Komodakis, Nikos Paragios - in IEEE Conference on Computer Vision and Pattern Recognition : CVPR , 2009
"... In this paper, we introduce a higher-order MRF optimization framework. On the one hand, it is very general; we thus use it to derive a generic optimizer that can be applied to almost any higher-order MRF and that provably optimizes a dual relaxation related to the input MRF problem. On the other han ..."
Abstract - Cited by 80 (11 self) - Add to MetaCart
In this paper, we introduce a higher-order MRF optimization framework. On the one hand, it is very general; we thus use it to derive a generic optimizer that can be applied to almost any higher-order MRF and that provably optimizes a dual relaxation related to the input MRF problem. On the other hand, it is also extremely flexible and thus can be easily adapted to yield far more powerful algorithms when dealing with subclasses of high-order MRFs. We thus introduce a new powerful class of high-order potentials, which are shown to offer enough expressive power and to be useful for many vision tasks. To address them, we derive, based on the same framework, a novel and extremely efficient message-passing algorithm, which goes beyond the aforementioned generic optimizer and is able to deliver almost optimal solutions of very high quality. Experimental results on vision problems demonstrate the extreme effectiveness of our approach. For instance, we show that in some cases we are even able to compute the global optimum for NP-hard higher-order MRFs in a very efficient manner. 1.
(Show Context)

Citation Context

...bility of higher-order models to vision. We should note that not many MRF algorithms for highorder vision problems have been proposed up to now. A notable exception is the recent work of Kohli et al. =-=[3, 4]-=-, where an efficient inference technique was proposed for a specific class of higher-order MRFs. Lan et al. [8] presented an efficient but approximate version of BP, while Potetz [10] proposed a BP ad...

Minimizing Sparse Higher Order Energy Functions of Discrete Variables

by Carsten Rother, Pushmeet Kohli, Wei Feng, Jiaya Jia
"... Higher order energy functions have the ability to encode high level structural dependencies between pixels, which have been shown to be extremely powerful for image labeling problems. Their use, however, is severely hampered in practice by the intractable complexity of representing and minimizing su ..."
Abstract - Cited by 74 (13 self) - Add to MetaCart
Higher order energy functions have the ability to encode high level structural dependencies between pixels, which have been shown to be extremely powerful for image labeling problems. Their use, however, is severely hampered in practice by the intractable complexity of representing and minimizing such functions. We observed that higher order functions encountered in computer vision are very often “sparse”, i.e. many labelings of a higher order clique are equally unlikely and hence have the same high cost. In this paper, we address the problem of minimizing such sparse higher order energy functions. Our method works by transforming the problem into an equivalent quadratic function minimization problem. The resulting quadratic function can be minimized using popular message passing or graph cut based algorithms for MAP inference. Although this is primarily a theoretical paper, it also shows how higher order functions can be used to obtain impressive results for the binary texture restoration problem.
(Show Context)

Citation Context

...instance, the function shown in figure 4(c) is a lower envelop of the higher order functions shown in figure 4(b). This transformation method can be seen as a generalization of the method proposed in =-=[8]-=- for transforming the Robust P n model potentials. 4. Transforming Pseudo-Boolean Functions In the previous section, we discussed how to transform multi-label higher order functions to quadratic ones ...

Stacked Hierarchical Labeling

by Daniel Munoz, J. Andrew Bagnell, Martial Hebert
"... Abstract. In this work we propose a hierarchical approach for labeling semantic objects and regions in scenes. Our approach is reminiscent of early vision literature in that we use a decomposition of the image in order to encode relational and spatial information. In contrast to much existing work o ..."
Abstract - Cited by 63 (17 self) - Add to MetaCart
Abstract. In this work we propose a hierarchical approach for labeling semantic objects and regions in scenes. Our approach is reminiscent of early vision literature in that we use a decomposition of the image in order to encode relational and spatial information. In contrast to much existing work on structured prediction for scene understanding, we bypass a global probabilistic model and instead directly train a hierarchical inference procedure inspired by the message passing mechanics of some approximate inference procedures in graphical models. This approach mitigates both the theoretical and empirical difficulties of learning probabilistic models when exact inference is intractable. In particular, we draw from recent work in machine learning and break the complex inference process into a hierarchical series of simple machine learning subproblems. Each subproblem in the hierarchy is designed to capture the image and contextual statistics in the scene. This hierarchy spans coarse-to-fine regions and explicitly models the mixtures of semantic labels that may be present due to imperfect segmentation. To avoid cascading of errors and overfitting, we train the learning problems in sequence to ensure robustness to likely errors earlier in the inference sequence and leverage the stacking approach developed by Cohen et al. 1
(Show Context)

Citation Context

...vex nature (assuming no latent states) [19]. Furthermore, although exact inference is NP-hard over these models, there has been much recent progress towards efficient approximate inference techniques =-=[12,14]-=-. However, correctly optimizing these convex models requires exact inference during learning. Unfortunately, when exact inference cannot be performed, converging to the optimum is no longer guaranteed...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University