Results 1 - 10
of
24
Harmony Potentials for Joint Classification and Segmentation
- In Conference on Computer Vision and Pattern Recognition
, 2010
"... Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21. 1.
SLIC Superpixels ⋆
"... Abstract. Superpixels are becoming increasingly popular for use in computer vision applications. However, there are few algorithms that output a desired number of regular, compact superpixels with a low computational overhead. We introduce a novel algorithm that clusters pixels in the combined five- ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract. Superpixels are becoming increasingly popular for use in computer vision applications. However, there are few algorithms that output a desired number of regular, compact superpixels with a low computational overhead. We introduce a novel algorithm that clusters pixels in the combined five-dimensional color and image plane space to efficiently generate compact, nearly uniform superpixels. The simplicity of our approach makes it extremely easy to use – a lone parameter specifies the number of superpixels – and the efficiency of the algorithm makes it very practical. Experiments show that our approach produces superpixels at a lower computational cost while achieving a segmentation quality equal to or greater than four state-of-the-art methods, as measured by boundary recall and under-segmentation error. We also demonstrate the benefits of our superpixel approach in contrast to existing methods for two tasks in which superpixels have already been shown to increase performance over pixel-based methods. 1
Video-based descriptors for object recognition
- Image and Vision Computing, 29(10):639
"... We describe a visual recognition system operating on a hand-held device, based on a video-based feature descriptor, and characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We describe a visual recognition system operating on a hand-held device, based on a video-based feature descriptor, and characterize its invariance and discriminative properties. Feature selection and tracking are performed in real-time, and used to train a template-based classifier during a capture phase prompted by the user. During normal operation, the system scores objects in the field of view based on their ranking. Severe resource constraints have prompted a re-evaluation of existing algorithms improving their performance (accuracy and robustness) as well as computational efficiency. We motivate the design choices in the implementation with a characterization of the stability properties of local invariant detectors, and of the conditions under which a template-based descriptor is optimal. The analysis also highlights the role of time as “weak supervisor ” during training, which we exploit in our implementation.
Extracting Foreground Masks towards Object Recognition
"... Effective segmentation prior to recognition has been shown to improve recognition performance. However, most segmentation algorithms adopt methods which are not explicitly linked to the goal of object recognition. Here we solve a related but slightly different problem in order to assist object recog ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Effective segmentation prior to recognition has been shown to improve recognition performance. However, most segmentation algorithms adopt methods which are not explicitly linked to the goal of object recognition. Here we solve a related but slightly different problem in order to assist object recognition more directly- the extraction of a foreground mask, which identifies the locations of objects in the image. We propose a novel foreground/background segmentation algorithm that attempts to segment the interesting objects from the rest of the image, while maximizing an objective function which is tightly related to object recognition. We do this in a manner which requires no classspecific knowledge of object categories, using a probabilistic formulation which is derived from manually segmented images. The model includes a geometric prior and an appearance prior, whose parameters are learnt on the fly from images that are similar to the query image. We use graphcut based energy minimization to enforce spatial coherence on the model’s output. The method is tested on the challenging VOC09 and VOC10 segmentation datasets, achieving excellent results in providing a foreground mask. We also provide comparisons to the recent segmentation method of [7]. 1.
Really quick shift: Image segmentation on a GPU
"... Abstract. The paper presents an exact GPU implementation of the quick shift image segmentation algorithm. Variants of the implementation which use global memory and texture caching are presented, and the paper shows that a method backed by texture caching can produce a 10-50X speedup for practical i ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The paper presents an exact GPU implementation of the quick shift image segmentation algorithm. Variants of the implementation which use global memory and texture caching are presented, and the paper shows that a method backed by texture caching can produce a 10-50X speedup for practical images, making computation of super-pixels possible at 5-10Hz on modest sized (256x256) images.
Using Global Bag of Features Models in Random Fields for Joint Categorization and Segmentation of Objects
"... We propose to bridge the gap between Random Field (RF) formulations for joint categorization and segmentation (JCaS), which model local interactions among pixels and superpixels, and Bag of Features categorization algorithms, which use global descriptors. For this purpose, we introduce new higher or ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose to bridge the gap between Random Field (RF) formulations for joint categorization and segmentation (JCaS), which model local interactions among pixels and superpixels, and Bag of Features categorization algorithms, which use global descriptors. For this purpose, we introduce new higher order potentials that encode the classification cost of a histogram extracted from all the objects in an image that belong to a particular category, where the cost is given as the output of a classifier when applied to the histogram. The potentials efficiently encode the classification costs of several histograms resulting from the different possible segmentations of an image. They can be integrated with existing potentials, hence providing a natural unification of global and local interactions. The potentials ’ parameters can be treated as parameters of the RF and hence be jointly learnt along with the other parameters of the RF. Experiments show that our framework can be used to improve the performance of existing JCaS algorithms. 1.
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
"... Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While regionlevel models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph str ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While regionlevel models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy. 1
Fast and Robust Object Segmentation with the Integral Linear Classifier
"... We propose an efficient method, built on the popular Bag of Features approach, that obtains robust multiclass pixellevel object segmentation of an image in less than 500ms, with results comparable or better than most state of the art methods. We introduce the Integral Linear Classifier (ILC), that c ..."
Abstract
- Add to MetaCart
We propose an efficient method, built on the popular Bag of Features approach, that obtains robust multiclass pixellevel object segmentation of an image in less than 500ms, with results comparable or better than most state of the art methods. We introduce the Integral Linear Classifier (ILC), that can readily obtain the classification score for any image sub-window with only 6 additions and 1 product by fusing the accumulation and classification steps in a single operation. In order to design a method as efficient as possible, our building blocks are carefully selected from the quickest in the state of the art. More precisely, we evaluate the performance of three popular local descriptors, that can be very efficiently computed using integral images, and two fast quantization methods: the Hierarchical K-Means, and the Extremely Randomized Forest. Finally, we explore the utility of adding spatial bins to the Bag of Features histograms and that of cascade classifiers to improve the obtained segmentation. Our method is compared to the state of the art in the difficult Graz-02 and PASCAL 2007 Segmentation Challenge datasets. 1.
IMAGE-BASED BUILDING CLASSIFICATION AND 3D MODELING WITH SUPER-PIXELS
"... Due to an increasing amount of aerial data there is significant demand in automatic large-scale modeling of buildings. This work presents an image-driven method for automatic building extraction and 3D modeling from large-scale aerial imagery. We introduce a fast unsupervised segmentation technique ..."
Abstract
- Add to MetaCart
Due to an increasing amount of aerial data there is significant demand in automatic large-scale modeling of buildings. This work presents an image-driven method for automatic building extraction and 3D modeling from large-scale aerial imagery. We introduce a fast unsupervised segmentation technique based on super-pixels. Considering the super-pixels as smallest units in the image space, these regions offer important spatial support for an information fusion step and enable a generic modeling of arbitrary building footprints and rooftop shapes. In our three-staged approach we integrate both appearance information and height data to accurately classify building pixels and to model complex rooftops. We apply our approach to datasets, consisting many overlapping aerial images, with challenging characteristics. The classification pipeline is evaluated on ground truth data in terms of correctly labeled pixels. We use the building classification together with color and height for large-scale modeling of buildings. 1
A Fully Automated Approach to Segmentation of Irregularly Shaped Cellular Structures in EM Images
"... Abstract. While there has been substantial progress in segmenting natural images, state-of-the-art methods that perform well in such tasks unfortunately tend to underperform when confronted with the different challenges posed by electron microscope (EM) data. For example, in EM imagery of neural tis ..."
Abstract
- Add to MetaCart
Abstract. While there has been substantial progress in segmenting natural images, state-of-the-art methods that perform well in such tasks unfortunately tend to underperform when confronted with the different challenges posed by electron microscope (EM) data. For example, in EM imagery of neural tissue, numerous cells and subcellular structures appear within a single image, they exhibit irregular shapes that cannot be easily modeled by standard techniques, and confusing textures clutter the background. We propose a fully automated approach that handles these challenges by using sophisticated cues that capture global shape and texture information, and by learning the specific appearance of object boundaries. We demonstrate that our approach significantly outperforms state-of-the-art techniques and closely matches the performance of human annotators. 1

