Results 1  10
of
219
"GrabCut”  interactive foreground extraction using iterated graph cuts
 ACM TRANS. GRAPH
, 2004
"... The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently ..."
Abstract

Cited by 1130 (36 self)
 Add to MetaCart
The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture (colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graphcut has been developed which successfully combines both types of information. In this paper we extend the graphcut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for “border matting ” has been developed to estimate simultaneously the alphamatte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools.
Graph Cuts and Efficient ND Image Segmentation
, 2006
"... Combinatorial graph cut algorithms have been successfully applied to a wide range of problems in vision and graphics. This paper focusses on possibly the simplest application of graphcuts: segmentation of objects in image data. Despite its simplicity, this application epitomizes the best features ..."
Abstract

Cited by 307 (7 self)
 Add to MetaCart
Combinatorial graph cut algorithms have been successfully applied to a wide range of problems in vision and graphics. This paper focusses on possibly the simplest application of graphcuts: segmentation of objects in image data. Despite its simplicity, this application epitomizes the best features of combinatorial graph cuts methods in vision: global optima, practical efficiency, numerical robustness, ability to fuse a wide range of visual cues and constraints, unrestricted topological properties of segments, and applicability to ND problems. Graph cuts based approaches to object extraction have also been shown to have interesting connections with earlier segmentation methods such as snakes, geodesic active contours, and levelsets. The segmentation energies optimized by graph cuts combine boundary regularization with regionbased properties in the same fashion as MumfordShah style functionals. We present motivation and detailed technical description of the basic combinatorial optimization framework for image segmentation via s/t graph cuts. After the general concept of using binary graph cut algorithms for object segmentation was first proposed and tested in Boykov and Jolly (2001), this idea was widely studied in computer vision and graphics communities. We provide links to a large number of known extensions based on iterative parameter reestimation and learning, multiscale or hierarchical approaches, narrow bands, and other techniques for demanding photo, video, and medical applications.
Robust Higher Order Potentials for Enforcing Label Consistency
, 2009
"... This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation ..."
Abstract

Cited by 259 (34 self)
 Add to MetaCart
This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner. Our method is based on higher order conditional random fields and uses potentials defined on sets of pixels (image segments) generated using unsupervised segmentation algorithms. These potentials enforce label consistency in image regions and can be seen as a generalization of the commonly used pairwise contrast sensitive smoothness potentials. The higher order potential functions used in our framework take the form of the Robust P n model and are more general than the P n Potts model recently proposed by Kohli et al. We prove that the optimal swap and expansion moves for energy functions composed of these potentials can be computed by solving a stmincut problem. This enables the use of powerful graph cut based move making algorithms for performing inference in the framework. We test our method on the problem of multiclass object segmentation by augmenting the conventional CRF used for object segmentation with higher order potentials defined on image regions. Experiments on challenging data sets show that integration of higher order potentials quantitatively and qualitatively improves results leading to much better definition of object boundaries. We
Learning to Detect A Salient Object
 In CVPR
, 2007
"... Abstract We study visual attention by detecting a salient object in an input image. We formulate salient object detection as an image segmentation problem, where we separate the salient object from the image background. We propose a set of novel features including multiscale contrast, centersurroun ..."
Abstract

Cited by 240 (12 self)
 Add to MetaCart
(Show Context)
Abstract We study visual attention by detecting a salient object in an input image. We formulate salient object detection as an image segmentation problem, where we separate the salient object from the image background. We propose a set of novel features including multiscale contrast, centersurround histogram, and color spatial distribution to describe a salient object locally, regionally, and globally. A Conditional Random Field is learned to effectively combine these features for salient object detection. We also constructed a large image database containing tens of thousands of carefully labeled images by multiple users. To our knowledge, it is the first large image database for quantitative evaluation of visual attention algorithms. We validate our approach on this image database, which is public available with this paper. 1.
TextonBoost for Image Understanding: MultiClass Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context
, 2007
"... This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits textur ..."
Abstract

Cited by 217 (9 self)
 Add to MetaCart
This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits texturelayout filters, novel features based on textons, which jointly model patterns of texture and their spatial layout. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating the unary classifier in a conditional random field, which (i) captures the spatial interactions between class labels of neighboring pixels, and (ii) improves the segmentation of specific object instances. Efficient training of the model on large datasets is achieved by exploiting both random feature selection and piecewise training methods. High classification and segmentation accuracy is
Multiview Stereo via Volumetric Graphcuts and Occlusion Robust PhotoConsistency
, 2007
"... This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimisation using Graphcuts. Our approach is to seek the optimal partitioning of 3D space into two regions labelled as ‘object’ and ‘empty’ under a cost functional ..."
Abstract

Cited by 189 (9 self)
 Add to MetaCart
(Show Context)
This paper presents a volumetric formulation for the multiview stereo problem which is amenable to a computationally tractable global optimisation using Graphcuts. Our approach is to seek the optimal partitioning of 3D space into two regions labelled as ‘object’ and ‘empty’ under a cost functional consisting of the following two terms: (1) A term that forces the boundary between the two regions to pass through photoconsistent locations and (2) a ballooning term that inflates the ‘object ’ region. To take account of the effect of occlusion on the first term we use an occlusion robust photoconsistency metric based on Normalised Cross Correlation, which does not assume any geometric knowledge about the reconstructed object. The globally optimal 3D partitioning can be obtained as the minimum cut solution of a weighted graph.
Accelerated training of conditional random fields with stochastic gradient methods
 In ICML
, 2006
"... We apply Stochastic MetaDescent (SMD), a stochastic gradient optimization method with gain vector adaptation, to the training of Conditional Random Fields (CRFs). On several large data sets, the resulting optimizer converges to the same quality of solution over an order of magnitude faster than lim ..."
Abstract

Cited by 141 (6 self)
 Add to MetaCart
We apply Stochastic MetaDescent (SMD), a stochastic gradient optimization method with gain vector adaptation, to the training of Conditional Random Fields (CRFs). On several large data sets, the resulting optimizer converges to the same quality of solution over an order of magnitude faster than limitedmemory BFGS, the leading method reported to date. We report results for both exact and inexact inference techniques. 1.
Bilayer segmentation of live video
 In: IEEE Conference on Computer Vision and Pattern Recognition
, 2006
"... a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and th ..."
Abstract

Cited by 108 (3 self)
 Add to MetaCart
(Show Context)
a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and then composited free of aliasing upon a different background; a useful tool in videoconferencing applications. The sequences and ground truth data used throughout this paper are available from [1]. This paper presents an algorithm capable of realtime separation of foreground from background in monocular video sequences. Automatic segmentation of layers from colour/contrast or from motion alone is known to be errorprone. Here motion, colour and contrast cues are probabilistically fused together with spatial and temporal priors to infer layers accurately and efficiently. Central to our algorithm is the fact that pixel velocities are not needed, thus removing the need for optical flow estimation, with its tendency to error and computational expense. Instead, an efficient motion vs nonmotion classifier is trained to operate directly and jointly on intensitychange and contrast. Its output is then fused with colour information. The prior on segmentation is represented by a second order, temporal, Hidden Markov Model, together with a spatial MRF favouring coherence except where contrast is high. Finally, accurate layer segmentation and explicit occlusion detection are efficiently achieved by binary graph cut. The segmentation accuracy of the proposed algorithm is quantitatively evaluated with respect to existing groundtruth data and found to be comparable to the accuracy of a state of the art stereo segmentation algorithm. Foreground/background segmentation is demonstrated in the application of live background substitution and shown to generate convincingly good quality composite video. 1 1.
P³ & beyond: Solving energies with higher order cliques
 IN COMPUTER VISION AND PATTERN RECOGNITION
, 2007
"... In this paper we extend the class of energy functions for which the optimal αexpansion and αβswap moves can be computed in polynomial time. Specifically, we introduce a class of higher order clique potentials and show that the expansion and swap moves for any energy function composed of these pote ..."
Abstract

Cited by 102 (17 self)
 Add to MetaCart
(Show Context)
In this paper we extend the class of energy functions for which the optimal αexpansion and αβswap moves can be computed in polynomial time. Specifically, we introduce a class of higher order clique potentials and show that the expansion and swap moves for any energy function composed of these potentials can be found by minimizing a submodular function. We also show that for a subset of these potentials, the optimal move can be found by solving an stmincut problem. We refer to this subset as the P n Potts model. Our results enable the use of powerful move making algorithms i.e. αexpansion and αβswap for minimization of energy functions involving higher order cliques. Such functions have the capability of modelling the rich statistics of natural scenes and can be used for many applications in computer vision. We demonstrate their use on one such application i.e. the texture based video segmentation problem.