Results 1  10
of
175
Object Detection with Discriminatively Trained Part Based Models
"... We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves stateoftheart results in the PASCAL object detection challenges. While deformable part models have become quite popular, their ..."
Abstract

Cited by 1422 (49 self)
 Add to MetaCart
We describe an object detection system based on mixtures of multiscale deformable part models. Our system is able to represent highly variable object classes and achieves stateoftheart results in the PASCAL object detection challenges. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL datasets. Our system relies on new methods for discriminative training with partially labeled data. We combine a marginsensitive approach for datamining hard negative examples with a formalism we call latent SVM. A latent SVM is a reformulation of MISVM in terms of latent variables. A latent SVM is semiconvex and the training problem becomes convex once latent information is specified for the positive examples. This leads to an iterative training algorithm that alternates between fixing latent values for positive examples and optimizing the latent SVM objective function.
A discriminatively trained, multiscale, deformable part model
 In IEEE Conference on Computer Vision and Pattern Recognition (CVPR2008
, 2008
"... This paper describes a discriminatively trained, multiscale, deformable part model for object detection. Our system achieves a twofold improvement in average precision over the best performance in the 2006 PASCAL person detection challenge. It also outperforms the best results in the 2007 challenge ..."
Abstract

Cited by 555 (11 self)
 Add to MetaCart
(Show Context)
This paper describes a discriminatively trained, multiscale, deformable part model for object detection. Our system achieves a twofold improvement in average precision over the best performance in the 2006 PASCAL person detection challenge. It also outperforms the best results in the 2007 challenge in ten out of twenty categories. The system relies heavily on deformable parts. While deformable part models have become quite popular, their value had not been demonstrated on difficult benchmarks such as the PASCAL challenge. Our system also relies heavily on new methods for discriminative training. We combine a marginsensitive approach for data mining hard negative examples with a formalism we call latent SVM. A latent SVM, like a hidden CRF, leads to a nonconvex training problem. However, a latent SVM is semiconvex and the training problem becomes convex once latent information is specified for the positive examples. We believe that our training methods will eventually make possible the effective use of more latent information such as hierarchical (grammar) models and models involving latent three dimensional pose. 1.
Contourbased learning for object detection
 In Proceedings, International Conference on Computer Vision
, 2005
"... We present a novel categorical object detection scheme that uses only local contourbased features. A twostage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of unsegmented ima ..."
Abstract

Cited by 152 (1 self)
 Add to MetaCart
(Show Context)
We present a novel categorical object detection scheme that uses only local contourbased features. A twostage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of unsegmented images; the second stage bootstraps these detections to learn an improved classifier while explicitly training against clutter. The detectors are learned with a boosting algorithm which creates a locationsensitive classifier using a discriminative set of features from a randomly chosen dictionary of contour fragments. We present results that are very competitive with other stateoftheart object detection schemes and show robustness to object articulations, clutter, and occlusion. Our major contributions are the application of boosted local contourbased features for object detection in a partially supervised learning framework, and an efficient new boosting procedure for simultaneously selecting features and estimating perfeature parameters. 1.
Threedimensional shape knowledge for joint image segmentation and pose estimation
 Pattern Recognition, volume 3663 of LNCS
, 2005
"... In this article we present the integration of 3D shape knowledge into a variational model for level set based image segmentation and tracking. Given a 3D surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object co ..."
Abstract

Cited by 59 (30 self)
 Add to MetaCart
(Show Context)
In this article we present the integration of 3D shape knowledge into a variational model for level set based image segmentation and tracking. Given a 3D surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to estimate the 3D pose parameters of the object. Viceversa, the surface model projected to the image plane helps in a topdown manner to improve the extraction of the contour. While common alternative segmentation approaches, which integrate 2D shape knowledge, face the problem that an object can look very differently from various viewpoints, a 3D free form model ensures that for each view the model can fit the data in the image very well. Moreover, one additionally solves the higher level problem of determining the object pose in 3D space. Due to the variational formulation, the approach clearly states all model assumptions in a single energy functional that is locally minimized by our method. Its performance is demonstrated by experiments with a monocular and a stereo camera system. 1 1
Estimating 3d shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior
 Edges, Specular Highlights, Texture Constraints and a Prior, Proceedings of Computer Vision and Pattern Recognition
, 2005
"... We present a novel algorithm aiming to estimate the 3D shape, the texture of a human face, along with the 3D pose and the light direction from a single photograph by recovering the parameters of a 3D Morphable Model. Generally, the algorithms tackling the problem of 3D shape estimation from image da ..."
Abstract

Cited by 38 (4 self)
 Add to MetaCart
(Show Context)
We present a novel algorithm aiming to estimate the 3D shape, the texture of a human face, along with the 3D pose and the light direction from a single photograph by recovering the parameters of a 3D Morphable Model. Generally, the algorithms tackling the problem of 3D shape estimation from image data use only the pixels intensity as input to drive the estimation process. This was previously achieved using either a simple model, such as the Lambertian reflectance model, leading to a linear fitting algorithm. Alternatively, this problem was addressed using a more precise model and minimizing a nonconvex cost function with many local minima. One way to reduce the local minima problem is to use a stochastic optimization algorithm. However, the convergence properties (such as the radius of convergence) of such algorithms, are limited. Here, as well as the pixel intensity, we use various image features such as the edges or the location of the specular highlights. The 3D shape, texture and imaging parameters are then estimated by maximizing the posterior of the parameters given these image features. The overall cost function obtained is smoother and, hence, a stochastic optimization algorithm is not needed to avoid the local minima problem. This leads to the MultiFeatures Fitting algorithm that has a wider radius of convergence and a higher level of precision. This is shown on some example photographs, and on a recognition experiment performed on the CMUPIE image database. 1.
Automatic salient object segmentation based on context and shape prior
 In Proc. British Machine Vision Conference (BMVC
, 2011
"... We propose a novel automatic salient object segmentation algorithm which integrates both bottomup salient stimuli and objectlevel shape prior, i.e., a salient object has a welldefined closed boundary. Our approach is formalized as an iterative energy minimization framework, leading to binary segm ..."
Abstract

Cited by 38 (5 self)
 Add to MetaCart
(Show Context)
We propose a novel automatic salient object segmentation algorithm which integrates both bottomup salient stimuli and objectlevel shape prior, i.e., a salient object has a welldefined closed boundary. Our approach is formalized as an iterative energy minimization framework, leading to binary segmentation of the salient object. Such energy minimization is initialized with a saliency map which is computed through context analysis based on multiscale superpixels. Objectlevel shape prior is then extracted combining saliency with object boundary information. Both saliency map and shape prior update after each iteration. Experimental results on two public benchmark datasets show that our proposed approach outperforms stateoftheart methods. 1
Shared Parts for Deformable Partbased Models
"... The deformable partbased model (DPM) proposed by Felzenszwalb et al. has demonstrated stateoftheart results in object localization. The model offers a high degree of learnt invariance by utilizing viewpointdependent mixture components and movable parts in each mixture component. One might hope ..."
Abstract

Cited by 36 (1 self)
 Add to MetaCart
(Show Context)
The deformable partbased model (DPM) proposed by Felzenszwalb et al. has demonstrated stateoftheart results in object localization. The model offers a high degree of learnt invariance by utilizing viewpointdependent mixture components and movable parts in each mixture component. One might hope to increase the accuracy of the DPM by increasing the number of mixture components and parts to give a more faithful model, but limited training data prevents this from being effective. We propose an extension to the DPM which allows for sharing of object part models among multiple mixture components as well as object classes. This results in more compact models and allows training examples to be shared by multiple components, ameliorating the effect of a limited size training set. We (i) reformulate the DPM to incorporate part sharing, and (ii) propose a novel energy function allowing for coupled training of mixture components and object classes. We report stateoftheart results on the PASCAL VOC dataset. 1.
Object Segmentation in Video: A Hierarchical Variational Approach for Turning Point Trajectories into Dense Regions
"... Point trajectories have emerged as a powerful means to obtain high quality and fully unsupervised segmentation of objects in video shots. They can exploit the long term motion difference between objects, but they tend to be sparse due to computational reasons and the difficulty in estimating motion ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
(Show Context)
Point trajectories have emerged as a powerful means to obtain high quality and fully unsupervised segmentation of objects in video shots. They can exploit the long term motion difference between objects, but they tend to be sparse due to computational reasons and the difficulty in estimating motion in homogeneous areas. In this paper we introduce a variational method to obtain dense segmentations from such sparse trajectory clusters. Information is propagated with a hierarchical, nonlinear diffusion process that runs in the continuous domain but takes superpixels into account. We show that this process raises the density from 3% to 100 % and even increases the average precision of labels. 1.
Simultaneous segmentation and pose estimation of humans using dynamic graph cuts
 Int. J. Comput. Vision
"... Abstract This paper presents a novel algorithm for performing integrated segmentation and 3D pose estimation of a human body from multiple views. Unlike other state of the art methods which focus on either segmentation or pose estimation individually, our approach tackles these two tasks together. O ..."
Abstract

Cited by 34 (0 self)
 Add to MetaCart
(Show Context)
Abstract This paper presents a novel algorithm for performing integrated segmentation and 3D pose estimation of a human body from multiple views. Unlike other state of the art methods which focus on either segmentation or pose estimation individually, our approach tackles these two tasks together. Our method works by optimizing a cost function based on a Conditional Random Field (CRF). This has the advantage that all information in the image (edges, background and foreground appearances), as well as the prior information on the shape and pose of the subject can be combined and used in a Bayesian framework. Optimizing such a cost function would have been computationally infeasible. However, our recent research in dynamic graph cuts allows this to be done much more efficiently than before. We demonstrate the efficacy of our approach on challenging motion sequences. Although we target the human pose inference problem in the paper, our method is completely generic and can be used to segment and infer the pose of any rigid, deformable or articulated object.