Results 1 - 10
of
66
Monocular Pedestrian Detection: Survey and Experiments
, 2008
"... Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspective. The first ..."
Abstract
-
Cited by 153 (13 self)
- Add to MetaCart
Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspective. The first part of the paper consists of a survey. We cover the main components of a pedestrian detection system and the underlying models. The second (and larger) part of the paper contains a corresponding experimental study. We consider a diverse set of state-of-the-art systems: wavelet-based AdaBoost cascade [74], HOG/linSVM [11], NN/LRF [75] and combined shape-texture detection [23]. Experiments are performed on an extensive dataset captured on-board a vehicle driving through urban environment. The dataset includes many thousands of training samples as well as a 27 minute test sequence involving more than 20000 images with annotated pedestrian locations. We consider a generic evaluation setting and one specific to pedestrian detection on-board a vehicle. Results indicate a clear advantage of HOG/linSVM at higher image resolutions and lower processing speeds, and a superiority of the wavelet-based AdaBoost cascade approach at lower image resolutions and (near) real-time processing speeds. The dataset (8.5GB) is made public for benchmarking purposes.
Contour-based learning for object detection
- In Proceedings, International Conference on Computer Vision
, 2005
"... We present a novel categorical object detection scheme that uses only local contour-based features. A two-stage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of unsegmented ima ..."
Abstract
-
Cited by 152 (1 self)
- Add to MetaCart
We present a novel categorical object detection scheme that uses only local contour-based features. A two-stage, partially supervised learning architecture is proposed: a rudimentary detector is learned from a very small set of segmented images and applied to a larger training set of unsegmented images; the second stage bootstraps these detections to learn an improved classifier while explicitly training against clutter. The detectors are learned with a boosting algorithm which creates a location-sensitive classifier using a discriminative set of features from a randomly chosen dictionary of contour fragments. We present results that are very competitive with other state-of-the-art object detection schemes and show robustness to object articulations, clutter, and occlusion. Our major contributions are the application of boosted local contour-based features for object detection in a partially supervised learning framework, and an efficient new boosting procedure for simultaneously selecting features and estimating per-feature parameters. 1.
Finding and tracking people from the bottom up
- In CVPR
, 2003
"... Abstract We ..."
(Show Context)
ZISSERMAN A.: Tracking people by learning their appearance.
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... ..."
Model-Based Hand Tracking Using A Hierarchical Bayesian Filter
, 2004
"... This thesis focuses on the automatic recovery of three-dimensional hand motion from one or more views. A 3D geometric hand model is constructed from truncated cones, cylinders and ellipsoids and is used to generate contours, which can be compared with edge contours and skin colour in images. The han ..."
Abstract
-
Cited by 104 (3 self)
- Add to MetaCart
(Show Context)
This thesis focuses on the automatic recovery of three-dimensional hand motion from one or more views. A 3D geometric hand model is constructed from truncated cones, cylinders and ellipsoids and is used to generate contours, which can be compared with edge contours and skin colour in images. The hand tracking problem is formulated as state estimation, where the model parameters define the internal state, which is to be estimated from image observations. In thew first
A bayesian, exemplar-based approach to hierarchical shape matching
- IEEE Trans. Pattern Anal. Mach. Intell
"... Abstract—This paper presents a novel probabilistic approach to hierarchical, exemplar-based shape matching. No feature correspondence is needed among exemplars, just a suitable pairwise similarity measure. The approach uses a template tree to efficiently represent and match the variety of shape exem ..."
Abstract
-
Cited by 74 (8 self)
- Add to MetaCart
(Show Context)
Abstract—This paper presents a novel probabilistic approach to hierarchical, exemplar-based shape matching. No feature correspondence is needed among exemplars, just a suitable pairwise similarity measure. The approach uses a template tree to efficiently represent and match the variety of shape exemplars. The tree is generated offline by a bottom-up clustering approach using stochastic optimization. Online matching involves a simultaneous coarse-to-fine approach over the template tree and over the transformation parameters. The main contribution of this paper is a Bayesian model to estimate the a posteriori probability of the object class, after a certain match at a node of the tree. This model takes into account object scale and saliency and allows for a principled setting of the matching thresholds such that unpromising paths in the tree traversal process are eliminated early on. The proposed approach was tested in a variety of application domains. Here, results are presented on one of the more challenging domains: real-time pedestrian detection from a moving vehicle. A significant speed-up is obtained when comparing the proposed probabilistic matching approach with a manually tuned nonprobabilistic variant, both utilizing the same template tree structure. Index Terms—Hierarchical shape matching, chamfer distance, Bayesian models. 1
A Bayesian Framework for Multi-cue 3D Object Tracking
- In Proceedings of European Conference on Computer Vision
, 2004
"... This paper presents a Bayesian framework for multi-cue 3D object tracking of deformable objects. The proposed spatio-temporal object representation involves a set of distinct linear subspace models or Dynamic Point Distribution Models (DPDMs), which can deal with both continuous and discontinuou ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
(Show Context)
This paper presents a Bayesian framework for multi-cue 3D object tracking of deformable objects. The proposed spatio-temporal object representation involves a set of distinct linear subspace models or Dynamic Point Distribution Models (DPDMs), which can deal with both continuous and discontinuous appearance changes; the representation is learned fully automatically from training data. The representation is enriched with texture information by means of intensity histograms, which are compared using the Bhattacharyya coe#cient. Direct 3D measurement is furthermore provided by a stereo system.
Monocular Human Motion Capture with a Mixture of Regressors
- IEEE Workshop on Vision for Human-Computer Interaction
, 2005
"... We address 3D human motion capture from monocular images, taking a learning based approach to construct a probabilistic pose estimation model from a set of labelled human silhouettes. To compensate for ambiguities in the pose reconstruction problem, our model explicitly calculates several possible p ..."
Abstract
-
Cited by 54 (3 self)
- Add to MetaCart
(Show Context)
We address 3D human motion capture from monocular images, taking a learning based approach to construct a probabilistic pose estimation model from a set of labelled human silhouettes. To compensate for ambiguities in the pose reconstruction problem, our model explicitly calculates several possible pose hypotheses. It uses locality on a manifold in the input space and connectivity in the output space to identify regions of multi-valuedness in the mapping from silhouette to 3D pose. This information is used to fit a mixture of regressors on the input manifold, giving us a global model capable of predicting the possible poses with corresponding probabilities. These are then used in a dynamicalmodel based tracker that automatically detects tracking failures and re-initializes in a probabilistically correct manner. The system is trained on conventional motion capture data, using both the corresponding real human silhouettes and silhouettes synthesized artificially from several different models for improved robustness to inter-person variations. Static pose estimation is illustrated on a variety of silhouettes. The robustness of the method is demonstrated by tracking on a real image sequence requiring multiple automatic re-initializations. 1.
Non-metric affinity propagation for unsupervised image categorization
"... Unsupervised categorization of images or image parts is often needed for image and video summarization or as a preprocessing step in supervised methods for classification, tracking and segmentation. While many metric-based techniques have been applied to this problem in the vision community, often, ..."
Abstract
-
Cited by 52 (2 self)
- Add to MetaCart
(Show Context)
Unsupervised categorization of images or image parts is often needed for image and video summarization or as a preprocessing step in supervised methods for classification, tracking and segmentation. While many metric-based techniques have been applied to this problem in the vision community, often, the most natural measures of similarity (e.g., number of matching SIFT features) between pairs of images or image parts is non-metric. Unsupervised categorization by identifying a subset of representative exemplars can be efficiently performed with the recently-proposed ‘affinity propagation ’ algorithm. In contrast to k-centers clustering, which iteratively refines an initial randomly-chosen set of exemplars, affinity propagation simultaneously considers all data points as potential exemplars and iteratively exchanges messages between data points until a good solution emerges. When applied to the Olivetti face data set using a translation-invariant non-metric similarity, affinity propagation achieves a much lower reconstruction error and nearly halves the classification error rate, compared to state-of-the-art techniques. For the more challenging problem of unsupervised categorization of images from the Caltech101 data set, we derived non-metric similarities between pairs of images by matching SIFT features. Affinity propagation successfully identifies meaningful categories, which provide a natural summarization of the training images and can be used to classify new input images. 1.
Features for recognition: viewpoint invariance for non-planar scenes
- In Technical Report UCLA-CSD 040049
, 2004
"... We present a technique for local image representation that is invariant to viewpoint for scenes with arbitrary non-planar shape. We show that generic viewpoint invariance can be achieved, under suitable conditions, although the resulting invariant is not shape-discriminative. Our results serve both ..."
Abstract
-
Cited by 31 (13 self)
- Add to MetaCart
(Show Context)
We present a technique for local image representation that is invariant to viewpoint for scenes with arbitrary non-planar shape. We show that generic viewpoint invariance can be achieved, under suitable conditions, although the resulting invariant is not shape-discriminative. Our results serve both to validate existing approaches to local feature detection and description, as well as to complement them where they are not applicable. We illustrate our approach on images of 3-D corners where existing approaches fail. Contents 1