Results 1 - 10
of
494
A Boosted Particle Filter: Multitarget Detection and Tracking
- In ECCV
, 2004
"... The problem of tracking a varying number of non-rigid objects has two major di#culties. First, the observation models and target distributions can be highly non-linear and non-Gaussian. Second, the presence of a large, varying number of objects creates complex interactions with overlap and ambig ..."
Abstract
-
Cited by 308 (7 self)
- Add to MetaCart
(Show Context)
The problem of tracking a varying number of non-rigid objects has two major di#culties. First, the observation models and target distributions can be highly non-linear and non-Gaussian. Second, the presence of a large, varying number of objects creates complex interactions with overlap and ambiguities. To surmount these di#culties, we introduce a vision system that is capable of learning, detecting and tracking the objects of interest. The system is demonstrated in the context of tracking hockey players using video sequences. Our approach combines the strengths of two successful algorithms: mixture particle filters and Adaboost. The mixture particle filter [17] is ideally suited to multi-target tracking as it assigns a mixture component to each player. The crucial design issues in mixture particle filters are the choice of the proposal distribution and the treatment of objects leaving and entering the scene.
3D Articulated Models and Multi-View Tracking with Physical Forces
"... this article we focus on the study of the gestures of a person, but the same methodology could be applied to the study of robots motions or of other kinds of articulated objects. Some examples of applications are listed in the table 1. ..."
Abstract
-
Cited by 194 (0 self)
- Add to MetaCart
this article we focus on the study of the gestures of a person, but the same methodology could be applied to the study of robots motions or of other kinds of articulated objects. Some examples of applications are listed in the table 1.
Covariance scaled sampling for monocular 3D body tracking
- CVPR
, 2001
"... We present a method for recovering 3D human body motion from monocular video sequences using robust image matching, joint limits and non-self-intersection constraints, and a new sample-andrefine search strategy guided by rescaled cost-function covariances. Monocular 3D body tracking is challenging: ..."
Abstract
-
Cited by 156 (3 self)
- Add to MetaCart
(Show Context)
We present a method for recovering 3D human body motion from monocular video sequences using robust image matching, joint limits and non-self-intersection constraints, and a new sample-andrefine search strategy guided by rescaled cost-function covariances. Monocular 3D body tracking is challenging: for reliable tracking at least 30 joint parameters need to be estimated, subject to highly nonlinear physical constraints; the problem is chronically illconditioned as about 1/3 of the d.o.f. (the depth-related ones) are almost unobservable in any given monocular image; and matching an imperfect, highly flexible, self-occluding model to cluttered image features is intrinsically hard. To reduce correspondence ambiguities we use a carefully designed robust matching-cost metric that combines robust optical flow, edge energy, and motion boundaries. Even so, the ambiguity, nonlinearity and non-observability make the parameter-space cost surface multi-modal, unpredictable and illconditioned, so minimizing it is difficult. We discuss the limitations of CONDENSATION-like samplers, and introduce a novel hybrid search algorithm that combines inflated-covariance-scaled sampling and continuous optimization subject to physical constraints. Experiments on some challenging monocular sequences show that robust cost modelling, joint and self-intersection constraints, and informed sampling are all essential for reliable monocular 3D body tracking.
Monocular Pedestrian Detection: Survey and Experiments
, 2008
"... Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspective. The first ..."
Abstract
-
Cited by 153 (13 self)
- Add to MetaCart
Pedestrian detection is a rapidly evolving area in computer vision with key applications in intelligent vehicles, surveillance and advanced robotics. The objective of this paper is to provide an overview of the current state of the art from both methodological and experimental perspective. The first part of the paper consists of a survey. We cover the main components of a pedestrian detection system and the underlying models. The second (and larger) part of the paper contains a corresponding experimental study. We consider a diverse set of state-of-the-art systems: wavelet-based AdaBoost cascade [74], HOG/linSVM [11], NN/LRF [75] and combined shape-texture detection [23]. Experiments are performed on an extensive dataset captured on-board a vehicle driving through urban environment. The dataset includes many thousands of training samples as well as a 27 minute test sequence involving more than 20000 images with annotated pedestrian locations. We consider a generic evaluation setting and one specific to pedestrian detection on-board a vehicle. Results indicate a clear advantage of HOG/linSVM at higher image resolutions and lower processing speeds, and a superiority of the wavelet-based AdaBoost cascade approach at lower image resolutions and (near) real-time processing speeds. The dataset (8.5GB) is made public for benchmarking purposes.
Strike a pose: Tracking people by finding stylized poses
- In CVPR
, 2005
"... We develop an algorithm for finding and kinematically tracking multiple people in long sequences. Our basic assumption is that people tend to take on certain canonical poses, even when performing unusual activities like throwing a baseball or figure skating. We build a person detector that quite acc ..."
Abstract
-
Cited by 152 (14 self)
- Add to MetaCart
(Show Context)
We develop an algorithm for finding and kinematically tracking multiple people in long sequences. Our basic assumption is that people tend to take on certain canonical poses, even when performing unusual activities like throwing a baseball or figure skating. We build a person detector that quite accurately detects and localizes limbs of people in lateral walking poses. We use the estimated limbs from a detection to build a discriminative appearance model; we assume the features that discriminate a figure in one frame will discriminate the figure in other frames. We then use the models as limb detectors in a pictorial structure framework, detecting figures in unrestricted poses in both previous and successive frames. We have run our tracker on hundreds of thousands of frames, and present and apply a methodology for evaluating tracking on such a large scale. We test our tracker on real sequences including a feature-length film, an hour of footage from a public park, and various sports sequences. We find that we can quite accurately automatically find and track multiple people interacting with each other while performing fast and unusual motions. 1.
Shape-From-Silhouette of Articulated Objects and its Use for Human Body Kinematics Estimation and Motion Capture
, 2003
"... Shape-From-Silhouette (SFS), also known as Visual Hull (VH) construction, is a popular 3D reconstruction method which estimates the shape of an object from multiple silhouette images. The original SFS formulation assumes that all of the silhouette images are captured either at the same time or while ..."
Abstract
-
Cited by 150 (3 self)
- Add to MetaCart
Shape-From-Silhouette (SFS), also known as Visual Hull (VH) construction, is a popular 3D reconstruction method which estimates the shape of an object from multiple silhouette images. The original SFS formulation assumes that all of the silhouette images are captured either at the same time or while the object is static. This assumption is violated when the object moves or changes shape. Hence the use of SFS with moving objects has been restricted to treating each time instant sequentially and independently. Recently we have successfully extended the traditional SFS formulation to refine the shape of a rigidly moving object over time. Here we further extend SFS to apply to dynamic articulated objects. Given silhouettes of a moving articulated object, the process of recovering the shape and motion requires two steps: (1) correctly segmenting (points on the boundary of) the silhouettes to each articulated part of the object, (2) estimating the motion of each individual part using the segmented silhouette. In this paper, we propose an iterative algorithm to solve this simultaneous assignment and alignment problem. Once we have estimated the shape and motion of each part of the object, the articulation points between each pair of rigid parts are obtained by solving a simple motion constraint between the connected parts. To validate our algorithm, we first apply it to segment the different body parts and estimate the joint positions of a person. The acquired kinematic (shape and joint) information is then used to track the motion of the person in new video sequences.
Finding and tracking people from the bottom up
- In CVPR
, 2003
"... Abstract We ..."
(Show Context)
Kinematic Jump Processes For Monocular 3D Human Tracking
- In Int. Conf. Computer Vision & Pattern Recognition
, 2003
"... A major difficulty for 3D human body tracking from monocular image sequences is the near non-observability of kinematic degrees of freedom that generate motion in depth. For known link (body segment) lengths, the strict non-observabilities reduce to twofold ‘forwards/backwards flipping ’ ambiguities ..."
Abstract
-
Cited by 138 (17 self)
- Add to MetaCart
(Show Context)
A major difficulty for 3D human body tracking from monocular image sequences is the near non-observability of kinematic degrees of freedom that generate motion in depth. For known link (body segment) lengths, the strict non-observabilities reduce to twofold ‘forwards/backwards flipping ’ ambiguities for each link. These imply 2 # links formal inverse kinematics solutions for the full model, and hence linked groups of O(2 # links) local minima in the model-image matching cost function. Choosing the wrong minimum leads to rapid mistracking, so for reliable tracking, rapid methods of investigating alternative minima within a group are needed. Previous approaches to this have used generic search methods that do not exploit the specific problem structure. Here, we complement these by using simple kinematic reasoning to enumerate the tree of possible forwards/backwards flips, thus greatly speeding the search within each linked group of minima. Our methods can be used either deterministically, or within stochastic ‘jump-diffusion ’ style search processes. We give experimental results on some challenging monocular human tracking sequences, showing how the new kinematic-flipping based sampling method improves and complements existing ones.
ZISSERMAN A.: Tracking people by learning their appearance.
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... ..."
(Show Context)