Results 1 - 10
of
270
Object Tracking: A Survey
, 2006
"... The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns o ..."
Abstract
-
Cited by 701 (7 self)
- Add to MetaCart
The goal of this article is to review the state-of-the-art tracking methods, classify them into different categories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns of both the object and the scene, nonrigid object structures, object-to-object and object-to-scene occlusions, and camera motion. Tracking is usually performed in the context of higher-level applications that require the location and/or shape of the object in every frame. Typically, assumptions are made to constrain the tracking problem in the context of a particular application. In this survey, we categorize the tracking methods on the basis of the object and motion representations used, provide detailed descriptions of representative methods in each category, and examine their pros and cons. Moreover, we discuss the important issues related to tracking including the use of appropriate image features, selection of motion models, and detection of objects.
Real-Time Combined 2D+3D Active Appearance Models
- In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2004
"... Active Appearance Models (AAMs) are generative models commonly used to model faces. Another closely related type of face models are 3D Morphable Models (3DMMs). Although AAMs are 2D, they can still be used to model 3D phenomena such as faces moving across pose. We first study the representational po ..."
Abstract
-
Cited by 159 (19 self)
- Add to MetaCart
(Show Context)
Active Appearance Models (AAMs) are generative models commonly used to model faces. Another closely related type of face models are 3D Morphable Models (3DMMs). Although AAMs are 2D, they can still be used to model 3D phenomena such as faces moving across pose. We first study the representational power of AAMs and show that they can model anything a 3DMM can, but possibly require more shape parameters. We quantify the number of additional parameters required and show that 2D AAMs can generate model instances that are not possible with the equivalent 3DMM. We proceed to describe how a non-rigid structure-from-motion algorithm can be used to construct the corresponding 3D shape modes of a 2D AAM. We then show how the 3D modes can be used to constrain the AAM so that it can only generate model instances that can also be generated with the 3D modes. Finally, we propose a realtime algorithm for fitting the AAM while enforcing the constraints, creating what we call a "Combined 2D+3D AAM." 1
Tracking and Modeling Non-Rigid Objects with Rank Constraints
, 2001
"... This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank ..."
Abstract
-
Cited by 159 (7 self)
- Add to MetaCart
This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank of the tracking matrix. The rank constraint is used to achieve robust and precise low-level optical flow estimation without prior knowledge of the 3D shape of the object. The bound on the rank is also exploited to handle occlusion at the tracking level leading to the possibility of recovering the complete trajectories of occluded/disoccluded points. Following the same lowrank principle, the resulting flow matrix can be factored to get the 3D pose, configuration coefficients, and 3D basis shapes. The flow matrix is factored in an iterative manner, looping between solving for pose, configuration, and basis shapes. The flow-based tracking is applied to several video sequences and provides the input to the 3D non-rigid reconstruction task. Additional results on synthetic data and comparisons to ground truth complete the experiments.
Face Transfer with Multilinear Models
- TO APPEAR IN SIGGRAPH 2005
, 2005
"... Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate ..."
Abstract
-
Cited by 145 (3 self)
- Add to MetaCart
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target—the attributes are separably controllable. This supports
A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and nondegenerate
- In ECCV
, 2006
"... Abstract. We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constra ..."
Abstract
-
Cited by 139 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constraint states that the trajectories of the same motion lie in a low dimensional linear manifold and different motions result in different linear manifolds; locality, by which we mean in a transformed space a data and its neighbors tend to lie in the same linear manifold, provides a cue for efficient estimation of these manifolds. Our algorithm estimates a number of linear manifolds, whose dimensions are unknown beforehand, and segment the trajectories accordingly. It first transforms and normalizes the trajectories; secondly, for each trajectory it estimates a local linear manifold through local sampling; then it derives the affinity matrix based on principal subspace angles between these estimated linear manifolds; at last, spectral clustering is applied to the matrix and gives the segmentation result. Our algorithm is general without restriction on the number of linear manifolds and without prior knowledge of the dimensions of the linear manifolds. We demonstrate in our experiments that it can segment a wide range of motions including independent, articulated, rigid, non-rigid, degenerate, non-degenerate or any combination of them. In some highly challenging cases where other state-of-the-art motion segmentation algorithms may fail, our algorithm gives expected results. 2 1
Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion, Shape Reflectance
, 2001
"... In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion of a dynamic 3D scene. Because these properties are completely unknown, our approach uses multiple views to build a piecewisecontinuous geometric and radiometric representation of the scene's trace ..."
Abstract
-
Cited by 117 (0 self)
- Add to MetaCart
In this paper we study the problem of recovering the 3D shape, reflectance, and non-rigid motion of a dynamic 3D scene. Because these properties are completely unknown, our approach uses multiple views to build a piecewisecontinuous geometric and radiometric representation of the scene's trace in space-time. Basic primitive of this representation is the dynamic surfel, which (1) encodes the instantaneous local shape, reflectance, and motion of a small region in the scene, and (2) enables accurate prediction of the region's dynamic appearance under known illumination conditions. We show that complete surfel-based reconstructions can be created by repeatedly applying an algorithm called Surfel Sampling that combines sampling and parameter estimation to fit a single surfel to a small, bounded region of space-time. Experimental results with the Phong reflectance model and complex real scenes (clothing, skin, shiny objects) illustrate our method's ability to explain pixels and pixel variations in terms of their physical causes--- shape, reflectance, motion, illumination, and visibility.
A Closed-Form Solution to Non-Rigid Shape and Motion Recovery
- In European Conference on Computer Vision
, 2004
"... Recovery of three diensWXzm (3D) sD) e and otion of non-sN;m[ s cenes fro a onocular videosdeomWW is i portant forapplications like robot navigation and hu an co puter interaction. If every point in thes cene rando ly oves it is i - posW=J= to recover the non-rigids-r es In practice, any non-rigid o ..."
Abstract
-
Cited by 113 (11 self)
- Add to MetaCart
(Show Context)
Recovery of three diensWXzm (3D) sD) e and otion of non-sN;m[ s cenes fro a onocular videosdeomWW is i portant forapplications like robot navigation and hu an co puter interaction. If every point in thes cene rando ly oves it is i - posW=J= to recover the non-rigids-r es In practice, any non-rigid objects e.g. the hu an face under various expres[XFX] defor with certains tructures Theirs hapes can be regarded as a weighted co bination of certains hapebasXJ Shape and otion recovery unders uchs ituations has attracted uch interesX Previous work onthis proble [6, 4, 13] utilized only orthonor ality consWJNm ts on the ca era rotations (ro- tation constraints).This paper proves that usJ] only the rotation cons]N]m ts res]N] in a biguous and invalid smWWX];m[ The a biguity arisX fro the fact that thesmX e bas+ are not unique becaus their linear transJW ation is a news et of eligiblebasib To eli inate the a biguity, we propos as et of novel consNXNm ts basis constraints, which uniquely deter ine thesmW e bas;F We prove that, under the weak-p ers ective projection odel, enforcing both the bas= and the rotation consW+;m ts leads to a closNm[JF slosNm to the proble of non-rigids hape and otion recovery. The accuracy and robus;Wm[ of ourclos=;m[J slos=; is evaluated quantitatively on sm thetic data and qualitatively on real videoseomWN;JN 1
Damped Newton algorithms for matrix factorization with missing data
- in CVPR05
, 2005
"... The problem of low-rank matrix factorization in the presence of missing data has seen significant attention in recent computer vision research. The approach that dominates the literature is EM-like alternation of closed-form solutions for the two factors of the matrix. An obvious alternative is nonl ..."
Abstract
-
Cited by 99 (0 self)
- Add to MetaCart
(Show Context)
The problem of low-rank matrix factorization in the presence of missing data has seen significant attention in recent computer vision research. The approach that dominates the literature is EM-like alternation of closed-form solutions for the two factors of the matrix. An obvious alternative is nonlinear optimization of both factors simultaneously, a strategy which has seen little published research. This paper provides a comprehensive comparison of the two strategies by evaluating previously published factorization algorithms as well as some second order methods not previously presented for this problem. We conclude that, although alternation approaches can be very quick, their propensity to glacial convergence in narrow valleys of the cost function means that averagecase performance is worse than second-order strategies. Further, we demonstrate the importance of two main observations: one, that schemes based on closed-form solutions alone are not suitable and that non-linear optimization strategies are faster, more accurate and provide more flexible frameworks for continued progress; and two, that basic objective functions are not adequate and that regularization priors must be incorporated, a process that is easier with nonlinear methods. 1.
Non-Rigid Structure-From-Motion: Estimating Shape and Motion with Hierarchical Priors
, 2007
"... This paper describes methods for recovering time-varying shape and motion of non-rigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. ..."
Abstract
-
Cited by 91 (1 self)
- Add to MetaCart
(Show Context)
This paper describes methods for recovering time-varying shape and motion of non-rigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. Time-varying shape is modeled as a rigid transformation combined with a non-rigid deformation. Reconstruction is ill-posed if arbitrary deformations are allowed, and thus additional assumptions about deformations are required. We first suggest restricting shapes to lie within a lowdimensional subspace, and describe estimation algorithms. However, this restriction alone is insufficient to constrain reconstruction. To address these problems, we propose a reconstruction method using a Probabilistic Principal Components Analysis (PPCA) shape model, and an estimation algorithm that simultaneously estimates 3D shape and motion for each instant, learns the PPCA model parameters, and robustly fills-in missing data points. We then extend the model to model temporal dynamics in object shape, allowing the algorithm to robustly handle severe cases of missing data.