Results 1 - 10
of
125
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have bee ..."
Abstract
-
Cited by 770 (3 self)
- Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and bio-sequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linear-Gaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying Rao-Blackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
A Survey of Computer Vision-Based Human Motion Capture
- Computer Vision and Image Understanding
, 2001
"... A comprehensive survey of computer vision-based human motion capture literature from the past two decades is presented. The focus is on a general overview based on a taxonomy of system functionalities, broken down into four processes: initialization, tracking, pose estimation, and recognition. Each ..."
Abstract
-
Cited by 515 (14 self)
- Add to MetaCart
A comprehensive survey of computer vision-based human motion capture literature from the past two decades is presented. The focus is on a general overview based on a taxonomy of system functionalities, broken down into four processes: initialization, tracking, pose estimation, and recognition. Each process is discussed and divided into subprocesses and/or categories of methods to provide a reference to describe and compare the more than 130 publications covered by the survey. References are included throughout the paper to exemplify important issues and their relations to the various methods. A number of general assumptions used in this research field are identified and the character of these assumptions indicates that the research field is still in an early stage of development. To evaluate the state of the art, the major application areas are identified and performances are analyzed in light of the methods
Stochastic Tracking of 3D Human Figures Using 2D Image Motion
- In European Conference on Computer Vision
, 2000
"... . A probabilistic method for tracking 3D articulated human gures in monocular image sequences is presented. Within a Bayesian framework, we de ne a generative model of image appearance, a robust likelihood function based on image graylevel dierences, and a prior probability distribution over pose an ..."
Abstract
-
Cited by 383 (33 self)
- Add to MetaCart
(Show Context)
. A probabilistic method for tracking 3D articulated human gures in monocular image sequences is presented. Within a Bayesian framework, we de ne a generative model of image appearance, a robust likelihood function based on image graylevel dierences, and a prior probability distribution over pose and joint angles that models how humans move. The posterior probability distribution over model parameters is represented using a discrete set of samples and is propagated over time using particle ltering. The approach extends previous work on parameterized optical ow estimation to exploit a complex 3D articulated motion model. It also extends previous work on human motion tracking by including a perspective camera model, by modeling limb self occlusion, and by recovering 3D motion from a monocular sequence. The explicit posterior probability distribution represents ambiguities due to image matching, model singularities, and perspective projection. The method relies only on a...
Recent Developments in Human Motion Analysis
"... Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the ..."
Abstract
-
Cited by 264 (3 self)
- Add to MetaCart
Visual analysis of human motion is currently one of the most active research topics in computer vision. This strong interest is driven by a wide spectrum of promising applications in many areas such as virtual reality, smart surveillance, perceptual interface, etc. Human motion analysis concerns the detection, tracking and recognition of people, and more generally, the understanding of human behaviors, from image sequences involving humans. This paper provides a comprehensive survey of research on computer vision based human motion analysis. The emphasis is on three major issues involved in a general human motion analysis system, namely human detection, tracking and activity understanding. Various methods for each issue are discussed in order to examine the state of the art. Finally, some research challenges and future directions are discussed.
Motion Texture: A Two-Level Statistical Model for Character Motion Synthesis
- ACM Transactions on Graphics
, 2002
"... In this paper, we describe a novel technique, called motion texture, for synthesizing complex human-figure motion (e.g., dancing) that is statistically similar to the original motion captured data. We de- fine motion texture as a set of motion textons and their distribution, which characterize the s ..."
Abstract
-
Cited by 211 (2 self)
- Add to MetaCart
(Show Context)
In this paper, we describe a novel technique, called motion texture, for synthesizing complex human-figure motion (e.g., dancing) that is statistically similar to the original motion captured data. We de- fine motion texture as a set of motion textons and their distribution, which characterize the stochastic and dynamic nature of the captured motion. Specifically, a motion texton is modeled by a linear dynamic system (LDS) while the texton distribution is represented by a transition matrix indicating how likely each texton is switched to another. We have designed a maximum likelihood algorithm to learn the motion textons and their relationship from the captured dance motion. The learnt motion texture can then be used to generate new animations automatically and/or edit animation sequences interactively. Most interestingly, motion texture can be manipulated at different levels, either by changing the fine details of a specific motion at the texton level or by designing a new choreography at the distribution level. Our approach is demonstrated by many synthesized sequences of visually compelling dance motion.
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking Hedvig Sidenblen
- In European Conference on Computer Vision
, 2002
"... This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthe ..."
Abstract
-
Cited by 201 (4 self)
- Add to MetaCart
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution . These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set
Video-based face recognition using probabilistic appearance manifolds
- In Proc. IEEE Conference on Computer Vision and Pattern Recognition
, 2003
"... This paper presents a novel method to model and recognize human faces in video sequences. Each registered person is represented by a low-dimensional appearance manifold in the ambient image space. The complex nonlinear appearance manifold expressed as a collection of subsets (named pose manifolds), ..."
Abstract
-
Cited by 176 (5 self)
- Add to MetaCart
(Show Context)
This paper presents a novel method to model and recognize human faces in video sequences. Each registered person is represented by a low-dimensional appearance manifold in the ambient image space. The complex nonlinear appearance manifold expressed as a collection of subsets (named pose manifolds), and the connectivity among them. Each pose manifold is approximated by an affine plane. To construct this representation, exemplars are sampled from videos, and these exemplars are clustered with a K-means algorithm; each cluster is represented as a plane computed through principal component analysis (PCA). The connectivity between the pose manifolds encodes the transition probability between images in each of the pose manifold and is learned from a training video sequences. A maximum a posteriori formulation is presented for face recognition in test video sequences by integrating the likelihood that the input image comes from a particular pose manifold and the transition probability to this pose manifold from the previous frame. To recognize faces with partial occlusion, we introduce a weight mask into the process. Extensive experiments demonstrate that the proposed algorithm outperforms existing frame-based face recognition methods with temporal voting schemes. 1
Learning Switching Linear Models of Human Motion
, 2000
"... The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification, and v ..."
Abstract
-
Cited by 136 (2 self)
- Add to MetaCart
The human figure exhibits complex and rich dynamic behavior that is both nonlinear and time-varying. Effective models of human dynamics can be learned from motion capture data using switching linear dynamic system (SLDS) models. We present results for human motion synthesis, classification, and visual tracking using learned SLDS models. Since exact inference in SLDS is intractable, we present three approximate inference algorithms and compare their performance. In particular, a new variational inference algorithm is obtained by casting the SLDS model as a Dynamic Bayesian Network. Classification experiments show the superiority of SLDS over conventional HMM's for our problem domain. 1 Introduction The human figure exhibits complex and rich dynamic behavior. Dynamics are essential to the classification of human motion (e.g. gesture recognition) as well as to the synthesis of realistic figure motion for computer graphics. In visual tracking applications, dynamics can provide a p...
ZISSERMAN A.: Tracking people by learning their appearance.
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... ..."
(Show Context)
Shape-from-Silhouette Across Time - Part I: Theory and Algorithms
- International Journal of Computer Vision
, 2005
"... Shape-From-Silhouette (SFS) is a shape reconstruction method which constructs a 3D shape estimate of an object using silhouette images of the object. The output of a SFS algorithm is known as the Visual Hull (VH). Traditionally SFS is either performed on static objects, or separately at each time in ..."
Abstract
-
Cited by 107 (3 self)
- Add to MetaCart
(Show Context)
Shape-From-Silhouette (SFS) is a shape reconstruction method which constructs a 3D shape estimate of an object using silhouette images of the object. The output of a SFS algorithm is known as the Visual Hull (VH). Traditionally SFS is either performed on static objects, or separately at each time instant in the case of videos of moving objects. In this paper we develop a theory of performing SFS across time: estimating the shape of a dynamic object (with unknown motion) by combining all of the silhouette images of the object over time. We first introduce a one dimensional element called a Bounding Edge to represent the Visual Hull. We then show that aligning two Visual Hulls using just their silhouettes is in general ambiguous and derive the geometric constraints (in terms of Bounding Edges) that govern the alignment. To break the alignment ambiguity, we combine stereo information with silhouette information and derive a Temporal SFS algorithm which consists of two steps: (1) estimate the motion of the objects over time (Visual Hull Alignment) and (2) combine the silhouette information using the estimated motion (Visual Hull Refinement). The algorithm is first developed for rigid objects and then extended to articulated objects. In the Part II of this paper we apply our temporal SFS algorithm to two human-related applications: (1) the acquisition of detailed human kinematic models and (2) marker-less motion tracking.