Results 1 - 10
of
47
Motion Layer Extraction in the Presence of Occlusion Using Graph Cuts
, 2005
"... Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective tra ..."
Abstract
-
Cited by 98 (9 self)
- Add to MetaCart
Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, detect the occlusion pixels over multiple consecutive frames, and segment the scene into several motion layers. First, after determining a number of seed regions using correspondences in two frames, we expand the seed regions and reject the outliers employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, an occlusion order constraint on multiple frames is explored, which enforces that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then, the correct layer segmentation is obtained by using a graph cuts algorithm and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust.
Layered Image Motion with Explicit Occlusions, Temporal Consistency, and Depth Ordering
"... Layered models are a powerful way of describing natural scenes containing smooth surfaces that may overlap and occlude each other. For image motion estimation, such models have a long history but have not achieved the wide use or accuracy of non-layered methods. We present a new probabilistic model ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
(Show Context)
Layered models are a powerful way of describing natural scenes containing smooth surfaces that may overlap and occlude each other. For image motion estimation, such models have a long history but have not achieved the wide use or accuracy of non-layered methods. We present a new probabilistic model of optical flow in layers that addresses many of the shortcomings of previous approaches. In particular, we define a probabilistic graphical model that explicitly captures: 1) occlusions and disocclusions; 2) depth ordering of the layers; 3) temporal consistency of the layer segmentation. Additionally the optical flow in each layer is modeled by a combination of a parametric model and a smooth deviation based on an MRF with a robust spatial prior; the resulting model allows roughness in layers. Finally, a key contribution is the formulation of the layers using an imagedependent hidden field prior based on recent models for static scene segmentation. The method achieves state-of-the-art results on the Middlebury benchmark and produces meaningful scene segmentations as well as detected occlusion regions. 1
Tracking multiple objects through occlusions
- In CVPR
, 2005
"... All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately. ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
(Show Context)
All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.
3D occlusion inference from silhouette cues
- In CVPR
, 2007
"... We consider the problem of detecting and accounting for the presence of occluders in a 3D scene based on silhouette cues in video streams obtained from multiple, calibrated views. While well studied and robust in controlled environments, silhouette-based reconstruction of dynamic objects fails in ge ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
(Show Context)
We consider the problem of detecting and accounting for the presence of occluders in a 3D scene based on silhouette cues in video streams obtained from multiple, calibrated views. While well studied and robust in controlled environments, silhouette-based reconstruction of dynamic objects fails in general environments where uncontrolled occlusions are commonplace, due to inherent silhouette corruption by occluders. We show that occluders in the interaction space of dynamic objects can be detected and their 3D shape fully recovered as a byproduct of shape-from-silhouette analysis. We provide a Bayesian sensor fusion formulation to process all occlusion cues occurring in a multi-view sequence. Results show that the shape of static occluders can be robustly recovered from pure dynamic object motion, and that this information can be used for online self-correction and consolidation of dynamic object shape reconstruction. 1.
Joint recognition of complex events and track matching
- In Proc. Computer Vision and Pattern Recognition (CVPR
, 2006
"... We present a novel method for jointly performing recog-nition of complex events and linking fragmented tracks into coherent, long-duration tracks. Many event recognition methods require highly accurate tracking, and may fail when tracks corresponding to event actors are fragmented or partially missi ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
(Show Context)
We present a novel method for jointly performing recog-nition of complex events and linking fragmented tracks into coherent, long-duration tracks. Many event recognition methods require highly accurate tracking, and may fail when tracks corresponding to event actors are fragmented or partially missing. However, these conditions occur fre-quently from occlusions, traffic and tracking errors. Re-cently, methods have been proposed for linking track frag-ments from multiple objects under these difficult conditions. Here, we develop a method for solving these two prob-lems jointly. A hypothesized event model, represented as a Dynamic Bayes Net, supplies data-driven constraints on the likelihood of proposed track fragment matches. These event-guided constraints are combined with appearance and kinematic constraints used in the previous track link-ing formulation. The result is the most likely track linking solution given the event model, and the highest event score given all of the track fragments. The event model with the highest score is determined to have occurred, if the score ex-ceeds a threshold. Results demonstrated on a busy scene of airplane servicing activities, where many non-event movers and long fragmented tracks are present, show the promise of the approach to solving the joint problem. 1.
Background updating for visual surveillance
- In Proceedings of the International Symposium on Visual Computing
, 2005
"... Abstract. Scene changes such as moved objects, parked vehicles, or opened/closed doors need to be carefully handled so that interesting foreground targets can be detected along with the short-term background layers created by those changes. A simple layered modeling technique is embedded into a code ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Scene changes such as moved objects, parked vehicles, or opened/closed doors need to be carefully handled so that interesting foreground targets can be detected along with the short-term background layers created by those changes. A simple layered modeling technique is embedded into a codebook-based background subtraction algorithm to update a background model. In addition, important issues related to background updating for visual surveillance are discussed. Experimental results on surveillance examples, such as unloaded packages and unattended objects, are presented by showing those objects as short-term background layers. 1
TRACKING HUMANS USING PRIOR AND LEARNED REPRESENTATIONS OF SHAPE AND APPEARANCE
, 2003
"... Tracking a moving person is challenging because a person’s appearance in images changes significantly due to articulation, viewpoint changes, and lighting variation across a scene. And different people appear differently due to numerous factors such as body shape, clothing, skin color, and hair. In ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Tracking a moving person is challenging because a person’s appearance in images changes significantly due to articulation, viewpoint changes, and lighting variation across a scene. And different people appear differently due to numerous factors such as body shape, clothing, skin color, and hair. In this thesis, a multi-cue tracking technique is introduced that uses prior information about the 2-D image shape of people in general along with an appearance model that is learned on-line for a specific individual. Assuming a static camera, the background is modeled and updated on-line. Rather than performing thresholding and blob detection during tracking, a foreground probability map (FPM) is computed which indicates the likelihood that a pixel is not the projection of the background. Off-line, a shape model of walking people is estimated from the FPMs computed from training sequences. During tracking, this generic prior model of human shape is used for person detection and to initialize a tracking process. As this prior model is very generic, a model of an individual’s appearance is learned on-line during tracking. As the person is tracked through a sequence using both shape and appearance, the appearance model is refined and multi-cue tracking becomes more robust. iii Acknowledgments Very special thanks to my parents and my wife for their constant support. Thanks to my advisor, Professor David Kriegman, for all his advice, assistance and suggestions. Thanks to Jeff Ho and Ming-hsuan Yang for discussions and advice. Thanks to everyone in Kriegman Research Group for their helps. Also thanks to Jin-yeop Chang and Sang-chul Lee for their favors.
Multiview Segmentation and Tracking of Dynamic Occluding Layers
"... We present an algorithm for the layered segmentation of video data in multiple views. The approach is based on computing the parameters of a layered representation of the scene in which each layer is modelled by its motion, appearance and occupancy, where occupancy describes, probabilistically, the ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
We present an algorithm for the layered segmentation of video data in multiple views. The approach is based on computing the parameters of a layered representation of the scene in which each layer is modelled by its motion, appearance and occupancy, where occupancy describes, probabilistically, the layer’s spatial extent and not simply its segmentation in a particular view. The problem is formulated as the MAP estimation of all layer parameters conditioned on those at the previous time step; i.e. a sqeuential estimation problem that is equivalent to tracking multiple objects in a given number views. Expectation-Maximisation is used to establish layer occupancy and visibility (which are represented distinctly) posterior probabilities. Evidence from areas in each view which are described poorly under the model is used to propose new layers automatically. Since these potential new layers often occur at the fringes of images, the algorithm is able to segment and track these in a single view until such time as a suitable candidate match is discovered in the other views. The algorithm is shown to be very effective at segmenting and tracking non-rigid objects and can cope with extreme occlusion. 1
Figure–Ground Segmentation from Occlusion
- IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2004
"... Layered video representations are increasingly popular, see [2] for a recent review. Segmentation of moving objects is a key step for automating such representations. Current motion segmentation methods either fail to segment moving objects in low textured regions or are computationally very expensi ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Layered video representations are increasingly popular, see [2] for a recent review. Segmentation of moving objects is a key step for automating such representations. Current motion segmentation methods either fail to segment moving objects in low textured regions or are computationally very expensive. This paper presents a computationally simple algorithm that segments moving objects even in low texture/low contrast scenes. Our method infers the moving object templates directly from the image intensity values, rather than computing the motion field as an intermediate step. Our model takes into account the rigidity of the moving object and the occlusion of the background by the moving object. We formulate the segmentation problem as the minimization of a penalized likelihood cost-function and present an algorithm to estimate all the unknown parameters: the motions, the template of the moving object, and the intensity levels of the object and of the background pixels. The cost function combines a maximum likelihood estimation term with a term that penalizes large templates. The minimization algorithm performs two alternate steps for which we derive closed-form solutions. Relaxation improves the convergence even when low texture makes it very challenging to segment the moving object from the background. Experiments demonstrate the good performance of our method.
Layered graphical models for tracking partially-occluded objects
"... Partial occlusions are commonplace in a variety of real world computer vision applications: surveillance, intelligent environments, assistive robotics, autonomous navigation, etc. While occlusion handling methods have been proposed, most methods tend to break down when confronted with numerous occlu ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
(Show Context)
Partial occlusions are commonplace in a variety of real world computer vision applications: surveillance, intelligent environments, assistive robotics, autonomous navigation, etc. While occlusion handling methods have been proposed, most methods tend to break down when confronted with numerous occluders in a scene. In this paper, a layered image-plane representation for tracking people through substantial occlusions is proposed. An imageplane representation of motion around an object is associated with a pre-computed graphical model, which can be instantiated efficiently during online tracking. A global state and observation space is obtained by linking transitions between layers. A Reversible Jump Markov Chain Monte Carlo approach is used to infer the number of people and track them online. The method outperforms two stateof-the-art methods for tracking over extended occlusions, given videos of a parking lot with numerous vehicles and a laboratory with many desks and workstations. 1.