Results 1 - 10
of
68
MLESAC: A New Robust Estimator with Application to Estimating Image Geometry
- Computer Vision and Image Understanding
, 2000
"... A new method is presented for robustly estimating multiple view relations from point correspondences. The method comprises two parts. The first is a new robust estimator MLESAC which is a generalization of the RANSAC estimator. It adopts the same sampling strategy as RANSAC to generate putative solu ..."
Abstract
-
Cited by 164 (5 self)
- Add to MetaCart
A new method is presented for robustly estimating multiple view relations from point correspondences. The method comprises two parts. The first is a new robust estimator MLESAC which is a generalization of the RANSAC estimator. It adopts the same sampling strategy as RANSAC to generate putative solutions, but chooses the solution that maximizes the likelihood rather than just the number of inliers. The second part of the algorithm is a general purpose method for automatically parameterizing these relations, using the output of MLESAC. A difficulty with multiview image relations is that there are often nonlinear constraints between the parameters, making optimization a difficult task. The parameterization method overcomes the difficulty of nonlinear constraints and conducts a constrained optimization. The method is general and its use is illustrated for the estimation of fundamental matrices, image–image homographies, and quadratic transformations. Results are given for both synthetic and real images. It is demonstrated that the method gives results equal or superior to those of previous approaches. c ○ 2000 Academic Press 1.
Learning Flexible Sprites in Video Layers
- In CVPR
, 2001
"... We propose a technique for automatically learning layers of "flexible sprites" -- probabilistic 2-dimensional appearance maps and masks of moving, occluding objects. The model explains each input image as a layered composition of exible sprites. A variational expectation maximization algorithm is us ..."
Abstract
-
Cited by 107 (14 self)
- Add to MetaCart
We propose a technique for automatically learning layers of "flexible sprites" -- probabilistic 2-dimensional appearance maps and masks of moving, occluding objects. The model explains each input image as a layered composition of exible sprites. A variational expectation maximization algorithm is used to learn a mixture of sprites from a video sequence. For each input image, probabilistic inference is used to infer the sprite class, translation, mask values and pixel intensities (including obstructed pixels) in each layer. Exact inference is intractable, but we show how a variational inference technique can be used to process 320 × 240 images at 1 frame/second. The only inputs to the learning algorithm are the video sequence, the number of layers and the number of flexible sprites. We give results on several tasks, including summarizing a video sequence with sprites, point-and-click video stabilization, and point-and-click object removal.
Generalized principal component analysis (GPCA)
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2003
"... This paper presents an algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. We represent the subspaces with a set of homogeneous polynomials whose degree is the number of subspaces and whose derivatives at a ..."
Abstract
-
Cited by 75 (27 self)
- Add to MetaCart
This paper presents an algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. We represent the subspaces with a set of homogeneous polynomials whose degree is the number of subspaces and whose derivatives at a data point give normal vectors to the subspace passing through the point. When the number of subspaces is known, we show that these polynomials can be estimated linearly from data; hence, subspace segmentation is reduced to classifying one point per subspace. We select these points optimally from the data set by minimizing certain distance function, thus dealing automatically with moderate noise in the data. A basis for the complement of each subspace is then recovered by applying standard PCA to the collection of derivatives (normal vectors). Extensions of GPCA that deal with data in a highdimensional space and with an unknown number of subspaces are also presented. Our experiments on low-dimensional data show that GPCA outperforms existing algebraic algorithms based on polynomial factorization and provides a good initialization to iterative techniques such as K-subspaces and Expectation Maximization. We also present applications of GPCA to computer vision problems such as face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views.
Object tracking with Bayesian estimation of dynamic layer representations
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2002
"... Decomposing video frames into coherent two-dimensional motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based v ..."
Abstract
-
Cited by 61 (4 self)
- Add to MetaCart
Decomposing video frames into coherent two-dimensional motion layers is a powerful method for representing videos. Such a representation provides an intermediate description that enables applications such as object tracking, video summarization and visualization, video insertion, and sprite-based video compression. Previous work on motion layer analysis has largely concentrated on two-frame or multiframe batch formulations. The temporal coherency of motion layers and the domain constraints on shapes have not been exploited. This paper introduces a complete dynamic motion layer representation in which spatial and temporal constraints on shape, motion, and layer appearance are modeled and estimated in a maximum a posteriori (MAP) framework using the generalized expectation-maximization (EM) algorithm. In order to limit the computational complexity of tracking arbitrarily shaped layer ownership, we propose a shape prior that parameterizes the representation of shape and prevents motion layers from evolving into arbitrary shapes. In this work, a Gaussian shape prior is chosen to specifically develop a near real-time tracker for vehicle tracking in aerial videos. However, the general idea of using a parametric shape representation as part of the state of a tracker is a powerful one that can be extended to other domains as well. Based on the dynamic layer representation, an iterative algorithm is developed for continuous object tracking over time. The proposed method has been successfully applied in an airborne vehicle tracking system. Its performance is compared with that of a correlation-based tracker and a motion change-based tracker to demonstrate the advantages of the new method. Examples of tracking when the backgrounds are cluttered and the vehicles undergo various rigid motions and complex interactions such as passing, turning, and stop-and-go demonstrate the strength of the complete dynamic layer representation.
A probabilistic framework for space carving
, 2001
"... for the degree of Doctor of Philosophy. ii This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration. This dissertation is not substantially the same as any I have submitted for a degree or diploma or other qualification at any other Unive ..."
Abstract
-
Cited by 52 (3 self)
- Add to MetaCart
for the degree of Doctor of Philosophy. ii This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration. This dissertation is not substantially the same as any I have submitted for a degree or diploma or other qualification at any other University. I further state that no part of my dissertation/thesis has already been, or is being concurrently submitted for any such degree, diploma or other quali-fication. This dissertation contains 78 figures and approximately 46000 words. This dissertation was revised December 2001. This thesis investigates the problem of reconstructing three-dimensional objects from image sequences. There are two major contributions in this thesis. The first contribution is an extension to the Space Carving framework that elimi-
Bilayer segmentation of live video
- In: IEEE Conference on Computer Vision and Pattern Recognition
, 2006
"... a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and th ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and then composited free of aliasing upon a different background; a useful tool in video-conferencing applications. The sequences and ground truth data used throughout this paper are available from [1]. This paper presents an algorithm capable of real-time separation of foreground from background in monocular video sequences. Automatic segmentation of layers from colour/contrast or from motion alone is known to be error-prone. Here motion, colour and contrast cues are probabilistically fused together with spatial and temporal priors to infer layers accurately and efficiently. Central to our algorithm is the fact that pixel velocities are not needed, thus removing the need for optical flow estimation, with its tendency to error and computational expense. Instead, an efficient motion vs nonmotion classifier is trained to operate directly and jointly on intensity-change and contrast. Its output is then fused with colour information. The prior on segmentation is represented by a second order, temporal, Hidden Markov Model, together with a spatial MRF favouring coherence except where contrast is high. Finally, accurate layer segmentation and explicit occlusion detection are efficiently achieved by binary graph cut. The segmentation accuracy of the proposed algorithm is quantitatively evaluated with respect to existing groundtruth data and found to be comparable to the accuracy of a state of the art stereo segmentation algorithm. Foreground/background segmentation is demonstrated in the application of live background substitution and shown to generate convincingly good quality composite video. 1 1.
A Robust Subspace Approach to Layer Extraction
, 2002
"... Representing images with layers has many important applications, such as video compression, motion analysis, and 3D scene analysis. This paper presents a robust subspace approach to reliably extracting layers from images by taking advantages of the fact that homographies induced by planar patches in ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
Representing images with layers has many important applications, such as video compression, motion analysis, and 3D scene analysis. This paper presents a robust subspace approach to reliably extracting layers from images by taking advantages of the fact that homographies induced by planar patches in the scene form a low dimensional linear subspace. Such subspace provides not only a feature space where layers in the image domain are mapped onto denser and better-defined clusters, but also a constraint for detecting outliers in the local measurements, thus making the algorithm robust to outliers. By enforcing the subspace constraint, spatial and temporal redundancy from multiple frames are simultaneously utilized, and noise can be effectively reduced. Good layer descriptions are shown to be extracted in the experimental results.
What Went Where
, 2003
"... We present a novel framework for motion segmentation that combines the concepts of layer-based methods and featurebased motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sam ..."
Abstract
-
Cited by 43 (4 self)
- Add to MetaCart
We present a novel framework for motion segmentation that combines the concepts of layer-based methods and featurebased motion estimation. We estimate the initial correspondences by comparing vectors of filter outputs at interest points, from which we compute candidate scene relations via random sampling of minimal subsets of correspondences. We achieve a dense, piecewise smooth assignment of pixels to motion layers using a fast approximate graphcut algorithm based on a Markov random field formulation. We demonstrate our approach on image pairs containing large inter-frame motion and partial occlusion. The approach is efficient and it successfully segments scenes with inter-frame disparities previously beyond the scope of layerbased motion segmentation methods.
Two-View Multibody Structure from Motion
, 2006
"... We present an algebraic geometric approach to 3-D motion estimation and segmentation of multiple rigid-body motions from noise-free point correspondences in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its asso ..."
Abstract
-
Cited by 35 (15 self)
- Add to MetaCart
We present an algebraic geometric approach to 3-D motion estimation and segmentation of multiple rigid-body motions from noise-free point correspondences in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its associated multibody fundamental matrix, which are natural generalizations of the epipolar constraint and of the fundamental matrix to multiple motions. We derive a rank constraint on a polynomial embedding of the correspondences, from which one can estimate the number of independent motions as well as linearly solve for the multibody fundamental matrix. We then show how to compute the epipolar lines from the first-order derivatives of the multibody epipolar constraint and the epipoles by solving a plane clustering problem using Generalized PCA (GPCA). Given the epipoles and epipolar lines, the estimation of individual fundamental matrices becomes a linear problem. The clustering of the feature points is then automatically obtained from either the epipoles and epipolar lines or from the individual fundamental matrices. Although our approach is mostly designed for noise-free correspondences, we also test its performance on synthetic and real data with moderate levels of noise.
A Unified Algebraic Approach to 2-D and 3-D Motion Segmentation
- IN EUROPEAN CONFERENCE ON COMPUTER VISION
, 2004
"... We present an analytic solution to the problem of estimating multiple 2-D and 3-D motion models from two-view correspondences or optical flow. The key to our approach is to view the estimation of multiple motion models as the estimation of a single multibody motion model. This is possible thanks ..."
Abstract
-
Cited by 30 (15 self)
- Add to MetaCart
We present an analytic solution to the problem of estimating multiple 2-D and 3-D motion models from two-view correspondences or optical flow. The key to our approach is to view the estimation of multiple motion models as the estimation of a single multibody motion model. This is possible thanks to two important algebraic facts. First, we show that all the image measurements, regardless of their associated motion model, can be fit with a real or complex polynomial. Second, we show

