Results 1 - 10
of
31
Shape and motion from image streams under orthography: a factorization method
- International Journal of Computer Vision
, 1992
"... Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orth ..."
Abstract
-
Cited by 775 (33 self)
- Add to MetaCart
Inferring scene geometry and camera motion from a stream of images is possible in principle, but is an ill-conditioned problem when the objects are distant with respect to their size. We have developed a factorization method that can overcome this difficulty by recovering shape and motion under orthography without computing depth as an intermediate step. An image stream can be represented by the 2FxP measurement matrix of the image coordinates of P points tracked through F frames. We show that under orthographic projection this matrix is of rank 3. Based on this observation, the factorization method uses the singular-value decomposition technique to factor the measurement matrix into two matrices which represent object shape and camera rotation respectively. Two of the three translation components are computed in a preprocessing stage. The method can also handle and obtain a full solution from a partially filled-in measurement matrix that may result from occlusions or tracking failures. The method gives accurate results, and does not introduce smoothing in either shape or motion. We demonstrate this with a series of experiments on laboratory and outdoor image streams, with and without occlusions. 1
A Multi-body Factorization Method for Independently Moving Objects
- International Journal of Computer Vision
, 1997
"... this paper we present & new method for separating and recovering the motion and shape of multiple independently moving objects in sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image lev ..."
Abstract
-
Cited by 133 (10 self)
- Add to MetaCart
this paper we present & new method for separating and recovering the motion and shape of multiple independently moving objects in sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects
A Multi-body Factorization Method for Motion Analysis
, 1995
"... The structure-from-motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, ..."
Abstract
-
Cited by 121 (2 self)
- Add to MetaCart
The structure-from-motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, especially for its theoretical treatment. In this paper we present a new method for separating and recovering the motion and shape of multiple independently moving objects in a sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects. Once the matr...
Generalized principal component analysis (GPCA)
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2003
"... This paper presents an algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. We represent the subspaces with a set of homogeneous polynomials whose degree is the number of subspaces and whose derivatives at a ..."
Abstract
-
Cited by 75 (27 self)
- Add to MetaCart
This paper presents an algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points. We represent the subspaces with a set of homogeneous polynomials whose degree is the number of subspaces and whose derivatives at a data point give normal vectors to the subspace passing through the point. When the number of subspaces is known, we show that these polynomials can be estimated linearly from data; hence, subspace segmentation is reduced to classifying one point per subspace. We select these points optimally from the data set by minimizing certain distance function, thus dealing automatically with moderate noise in the data. A basis for the complement of each subspace is then recovered by applying standard PCA to the collection of derivatives (normal vectors). Extensions of GPCA that deal with data in a highdimensional space and with an unknown number of subspaces are also presented. Our experiments on low-dimensional data show that GPCA outperforms existing algebraic algorithms based on polynomial factorization and provides a good initialization to iterative techniques such as K-subspaces and Expectation Maximization. We also present applications of GPCA to computer vision problems such as face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views.
Motion Segmentation with Missing Data using PowerFactorization and GPCA
- In CVPR
, 2004
"... We consider the problem of segmenting multiple rigid motions from point correspondences in multiple affine views. We cast this problem as a subspace clustering problem in which the motion of each object lives in a subspace of dimension two, three or four. Unlike previous work, we do not restrict the ..."
Abstract
-
Cited by 52 (8 self)
- Add to MetaCart
We consider the problem of segmenting multiple rigid motions from point correspondences in multiple affine views. We cast this problem as a subspace clustering problem in which the motion of each object lives in a subspace of dimension two, three or four. Unlike previous work, we do not restrict the motion subspaces to be four-dimensional or linearly independent. Instead, our approach deals gracefully with all the spectrum of possible affine motions: from twodimensional and partially dependent to four-dimensional and fully independent. In addition, our method handles the case of missing data, meaning that point tracks do not have to be visible in all images. Our approach involves projecting the point trajectories of all the points into a 5dimensional space, using the PowerFactorization method to fill in missing data. Then multiple linear subspaces representing independent motions are fitted to the points in using GPCA. We test our algorithm on various real sequences with degenerate and nondegenerate motions, missing data, perspective effects, transparent motions, etc. Our algorithm achieves a misclassification error of less than 5% for sequences with up to 30% of missing data points.
Spatio-Temporal Segmentation of Video by Hierarchical Mean Shift Analysis
- Center for Automat. Res., U. of Md, College Park
, 2002
"... We describe a simple new technique for spatio-temporal segmentation of video sequences. Each pixel of a 3D space-time video stack is mapped to a 7D feature point whose coordinates include three color components, two motion angle components and two motion position components. The clustering of these ..."
Abstract
-
Cited by 47 (4 self)
- Add to MetaCart
We describe a simple new technique for spatio-temporal segmentation of video sequences. Each pixel of a 3D space-time video stack is mapped to a 7D feature point whose coordinates include three color components, two motion angle components and two motion position components. The clustering of these feature points provides color segmentation and motion segmentation, as well as a consistent labeling of regions over time which amounts to region tracking. For this task we have adopted a hierarchical clustering method which operates by repeatedly applying mean shift analysis over increasing large ranges, using at each pass the cluster centers of the previous pass, with weights equal to the counts of the points that contributed to the clusters. This technique has lower complexity for large mean shift radii than regular mean shift analysis because it can use binary tree structures more efficiently during range search. In addition, it provides a hierarchical segmentation of the data. Applications include video compression and compact descriptions of video sequences for video indexing and retrieval applications.
A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and nondegenerate
- In ECCV
, 2006
"... Abstract. We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constra ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Abstract. We cast the problem of motion segmentation of feature trajectories as linear manifold finding problems and propose a general framework for motion segmentation under affine projections which utilizes two properties of trajectory data: geometric constraint and locality. The geometric constraint states that the trajectories of the same motion lie in a low dimensional linear manifold and different motions result in different linear manifolds; locality, by which we mean in a transformed space a data and its neighbors tend to lie in the same linear manifold, provides a cue for efficient estimation of these manifolds. Our algorithm estimates a number of linear manifolds, whose dimensions are unknown beforehand, and segment the trajectories accordingly. It first transforms and normalizes the trajectories; secondly, for each trajectory it estimates a local linear manifold through local sampling; then it derives the affinity matrix based on principal subspace angles between these estimated linear manifolds; at last, spectral clustering is applied to the matrix and gives the segmentation result. Our algorithm is general without restriction on the number of linear manifolds and without prior knowledge of the dimensions of the linear manifolds. We demonstrate in our experiments that it can segment a wide range of motions including independent, articulated, rigid, non-rigid, degenerate, non-degenerate or any combination of them. In some highly challenging cases where other state-of-the-art motion segmentation algorithms may fail, our algorithm gives expected results. 2 1
A benchmark for the comparison of 3D motion segmentation algorithms
- In CVPR
, 2007
"... Over the past few years, several methods for segmenting a scene containing multiple rigidly moving objects have been proposed. However, most existing methods have been tested on a handful of sequences only, and each method has been often tested on a different set of sequences. Therefore, the compari ..."
Abstract
-
Cited by 37 (5 self)
- Add to MetaCart
Over the past few years, several methods for segmenting a scene containing multiple rigidly moving objects have been proposed. However, most existing methods have been tested on a handful of sequences only, and each method has been often tested on a different set of sequences. Therefore, the comparison of different methods has been fairly limited. In this paper, we compare four 3-D motion segmentation algorithms for affine cameras on a benchmark of 155 motion sequences of checkerboard, traffic, and articulated scenes. 1.
Two-View Multibody Structure from Motion
, 2006
"... We present an algebraic geometric approach to 3-D motion estimation and segmentation of multiple rigid-body motions from noise-free point correspondences in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its asso ..."
Abstract
-
Cited by 35 (15 self)
- Add to MetaCart
We present an algebraic geometric approach to 3-D motion estimation and segmentation of multiple rigid-body motions from noise-free point correspondences in two perspective views. Our approach exploits the algebraic and geometric properties of the so-called multibody epipolar constraint and its associated multibody fundamental matrix, which are natural generalizations of the epipolar constraint and of the fundamental matrix to multiple motions. We derive a rank constraint on a polynomial embedding of the correspondences, from which one can estimate the number of independent motions as well as linearly solve for the multibody fundamental matrix. We then show how to compute the epipolar lines from the first-order derivatives of the multibody epipolar constraint and the epipoles by solving a plane clustering problem using Generalized PCA (GPCA). Given the epipoles and epipolar lines, the estimation of individual fundamental matrices becomes a linear problem. The clustering of the feature points is then automatically obtained from either the epipoles and epipolar lines or from the individual fundamental matrices. Although our approach is mostly designed for noise-free correspondences, we also test its performance on synthetic and real data with moderate levels of noise.
Segmenting, Modeling, and Matching Video Clips Containing Multiple Moving Objects
, 2005
"... This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and are observed by a moving camera. Multi–view constraints associated with groups of affine–covariant scene patches and a normalized description of their appearance ar ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
This paper presents a novel representation for dynamic scenes composed of multiple rigid objects that may undergo different motions and are observed by a moving camera. Multi–view constraints associated with groups of affine–covariant scene patches and a normalized description of their appearance are used to segment a scene into its rigid components, construct three–dimensional models of these components, and match instances of models recovered from different image sequences. The proposed approach has been implemented, and it is applied to the detection and matching of moving objects in video sequences and to shot matching, i.e., the identification of shots that depict the same scene in a video clip.

