Results 1 - 10
of
138
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms.
- In IEEE Workshop on Stereo and Multi-Baseline Vision,
, 2001
"... Abstract Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame ..."
Abstract
-
Cited by 1546 (22 self)
- Add to MetaCart
(Show Context)
Abstract Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.
What energy functions can be minimized via graph cuts?
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2004
"... In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are co ..."
Abstract
-
Cited by 1047 (23 self)
- Add to MetaCart
(Show Context)
In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.
Multi-camera Scene Reconstruction via Graph Cuts
- in European Conference on Computer Vision
, 2002
"... We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints. ..."
Abstract
-
Cited by 317 (9 self)
- Add to MetaCart
(Show Context)
We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints.
Advances in computational stereo
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo research has matured significantly throughout t ..."
Abstract
-
Cited by 224 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo research has matured significantly throughout the years and many advances in computational stereo continue to be made, allowing stereo to be applied to new and more demanding problems. In this paper, we review recent advances in computational stereo, focusing primarily on three important topics: correspondence methods, methods for occlusion, and real-time implementations. Throughout, we present tables that summarize and draw distinctions among key ideas and approaches. Where available, we provide comparative analyses and we make suggestions for analyses yet to be done. Index Terms—Computational stereo, stereo correspondence, occlusion, real-time stereo, review. æ 1
Detailed real-time urban 3D reconstruction from video.
- IJCV,
, 2008
"... Abstract The paper presents a system for automatic, georegistered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using curren ..."
Abstract
-
Cited by 185 (22 self)
- Add to MetaCart
(Show Context)
Abstract The paper presents a system for automatic, georegistered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU's to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.
Comparison of Graph Cuts with Belief Propagation for Stereo, Using Identical MRF Parameters
- In ICCV
, 2003
"... Recent stereo algorithms have achieved impressive results by modelling the disparity image as a Markov Random Field (MRF). An important component of an MRF-based approach is the inference algorithm used to find the most likely setting of each node in the MRF. Algorithms have been proposed which use ..."
Abstract
-
Cited by 172 (0 self)
- Add to MetaCart
(Show Context)
Recent stereo algorithms have achieved impressive results by modelling the disparity image as a Markov Random Field (MRF). An important component of an MRF-based approach is the inference algorithm used to find the most likely setting of each node in the MRF. Algorithms have been proposed which use Graph Cuts or Belief Propagation for inference. These stereo algorithms differ in both the inference algorithm used and the formulation of the MRF. It is unknown whether to attribute the responsibility for differences in performance to the MRF or the inference algorithm. We address this through controlled experiments by comparing the Belief Propagation algorithm and the Graph Cuts algorithm on the same MRF's, which have been created for calculating stereo disparities. We find that the labellings produced by the two algorithms are comparable. The solutions produced by Graph Cuts have a lower energy than those produced with Belief Propagation, but this does not necessarily lead to increased performance relative to the ground-truth.
Handling Occlusions in Dense Multi-view Stereo
, 2001
"... While stereo matching was originally formulated as the recovery of 3D shape from a pair of images, it is now generally recognized that using more than two images can dramatically improve the quality of the reconstruction. Unfortunately, as more images are added, the prevalence of semioccluded region ..."
Abstract
-
Cited by 140 (9 self)
- Add to MetaCart
(Show Context)
While stereo matching was originally formulated as the recovery of 3D shape from a pair of images, it is now generally recognized that using more than two images can dramatically improve the quality of the reconstruction. Unfortunately, as more images are added, the prevalence of semioccluded regions (pixels visible in some but not all images) also increases. In this paper, we propose some novel techniques to deal with this problem. Our first idea is to use a combination of shiftable windows and a dynamically selected subset of the neighboring images to do the matches. Our second idea is to explicitly label occluded pixels within a global energy minimization framework, and to reason about visibility within this framework so that only truly visible pixels are matched. Experimental results show a dramatic improvement using the first idea over conventional multibaseline stereo, especially when used in conjunction with a global energy minimization technique. These results also show that explicit occlusion labeling and visibility reasoning do help, but not significantly, if the spatial and temporal selection is applied first.
Global Stereo Reconstruction under Second Order Smoothness Priors
"... Second-order priors on the smoothness of 3D surfaces are a better model of typical scenes than first-order priors. However, stereo reconstruction using global inference algorithms, such as graph-cuts, has not been able to incorporate second-order priors because the triple cliques needed to express t ..."
Abstract
-
Cited by 127 (8 self)
- Add to MetaCart
Second-order priors on the smoothness of 3D surfaces are a better model of typical scenes than first-order priors. However, stereo reconstruction using global inference algorithms, such as graph-cuts, has not been able to incorporate second-order priors because the triple cliques needed to express them yield intractable (non-submodular) optimization problems. This paper shows that inference with triple cliques can be effectively optimized. Our optimization strategy is a development of recent extensions to α-expansion, based on the “QPBO ” algorithm [5, 14, 26]. The strategy is to repeatedly merge proposal depth maps using a novel extension of QPBO. Proposal depth maps can come from any source, for example fronto-parallel planes as in α-expansion, or indeed any existing stereo algorithm, with arbitrary parameter settings. Experimental results demonstrate the usefulness of the second-order prior and the efficacy of our optimization framework. An implementation of our stereo framework is available online [34].
Fast approximate energy minimization with label costs
, 2010
"... The α-expansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main contribution is to extend α-expansion so that it can simult ..."
Abstract
-
Cited by 110 (9 self)
- Add to MetaCart
(Show Context)
The α-expansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higher-order terms. Our main contribution is to extend α-expansion so that it can simultaneously optimize “label costs ” as well. An energy with label costs can penalize a solution based on the set of labels that appear in it. The simplest special case is to penalize the number of labels in the solution. Our energy is quite general, and we prove optimality bounds for our algorithm. A natural application of label costs is multi-model fitting, and we demonstrate several such applications in vision: homography detection, motion segmentation, and unsupervised image segmentation. Our C++/MATLAB implementation is publicly available.
Motion Layer Extraction in the Presence of Occlusion Using Graph Cuts
, 2005
"... Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective tra ..."
Abstract
-
Cited by 98 (9 self)
- Add to MetaCart
Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, detect the occlusion pixels over multiple consecutive frames, and segment the scene into several motion layers. First, after determining a number of seed regions using correspondences in two frames, we expand the seed regions and reject the outliers employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, an occlusion order constraint on multiple frames is explored, which enforces that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then, the correct layer segmentation is obtained by using a graph cuts algorithm and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust.