• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Computing visual correspondence with occlusions using graph cuts. (2001)

by V Kolmogorov, R Zabih
Venue:ICCV,
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 365
Next 10 →

Fast approximate energy minimization via graph cuts

by Yuri Boykov, Olga Veksler, Ramin Zabih - IEEE Transactions on Pattern Analysis and Machine Intelligence , 2001
"... In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when v ..."
Abstract - Cited by 2120 (61 self) - Add to MetaCart
In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an α-βswap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. Our second
(Show Context)

Citation Context

...ch does not treat the images symmetrically and allows inconsistent disparities. For example, two pixels in thesrst image may be assigned to one pixel in the second image. Occlusions are also ignored. =-=[28-=-] presents a stereo algorithm based on expansion moves that addresses these problems. 25 data terms D p (f p ) in Section 8.1. We use dierent interactions V p;q (f p ; f q ) and we state them for each...

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms.

by Daniel Scharstein , Richard Szeliski - In IEEE Workshop on Stereo and Multi-Baseline Vision, , 2001
"... Abstract Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame ..."
Abstract - Cited by 1546 (22 self) - Add to MetaCart
Abstract Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today's best-performing stereo algorithms.
(Show Context)

Citation Context

..., 23] encourages disparity discontinuities to coincide with intensity/color edges and appears to account for some of the good performance of global optimization approaches. Once the global energy has been defined, a variety of algorithms can be used to find a (local) minimum. Traditional approaches associated with regularization and Markov Random Fields include continuation [17], simulated annealing [47, 75, 6], highest confidence first [28], and mean-field annealing [45]. More recently, max-flow and graph-cut methods have been proposed to solve a special class of global optimization problems [92, 55, 23, 123, 65]. Such methods are more efficient than simulated annealing and have produced good results. Dynamic programming. A different class of global optimization algorithms are those based on dynamic programming. While the 2D-optimization of Equation (3) can be shown to be NP-hard for common classes of smoothness functions [123], dynamic programming can find the global minimum for independent scanlines in polynomial time. Dynamic programming was first used for stereo vision in sparse, edge-based methods [3, 83]. More recent approaches have focused on the dense (intensity-based) scanline optimization pr...

An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision

by Yuri Boykov, Vladimir Kolmogorov - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2001
"... After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time compl ..."
Abstract - Cited by 1315 (53 self) - Add to MetaCart
After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper
(Show Context)

Citation Context

...ow algorithms from combinatorial optimization can be used to minimize certain important energy functions in vision. The energies addressed by Greig et. al. and by most later graph based methods (e.g. =-=[15, 12, 2, 11, 4, 1, 18, 13, 16, 17, 3, 14-=-]) can be represented as a posterior energy in MAP-MRF 1 framework: E(L) = X p2P D p (L p ) + X (p;q)2N V p;q (L p ; L q ); (1) where L = fL p jp 2 Pg is a labeling of image P , D p () is a data penal...

What energy functions can be minimized via graph cuts?

by Vladimir Kolmogorov, Ramin Zabih - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2004
"... In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are co ..."
Abstract - Cited by 1047 (23 self) - Add to MetaCart
In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.
(Show Context)

Citation Context

... can be computed very efficiently by max flow algorithms. These methods have been successfully used for a wide variety of vision problems including image restoration [7, 8, 12, 14], stereo and motion =-=[4, 7, 8, 13, 16, 20, 21]-=-, voxel occupancy [23], multicamera scene reconstruction [18] and medical imaging [5, 6, 15]. The output ofsthese algorithms is generally a solution with some interesting theoretical quality guarantee...

Stereo matching using belief propagation

by Jian Sun, Nan-ning Zheng, Heung-yeung Shum , 2003
"... In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, ..."
Abstract - Cited by 350 (4 self) - Add to MetaCart
In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, and a binary process for occlusion. After eliminating the line process and the binary process by introducing two robust functions, we apply the belief propagation algorithm to obtain the maximum a posteriori (MAP) estimation in the Markov network. Other low-level visual cues (e.g., image segmentation) can also be easily incorporated in our stereo model to obtain better stereo results. Experiments demonstrate that our methods are comparable to the state-of-the-art stereo algorithms for many test cases.
(Show Context)

Citation Context

...rporating image segmentation improves stereo matching significantly, with 40% error reduction in B Ō. In fact, our algorithm ranks as the best for “Tsukuba” and outperforms Graph Cut (with occlusion) =-=[17]-=- which was widely considered the state-of-art stereo matching algorithm. Our algorithm competes well with other stereo algorithms for the three other data sets, “Sawtooth”, “Venus” and “Map”. It is in...

Multi-camera Scene Reconstruction via Graph Cuts

by Vladimir Kolmogorov, Ramin Zabih - in European Conference on Computer Vision , 2002
"... We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints. ..."
Abstract - Cited by 317 (9 self) - Add to MetaCart
We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints.
(Show Context)

Citation Context

...nerally viewed as too slow for early vision to be practical. Our approach is motivated by some recent work in stereo, where fast energy minimization algorithms have been developed based on graph cuts =-=[6, 7, 12, 14, 20, 21]-=-. These methods give strong experimental results in practice; for example, two recent evaluations of stereo algorithms (using real imagery with ground truth) found that a graph cut method gave the bes...

Advances in computational stereo

by Darius Burschka, Technische Universität München, Gregory D. Hager, Myron Z. Brown, Darius Burschka, Gregory D. Hager, Senior Member - IEEE Transactions on Pattern Analysis and Machine Intelligence , 2003
"... Abstract—Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo research has matured significantly throughout t ..."
Abstract - Cited by 224 (3 self) - Add to MetaCart
Abstract—Extraction of three-dimensional structure of a scene from stereo images is a problem that has been studied by the computer vision community for decades. Early work focused on the fundamentals of image correspondence and stereo geometry. Stereo research has matured significantly throughout the years and many advances in computational stereo continue to be made, allowing stereo to be applied to new and more demanding problems. In this paper, we review recent advances in computational stereo, focusing primarily on three important topics: correspondence methods, methods for occlusion, and real-time implementations. Throughout, we present tables that summarize and draw distinctions among key ideas and approaches. Where available, we provide comparative analyses and we make suggestions for analyses yet to be done. Index Terms—Computational stereo, stereo correspondence, occlusion, real-time stereo, review. æ 1
(Show Context)

Citation Context

...mage, ground truth, Muhlmann et al.’s area correlation algorithm [57], dynamic programming (similar to Intille and Bobick [36]), Roy and Cox’s maximum flow [65], and Kolmogorov and Zabih’s graph=-= cuts [45]-=-.s996 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 8, AUGUST 2003 TABLE 2 Common Block-Matching Methods (See Fig. 4 for Visual Description of Terms) cross correlation, ...

Stereo Processing by Semi-Global Matching and Mutual Information

by Heiko Hirschmüller - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE , 2007
"... This paper describes the Semi-Global Matching (SGM) stereo method. It uses a pixelwise, Mutual Information based matching cost for compensating radiometric differences of input images. Pixelwise matching is supported by a smoothness constraint that is usually expressed as a global cost function. SGM ..."
Abstract - Cited by 218 (1 self) - Add to MetaCart
This paper describes the Semi-Global Matching (SGM) stereo method. It uses a pixelwise, Mutual Information based matching cost for compensating radiometric differences of input images. Pixelwise matching is supported by a smoothness constraint that is usually expressed as a global cost function. SGM performs a fast approximation by pathwise optimizations from all directions. The discussion also addresses occlusion detection, subpixel refinement and multi-baseline matching. Additionally, postprocessing steps for removing outliers, recovering from specific problems of structured environments and the interpolation of gaps are presented. Finally, strategies for processing almost arbitrarily large images and fusion of disparity images using orthographic projection are proposed. A comparison on standard stereo images shows that SGM is among the currently top-ranked algorithms and is best, if subpixel accuracy is considered. The complexity is linear to the number of pixels and disparity range, which results in a runtime of just 1-2s on typical test images. An in depth evaluation of the Mutual Information based matching cost demonstrates a tolerance against a wide range of radiometric transformations. Finally, examples of reconstructions from huge aerial frame and pushbroom images demonstrate that the presented ideas are working well on practical problems.

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

by Andreas Geiger, Philip Lenz, Raquel Urtasun
"... Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks fo ..."
Abstract - Cited by 174 (10 self) - Add to MetaCart
Today, visual recognition systems are still rarely employed in robotics applications. Perhaps one of the main reasons for this is the lack of demanding benchmarks that mimic such scenarios. In this paper, we take advantage of our autonomous driving platform to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry / SLAM and 3D object detection. Our recording platform is equipped with four high resolution video cameras, a Velodyne laser scanner and a state-of-the-art localization system. Our benchmarks comprise 389 stereo and optical flow image pairs, stereo visual odometry sequences of 39.2 km length, and more than 200k 3D object annotations captured in cluttered scenarios (up to 15 cars and 30 pedestrians are visible per image). Results from state-of-the-art algorithms reveal that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world. Our goal is to reduce this bias by providing challenging benchmarks with novel difficulties to the computer vision community. Our benchmarks are available online at: www.cvlibs.net/datasets/kitti 1.
(Show Context)

Citation Context

...d that algorithms ranking high on existing benchmarks often fail when confronted with more realistic scenarios. This section tells their story. 3.1. Stereo Matching For stereo matching, we run global =-=[26, 37, 46]-=-, semiglobal [23], local [5, 20, 38] and seed-growing [27, 10, 9] methods. The parameter settings we have employed can be found on www.cvlibs.net/datasets/kitti. Missing disparities are filled-in for ...

Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure

by Andreas Klaus, et al.
"... A novel stereo matching algorithm is proposed that utilizes color segmentation on the reference image and a selfadapting matching score that maximizes the number of reliable correspondences. The scene structure is modeled by a set of planar surface patches which are estimated using a new technique t ..."
Abstract - Cited by 171 (0 self) - Add to MetaCart
A novel stereo matching algorithm is proposed that utilizes color segmentation on the reference image and a selfadapting matching score that maximizes the number of reliable correspondences. The scene structure is modeled by a set of planar surface patches which are estimated using a new technique that is more robust to outliers. Instead of assigning a disparity value to each pixel, a disparity plane is assigned to each segment. The optimal disparity plane labeling is approximated by applying belief propagation. Experimental results using the Middlebury stereo test bed demonstrate the superior performance of the proposed method.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University