Results 1 - 10
of
47
Multi-View Stereo for Community Photo Collections
"... We present a multi-view stereo algorithm that addresses the extreme changes in lighting, scale, clutter, and other effects in large online community photo collections. Our idea is to intelligently choose images to match, both at a per-view and per-pixel level. We show that such adaptive view selecti ..."
Abstract
-
Cited by 187 (23 self)
- Add to MetaCart
(Show Context)
We present a multi-view stereo algorithm that addresses the extreme changes in lighting, scale, clutter, and other effects in large online community photo collections. Our idea is to intelligently choose images to match, both at a per-view and per-pixel level. We show that such adaptive view selection enables robust performance even with dramatic appearance variability. The stereo matching technique takes as input sparse 3D points reconstructed from structure-from-motion methods and iteratively grows surfaces from these points. Optimizing for surface normals within a photoconsistency measure significantly improves the matching results. While the focus of our approach is to estimate high-quality depth maps, we also show examples of merging the resulting depth maps into compelling scene reconstructions. We demonstrate our algorithm on standard multi-view stereo datasets and on casually acquired photo collections of famous scenes gathered from the Internet. 1
R.: Using multiple hypotheses to improve depth-maps for multi-view stereo
- In: Proc. 10 th Europ. Conf. on Computer Vision (ECCV) (2008
"... Abstract. We propose an algorithm to improve the quality of depth-maps used for Multi-View Stereo (MVS). Many existing MVS techniques make use of a two stage approach which estimates depth-maps from neighbouring images and then merges them to extract a final surface. Often the depth-maps used for th ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
(Show Context)
Abstract. We propose an algorithm to improve the quality of depth-maps used for Multi-View Stereo (MVS). Many existing MVS techniques make use of a two stage approach which estimates depth-maps from neighbouring images and then merges them to extract a final surface. Often the depth-maps used for the merging stage will contain outliers due to errors in the matching process. Traditional systems exploit redundancy in the image sequence (the surface is seen in many views), in order to make the final surface estimate robust to these outliers. In the case of sparse data sets there is often insufficient redundancy and thus performance degrades as the number of images decreases. In order to improve performance in these circumstances it is necessary to remove the outliers from the depth-maps. We identify the two main sources of outliers in a top performing algorithm: (1) spurious matches due to repeated texture and (2) matching failure due to occlusion, distortion and lack of texture. We propose two contributions to tackle these failure modes. Firstly, we store multiple depth hypotheses and use a spatial consistently constraint to extract the true depth. Secondly, we allow the algorithm to return an unknown state when the a true depth estimate cannot be found. By combining these in a discrete label MRF optimisation we are able to obtain high accuracy depthmaps with low numbers of outliers. We evaluate our algorithm in a multi-view stereo framework and find it to confer state-of-the-art performance with the leading techniques, in particular on the standard evaluation sparse data sets. 1
Accurate multiview reconstruction using robust binocular stereo and surface meshing
- IN PROC. OF CVPR
, 2008
"... This paper presents a new algorithm for multi-view reconstruction that demonstrates both accuracy and efficiency. Our method is based on robust binocular stereo matching, followed by adaptive point-based filtering of the merged point clouds, and efficient, high-quality mesh generation. All aspects o ..."
Abstract
-
Cited by 50 (9 self)
- Add to MetaCart
(Show Context)
This paper presents a new algorithm for multi-view reconstruction that demonstrates both accuracy and efficiency. Our method is based on robust binocular stereo matching, followed by adaptive point-based filtering of the merged point clouds, and efficient, high-quality mesh generation. All aspects of our method are designed to be highly scalable with the number of views. Our technique produces the most accurate results among current algorithms for a sparse number of viewpoints according to the Middlebury datasets. Additionally, we prove to be the most efficient method among non-GPU algorithms for the same datasets. Finally, our scaled-window matching technique also excels at reconstructing deformable objects with high-curvature surfaces, which we demonstrate with a number of examples.
Manipulator and object tracking for in-hand 3D object modeling
- Int’l Journal of Robotics Research (IJRR
, 2011
"... Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically trained using dat ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
(Show Context)
Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically trained using data generated in an offline process. As a result, they do not enable a robot to acquire new object models as it operates in an environment. In this paper, we develop an approach to building 3D models of unknown objects based on a depth camera observing the robot’s hand while moving an object. The approach integrates both shape and appearance information into an articulated ICP approach to track the robot’s manipulator and the object. Objects are modeled by sets of surfels, which are small patches providing occlusion and appearance information. Experiments show that our approach provides very good 3D models even when the object is highly symmetric and lacks visual features and the manipulator motion is noisy. Autonomous object modeling represents a step toward improved semantic understanding, which will eventually enable robots to reason about their environments in terms of objects and their relations rather than through raw sensor data. 1
Anisotropic Minimal Surfaces Integrating Photoconsistency and Normal Information for Multiview Stereo
- In European Conference on Computer Vision
, 2010
"... Abstract. In this work the weighted minimal surface model traditionally used in multiview stereo is revisited. We propose to generalize the classical photoconsistency-weighted minimal surface approach by means of an anisotropic metric which allows to integrate a specified surface orientation into th ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
(Show Context)
Abstract. In this work the weighted minimal surface model traditionally used in multiview stereo is revisited. We propose to generalize the classical photoconsistency-weighted minimal surface approach by means of an anisotropic metric which allows to integrate a specified surface orientation into the optimization process. In contrast to the conventional isotropic case, where all spatial directions are treated equally, the anisotropic metric adaptively weights the regularization along different directions so as to favor certain surface orientations over others. We show that the proposed generalization preserves all properties and globality guarantees of continuous convex relaxation methods. We make use of a recently introduced efficient primal-dual algorithm to solve the arising saddle point problem. In multiple experiments on real image sequences we demonstrate that the proposed anisotropic generalization allows to overcome oversmoothing of small-scale surface details, giving rise to more precise reconstructions. 1
Continuous depth estimation for multi-view stereo
- Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
, 2009
"... Depth-map merging approaches have become more and more popular in multi-view stereo (MVS) because of their flexibility and superior performance. The quality of depth map used for merging is vital for accurate 3D reconstruction. While traditional depth map estimation has been performed in a discrete ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
(Show Context)
Depth-map merging approaches have become more and more popular in multi-view stereo (MVS) because of their flexibility and superior performance. The quality of depth map used for merging is vital for accurate 3D reconstruction. While traditional depth map estimation has been performed in a discrete manner, we suggest the use of a continuous counterpart. In this paper, we first integrate silhouette information and epipolar constraint into the variational method for continuous depth map estimation. Then, several depth candidates are generated based on a multiple starting scales (MSS) framework. From these candidates, refined depth maps for each view are synthesized according to path-based NCC (normalized cross correlation) metric. Finally, the multiview depth maps are merged to produce 3D models. Our algorithm excels at detail capture and produces one of the most accurate results among the current algorithms for sparse MVS datasets according to the Middlebury benchmark. Additionally, our approach shows its outstanding robustness and accuracy in free-viewpoint video scenario. 1.
Interactive pixelaccurate free viewpoint rendering from images with silhouette aware sampling
- Computer Graphics Forum
"... We present an integrated, fully GPU-based processing pipeline to interactively render new views of arbitrary scenes from calibrated but otherwise unstructured input views. In a two-step procedure, our method first generates for each input view a dense proxy of the scene using a new multi-view stereo ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
We present an integrated, fully GPU-based processing pipeline to interactively render new views of arbitrary scenes from calibrated but otherwise unstructured input views. In a two-step procedure, our method first generates for each input view a dense proxy of the scene using a new multi-view stereo formulation. Each scene proxy consists of a structured cloud of feature aware particles which automatically have their image space footprints aligned to depth discontinuities of the scene geometry and hence effectively handle sharp object boundaries and occlusions. We propose a particle optimization routine combined with a special parameterization of the view space which enables an efficient proxy generation as well as robust and intuitive filter operators for noise and outlier removal. Moreover, our generic proxy generation allows us to flexibly handle scene complexities ranging from small objects up to complete outdoor scenes. The second phase of the algorithm combines these particle clouds in real-time into a view-dependent proxy for the desired output view and performs a pixel-accurate accumulation of the color contributions from each available input view. This makes it possible to reconstruct even fine-scale view-dependent illumination effects. We demonstrate how all these processing stages of the pipeline can be implemented entirely on the GPU with memory efficient, scalable data structures for maximum performance. This allows us to generate new output renderings of high visual quality from input images in real-time.
Incremental free-space carving for real-time 3d reconstruction
, 2010
"... Almost all current multi-view methods are slow, and thus suited to offline reconstruction. This paper presents a set of heuristic space-carving algorithms with a focus on speed over detail. The algorithms discretize space via the 3D Delaunay triangulation, and they carve away the volumes that violat ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Almost all current multi-view methods are slow, and thus suited to offline reconstruction. This paper presents a set of heuristic space-carving algorithms with a focus on speed over detail. The algorithms discretize space via the 3D Delaunay triangulation, and they carve away the volumes that violate free-space or visibility constraints. Whereas similar methods exist, our algorithms are fast and fully incremental. They encompass a dynamic event-driven approach to reconstruction that is suitable for integration with online SLAM or Structure-from-Motion. We integrate our algorithms with PTAM [12], and we realize a complete system that reconstructs 3D geometry from video in real-time. Experiments on typical real-world inputs demonstrate online performance with modest hardware. We provide run-time complexity analysis and show that the perevent processing time is independent of the number of images previously processed: a requirement for real-time operation on lengthy image sequences.
Manipulator and Object Tracking for In Hand Model Acquisition
"... Abstract — Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically traine ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Abstract — Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically trained using data generated in an offline process. As a result, they do not enable a robot to acquire new object models as it operates in an environment. In this paper, we develop an approach to building 3D models of unknown objects based on a depth camera observing the robot’s hand while moving an object. The approach integrates both shape and appearance information into an articulated ICP approach to track the robot’s manipulator and the object. Objects are modeled by sets of surfels, which are small patches providing occlusion and appearance information. Experiments show that our approach provides very good 3D models even when the object is highly symmetric and lacking visual features and the manipulator motion is noisy. I.