Results 21 - 30
of
206
Robust Rotation and Translation Estimation in Multiview Reconstruction
"... It is known that the problem of multiview reconstruction can be solved in two steps: first estimate camera rotations and then translations using them. This paper presents new robust techniques for both of these steps. (i) Given pairwise relative rotations, global camera rotations are estimated linea ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
It is known that the problem of multiview reconstruction can be solved in two steps: first estimate camera rotations and then translations using them. This paper presents new robust techniques for both of these steps. (i) Given pairwise relative rotations, global camera rotations are estimated linearly in least squares. (ii) Camera translations are estimated using a standard technique based on Second Order Cone Programming. Robustness is achieved by using only a subset of points according to a new criterion that diminishes the risk of chosing a mismatch. It is shown that only four points chosen in a special way are sufficient to represent a pairwise reconstruction almost equally as all points. This leads to a significant speedup. In image sets with repetitive or similar structures, non-existent epipolar geometries may be found. Due to them, some rotations and consequently translations may be estimated incorrectly. It is shown that iterative removal of pairwise reconstructions with the largest residual and reregistration removes most non-existent epipolar geometries. The performance of the proposed method is demonstrated on difficult wide base-line image sets. 1.
Interactive 3d architectural modeling from unordered photo collections
- Proc. of SIGGRAPH Asia 2008
, 2008
"... Figure 1: Our interactive image-based modeling system provides an intuitive sketch-based interface for reconstructing a photorealistic textured piecewise planar 3D model of a building or architectural scene from an unordered collection of photographs. We present an interactive system for generating ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
Figure 1: Our interactive image-based modeling system provides an intuitive sketch-based interface for reconstructing a photorealistic textured piecewise planar 3D model of a building or architectural scene from an unordered collection of photographs. We present an interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs. To reconstruct 3D geometry in our system, the user draws outlines overlaid on 2D photographs. The 3D structure is then automatically computed by combining the 2D interaction with the multi-view geometric information recovered by performing structure from motion analysis on the input photographs. We utilize vanishing point constraints at multiple stages during the reconstruction, which is particularly useful for architectural scenes where parallel lines are abundant. Our approach enables us to accurately model polygonal faces from 2D interactions in a single image. Our system also supports useful operations such as edge snapping and extrusions. Seamless texture maps are automatically generated by combining multiple input photographs using graph cut optimization and Poisson blending. The user can add brush strokes as hints during the texture generation stage to remove artifacts caused by unmodeled geometric structures. We build models for a variety of architectural scenes from collections of up to about a hundred photographs. CR Categories: I.3.7 [Computer Graphics]: Three-dimensional graphics and realism, image-based modeling, texture mapping—
Perception-motivated interpolation of image sequences
-
, 2011
"... We present a method for image interpolation that is able to create high-quality, perceptually convincing transitions between recorded images. By implementing concepts derived from human vision, the problem of a physically correct image interpolation is relaxed to that of image interpolation which is ..."
Abstract
-
Cited by 19 (11 self)
- Add to MetaCart
We present a method for image interpolation that is able to create high-quality, perceptually convincing transitions between recorded images. By implementing concepts derived from human vision, the problem of a physically correct image interpolation is relaxed to that of image interpolation which is perceived as visually correct by human observers. We find that it suffices to focus on exact edge correspondences, homogeneous regions and coherent motion to compute convincing results. A user study confirms the visual quality of the proposed image interpolation approach. We show how each aspect of our approach increases perceived quality of the result. We compare the results to other methods and assess achievable quality for different types of scenes.
From structure-from-motion point clouds to fast location recognition
- In IEEE Conference on Computer Vision and Pattern Recognition (CVPR
, 2009
"... Efficient view registration with respect to a given 3D reconstruction has many applications like inside-out tracking in indoor and outdoor environments, and geo-locating images from large photo collections. We present a fast location recognition technique based on structure from motion point clouds. ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Efficient view registration with respect to a given 3D reconstruction has many applications like inside-out tracking in indoor and outdoor environments, and geo-locating images from large photo collections. We present a fast location recognition technique based on structure from motion point clouds. Vocabulary tree-based indexing of features directly returns relevant fragments of 3D models instead of documents from the images database. Additionally, we propose a compressed 3D scene representation which improves recognition rates while simultaneously reducing the computation time and the memory consumption. The design of our method is based on algorithms that efficiently utilize modern graphics processing units to deliver real-time performance for view registration. We demonstrate the approach by matching hand-held outdoor videos to known 3D urban models, and by registering images from online photo collections to the corresponding landmarks. 1.
Building Rome on a Cloudless Day
"... Abstract. This paper introduces an approach for dense 3D reconstruction from unregistered Internet-scale photo collections with about 3 million images within the span of a day on a single PC (“cloudless”). Our method advances image clustering, stereo, stereo fusion and structure from motion to achie ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Abstract. This paper introduces an approach for dense 3D reconstruction from unregistered Internet-scale photo collections with about 3 million images within the span of a day on a single PC (“cloudless”). Our method advances image clustering, stereo, stereo fusion and structure from motion to achieve high computational performance. We leverage geometric and appearance constraints to obtain a highly parallel implementation on modern graphics processors and multi-core architectures. This leads to two orders of magnitude higher performance on an order of magnitude larger dataset than competing state-of-the-art approaches. 1
Online Metric Learning and Fast Similarity Search
"... Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once. However, in many real applications, constraints are only available incrementally, thus necess ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once. However, in many real applications, constraints are only available incrementally, thus necessitating methods that can perform online updates to the learned metric. Existing online algorithms offer bounds on worst-case performance, but typically do not perform well in practice as compared to their offline counterparts. We present a new online metric learning algorithm that updates a learned Mahalanobis metric based on LogDet regularization and gradient descent. We prove theoretical worst-case performance bounds, and empirically compare the proposed method against existing online metric learning algorithms. To further boost the practicality of our approach, we develop an online locality-sensitive hashing scheme which leads to efficient updates to data structures used for fast approximate similarity search. We demonstrate our algorithm on multiple datasets and show that it outperforms relevant baselines. 1
Rgbd mapping: Using depth cameras for dense 3d modeling of indoor environments
- In RGB-D: Advanced Reasoning with Depth Cameras Workshop in conjunction with RSS
, 2010
"... Abstract RGB-D cameras are novel sensing systems that capture RGB images along with per-pixel depth information. In this paper we investigate how such cameras can be used in the context of robotics, specifically for building dense 3D maps of indoor environments. Such maps have applications in robot ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
Abstract RGB-D cameras are novel sensing systems that capture RGB images along with per-pixel depth information. In this paper we investigate how such cameras can be used in the context of robotics, specifically for building dense 3D maps of indoor environments. Such maps have applications in robot navigation, manipulation, semantic mapping, and telepresence. We present RGB-D Mapping, a full 3D mapping system that utilizes a novel joint optimization algorithm combining visual features and shape-based alignment. Visual and depth information are also combined for view-based loop closure detection, followed by pose optimization to achieve globally consistent maps. We evaluate RGB-D Mapping on two large indoor environments, and show that it effectively combines the visual and shape information available from RGB-D cameras. 1
Skeletal graphs for efficient structure from motion
"... We address the problem of efficient structure from motion for large, unordered, highly redundant, and irregularly sampled photo collections, such as those found on Internet photo-sharing sites. Our approach computes a small skeletal subset of images, reconstructs the skeletal set, and adds the remai ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
We address the problem of efficient structure from motion for large, unordered, highly redundant, and irregularly sampled photo collections, such as those found on Internet photo-sharing sites. Our approach computes a small skeletal subset of images, reconstructs the skeletal set, and adds the remaining images using pose estimation. Our technique drastically reduces the number of parameters that are considered, resulting in dramatic speedups, while provably approximating the covariance of the full set of parameters. To compute a skeletal image set, we first estimate the accuracy of two-frame reconstructions between pairs of overlapping images, then use a graph algorithm to select a subset of images that, when reconstructed, approximates the accuracy of the full set. A final bundle adjustment can then optionally be used to restore any loss of accuracy. 1.
Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction
"... Large-scale 3D reconstruction has recently received much attention from the computer vision community. Bundle adjustment is a key component of 3D reconstruction problems. However, traditional bundle adjustment algorithms require a considerable amount of memory and computational resources. In this pa ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Large-scale 3D reconstruction has recently received much attention from the computer vision community. Bundle adjustment is a key component of 3D reconstruction problems. However, traditional bundle adjustment algorithms require a considerable amount of memory and computational resources. In this paper, we present an extremely efficient, inherently out-of-core bundle adjustment algorithm. We decouple the original problem into several submaps that have their own local coordinate systems and can be optimized in parallel. A key contribution to our algorithm is making as much progress towards optimizing the global non-linear cost function as possible using the fragments of the reconstruction that are currently in core memory. This allows us to converge with very few global sweeps (often only two) through the entire reconstruction. We present experimental results on large-scale 3D reconstruction datasets, both synthetic and real. 1.
Tour the world: Building a web-scale landmark recognition engine
- in: IEEE Conference on Computer Vision and Pattern Recognition, Electronic Proceedings
, 2009
"... Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system. This paper ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system. This paper leverages the vast amount of multimedia data on the web, the availability of an Internet image search engine, and advances in object recognition and clustering techniques, to address these issues. First, a comprehensive list of landmarks is mined from two sources: (1) ∼20 million GPS-tagged photos and (2) online tour guide web pages. Candidate images for each landmark are then obtained from photo sharing websites or by querying an image search engine. Second, landmark visual models are built by pruning candidate images using efficient image matching and unsupervised clustering techniques. Finally, the landmarks and their visual models are validated by checking authorship of their member images. The resulting landmark recognition engine incorporates 5312 landmarks from 1259 cities in 144 countries. The experiments demonstrate that the engine can deliver satisfactory recognition performance with high efficiency. 1.

