Results 1 -
7 of
7
Camera Self-Calibration for Sequential Bayesian Structure From Motion
"... Abstract — Computer vision researchers have proved the feasibility of camera self-calibration —the estimation of a camera’s internal parameters from an image sequence without any known scene structure. Various self-calibration algorithms have been published. Nevertheless, all of the recent sequentia ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract — Computer vision researchers have proved the feasibility of camera self-calibration —the estimation of a camera’s internal parameters from an image sequence without any known scene structure. Various self-calibration algorithms have been published. Nevertheless, all of the recent sequential approaches to 3D structure and motion estimation from image sequences which have arisen in robotics and aim at real-time operation (often classed as visual SLAM or visual odometry) have relied on pre-calibrated cameras and have not attempted online calibration. In this paper, we present a sequential filtering algorithm for simultaneous estimation of 3D scene estimation, camera trajectory and full camera calibration from a sequence of fixed but unknown calibration. This calibration comprises the standard projective parameters of focal length and principal point along with two radial distortion coefficients. To deal with the large non-linearities introduced by the unknown calibration parameters, we use a Sum of Gaussians (SOG) filter rather than the simpler Extended Kalman Filter (EKF). To our knowledge, this is the first sequential Bayesian autocalibration algorithm which achieves complete fixed camera calibration using as input only a sequence of uncalibrated monocular images. The approach is validated with experimental results using natural images, including a demonstration of loop closing for a sequence with unknown camera calibration. I.
1-Point RANSAC for EKF-Based Structure from Motion
"... (SfM) techniques have been combined with non-linear global optimization (Bundle Adjustment, BA) over a sliding window to recursively provide camera pose and feature location estimation from long image sequences. Normally called Visual Odometry, these algorithms are nowadays able to estimate with imp ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(SfM) techniques have been combined with non-linear global optimization (Bundle Adjustment, BA) over a sliding window to recursively provide camera pose and feature location estimation from long image sequences. Normally called Visual Odometry, these algorithms are nowadays able to estimate with impressive accuracy trajectories of hundreds of meters; either from an image sequence (usually stereo) as the only input, or combining visual and propioceptive information from inertial sensors or wheel odometry. This paper has a double objective. First, we aim to illustrate for the first time how similar accuracy and trajectory length can be achieved by filtering-based visual SLAM methods. Specifically, a camera-centered Extended Kalman Filter is used here to process a monocular sequence as the only input, with 6DOF motion estimated. Features are kept live in the filter while visible as the camera explores forward, and are deleted from the state once they go out of view. This permits an increase in the number of tracked features per frame from tens to around a hundred. While improving the accuracy of the estimation, it makes computationally infeasible the exhaustive Branch and Bound search performed by standard JCBB for match outlier rejection. As a second contribution that overcomes this problem, we present here a RANSAC-like algorithm that exploits the probabilistic prediction of the filter. This use of prior information makes it possible to reduce the size of the minimal data subset to instantiate a hypothesis to the minimum possible of 1 point, greatly increasing the efficiency of the outlier rejection stage. Experimental results from real image sequences covering trajectories of hundreds of meters are presented and compared against RTK GPS ground truth. Estimation errors are about 1 % of the trajectory for trajectories up to 650 metres.
Automatically and Efficiently Inferring the Hierarchical Structure of Visual Maps
"... (SLAM), it is well known that probabilistic filtering approaches which aim to estimate the robot and map state sequentially suffer from poor computational scaling to large map sizes. Various authors have demonstrated that this problem can be mitigatedbyapproximationswhichtreatestimatesoffeaturesin d ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(SLAM), it is well known that probabilistic filtering approaches which aim to estimate the robot and map state sequentially suffer from poor computational scaling to large map sizes. Various authors have demonstrated that this problem can be mitigatedbyapproximationswhichtreatestimatesoffeaturesin different parts of a map as conditionally independent, allowing them to be processed separately. When it comes to the choice of how to divide a large map into such ‘submaps’, straightforward heuristics may be sufficient in maps built using sensors such as laser range-finders with limited range, where a regular grid of submap boundaries performs well. With visual sensing, however, the ideal division of submaps is less clear, since a camera has potentially unlimited range and will often observe spatially distant parts of a scene simultaneously. In this paper we present an efficient and generic method for automatically determining a suitable submap division for SLAM maps, and apply this to visual maps built with a single agile camera. We use the mutual information between predicted measurements of features as an absolute measure of correlation, and cluster highly correlated features into groups. Via tree factorisation, we are able to determine not just a single level submap division but a powerful fully hierarchical correlation and clustering structure. Our analysis and experiments reveal particularly interesting structure in visual maps and give pointerstomoreefficientapproximatevisualSLAMalgorithms. I.
Simultaneous Point Matching and 3D Deformable Surface Reconstruction ∗
"... It has been shown that the 3D shape of a deformable surface in an image can be recovered by establishing correspondences between that image and a reference one in which the shape is known. These matches can then be used to set-up a convex optimization problem in terms of the shape parameters, which ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
It has been shown that the 3D shape of a deformable surface in an image can be recovered by establishing correspondences between that image and a reference one in which the shape is known. These matches can then be used to set-up a convex optimization problem in terms of the shape parameters, which is easily solved. However, in many cases, the correspondences are hard to establish reliably. In this paper, we show that we can solve simultaneously for both 3D shape and correspondences, thereby using 3D shape constraints to guide the image matching and increasing robustness, for example when the textures are repetitive. This involves solving a mixed integer quadratic problem. While optimizing this problem is NP-hard in general, we show that its solution can nevertheless be approximated effectively by a branch-and-bound algorithm. 1.
1-Point RANSAC for EKF Filtering. Application to Real-Time Structure from Motion and Visual
"... Random Sample Consensus (RANSAC) has become one of the most successful techniques for robust estimation from a data set that may contain outliers. It works by constructing model hypotheses from random minimal data subsets and evaluating their validity from the support of the whole data. In this pape ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Random Sample Consensus (RANSAC) has become one of the most successful techniques for robust estimation from a data set that may contain outliers. It works by constructing model hypotheses from random minimal data subsets and evaluating their validity from the support of the whole data. In this paper we present a novel combination of RANSAC plus Extended Kalman Filter (EKF) that uses the available prior probabilistic information from the EKF in the RANSAC model hypothesize stage. This allows the minimal sample size to be reduced to one, resulting in large computational savings without the loss of discriminative power. 1-Point RANSAC is shown to outperform both in accuracy and computational cost the Joint Compatibility Branch and Bound (JCBB) algorithm, a gold-standard technique for spurious rejection within the EKF framework. Two visual estimation scenarios are used in the experiments: First, six degrees of freedom motion estimation from a monocular sequence (Structure from Motion). Here, a new method for benchmarking six DOF visual estimation algorithms based on the use of high resolution images is presented, validated and used to show the superiority of 1-Point RANSAC. Second, we demonstrate long-term robot trajectory estimation combining monocular vision and wheel odometry (Visual Odometry). Here, a comparison against GPS shows an accuracy comparable to state-of-the-art visual odometry methods. 1
Real-Time Camera Tracking: When is High Frame-Rate Best?
"... Abstract. Higher frame-rates promise better tracking of rapid motion, but advanced real-time vision systems rarely exceed the standard 10– 60Hz range, arguing that the computation required would be too great. Actually, increasing frame-rate is mitigated by reduced computational cost per frame in tra ..."
Abstract
- Add to MetaCart
Abstract. Higher frame-rates promise better tracking of rapid motion, but advanced real-time vision systems rarely exceed the standard 10– 60Hz range, arguing that the computation required would be too great. Actually, increasing frame-rate is mitigated by reduced computational cost per frame in trackers which take advantage of prediction. Additionally, when we consider the physics of image formation, high frame-rate implies that the upper bound on shutter time is reduced, leading to less motion blur but more noise. So, putting these factors together, how are application-dependent performance requirements of accuracy, robustness and computational cost optimised as frame-rate varies? Using 3D camera tracking as our test problem, and analysing a fundamental dense whole image alignment approach, we open up a route to a systematic investigation via the careful synthesis of photorealistic video using ray-tracing of a detailed 3D scene, experimentally obtained photometric response and noise models, and rapid camera motions. Our multi-frame-rate, multiresolution, multi-light-level dataset is based on tens of thousands of hours of CPU rendering time. Our experiments lead to quantitative conclusions about frame-rate selection and highlight the crucial role of full consideration of physical image formation in pushing tracking performance. 1

