Results 1 - 10
of
15
Putting objects in perspective
- In CVPR
, 2006
"... Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface ..."
Abstract
-
Cited by 106 (10 self)
- Add to MetaCart
Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach. 1.
S.: Counting crowded moving objects
, 2006
"... In its full generality, motion analysis of crowded objects necessitates recognition and segmentation of each moving entity. The difficulty of these tasks increases considerably with occlusions and therefore with crowding. When the objects are constrained to be of the same kind, however, partitioning ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In its full generality, motion analysis of crowded objects necessitates recognition and segmentation of each moving entity. The difficulty of these tasks increases considerably with occlusions and therefore with crowding. When the objects are constrained to be of the same kind, however, partitioning of densely crowded semi-rigid objects can be accomplished by means of clustering tracked feature points. We base our approach on a highly parallelized version of the KLT tracker in order to process the video into a set of feature trajectories. While such a set of trajectories provides a substrate for motion analysis, their unequal lengths and fragmented nature present difficulties for subsequent processing. To address this, we propose a simple means of spatially and temporally conditioning the trajectories. Given this representation, we integrate it with a learned object descriptor to achieve a segmentation of the constituent motions. We present experimental results for the problem of estimating the number of moving objects in a dense crowd as a function of time. 1
What Can Casual Walkers Tell Us About A 3D Scene?
, 2007
"... An approach for incremental learning of a 3D scene from a single static video camera is presented in this paper. In particular, we exploit the presence of casual people walking in the scene to infer relative depth, learn shadows, and segment the critical ground structure. Considering that this type ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
An approach for incremental learning of a 3D scene from a single static video camera is presented in this paper. In particular, we exploit the presence of casual people walking in the scene to infer relative depth, learn shadows, and segment the critical ground structure. Considering that this type of video data is so ubiquitous, this work provides an important step towards 3D scene analysis from single cameras in readily available ordinary videos and movies. On-line 3D scene learning, as presented here, is very important for applications such as scene analysis, foreground refinement, tracking, biometrics, automated camera collaboration, activity analysis, identification, and real-time computer-graphics applications. The main contributions of this work are then two-fold. First, we use the people in the scene to continuously learn and update the 3D scene parameters using an incremental robust (L1) error minimization. Secondly, models of shadows in the scene are learned using a statistical framework. A symbiotic relationship between the shadow model and the estimated scene geometry is exploited towards incremental mutual improvement. We illustrate the effectiveness of the proposed framework with applications in foreground refinement, automatic segmentation as well as relative depth mapping of the floor/ground, and estimation of 3D trajectories of people in the scene. 1.
Autocalibration from tracks of walking people
- in Proc. British Machine Vision Conference (BMVC
, 2006
"... It has been shown that under a small number of assumptions, observations of people can be used to obtain metric calibration information of a camera, which is particularly useful for surveillance applications. However, previous work had to exclude the common criticial configuration of the camera’s pr ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
It has been shown that under a small number of assumptions, observations of people can be used to obtain metric calibration information of a camera, which is particularly useful for surveillance applications. However, previous work had to exclude the common criticial configuration of the camera’s principal point falling on the horizon line and very long focal lengths, both of which occur commonly in practise. Due to noise, the quality of the calibration quickly degrades at and in the vicinity of these configurations. This paper provides a robust solution to this problem by incorporating information about the motion of people into the estimation process. It is shown that under the assumption that people walk with a constant velocity, calibration performance can be improved significantly. In addition to solving the above problem, the incorporation of temporal data also helps to take correlations between subsequent detections into consideration, which leads to an up-front reduction of the noise in the measurements and an overall improvement in auto-calibration performance. 1
An Integrated Background Model for Video Surveillance Based on Primal Sketch and 3D Scene Geometry
"... This paper presents a novel integrated background model for video surveillance. Our model uses a primal sketch representation for image appearance and 3D scene geometry to capture the ground plane and major surfaces in the scene. The primal sketch model divides the background image into three types ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper presents a novel integrated background model for video surveillance. Our model uses a primal sketch representation for image appearance and 3D scene geometry to capture the ground plane and major surfaces in the scene. The primal sketch model divides the background image into three types of regions — flat, sketchable and textured. The three types of regions are modeled respectively by mixture of Gaussians, image primitives and LBP histograms. We calibrate the camera and recover important planes such as ground, horizontal surfaces, walls, stairs in the 3D scene, and use geometric information to predict the sizes and locations of foreground blobs to further reduce false alarms. Compared with the state-of-theart background modeling methods, our approach is more effective, especially for indoor scenes where shadows, highlights and reflections of moving objects and camera exposure adjusting usually cause problems. Experiment results demonstrate that our approach improves the performance of background/foreground separation at pixel level, and the integrated video surveillance system at the object and trajectory level. 1.
Camera Calibration for Uneven Terrains by Observing Pedestrians
"... A calibrated camera is essential for computer vision systems. The prime reason being that such a camera acts as an angle measuring device. Once the camera is calibrated, applications like 3D reconstruction or metrology or other applications requiring real world information from the video sequences c ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A calibrated camera is essential for computer vision systems. The prime reason being that such a camera acts as an angle measuring device. Once the camera is calibrated, applications like 3D reconstruction or metrology or other applications requiring real world information from the video sequences can be envisioned. Motivated by this, we address the problem of calibrating multiple cameras, with an overlapping field of view (FoV), observing pedestrians in a scene walking on an uneven terrain. This problem of calibration from an uneven terrain has so far not been addressed in the vision community. We automatically estimated the infinite homography between the cameras by using the special geometric information obtained from observing pedestrians. This homography provides constraints on the intrinsic (or interior) camera parameters while also enabling us to estimate the extrinsic (or exterior) camera parameters. We test the proposed method on real as well as synthetic data; encouraging results demonstrate the applicability of the proposed method. 1
London, UK Camera Auto-Calibration from Articulated Motion
"... This paper presents a novel auto-calibration method from unconstrained human body motion. It relies on the underlying biomechanical constraints associated with human bipedal locomotion. By analysing positions of key points during a sequence, our technique is able to detect frames where the human bod ..."
Abstract
- Add to MetaCart
This paper presents a novel auto-calibration method from unconstrained human body motion. It relies on the underlying biomechanical constraints associated with human bipedal locomotion. By analysing positions of key points during a sequence, our technique is able to detect frames where the human body adopts a particular posture which ensures the coplanarity of those key points and therefore allows a successful camera calibration. Our technique includes a 3D model adaptation phase which removes the requirement for a precise geometrical 3D description of those points. Our method is validated using a variety of human bipedal motions and camera configurations. 1.
Object detection · Camera calibration · 3D reconstruction ·
"... Abstract Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, ..."
Abstract
- Add to MetaCart
Abstract Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach.
Self-calibrating Cameras . . .
, 2008
"... This thesis addresses the automatic calibration of two static surveillance cameras in a manmade world with orthogonal and parallel structures and a common ground plane. An approach is taken where the calibration of the interior orientation, the undistortion of the lens and the calibration of a cam ..."
Abstract
- Add to MetaCart
This thesis addresses the automatic calibration of two static surveillance cameras in a manmade world with orthogonal and parallel structures and a common ground plane. An approach is taken where the calibration of the interior orientation, the undistortion of the lens and the calibration of a camera’s rotation to the world perform before calibrating the camera centers, which allows methods that work in slightly overlapping as in non-overlapping views. We present a new incremental calibration composed of Expectation Maximization and Simulated Annealing that uses the uncertainties of noisy line segments to process a video stream instead of a single image. The advantage of video is that orthogonal and parallel edge
CVPR #1003 CVPR 2012 Submission #1003. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE. CVPR
"... We present an approach which exploits the coupling between human actions and scene geometry. We investigate the use of human pose as a cue for single-view 3D scene understanding. Our method builds upon recent advances in still-image action recognition and pose estimation, to extract functional and g ..."
Abstract
- Add to MetaCart
We present an approach which exploits the coupling between human actions and scene geometry. We investigate the use of human pose as a cue for single-view 3D scene understanding. Our method builds upon recent advances in still-image action recognition and pose estimation, to extract functional and geometric constraints about the scene from people detections. These constraints are then used to improve state-of-the-art single-view 3D scene understanding approaches. The proposed method is validated on a collection of single-viewpoint time-lapse image sequences as well as a dataset of still images of indoor scenes. We demonstrate that observing people performing different actions can significantly improve estimates of scene geometry and 3D layout. 1.

