Results 1 - 10
of
152
Floor Fields for Tracking in High Density Crowd Scenes
- In Proc. of European Conf on Computer Vision
, 2008
"... Abstract. This paper presents an algorithm for tracking individual targets in high density crowd scenes containing hundreds of people. Tracking in such a scene is extremely challenging due to the small number of pixels on the target, appearance ambiguity resulting from the dense packing, and severe ..."
Abstract
-
Cited by 102 (6 self)
- Add to MetaCart
(Show Context)
Abstract. This paper presents an algorithm for tracking individual targets in high density crowd scenes containing hundreds of people. Tracking in such a scene is extremely challenging due to the small number of pixels on the target, appearance ambiguity resulting from the dense packing, and severe inter-object occlusions. The novel tracking algorithm, which is outlined in this paper, will overcome these challenges using a scene structure based force model. In this force model an individual, when moving in a particular scene, is subjected to global and local forces that are functions of the layout of that scene and the locomotive behavior of other individuals in the scene. The key ingredients of the force model are three floor fields, which are inspired by the research in the field of evacuation dynamics, namely Static Floor Field (SFF), Dynamic Floor Field (DFF), and Boundary Floor Field (BFF). These fields determine the probability of move from one location to another by converting the long-range forces into local ones. The SFF specifies regions of the scene which are attractive in nature (e.g. an exit location). The DFF specifies the immediate behavior of the crowd in the vicinity of the individual being tracked. The BFF specifies influences exhibited by the barriers in the scene (e.g. walls, no-go areas). By combining cues from all three fields with the available appearance information, we track individual targets in high density crowds. 1
A linear programming approach for multiple object tracking
- in IEEE Conf. on Computer Vision and Pattern Recognition
, 2007
"... We propose a linear programming relaxation scheme for the class of multiple object tracking problems where the inter-object interaction metric is convex and the intraobject term quantifying object state continuity may use any metric. The proposed scheme models object tracking as a multi-path searchi ..."
Abstract
-
Cited by 84 (1 self)
- Add to MetaCart
(Show Context)
We propose a linear programming relaxation scheme for the class of multiple object tracking problems where the inter-object interaction metric is convex and the intraobject term quantifying object state continuity may use any metric. The proposed scheme models object tracking as a multi-path searching problem. It explicitly models track interaction, such as object spatial layout consistency or mutual occlusion, and optimizes multiple object tracks simultaneously. The proposed scheme does not rely on track initialization and complex heuristics. It has much less average complexity than previous efficient exhaustive search methods such as extended dynamic programming and is found to be able to find the global optimum with high probability. We have successfully applied the proposed method to multiple object tracking in video streams. 1.
Using particles to track varying numbers of interacting people
- In CVPR
, 2005
"... In this paper, we present a Bayesian framework for the fully automatic tracking of a variable number of interacting targets using a fixed camera. This framework uses a joint multi-object state-space formulation and a transdimensional Markov Chain Monte Carlo (MCMC) particle filter to recursively est ..."
Abstract
-
Cited by 72 (3 self)
- Add to MetaCart
(Show Context)
In this paper, we present a Bayesian framework for the fully automatic tracking of a variable number of interacting targets using a fixed camera. This framework uses a joint multi-object state-space formulation and a transdimensional Markov Chain Monte Carlo (MCMC) particle filter to recursively estimate the multi-object configuration and efficiently search the state-space. We also define a global observation model comprised of color and binary measurements capable of discriminating between different numbers of objects in the scene. We present results which show that our method is capable of tracking varying numbers of people through several challenging real-world tracking situations such as full/partial occlusion and entering/leaving the scene. 1.
Tracking Multiple Occluding People by Localizing on Multiple Scene Planes
"... Abstract—Occlusion and lack of visibility in crowded and cluttered scenes make it difficult to track individual people correctly and consistently, particularly in a single view. We present a multiview approach to solve this problem. In our approach, we neither detect nor track objects from any singl ..."
Abstract
-
Cited by 54 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Occlusion and lack of visibility in crowded and cluttered scenes make it difficult to track individual people correctly and consistently, particularly in a single view. We present a multiview approach to solve this problem. In our approach, we neither detect nor track objects from any single camera or camera pair; rather, evidence is gathered from all of the cameras into a synergistic framework and detection and tracking results are propagated back to each view. Unlike other multiview approaches that require fully calibrated views, our approach is purely image-based and uses only 2D constructs. To this end, we develop a planar homographic occupancy constraint that fuses foreground likelihood information from multiple views to resolve occlusions and localize people on a reference scene plane. For greater robustness, this process is extended to multiple planes parallel to the reference plane in the framework of plane to plane homologies. Our fusion methodology also models scene clutter using the Schmieder and Weathersby clutter measure, which acts as a confidence prior, to assign higher fusion weight to views with lesser clutter. Detection and tracking are performed simultaneously by graph cuts segmentation of tracks in the space-time occupancy likelihood data. Experimental results with detailed qualitative and quantitative analysis are demonstrated in challenging multiview crowded scenes. Index Terms—Tracking, sensor fusion, graph-theoretic methods. Ç 1
High-throughput ethomics in large groups of Drosophila. Nature Methods. 2009; 6:451–457. [PubMed: 19412169
"... We present a camera-based method for automatically quantifying the individual and social behaviors of fruit flies, Drosophila melanogaster, interacting within a planar arena. Our system includes machine vision algorithms that accurately track many individuals without swapping identities and classifi ..."
Abstract
-
Cited by 50 (6 self)
- Add to MetaCart
We present a camera-based method for automatically quantifying the individual and social behaviors of fruit flies, Drosophila melanogaster, interacting within a planar arena. Our system includes machine vision algorithms that accurately track many individuals without swapping identities and classification algorithms that detect behaviors. The data may be represented as an ethogram that plots the time course of behaviors exhibited by each fly, or as a vector that concisely captures the statistical properties of all behaviors displayed within a given period. We found that behavioral differences between individuals are consistent over time and are sufficient to accurately predict gender and genotype. In addition, we show that the relative positions of flies during social interactions vary according to gender, genotype, and social environment. We expect that our software, which permits high-throughput screening, will complement existing molecular methods available in Drosophila, facilitating new investigations into the genetic and cellular basis of behavior. The fruit fly, Drosophila melanogaster, has emerged as an important genetic model organism for the study of neurobiology and behavior. Research on fruit flies has led to
Multi-target tracking - linking identities using Bayesian network inference
- In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, IEEE Computer Society Press, Los Alamitos (2006) Person Tracking Within Crowded Scenes 179
, 2006
"... Multi-target tracking requires locating the targets and labeling their identities. The latter is a challenge when many targets, with indistinct appearances, frequently occlude one another, as in football and surveillance tracking. We present an approach to solving this labeling problem. When isolate ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
(Show Context)
Multi-target tracking requires locating the targets and labeling their identities. The latter is a challenge when many targets, with indistinct appearances, frequently occlude one another, as in football and surveillance tracking. We present an approach to solving this labeling problem. When isolated, a target can be tracked and its identity maintained. While, if targets interact this is not always the case. This paper assumes a track graph exists, denoting when targets are isolated and describing how they interact. Measures of similarity between isolated tracks are defined. The goal is to associate the identities of the isolated tracks, by exploiting the graph constraints and similarity measures. We formulate this as a Bayesian network inference problem, allowing us to use standard message propagation to find the most probable set of paths in an efficient way. The high complexity inevitable in large problems is gracefully reduced by removing dependency links between tracks. We apply the method to a 10 min sequence of an international football game and compare results to ground truth. 1.
Audiovisual probabilistic tracking of multiple speakers in meetings
- IEEE Transactions on Audio, Speech, and Language Processing
, 2007
"... e-mail ..."
(Show Context)
Tracking in Unstructured Crowded Scenes
"... This paper presents a target tracking framework for unstructured crowded scenes. Unstructured crowded scenes are defined as those scenes where the motion of a crowd appears to be random with different participants moving in different directions over time. This means each spatial location in such sce ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
(Show Context)
This paper presents a target tracking framework for unstructured crowded scenes. Unstructured crowded scenes are defined as those scenes where the motion of a crowd appears to be random with different participants moving in different directions over time. This means each spatial location in such scenes supports more than one, or multi-modal, crowd behavior. The case of tracking in structured crowded scenes, where the crowd moves coherently in a common direction, and the direction of motion does not vary over time, was previously handled in [1]. In this work, we propose to model various crowd behavior (or motion) modalities at different locations of the scene by employing Correlated Topic Model (CTM) of [16]. In our construction, words correspond to low level quantized motion features and topics correspond to crowd behaviors. It is then assumed that motion at each location in an unstructured crowd scene is generated by a set of behavior proportions, where behaviors represent distributions over low-level motion features. This way any one location in the scene may support multiple crowd behavior modalities and can be used as prior information for tracking. Our approach enables us to model a diverse set of unstructured crowd domains, which range from cluttered time-lapse microscopy videos of cell populations in vitro, to footage of crowded sporting events. 1.
Contextual Identity Recognition in Personal Photo Albums
"... We present an efficient probabilistic method for identity recognition in personal photo albums. Personal photos are usually taken under uncontrolled conditions – the captured faces exhibit significant variations in pose, expression and illumination that limit the success of traditional face recognit ..."
Abstract
-
Cited by 37 (1 self)
- Add to MetaCart
(Show Context)
We present an efficient probabilistic method for identity recognition in personal photo albums. Personal photos are usually taken under uncontrolled conditions – the captured faces exhibit significant variations in pose, expression and illumination that limit the success of traditional face recognition algorithms. We show how to improve recognition rates by incorporating additional cues present in personal photo collections, such as clothing appearance and information about when the photo was taken. This is done by constructing a Markov Random Field (MRF) that effectively combines all available contextual cues in a principled recognition framework. Performing inference in the MRF produces markedly improved recognition results in a challenging dataset consisting of the personal photo collections of multiple people. At the same time, the computational cost of our approach remains comparable to that of standard face recognition approaches. 1.
Tracking the visual focus of attention for a varying number of wandering people
, 2008
"... Abstract—In this paper, we define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W), determining where a person is looking when their movement is unconstrained. The VFOA-W estimation is a new and important problem with implications in ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
(Show Context)
Abstract—In this paper, we define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W), determining where a person is looking when their movement is unconstrained. The VFOA-W estimation is a new and important problem with implications in behavior understanding and cognitive science and real-world applications. One such application, presented in this paper, monitors the attention passers-by pay to an outdoor advertisement by using a single video camera. In our approach to the VFOA-W problem, we propose a multiperson tracking solution based on a dynamic Bayesian network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. For efficient inference in the resulting variable-dimensional state-space, we propose a Reversible-Jump Markov Chain Monte Carlo (RJMCMC) sampling scheme and a novel global observation model, which determines the number of people in the scene and their locations. To determine if a person is looking at the advertisement or not, we propose Gaussian Mixture Model (GMM)-based and Hidden Markov Model (HMM)-based VFOA-W models, which use head pose and location information. Our models are evaluated for tracking performance and ability to recognize people looking at an outdoor advertisement, with results indicating good performance on sequences where up to three mobile observers pass in front of an advertisement. Index Terms—Computer vision, tracking, video analysis, consumer products. Ç 1