Results 1 -
2 of
2
Modeling search for people in 900 scenes: A combined source model of eye guidance
- Visual Cognition
, 2009
"... How predictable are human eye movements during search in real world scenes? We recorded 14 observers ’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from t ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
How predictable are human eye movements during search in real world scenes? We recorded 14 observers ’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94 % of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in Please address all correspondence to Aude Oliva, Department of Brain and Cognitive
Ranking the Local Invariant Features for the Robust Visual Saliencies
"... Local invariant feature based methods have been proven to be effective in computer vision for object recognition and learning. But for an image, the number of points detected and to be matched may be very large, or even redundantly represent the shape information present. Since selective attention i ..."
Abstract
- Add to MetaCart
Local invariant feature based methods have been proven to be effective in computer vision for object recognition and learning. But for an image, the number of points detected and to be matched may be very large, or even redundantly represent the shape information present. Since selective attention is a basic mechanism of the visual system, we explore whether there is a subset of salient points that can be robustly detected and matched. We propose a method to rank the redundant local invariant features. The results prove that the top ranked points capture the salient information effectively. The method can be used as a preprocessing step for the Bag-of-Feature based methods or graph based methods. Here they simplify the complexity of the processes, such as training, matching and tracking. 1

