Results 1 -
6 of
6
Evaluation of Selective Attention under Similarity Transforms
, 2003
"... Computational selective attention systems have mostly been developed as models of human attention, and they have been evaluated on that basis. Now, however, they are being used as front ends to object recognition systems, and in particular to appearance-based recognition systems. As such, they need ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Computational selective attention systems have mostly been developed as models of human attention, and they have been evaluated on that basis. Now, however, they are being used as front ends to object recognition systems, and in particular to appearance-based recognition systems. As such, they need to be evaluated by other criteria. A common goal for object recognition systms in invariance to 2D similarity transformations (i.e. in-plane translations, rotations, reflections and scales). This implies that attention systems used as front ends should also be invariant to similarity transforms. This paper evaluates the Neuromorphic Vision Toolkit (NVT), a well known and publicly available selective attention system, and finds it to be highly sensitive to 2D similarity transforms. Further investigation, however, suggests that this sensitivity is an artifact of the publicly available implementation, and not of the neuromorphic principles it is based on. Therefore we have developed a new system, called SAFE (Selective Attention as a Front End), that is conceptually similar to NVT. However, SAFE is largely invariant to 2D similarity transformations of the source image and selects scales as well as spatial locations for fixations, implementing a combined "zoom-spotlight" model of attention.
An Attention-Driven Model for Grouping Similar Images with Image Retrieval Applications
, 2006
"... Recent work in the computational modeling of visual attention has demonstrated that a purely bottom-up approach to identifying salient regions within an image can be successfully applied to diverse and practical problems from target recognition to the placement of advertisement. This paper propo ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Recent work in the computational modeling of visual attention has demonstrated that a purely bottom-up approach to identifying salient regions within an image can be successfully applied to diverse and practical problems from target recognition to the placement of advertisement. This paper proposes an application of a combination of computational models of visual attention to the image retrieval problem. We demonstrate that certain shortcomings of existing content-based image retrieval solutions can be addressed by implementing a biologically-motivated, unsupervised way of grouping together images whose salient regions of interest (ROIs) are perceptually similar regardless of the visual contents of other (less relevant) parts of the image. We propose a model in which only the salient regions of an image are encoded as ROIs whose features are then compared against previously seen ROIs and assigned cluster membership accordingly. Experimental results show that the proposed approach works well for several combinations of feature extraction techniques and clustering algorithms, suggesting a promising avenue for future improvements, such as the addition of a top-down component and the inclusion of a relevance feedback mechanism.
Searching for Image Information Content, its Discovery, Extraction, and Representation
, 2005
"... Image information content is known to be a complicated and a controversial problem. We posit a new image information content definition. Following the theory of Solomonoff-KolmogorovChaitin 's complexity, we define image information content as a set of descriptions of image data structures. Three le ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Image information content is known to be a complicated and a controversial problem. We posit a new image information content definition. Following the theory of Solomonoff-KolmogorovChaitin 's complexity, we define image information content as a set of descriptions of image data structures. Three levels of such description can be generally distinguished: (1) the global level, where the coarse structure of the entire scene is initially outlined; (2) the intermediate level, where structures of separate, nonoverlapping image regions usually associated with individual scene objects are delineated; and (3) the low-level description, where local image structures observed in a limited and restricted field of view are resolved. A technique for creating such image information content descriptors is developed. Its algorithm is presented and elucidated with some examples, which demonstrate the effectiveness of the proposed approach. 2005 SPIE and IS&T. [DOI: 10.1117/1.1867476] 1
Using Visual Attention to Extract Regions of Interest in the Context of Image Retrieval
, 2006
"... Recent research on computational modeling of visual attention has demonstrated that a bottom-up approach to identifying salient regions within an image can be applied to diverse and practical problems for which conventional machine vision techniques have not succeeded in producing robust solutions. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Recent research on computational modeling of visual attention has demonstrated that a bottom-up approach to identifying salient regions within an image can be applied to diverse and practical problems for which conventional machine vision techniques have not succeeded in producing robust solutions. This paper proposes a new method for extracting regions of interest (ROIs) from images using models of visual attention. It is presented in the context of improving content-based image retrieval (CBIR) solutions by implementing a biologically-motivated, unsupervised technique of grouping together images whose salient ROIs are perceptually similar. In this paper we focus on the process of extracting the salient regions of an image. The excellent results obtained with the proposed method have demonstrated that the ROIs of the images can be independently indexed for comparison against other regions on the basis of similarity for use in a CBIR solution.
Attention and the Minimal Subscene
- ACTION TO LANGUAGE VIA THE MIRROR NEURON SYSTEM
, 2006
"... We describe a computational framework that explores the interaction between focal visual attention, the recognition of objects and actions, and the related use of language. We introduce the notions of "minimal subscene" and “anchored subscene ” to provide a middle ground representation, in ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe a computational framework that explores the interaction between focal visual attention, the recognition of objects and actions, and the related use of language. We introduce the notions of "minimal subscene" and “anchored subscene ” to provide a middle ground representation, in which an agent is linked to objects or other agents via some action. We offer a preliminary model of visual attention which links bottom-up salience, contextual cues, object recognition, top-down attention, and short-term memory in building representations of subscenes. We then examine how this framework links to low-level visual perception, on the one end, and to sentences which describe a subscene or raise questions about the scene, on the other.
AN ATTENTION-BASED METHOD FOR EXTRACTING SALIENT REGIONS OF INTEREST FROM STEREO IMAGES
, 2007
"... The fundamental problem of computer vision is caused by the translation of a three-dimensional world onto one or more two-dimensional planes. As a result, methods for extracting regions of interest (ROIs) have certain limitations that cannot be overcome with traditional techniques that only utilize ..."
Abstract
- Add to MetaCart
The fundamental problem of computer vision is caused by the translation of a three-dimensional world onto one or more two-dimensional planes. As a result, methods for extracting regions of interest (ROIs) have certain limitations that cannot be overcome with traditional techniques that only utilize a single projection of the image. For example, while it is dif cult to distinguished two overlapping, homogeneous regions with a single intensity or color image, depth information can usually easily be used to separate the regions. In this paper we present an extension to an existing saliency-based ROI extraction method. By adding depth information to the existing method many previously dif cult scenarios can now be handled. Experimental results show consistently improved ROI segmentation.

