Results 11 - 20
of
29
Invariant visual object recognition: A model, with lighting invariance
- Journal of Physiology - Paris
, 2006
"... How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. T ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate topdown feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.
Representing and Learning Visual Schemas in Neural Networks for Scene Analysis
- Center for Neural Engineering, University of Southern California
"... this paper, consists of three main components. The Low-Level Visual Module (simulated using procedural programs) extracts featural and positional information from the visual input. The Schema Module encodes structured knowledge about possible objects, and provides top-down information for the Low-Le ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
this paper, consists of three main components. The Low-Level Visual Module (simulated using procedural programs) extracts featural and positional information from the visual input. The Schema Module encodes structured knowledge about possible objects, and provides top-down information for the Low-Level Visual Module to focus attention at different parts of the scene. The Response Module learns to associate the schema activation patterns with external responses. It enables the external environment to provide reinforcement feedback for the learning of schematic structures. 1 Introduction
A Unified Theory of Exogenous and Endogenous Attentional Control
, 2007
"... Although diverse, theories of visual attention generally share the notion that attention is controlled by some combination of three distinct strategies: (1) exogenous cueing from locallycontrasting primitive visual features, such as abrupt onsets or color singletons (e.g., Itti & Koch, 2001); (2) en ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Although diverse, theories of visual attention generally share the notion that attention is controlled by some combination of three distinct strategies: (1) exogenous cueing from locallycontrasting primitive visual features, such as abrupt onsets or color singletons (e.g., Itti & Koch, 2001); (2) endogenous gain modulation of exogenous activations, used to guide attention to task relevant features (e.g., Navalpakkam & Itti, 2005; Wolfe, 1994); and (3) endogenous prediction of likely locations of interest, based on task and scene gist (e.g., Torralba, Oliva, Catelhano, & Henderson, 2006). Because these strategies can make conflicting suggestions, theories posit arbitration and combination rules. We propose an alternative conceptualization consisting of a single unified mechanism that is controlled along two dimensions: the degree of task focus, and the spatial scale of operation. Previously proposed strategies—and their combinations—can be viewed as instances of this mechanism. Our theory offers a means of integrating data from a wide range of attentional phenomena. More importantly, the theory yields an unusual perspective on attention that places a fundamental emphasis on the role of experience and task-related knowledge. 1 1
Viewpoint Information Provided By a Familiar Environment Facilitates Object Identification
, 1999
"... . We studied whether contextual information regarding an observer's location within a familiar scene could influence the identification of objects. The context was provided by a 3D virtual living room, which allowed natural familiarization of the scene and objects together with a high level of inter ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. We studied whether contextual information regarding an observer's location within a familiar scene could influence the identification of objects. The context was provided by a 3D virtual living room, which allowed natural familiarization of the scene and objects together with a high level of interactivity. Results of initial self-orientation judgments obtained in the room showed observers could make accurate judgments of their instantaneous orientation with respect to a reference point. We wanted to know if this information could in turn be used as an aid to identify objects from unfamiliar viewpoints. Our main experiment showed that after familiarization of objects within the virtual room, the presence of the room during identification produced significantly fewer errors than when the objects were shown in isolation. This reduction in error was attributed to the provision of a consistent reference frame by the room. This was tested by a control experiment, in which we randomly varie...
Visual short-term memory for two sequential arrays: One integrated representation or two separate representations?
, 2003
"... Two dot arrays, each containing a different set of 6 randomly selected locations from a 5x5 matrix, were presented briefly and separated by an inter-stimulusinterval (ISI) of 0, 200, 500, or 1500ms. Subjects were asked to remember these locations and to report whether a probe dot matched the memory ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Two dot arrays, each containing a different set of 6 randomly selected locations from a 5x5 matrix, were presented briefly and separated by an inter-stimulusinterval (ISI) of 0, 200, 500, or 1500ms. Subjects were asked to remember these locations and to report whether a probe dot matched the memory locations. To find out whether subjects formed an integrated representation of the two arrays, the probe dot was accompanied by matrix elements from the first array, from the second array, or from both arrays. Memory for array 1 was significantly impaired when the retrieval context was drawn from array 2, and vice versa, suggesting that the two arrays were represented separately. This effect was observed only at an ISI of 500ms or longer. We propose that as array 1 is better consolidated, it representation becomes more separated from that of array 2. 2 An important challenge confronting the human visual system during natural viewing is to extract complex visual information and to retain it momentarily. A lot of vision research has focused on how the visual system perceives objects and scenes (Palmer, 1999), and how such information is retained in visual short-term memory (Intraub, 1997; Jiang, Olson, & Chun, 2000; Luck & Vogel, 1997; Phillips, 1974; Rensink, O'Regan, & Clark, 1997; Wheeler & Treisman, 2002). Recent studies suggest that approximately four visual objects or six spatial locations can be stored in VSTM simultaneously (Luck & Vogel, 1997; Pashler, 1988). More features can be stored in VSTM if they conjoin to form integrated objects than if they are separate (Lee & Chun, 2001; Luck & Vogel, 1997; Olson & Jiang, 2002; Xu, 2002; Wheeler & Treisman, 2001). These studies have focused on the representation of a single visual display in VSTM. However, visual events evolv...
On the Potential of Incorporating Knowledge of Human Visual Attention into CBIR Systems
, 2006
"... Content-based image retrieval (CBIR) systems have been actively investigated over the past decade. Several existing CBIR prototypes claim to be designed based on perceptual characteristics of the human visual system, but even those who do are far from recognizing that they could benefit further by ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Content-based image retrieval (CBIR) systems have been actively investigated over the past decade. Several existing CBIR prototypes claim to be designed based on perceptual characteristics of the human visual system, but even those who do are far from recognizing that they could benefit further by incorporating ongoing research in vision science. This paper explores the inclusion of human visual perception knowledge into the design and implementation of CBIR systems. Particularly, it addresses the latest developments in computational modeling of human visual attention. This fresh way of revisiting concepts in CBIR based on the latest findings and open questions in vision science research has the potential to overcome some of the challenges faced by CBIR systems.
Psychological Science Top-Down Attentional Guidance Based On Implicit Learning Of Visual Covariation
"... The visual environment is extremely rich and complex, producing information overload for the visual system. But the environment also embodies structure in the form of redundancies and regularities that may serve to reduce complexity. How do perceivers internalize this complex informational structu ..."
Abstract
- Add to MetaCart
The visual environment is extremely rich and complex, producing information overload for the visual system. But the environment also embodies structure in the form of redundancies and regularities that may serve to reduce complexity. How do perceivers internalize this complex informational structure? We present new evidence of visual learning that illustrates how observers learn how objects and events covary in the visual world. This information serves to guide visual processes such as object recognition and search. Our first experiment demonstrates that search and object recognition are facilitated by learned associations (covariation) between novel visual shapes. Our second experiment shows that regularities in dynamic visual environments can also be learned to guide search behavior. In both experiments, learning occurred incidentally and the memory representations were implicit. These experiments show how top-down visual knowledge, acquired through implicit learning, constrains what to expect and guides where to attend and look.
IMAGE RETRIEVAL USING VISUAL ATTENTION
"... Let the honor of your student be as dear to you as your own, the honor of your colleague as the reverence for your teacher, and the reverence for your teacher as the fear of Heaven. Rabbi Elazar ben Shammua, Pirkei Avot My mentor and dear friend Dr. Oge Marques deserves special thanks. His genuine d ..."
Abstract
- Add to MetaCart
Let the honor of your student be as dear to you as your own, the honor of your colleague as the reverence for your teacher, and the reverence for your teacher as the fear of Heaven. Rabbi Elazar ben Shammua, Pirkei Avot My mentor and dear friend Dr. Oge Marques deserves special thanks. His genuine dedication to learning had an impact on me from the moment this work began. Our many discussions were both academically challenging and enlightening. His advice and support were essential to the successful completion of this research. The guidance of Dr. Borko Furht, not only during the course of this dissertation, but since the start of my undergraduate studies, has been invaluable. It was his encouragement that first motivated me to pursue this degree, and for that I will always be grateful. Dr. Hari Kalva provided thoughtful insight as well as resources without which many of the results in this dissertation would not have been possible to obtain. I truly appreciate his help and support.
PSYCHOLOGICAL SCIENCE Research Article Try It, You’ll Like It The Influence of Expectation, Consumption, and Revelation
"... ABSTRACT—Patrons of a pub evaluated regular beer and ‘‘MIT brew’ ’ (regular beer plus a few drops of balsamic vinegar) in one of three conditions. One group tasted the samples blind (the secret ingredient was never disclosed). A second group was informed of the contents before tasting. A third group ..."
Abstract
- Add to MetaCart
ABSTRACT—Patrons of a pub evaluated regular beer and ‘‘MIT brew’ ’ (regular beer plus a few drops of balsamic vinegar) in one of three conditions. One group tasted the samples blind (the secret ingredient was never disclosed). A second group was informed of the contents before tasting. A third group learned of the secret ingredient immediately after tasting, but prior to indicating their preference. Not surprisingly, preference for the MIT brew was higher in the blind condition than in either of the two disclosure conditions. However, the timing of the information mattered substantially. Disclosure of the secret ingredient significantly reduced preference only when the disclosure preceded tasting, suggesting that disclosure affected preferences by influencing the experience itself, rather than by acting as an independent negative input or by modifying
Hierarchical Mixture of Classification Experts Uncovers Interactions between Brain Regions
"... The human brain can be described as containing a number of functional regions. These regions, as well as the connections between them, play a key role in information processing in the brain. However, most existing multi-voxel pattern analysis approaches either treat multiple regions as one large uni ..."
Abstract
- Add to MetaCart
The human brain can be described as containing a number of functional regions. These regions, as well as the connections between them, play a key role in information processing in the brain. However, most existing multi-voxel pattern analysis approaches either treat multiple regions as one large uniform region or several independent regions, ignoring the connections between them. In this paper we propose to model such connections in an Hidden Conditional Random Field (HCRF) framework, where the classifier of one region of interest (ROI) makes predictions based on not only its voxels but also the predictions from ROIs that it connects to. Furthermore, we propose a structural learning method in the HCRF framework to automatically uncover the connections between ROIs. We illustrate this approach with fMRI data acquired while human subjects viewed images of different natural scene categories and show that our model can improve the top-level (the classifier combining information from all ROIs) and ROI-level prediction accuracy, as well as uncover some meaningful connections between ROIs. 1

