Results 1 - 10
of
111
Learning Optimized Features for Hierarchical Models of Invariant Object Recognition
, 2002
"... There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense rese ..."
Abstract
-
Cited by 93 (28 self)
- Add to MetaCart
There is an ongoing debate over the capabilities of hierarchical neural feed-forward architectures for performing real-world invariant object recognition. Although a variety of hierarchical models exists, appropriate supervised and unsupervised learning methods are still an issue of intense research. We propose a feedforward model for recognition that shares components like weightsharing, pooling stages, and competitive nonlinearities with earlier approaches, but focus on new methods for learning optimal featuredetecting cells in intermediate stages of the hierarchical network.
A hierarchical Bayesian model of invariant pattern recognition in the visual cortex
- In Proceedings of the International Joint Conference on Neural Networks. IEEE
, 2005
"... Abstract — We describe a hierarchical model of invariant visual pattern recognition in the visual cortex. In this model, the knowledge of how patterns change when objects move is learned and encapsulated in terms of high probability sequences at each level of the hierarchy. Configuration of object p ..."
Abstract
-
Cited by 71 (2 self)
- Add to MetaCart
(Show Context)
Abstract — We describe a hierarchical model of invariant visual pattern recognition in the visual cortex. In this model, the knowledge of how patterns change when objects move is learned and encapsulated in terms of high probability sequences at each level of the hierarchy. Configuration of object parts is captured by the patterns of coincident high probability sequences. This knowledge is then encoded in a highly efficient Bayesian Network structure.The learning algorithm uses a temporal stability criterion to discover object concepts and movement patterns. We show that the architecture and algorithms are biologically plausible. The large scale architecture of the system matches the large scale organization of the cortex and the micro-circuits derived from the local computations match the anatomical data on cortical circuits. The system exhibits invariance across a wide variety of transformations and is robust in the presence of noise. Moreover, the model also offers alternative explanations for various known cortical phenomena. I.
Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity
"... Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on affe ..."
Abstract
-
Cited by 61 (6 self)
- Add to MetaCart
(Show Context)
Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on afferents that systematically fire early, while postsynaptic spike latencies decrease. Here we use this learning rule in an asynchronous feedforward spiking neural network that mimics the ventral visual pathway and shows that when the network is presented with natural images, selectivity to intermediate-complexity visual features emerges. Those features, which correspond to prototypical patterns that are both salient and consistently present in the images, are highly informative and enable robust object recognition, as demonstrated on various classification tasks. Taken together, these results show that temporal codes may be a key to understanding the phenomenal processing speed achieved by the visual system and that STDP can lead to fast and selective responses.
Attention, short-term memory, and action selection: A unifying theory
, 2005
"... Cognitive behaviour requires complex context-dependent processing of information that emerges from the links between attentional perceptual processes, working memory and reward-based evaluation of the performed actions. We describe a computational neuroscience theoretical framework which shows how a ..."
Abstract
-
Cited by 58 (13 self)
- Add to MetaCart
Cognitive behaviour requires complex context-dependent processing of information that emerges from the links between attentional perceptual processes, working memory and reward-based evaluation of the performed actions. We describe a computational neuroscience theoretical framework which shows how an attentional state held in a short term memory in the prefrontal cortex can by top-down processing influence ventral and dorsal stream cortical areas using biased competition to account for many aspects of visual attention. We also show how within the prefrontal cortex an attentional bias can influence the mapping of sensory inputs to motor outputs, and thus play an important role in decision making. We also show how the absence of expected rewards can switch an attentional bias signal, and thus rapidly and flexibly alter cognitive performance. This theoretical framework incorporates spiking and synaptic dynamics which enable single neuron responses, fMRI activations, psychophysical results, the effects of pharmacological agents, and the effects of damage to parts of the system to be explicitly simulated and predicted. This computational neuroscience framework provides an approach for integrating different levels of investigation of brain function, and for understanding the relations between them. The models also directly address how bottom-up and top-down processes interact in visual cognition,
A high-throughput screening approach to discovering good forms of visual representation.
- Computational and Systems Neuroscience (COSYNE).
, 2008
"... Abstract While many models of biological object recognition share a common set of ''broad-stroke'' properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model-e.g., the number of units per layer, the size ..."
Abstract
-
Cited by 52 (9 self)
- Add to MetaCart
Abstract While many models of biological object recognition share a common set of ''broad-stroke'' properties, the performance of any one model depends strongly on the choice of parameters in a particular instantiation of that model-e.g., the number of units per layer, the size of pooling kernels, exponents in normalization operations, etc. Since the number of such parameters (explicit or implicit) is typically large and the computational cost of evaluating one particular parameter set is high, the space of possible model instantiations goes largely unexplored. Thus, when a model fails to approach the abilities of biological visual systems, we are left uncertain whether this failure is because we are missing a fundamental idea or because the correct ''parts'' have not been tuned correctly, assembled at sufficient scale, or provided with enough training. Here, we present a high-throughput approach to the exploration of such parameter sets, leveraging recent advances in stream processing hardware (high-end NVIDIA graphic cards and the PlayStation 3's IBM Cell Processor). In analogy to highthroughput screening approaches in molecular biology and genetics, we explored thousands of potential network architectures and parameter instantiations, screening those that show promising object recognition performance for further analysis. We show that this approach can yield significant, reproducible gains in performance across an array of basic object recognition tasks, consistently outperforming a variety of state-of-the-art purpose-built vision systems from the literature. As the scale of available computational power continues to expand, we argue that this approach has the potential to greatly accelerate progress in both artificial vision and our understanding of the computational underpinning of biological vision.
How the brain might work: A hierarchical and temporal model for learning and recognition
- STANFORD UNIVERSITY
, 2008
"... ..."
(Show Context)
The representation of information about faces in the temporal and frontal lobes
- Neuropsychologia
, 2006
"... frontal lobes ..."
(Show Context)
Position Invariant Recognition in the Visual System with . . .
, 2000
"... The effects of cluttered environments are investigated on the performance of a hierarchical multilayer model of invariant object recognition in the visual system (VisNet) that employs learning rules that utilise a trace of previous neural activity. This class of model relies on the spatio-temporal s ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
The effects of cluttered environments are investigated on the performance of a hierarchical multilayer model of invariant object recognition in the visual system (VisNet) that employs learning rules that utilise a trace of previous neural activity. This class of model relies on the spatio-temporal statistics of natural visual inputs to be able to associate together different exemplars of the same stimulus or object which will tend to occur in temporal proximity. In this paper the different exemplars of a stimulus are the same stimulus in different positions. First it is shown that if the stimuli have been learned previously against a plain background, then the stimuli can be correctly recognised even in environments with cluttered (e.g. natural) backgrounds which form complex scenes. Second it is shown that the functional architecture has difficulty in learning new objects if they are presented against cluttered backgrounds. It is suggested that processes such as the use of a highresolution fovea, or attention, may be particularly useful in suppressing the effects of background noise and in segmenting objects from their background when new objects need to be learned. However, it is shown third that this problem may be ameliorated by the prior existence of stimulus tuned feature detecting neurons in the early layers of the VisNet, and that these feature detecting neurons may be set up through previous exposure to the relevant class of objects. Fourth we extend these results to partially occluded objects, showing that (in contrast with many artificial vision systems) correct recognition in this class of architecture can occur if the objects have been learned previously without occlusion.
A computational model of auditory selective attention
- IEEE Transactions on Neural Networks
, 2004
"... The auditory system must separate an acoustic mixture in order to create a perceptual description of each sound source. It has been proposed that this is achieved by a process of auditory scene analysis (ASA) in which a number of streams are produced, each describing a single sound source. Few compu ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
The auditory system must separate an acoustic mixture in order to create a perceptual description of each sound source. It has been proposed that this is achieved by a process of auditory scene analysis (ASA) in which a number of streams are produced, each describing a single sound source. Few computer models of ASA attempt to incorporate attentional effects, since ASA is typically seen as a precursor to attentional mechanisms. This assumption may be flawed: recent work has suggested that attention plays a key role in the formation of streams, as opposed to the conventional view that attention merely selects a pre-constructed stream. This study presents a conceptual framework for auditory selective attention in which the formation of groups and streams is heavily influenced by conscious and subconscious attention. This framework is implemented as a computational model comprising a network of neural oscillators which perform stream segregation on the basis of oscillatory correlation. Within the network, attentional interest is modelled as a gaussian distribution in frequency. This determines the connection weights between oscillators and the attentional process- the attentional leaky integrator (ALI). A segment or group of segments are said to be attended to if their oscillatory activity coincides temporally with a peak in the ALI activity. The output of the model is an ‘attentional stream’: a description of which frequencies are being attended at each epoch. The model successfully simulates a range of psychophysical phenomena. Furthermore, a number of predictions are made and a psychophysical experiment is conducted to investigate the time course of attentional allocation in a binaural streaming task. The results support the model prediction that attention is subject to a form of ‘reset ’ when the attentional focus is moved in space. Acknowledgements The inspiration and initial psychophysical data upon which this work is based came from a presentation
Invariant visual object recognition: A model, with lighting invariance
- Journal of Physiology - Paris
, 2006
"... How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. T ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
(Show Context)
How are invariant representations of objects formed in the visual cortex? We describe a neurophysiological and computational approach which focusses on a feature hierarchy model in which invariant representations can be built by self-organizing learning based on the statistics of the visual input. The model can use temporal continuity in an associative synaptic learning rule with a short term memory trace, and/or it can use spatial continuity in Continuous Transformation learning. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and in this paper we show also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate topdown feedback connections to model the control of attention by biased competition in for example spatial and object search tasks. The model has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene.