Results 1 - 10
of
30
What and where: A Bayesian inference theory of attention
, 2010
"... In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychop ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenomena – including bottom-up pop-out effects, multiplicative modulation of neuronal tuning
Learning invariant features using inertial priors
- ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE
, 2006
"... We address the technical challenges involved in combining key features from several theories of the visual cortex in a single coherent model. The resulting model is a hierarchical Bayesian network factored into modular component networks embedding variable-order Markov models. Each component network ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
We address the technical challenges involved in combining key features from several theories of the visual cortex in a single coherent model. The resulting model is a hierarchical Bayesian network factored into modular component networks embedding variable-order Markov models. Each component network has an associated receptive field corresponding to components residing in the level directly below it in the hierarchy. The variable-order Markov models account for features that are invariant to naturally occurring transformations in their inputs. These invariant features give rise to increasingly stable, persistent representations as we ascend the hierarchy. The receptive fields of proximate components on the same level overlap to restore selectivity that might otherwise be lost to invariance.
On the prospects for building a working model of the visual cortex
- in Proc. Nat. Conf. Artificial Intelligence
"... Human visual capability has remained largely beyond the reach of engineered systems despite intensive study and considerable progress in problem understanding, algorithms and computing power. We posit that significant progress can be made by combining existing technologies from computer vision, idea ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
Human visual capability has remained largely beyond the reach of engineered systems despite intensive study and considerable progress in problem understanding, algorithms and computing power. We posit that significant progress can be made by combining existing technologies from computer vision, ideas from theoretical neuroscience and the availability of large-scale computing power for experimentation. From a theoretical standpoint, our primary point of departure from current practice is our reliance on exploiting time in order to turn an otherwise intractable unsupervised problem into a locally semi-supervised, and plausibly tractable, learning problem. From a pragmatic perspective, our system architecture follows what we know of cortical neuroanatomy and provides a solid foundation for scalable hierarchical inference. This combination of features promises to provide a range of robust object-recognition capabilities. In July of 2005, one of us (Dean) presented a paper at AAAI entitled “A Computational Model of the Cerebral Cortex ” (Dean 2005). The paper described a graphical model of the visual cortex inspired by David Mumford’s computational architecture (1991; 1992; 2003). At that same meeting, Jeff Hawkins gave an invited talk entitled “From
What and where: a bayesian inference theory of visual attention
- Vision Research
"... In the theoretical framework described in this thesis, attention is part of the inference pro-cess that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main prop-erties at the level of psych ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference pro-cess that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main prop-erties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spa-tial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in clut-tered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenom-ena- including bottom-up pop-out effects, multiplicative modulation of neuronal tuning curves and shift in contrast responses- emerge naturally as predictions of the model. We also show that the bayesian model predicts well human eye fixations (considered as a proxy
Hierarchical Expectation Refinement for Learning Generative Perception Models
, 2005
"... We present a class of generative models well suited to modeling perceptual processes and an algorithm for learning their parameters that promises to scale to learning very large models. The models are hierarchical, composed of multiple levels, and allow input only at the lowest level, the base of ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
We present a class of generative models well suited to modeling perceptual processes and an algorithm for learning their parameters that promises to scale to learning very large models. The models are hierarchical, composed of multiple levels, and allow input only at the lowest level, the base of the hierarchy. Connections within a level are generally local and may or may not be directed. Connections between levels are directed and generally do not span multiple levels. The learning algorithm falls within the general family of expectation maximization algorithms. Parameter estimation proceeds level-by-level starting with components in the lowest level and moving up the hierarchy.
Cortical columns: Building blocks for intelligent systems
- In Proceedings of the Symposium Series on Computational Intelligence
, 2009
"... Abstract — The neocortex appears to be a very efficient, uniformly structured, and hierarchical computational system [25], [23], [24]. Researchers have made significant efforts to model intelligent systems that mimic these neocortical properties to perform a broad variety of pattern recognition and ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
(Show Context)
Abstract — The neocortex appears to be a very efficient, uniformly structured, and hierarchical computational system [25], [23], [24]. Researchers have made significant efforts to model intelligent systems that mimic these neocortical properties to perform a broad variety of pattern recognition and learning tasks. Unfortunately, many of these systems have drifted away from their cortical origins and incorporate or rely on attributes and algorithms that are not biologically plausible. In contrast, this paper describes a model for an intelligent system that is motivated by the properties of cortical columns, which can be viewed as the basic functional unit of the neocortex [35], [16]. Our model extends predictability minimization [30] to mimic the behavior of cortical columns and incorporates neocortical properties such as hierarchy, structural uniformity, and plasticity, and enables adaptive, hierarchical independent feature detection. Initial results for an unsupervised learning task–identifying independent features in image data–are quite promising, both in a single-level and a hierarchical organization modeled after the visual cortex. The model is also able to forget learned patterns that no longer appear in the dataset, demonstrating its adaptivity, resilience, and stability under changing input conditions. I.
FPGA implementation of Izhikevich spiking neurons for character recognition
, 2009
"... Abstract — There has been a strong push recently to examine biological scale simulations of neuromorphic algorithms to achieve stronger inference capabilities than current computing algorithms. The recent Izhikevich spiking neuron model is ideally suited for such large scale cortical simulations due ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Abstract — There has been a strong push recently to examine biological scale simulations of neuromorphic algorithms to achieve stronger inference capabilities than current computing algorithms. The recent Izhikevich spiking neuron model is ideally suited for such large scale cortical simulations due to its efficiency and biological accuracy. In this paper we explore the feasibility of using FPGAs for large scale simulations of the Izhikevich model. We developed a modularized processing element to evaluate a large number of Izhikevich spiking neurons in a pipelined manner. This approach allows for easy scalability of the model to larger FPGAs. We utilized a character recognition algorithm based on the Izhikevich model for this study and scaled up the algorithm to use over 9000 neurons. The FPGA implementation of the algorithm on a Xilinx Virtex 4 provided a speedup of approximately 8.5 times an equivalent software implementation on a 2.2 GHz AMD Opteron core. Our results indicate that FPGAs are suitable for large scale cortical simulations utilizing the Izhikevich spiking neuron model. Keywords-FPGA;spiking neural networks I.
Character Recognition Using Hierarchical Vector Quantization and Temporal Pooling
"... Abstract. In recent years, there has been a cross-fertilization of ideas between computational neuroscience models of the operation of the neocortex and artificial intelligence models of machine learning. Much of this work has focussed on the mammalian visual cortex, treating it as a hierarchically- ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract. In recent years, there has been a cross-fertilization of ideas between computational neuroscience models of the operation of the neocortex and artificial intelligence models of machine learning. Much of this work has focussed on the mammalian visual cortex, treating it as a hierarchically-structured pattern recognition machine that exploits statistical regularities in retinal input. It has further been proposed that the neocortex represents sensory information probabilistically, using some form of Bayesian inference to disambiguate noisy data. In the current paper, we focus on a particular model of the neocortex developed by Hawkins, known as hierarchical temporal memory (HTM). Our aim is to evaluate an important and recently implemented aspect of this model, namely its ability to represent temporal sequences of input within a hierarchically structured vector quantization algorithm. We test this temporal pooling feature of HTM on a benchmark of cursive handwriting recognition problems and compare it to a current state-of-the-art support vector machine implementation. We also examine whether two pre-processing techniques can enhance the temporal pooling algorithm’s performance. Our results show that a relatively simple temporal pooling approach can produce recognition rates that approach the current state-of-the-art without the need for extensive tuning of parameters. We also show that temporal pooling performance is surprisingly unaffected by the use of preprocessing techniques. 1
Predictive Encoding of Contextual Relationships for Perceptual Inference, Interpolation and Prediction
"... We propose a new neurally-inspired model that can learn to encode global relationship context of visual events across time and space and to use the contex-tual information to modulate the analysis by synthesis process in a predictive coding framework. The model is based on the principle of mutual pr ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose a new neurally-inspired model that can learn to encode global relationship context of visual events across time and space and to use the contex-tual information to modulate the analysis by synthesis process in a predictive coding framework. The model is based on the principle of mutual predictability. It learns latent contextual representations by maximiz-ing the predictability of visual events based on local and global context information. The model can there-fore interpolate missing events or predict future events in image sequences. The contextual representations modulate the prediction synthesis process by adaptively rescaling the contribution of each neuron’s basis func-tion. In contrast to standard predictive coding models, the prediction error in this model is used to update the context representation but does not alter the feedfor-ward input for the next layer, thus is more consistent with neuro-physiological observations. We establish the computational feasibility of this model by demonstrat-ing its ability to simultaneously infer context, as well as interpolate and predict input image sequences in a uni-fied framework.