Results 1 - 10
of
85
Robust object recognition with cortex-like mechanisms
- IEEE Trans. Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating b ..."
Abstract
-
Cited by 389 (47 self)
- Add to MetaCart
(Show Context)
Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.
Convex multi-task feature learning
- MACHINE LEARNING
, 2007
"... We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove th ..."
Abstract
-
Cited by 258 (25 self)
- Add to MetaCart
(Show Context)
We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task-specific functions and in the latter step it learns common-across-tasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select – not learn – a few common variables across the tasks.
Multi-task feature learning
- Advances in Neural Information Processing Systems 19
, 2007
"... We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the wellknown 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that th ..."
Abstract
-
Cited by 240 (8 self)
- Add to MetaCart
(Show Context)
We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the wellknown 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that this problem is equivalent to a convex optimization problem and develop an iterative algorithm for solving it. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the latter step we learn commonacross-tasks representations and in the former step we learn task-specific functions using these representations. We report experiments on a simulated and a real data set which demonstrate that the proposed method dramatically improves the performance relative to learning each task independently. Our algorithm can also be used, as a special case, to simply select – not learn – a few common features across the tasks.
Object class recognition and localization using sparse features with limited receptive fields
, 2006
"... ..."
Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity
"... Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on affe ..."
Abstract
-
Cited by 61 (6 self)
- Add to MetaCart
(Show Context)
Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on afferents that systematically fire early, while postsynaptic spike latencies decrease. Here we use this learning rule in an asynchronous feedforward spiking neural network that mimics the ventral visual pathway and shows that when the network is presented with natural images, selectivity to intermediate-complexity visual features emerges. Those features, which correspond to prototypical patterns that are both salient and consistently present in the images, are highly informative and enable robust object recognition, as demonstrated on various classification tasks. Taken together, these results show that temporal codes may be a key to understanding the phenomenal processing speed achieved by the visual system and that STDP can lead to fast and selective responses.
A quantitative theory of immediate visual recognition
- PROG BRAIN RES
, 2007
"... Human and non-human primates excel at visual recognition tasks. The primate visual system exhibits a strong degree of selectivity while at the same time being robust to changes in the input image. We have developed a quantitative theory to account for the computations performed by the feedforward p ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
(Show Context)
Human and non-human primates excel at visual recognition tasks. The primate visual system exhibits a strong degree of selectivity while at the same time being robust to changes in the input image. We have developed a quantitative theory to account for the computations performed by the feedforward path in the ventral stream of the primate visual cortex. Here we review recent predictions by a model instantiating the theory about physiological observations in higher visual areas. We also show that the model can perform recognition tasks on datasets of complex natural images at a level comparable to psychophysical measurements on human observers during rapid categorization tasks. In sum, the evidence suggests that the theory may provide a framework to explain the first 100–150 ms of visual object recognition. The model also constitutes a vivid example of how computational models can interact with experimental observations in order to advance our understanding of a complex phenomenon. We conclude by suggesting a number of open questions, predictions, and specific experiments for visual physiology and psychophysics.
What and where: A Bayesian inference theory of attention
, 2010
"... In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychop ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenomena – including bottom-up pop-out effects, multiplicative modulation of neuronal tuning
A model of V4 shape selectivity and invariance
- J. Neurophysiol
, 2007
"... Poggio T. A model of V4 shape selectivity and invariance. J ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
(Show Context)
Poggio T. A model of V4 shape selectivity and invariance. J
Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics
, 2006
"... iii iv I would like to thank my advisor, Dr. Christof Koch, for his guidance and patience throughout the work that led to this thesis. He and the other members of my advisory committee, Dr. Pietro Perona, Dr. Laurent Itti, Dr. Shinsuke Shimojo, and Dr. Richard Andersen, helped me to stay focused whe ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
iii iv I would like to thank my advisor, Dr. Christof Koch, for his guidance and patience throughout the work that led to this thesis. He and the other members of my advisory committee, Dr. Pietro Perona, Dr. Laurent Itti, Dr. Shinsuke Shimojo, and Dr. Richard Andersen, helped me to stay focused when I was about to embark on yet another project. It was an honor and pleasure to collaborate with Ueli Rutishauser and Dr. Fei-Fei Li at Caltech;
Attention in hierarchical models of object recognition
- Prog. Brain Res
, 2007
"... Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framewor ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framework for object recognition and attention and review the existing modeling literature in this context. Furthermore, we demonstrate a proof-of-concept implementation for sharing complex features between recognition and attention as a mode of top-down attention to particular objects or object categories. “At first he’d most easily make out the shadows; and after that the phantoms of the human beings and the other things in water; and, later, the things them-selves. ” — Socrates describing the visual experience of a man exposed to the richness of the visual world outside his cave for the first time (Plato, The Re-public). 1 1