• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex. http://dspace.mit.edu/handle/1721.1/36407 (2005)

by T Serre, M Kouh, C Cadieu
Venue:xxxx (2013) 1–11 C○ 2013
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 85
Next 10 →

Robust object recognition with cortex-like mechanisms

by Thomas Serre, Lior Wolf, Stanley Bileschi, Maximilian Riesenhuber, Tomaso Poggio - IEEE Trans. Pattern Analysis and Machine Intelligence , 2007
"... Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating b ..."
Abstract - Cited by 389 (47 self) - Add to MetaCart
Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shape-based as well as texture-based objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with state-of-the-art systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.
(Show Context)

Citation Context

...edicated to shape processing and object recognition. The system described in this paper may be the first counterexample to this situation: it is based on a model of object recognition in cortex [14], =-=[15]-=-, it respects the properties of cortical processing (including the absence of geometrical information) while showing performance at least comparable to the best computer vision systems. It has been su...

Convex multi-task feature learning

by Andreas Argyriou, Theodoros Evgeniou, Massimiliano Pontil - MACHINE LEARNING , 2007
"... We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove th ..."
Abstract - Cited by 258 (25 self) - Add to MetaCart
We present a method for learning sparse representations shared across multiple tasks. This method is a generalization of the well-known single-task 1-norm regularization. It is based on a novel non-convex regularizer which controls the number of learned features common across the tasks. We prove that the method is equivalent to solving a convex optimization problem for which there is an iterative algorithm which converges to an optimal solution. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the former step it learns task-specific functions and in the latter step it learns common-across-tasks sparse representations for these functions. We also provide an extension of the algorithm which learns sparse nonlinear representations using kernels. We report experiments on simulated and real data sets which demonstrate that the proposed method can both improve the performance relative to learning each task independently and lead to a few learned features common across related tasks. Our algorithm can also be used, as a special case, to simply select – not learn – a few common variables across the tasks.
(Show Context)

Citation Context

...ting a specific object in images is treated as a single supervised learning task. Images of different objects may share a number of features that are different from the pixel representation of images =-=[28, 41, 43]-=-. In modeling users/consumers’ preferences [1, 33], there may be common product features (e.g., for cars, books, webpages, consumer electronics, etc) that are considered to be important by a number of...

Multi-task feature learning

by Andreas Argyriou, Theodoros Evgeniou, Massimiliano Pontil - Advances in Neural Information Processing Systems 19 , 2007
"... We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the wellknown 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that th ..."
Abstract - Cited by 240 (8 self) - Add to MetaCart
We present a method for learning a low-dimensional representation which is shared across a set of multiple related tasks. The method builds upon the wellknown 1-norm regularization problem using a new regularizer which controls the number of learned features common for all the tasks. We show that this problem is equivalent to a convex optimization problem and develop an iterative algorithm for solving it. The algorithm has a simple interpretation: it alternately performs a supervised and an unsupervised step, where in the latter step we learn commonacross-tasks representations and in the former step we learn task-specific functions using these representations. We report experiments on a simulated and a real data set which demonstrate that the proposed method dramatically improves the performance relative to learning each task independently. Our algorithm can also be used, as a special case, to simply select – not learn – a few common features across the tasks.
(Show Context)

Citation Context

...sual system is organized in a way such that all objects 1 are represented – at the earlier stages of the visual system – using a common set of features learned, e.g. local filters similar to wavelets =-=[16]-=-. In modeling users’ preferences/choices, it may also be the case that people make product choices (e.g. of books, music CDs, etc.) using a common set of features describing these products. In this pa...

Object class recognition and localization using sparse features with limited receptive fields

by Jim Mutch, David G. Lowe , 2006
"... ..."
Abstract - Cited by 93 (4 self) - Add to MetaCart
Abstract not found

Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity

by Timothée Masquelier, Simon J. Thorpe
"... Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on affe ..."
Abstract - Cited by 61 (6 self) - Add to MetaCart
Spike timing dependent plasticity (STDP) is a learning rule that modifies synaptic strength as a function of the relative timing of pre- and postsynaptic spikes. When a neuron is repeatedly presented with similar inputs, STDP is known to have the effect of concentrating high synaptic weights on afferents that systematically fire early, while postsynaptic spike latencies decrease. Here we use this learning rule in an asynchronous feedforward spiking neural network that mimics the ventral visual pathway and shows that when the network is presented with natural images, selectivity to intermediate-complexity visual features emerges. Those features, which correspond to prototypical patterns that are both salient and consistently present in the images, are highly informative and enable robust object recognition, as demonstrated on various classification tasks. Taken together, these results show that temporal codes may be a key to understanding the phenomenal processing speed achieved by the visual system and that STDP can lead to fast and selective responses.
(Show Context)

Citation Context

...le spike train generated by the image. This final potential can be seen as the number of early spikes in common between a current input and a stored prototype (this contrasts with HMAX and extensions =-=[6,7,26]-=-, where a Euclidian distance or a normalized dot product is used to measure the difference between a stored prototype PLoS Computational Biology | www.ploscompbiol.org 0003 February 2007 | Volume 3 | ...

A quantitative theory of immediate visual recognition

by Thomas Serre, Gabriel Kreiman, Minjoon Kouh, Charles Cadieu, Ulf Knoblich, Tomaso Poggio - PROG BRAIN RES , 2007
"... Human and non-human primates excel at visual recognition tasks. The primate visual system exhibits a strong degree of selectivity while at the same time being robust to changes in the input image. We have developed a quantitative theory to account for the computations performed by the feedforward p ..."
Abstract - Cited by 47 (14 self) - Add to MetaCart
Human and non-human primates excel at visual recognition tasks. The primate visual system exhibits a strong degree of selectivity while at the same time being robust to changes in the input image. We have developed a quantitative theory to account for the computations performed by the feedforward path in the ventral stream of the primate visual cortex. Here we review recent predictions by a model instantiating the theory about physiological observations in higher visual areas. We also show that the model can perform recognition tasks on datasets of complex natural images at a level comparable to psychophysical measurements on human observers during rapid categorization tasks. In sum, the evidence suggests that the theory may provide a framework to explain the first 100–150 ms of visual object recognition. The model also constitutes a vivid example of how computational models can interact with experimental observations in order to advance our understanding of a complex phenomenon. We conclude by suggesting a number of open questions, predictions, and specific experiments for visual physiology and psychophysics.
(Show Context)

Citation Context

... phenomenological level. Plausible biophysical circuits for the TUNING and MAX operations have been proposed based on feedforward and/or feedback shunting inhibition combined with normalization (see (=-=Serre et al., 2005-=-) and references therein). j=1 2.2.2 Building a dictionary of shape-components from V1 to IT The overall architecture is sketched in Figure 1 and reflects the general organization of the visual cortex...

What and where: A Bayesian inference theory of attention

by Sharat Chikkerur , 2010
"... In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychop ..."
Abstract - Cited by 36 (6 self) - Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while feature-based attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several well-known attentional phenomena – including bottom-up pop-out effects, multiplicative modulation of neuronal tuning

A model of V4 shape selectivity and invariance

by Charles Cadieu, Minjoon Kouh, Anitha Pasupathy, Charles E. Connor, Maximilian Riesenhuber, Tomaso Poggio - J. Neurophysiol , 2007
"... Poggio T. A model of V4 shape selectivity and invariance. J ..."
Abstract - Cited by 26 (6 self) - Add to MetaCart
Poggio T. A model of V4 shape selectivity and invariance. J
(Show Context)

Citation Context

...d to transformed versions of the same stimuli. Based on these hypotheses, quantitative models of the ventral pathway have been developed (Fukushima et al. 1983; Mel 1997; Riesenhuber and Poggio 1999; =-=Serre et al. 2005-=-, 2007a) with the goal of explaining object recognition. The V4 model presented here is part of a model (Serre et al. 2005, 2007a) of the entire ventral pathway. Within this framework, we sought to ex...

Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics

by Dirk Walther, Dirk Walther , 2006
"... iii iv I would like to thank my advisor, Dr. Christof Koch, for his guidance and patience throughout the work that led to this thesis. He and the other members of my advisory committee, Dr. Pietro Perona, Dr. Laurent Itti, Dr. Shinsuke Shimojo, and Dr. Richard Andersen, helped me to stay focused whe ..."
Abstract - Cited by 21 (0 self) - Add to MetaCart
iii iv I would like to thank my advisor, Dr. Christof Koch, for his guidance and patience throughout the work that led to this thesis. He and the other members of my advisory committee, Dr. Pietro Perona, Dr. Laurent Itti, Dr. Shinsuke Shimojo, and Dr. Richard Andersen, helped me to stay focused when I was about to embark on yet another project. It was an honor and pleasure to collaborate with Ueli Rutishauser and Dr. Fei-Fei Li at Caltech;

Attention in hierarchical models of object recognition

by Dirk B. Walther, Christof Koch - Prog. Brain Res , 2007
"... Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framewor ..."
Abstract - Cited by 13 (0 self) - Add to MetaCart
Object recognition and visual attention are tightly linked processes in human perception. Over the last three decades, many models have been suggested to explain these two processes and their interactions, and in some cases these models appear to contradict each other. We suggest a unifying framework for object recognition and attention and review the existing modeling literature in this context. Furthermore, we demonstrate a proof-of-concept implementation for sharing complex features between recognition and attention as a mode of top-down attention to particular objects or object categories. “At first he’d most easily make out the shadows; and after that the phantoms of the human beings and the other things in water; and, later, the things them-selves. ” — Socrates describing the visual experience of a man exposed to the richness of the visual world outside his cave for the first time (Plato, The Re-public). 1 1
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University