Results 1  10
of
47
Learning Deep Architectures for AI
"... Theoretical results suggest that in order to learn the kind of complicated functions that can represent highlevel abstractions (e.g. in vision, language, and other AIlevel tasks), one may need deep architectures. Deep architectures are composed of multiple levels of nonlinear operations, such as i ..."
Abstract

Cited by 183 (30 self)
 Add to MetaCart
Theoretical results suggest that in order to learn the kind of complicated functions that can represent highlevel abstractions (e.g. in vision, language, and other AIlevel tasks), one may need deep architectures. Deep architectures are composed of multiple levels of nonlinear operations, such as in neural nets with many hidden layers or in complicated propositional formulae reusing many subformulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the stateoftheart in certain areas. This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of singlelayer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
Deep Sparse Rectifier Neural Networks
"... While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multilayer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbol ..."
Abstract

Cited by 57 (17 self)
 Add to MetaCart
(Show Context)
While logistic sigmoid neurons are more biologically plausible than hyperbolic tangent neurons, the latter work better for training multilayer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard nonlinearity and nondifferentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semisupervised setups with extraunlabeled data, deep rectifier networks can reach their best performance without requiring any unsupervised pretraining on purely supervised tasks with large labeled datasets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised neural networks, and closing the performance gap between neural networks learnt with and without unsupervised pretraining. 1
What and where: A Bayesian inference theory of attention
, 2010
"... In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychop ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while featurebased attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several wellknown attentional phenomena – including bottomup popout effects, multiplicative modulation of neuronal tuning
Shallow vs. deep sumproduct networks
 In NIPS
, 2011
"... We investigate the representational power of sumproduct networks (computation networks analogous to neural networks, but whose individual units compute either products or weighted sums), through a theoretical analysis that compares deep (multiple hidden layers) vs. shallow (one hidden layer) archit ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
We investigate the representational power of sumproduct networks (computation networks analogous to neural networks, but whose individual units compute either products or weighted sums), through a theoretical analysis that compares deep (multiple hidden layers) vs. shallow (one hidden layer) architectures. We prove there exist families of functions that can be represented much more efficiently with a deep network than with a shallow one, i.e. with substantially fewer hidden units. Such results were not available until now, and contribute to motivate recent research involving learning of deep sumproduct networks, and more generally motivate research in Deep Learning. 1 Introduction and prior work Many learning algorithms are based on searching a family of functions so as to identify one member of said family which minimizes a training criterion. The choice of this family of functions and how members of that family are parameterized can be a crucial one. Although there is no universally optimal choice of parameterization or family of functions (or “architecture”), as demonstrated by
Latent Hierarchical Model of Temporal Structure for Complex Activity Classification
"... Abstract — Modeling the temporal structure of subactivities is an important yet challenging problem in complex activity classification. This paper proposes a latent hierarchical model (LHM) to describe the decomposition of complex activity into subactivities in a hierarchical way. The LHM has a t ..."
Abstract

Cited by 8 (7 self)
 Add to MetaCart
(Show Context)
Abstract — Modeling the temporal structure of subactivities is an important yet challenging problem in complex activity classification. This paper proposes a latent hierarchical model (LHM) to describe the decomposition of complex activity into subactivities in a hierarchical way. The LHM has a treestructure, where each node corresponds to a video segment (subactivity) at certain temporal scale. The starting and ending time points of each subactivity are represented by two latent variables, which are automatically determined during the inference process. We formulate the training problem of the LHM in a latent kernelized SVM framework and develop an efficient cascade inference method to speed up classification. The advantages of our methods come from: 1) LHM models the complex activity with a deep structure, which is decomposed into subactivities in a coarsetofine manner and 2) the starting and ending time points of each segment are adaptively determined to deal with the temporal displacement and duration variation of subactivity. We conduct experiments on three datasets: 1) the KTH; 2) the Hollywood2; and 3) the Olympic Sports. The experimental results show the effectiveness of the LHM in complex activity classification. With dense features, our LHM achieves the stateoftheart performance on the Hollywood2 dataset and the Olympic Sports dataset. Index Terms — Activity classification, hierarchical model, deep
What and where: a bayesian inference theory of visual attention
 Vision Research
"... In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psych ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
In the theoretical framework described in this thesis, attention is part of the inference process that solves the visual recognition problem of what is where. The theory proposes a computational role for attention and leads to a model that predicts some of its main properties at the level of psychophysics and physiology. In our approach, the main goal of the visual system is to infer the identity and the position of objects in visual scenes: spatial attention emerges as a strategy to reduce the uncertainty in shape information while featurebased attention reduces the uncertainty in spatial information. Featural and spatial attention represent two distinct modes of a computational process solving the problem of recognizing and localizing objects, especially in difficult recognition tasks such as in cluttered natural scenes. We describe a specific computational model and relate it to the known functional anatomy of attention. We show that several wellknown attentional phenomena including bottomup popout effects, multiplicative modulation of neuronal tuning curves and shift in contrast responses emerge naturally as predictions of the model. We also show that the bayesian model predicts well human eye fixations (considered as a proxy
Learning hierarchical sparse representations using iterative dictionary learning and dimension reduction
 In Proc. of BICA
, 2011
"... This paper introduces an elemental building block which combines Dictionary Learning and Dimension Reduction (DRDL). We show how this foundational element can be used to iteratively construct a Hierarchical Sparse Representation (HSR) of a sensory stream. We compare our approach to existing models ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This paper introduces an elemental building block which combines Dictionary Learning and Dimension Reduction (DRDL). We show how this foundational element can be used to iteratively construct a Hierarchical Sparse Representation (HSR) of a sensory stream. We compare our approach to existing models showing the generality of our simple prescription. We then perform preliminary experiments using this framework, illustrating with the example of an object recognition task using standard datasets. This work introduces the very first steps towards an integrated framework for designing and analyzing various computational tasks from learning to attention to action. The ultimate goal is building a mathematically rigorous, integrated theory of intelligence.
Derived Distance: towards a mathematical theory of visual cortex.
, 2007
"... We describe a “natural ” metric on the space of images motivated by the neuroscience of visual cortex. We propose the notion of a hierarchical derived distance and suggest that it could be applied to the classification of imagery and text and to the analysis of genomics data. 1 ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We describe a “natural ” metric on the space of images motivated by the neuroscience of visual cortex. We propose the notion of a hierarchical derived distance and suggest that it could be applied to the classification of imagery and text and to the analysis of genomics data. 1
Informatics in neuroscience
 Brief. Bioinformatics
, 2007
"... The application of informatics to neuroscience goes far beyond ‘traditional ’ bioinformatics modalities such as DNA sequences. In this review, we describe how informatics is being used to study the nervous system at multiple levels, spanning scales from molecules to behavior. The continuing developm ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
The application of informatics to neuroscience goes far beyond ‘traditional ’ bioinformatics modalities such as DNA sequences. In this review, we describe how informatics is being used to study the nervous system at multiple levels, spanning scales from molecules to behavior. The continuing development of standards for data exchange and interoperability, together with increasing awareness and acceptance of the importance of data sharing, are among the key efforts required to advance the field.
Mathematics of the neural response
, 2008
"... We propose a natural image representation, the neural response, motivated by the neuroscience of the visual cortex. The inner product defined by the neural response leads to a similarity measure between functions which we call the derived kernel. Based on a hierarchical architecture, we give a recur ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
We propose a natural image representation, the neural response, motivated by the neuroscience of the visual cortex. The inner product defined by the neural response leads to a similarity measure between functions which we call the derived kernel. Based on a hierarchical architecture, we give a recursive definition of the neural response and associated derived kernel. The derived kernel can be used in a variety of application The goal of this paper is to define a distance function on a space of images which reflects how humans see the images. The distance between two images corresponds to how similar they appear to an observer. Most learning algorithms critically depend on a suitably defined similarity measure, though the theory of learning so far provides no general rule to choose