Results 11 
17 of
17
Marginalized Denoising Autoencoders for Nonlinear Representations
"... Denoising autoencoders (DAEs) have been successfully used to learn new representations for a wide range of machine learning tasks. During training, DAEs make many passes over the training dataset and reconstruct it from partial corruption generated from a prespecified corrupting distribution. T ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Denoising autoencoders (DAEs) have been successfully used to learn new representations for a wide range of machine learning tasks. During training, DAEs make many passes over the training dataset and reconstruct it from partial corruption generated from a prespecified corrupting distribution. This process learns robust representation, though at the expense of requiring many training epochs, in which the data is explicitly corrupted. In this paper we present the marginalized Denoising Autoencoder (mDAE), which (approximately) marginalizes out the corruption during training. Effectively, the mDAE takes into account infinitely many corrupted copies of the training data in every epoch, and therefore is able to match or outperform the DAE with much fewer training epochs. We analyze our proposed algorithm and show that it can be understood as a classic autoencoder with a special form of regularization. In empirical evaluations we show that it attains 12 orderofmagnitude speedup in training time over other competing approaches. 1.
The 1st International Workshop "Feature Extraction: Modern Questions and Challenges" FEAST at Play: Feature ExtrAction using Score function Tensors
"... Abstract Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we build upon a novel framework called FEAST(Feature ExtrAction using Score function Tensors) which incorporates gene ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract Feature learning forms the cornerstone for tackling challenging learning problems in domains such as speech, computer vision and natural language processing. In this paper, we build upon a novel framework called FEAST(Feature ExtrAction using Score function Tensors) which incorporates generative models for discriminative learning. FEAST considers a novel class of matrix and tensorvalued feature transform, which can be pretrained using unlabeled samples. It uses an efficient algorithm for extracting discriminative information, given these pretrained features and labeled samples for any related task. The class of features it adopts are based on higherorder score functions, which capture local variations in the probability density function of the input. We employ efficient spectral decomposition algorithms (on matrices and tensors) for extracting discriminative components. The advantage of employing tensorvalued features is that we can extract richer discriminative information in the form of overcomplete representations (where number of discriminative features is greater than input dimension). In this paper, we provide preliminary experiment results on real datasets.
GSNs: Generative Stochastic Networks
"... We introduce a novel training principle for generative probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework generalizes Denoising AutoEncoders (DAE) and is based on learning the transition operator of a Markov chain whose s ..."
Abstract
 Add to MetaCart
We introduce a novel training principle for generative probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework generalizes Denoising AutoEncoders (DAE) and is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. The transition distribution is a conditional distribution that generally involves a small move, so it has fewer dominant modes and is unimodal in the limit of small moves. This simplifies the learning problem, making it less like density estimation and more akin to supervised function approximation, with gradients that can be obtained by backprop. The theorems provided here provide a probabilistic interpretation for denoising autoencoders and generalize them; seen in the context of this framework, autoencoders that learn with injected noise are a special case of GSNs and can be interpreted as generative models. The theorems also provide an interesting justification for dependency networks and generalized pseudolikelihood and define an appropriate joint distribution and sampling mechanism, even when the conditionals are not consistent. GSNs can be used with missing inputs and can be used to sample subsets of variables given the rest. Experiments validating these theoretical results are conducted on both synthetic datasets and image datasets. The experiments employ a particular architecture that mimics the Deep Boltzmann Machine Gibbs sampler but that allows training to proceed with backprop through a recurrent neural network with noise injected inside and without the need for layerwise pretraining. 1.
Smart Decisions by Small Adjust ments: Iterating Denoising Autoen coders
, 2014
"... An iterative neural architecture based on repeated application of the Denoising Autoencoder is introduced. The architecture is placed in the family of other approaches involving networks of simple units and iteration at the exploitation stage. It is shown that repeated feeding of a pattern to a Deno ..."
Abstract
 Add to MetaCart
An iterative neural architecture based on repeated application of the Denoising Autoencoder is introduced. The architecture is placed in the family of other approaches involving networks of simple units and iteration at the exploitation stage. It is shown that repeated feeding of a pattern to a Denoising Autoencoder often yields nontrivial sensible improvements of the pattern. This statement is supported by a classification experiment, in which the data transformed by our architecture is shown to be more linearly separable than the original samples.
Predicting Images using Convolutional Networks: Visual Scene Understanding with Pixel Maps
, 2015
"... ..."
The Potential Energy of an Autoencoder
"... Abstract—Autoencoders are popular feature learning models, that are conceptually simple, easy to train and allow for efficient inference and training. Recent work has shown how certain autoencoders can be associated with an energy landscape, akin to negative logprobability in a probabilistic model, ..."
Abstract
 Add to MetaCart
Abstract—Autoencoders are popular feature learning models, that are conceptually simple, easy to train and allow for efficient inference and training. Recent work has shown how certain autoencoders can be associated with an energy landscape, akin to negative logprobability in a probabilistic model, which measures how well the autoencoder can represent regions in the input space. The energy landscape has been commonly inferred heuristically, by using a training criterion that relates the autoencoder to a probabilistic model such as a Restricted Boltzmann Machine (RBM). In this paper we show how most common autoencoders are naturally associated with an energy function, independent of the training procedure, and that the energy landscape can be inferred analytically by integrating the reconstruction function of the autoencoder. For autoencoders with sigmoid hidden units, the energy function is identical to the free energy of an RBM, which helps shed light onto the relationship between these two types of model. We also show that the autoencoder energy function allows us to explain common regularization procedures, such as contractive training, from the perspective of dynamical systems. As a practical application of the energy function, a generative classifier based on classspecific autoencoders is presented. Index Terms—Autoencoders, representation learning, unsupervised learning, generative classification F 1