Results 1  10
of
151
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
 IN ICML’09
, 2009
"... ..."
Restricted Boltzmann machines for collaborative filtering
 In Machine Learning, Proceedings of the Twentyfourth International Conference (ICML 2004). ACM
, 2007
"... Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of twolayer undirected graphical models, called Restricted Boltzmann Machines (RBM’s), can be used to model tabular data, such as user’s ratings of movies. We present eff ..."
Abstract

Cited by 220 (12 self)
 Add to MetaCart
(Show Context)
Most of the existing approaches to collaborative filtering cannot handle very large data sets. In this paper we show how a class of twolayer undirected graphical models, called Restricted Boltzmann Machines (RBM’s), can be used to model tabular data, such as user’s ratings of movies. We present efficient learning and inference procedures for this class of models and demonstrate that RBM’s can be successfully applied to the Netflix data set, containing over 100 million user/movie ratings. We also show that RBM’s slightly outperform carefullytuned SVD models. When the predictions of multiple RBM models and multiple SVD models are linearly combined, we achieve an error rate that is well over 6 % better than the score of Netflix’s own system. 1.
Learning Deep Architectures for AI
"... Theoretical results suggest that in order to learn the kind of complicated functions that can represent highlevel abstractions (e.g. in vision, language, and other AIlevel tasks), one may need deep architectures. Deep architectures are composed of multiple levels of nonlinear operations, such as i ..."
Abstract

Cited by 183 (30 self)
 Add to MetaCart
Theoretical results suggest that in order to learn the kind of complicated functions that can represent highlevel abstractions (e.g. in vision, language, and other AIlevel tasks), one may need deep architectures. Deep architectures are composed of multiple levels of nonlinear operations, such as in neural nets with many hidden layers or in complicated propositional formulae reusing many subformulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the stateoftheart in certain areas. This paper discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of singlelayer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
Rectified Linear Units Improve Restricted Boltzmann Machines Vinod Nair
"... Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all have the same weights but have progressively more negative biases. The learning and inference rules for these “Stepped Si ..."
Abstract

Cited by 154 (8 self)
 Add to MetaCart
(Show Context)
Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all have the same weights but have progressively more negative biases. The learning and inference rules for these “Stepped Sigmoid Units ” are unchanged. They can be approximated efficiently by noisy, rectified linear units. Compared with binary units, these units learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset. Unlike binary units, rectified linear units preserve information about relative intensities as information travels through multiple layers of feature detectors. 1.
Classification using discriminative restricted boltzmann machines
 In ICML ’08: Proceedings of the 25th international conference on Machine learning. ACM
, 2008
"... Recently, many applications for Restricted Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extractors for another learning algorithm or to provide a good initialization for deep feedforward neural network classifiers, ..."
Abstract

Cited by 99 (13 self)
 Add to MetaCart
(Show Context)
Recently, many applications for Restricted Boltzmann Machines (RBMs) have been developed for a large variety of learning problems. However, RBMs are usually used as feature extractors for another learning algorithm or to provide a good initialization for deep feedforward neural network classifiers, and are not considered as a standalone solution to classification problems. In this paper, we argue that RBMs provide a selfcontained framework for deriving competitive nonlinear classifiers. We present an evaluation of different learning algorithms for RBMs which aim at introducing a discriminative component to RBM training and improve their performance as classifiers. This approach is simple in that RBMs are used directly to build a classifier, rather than as a stepping stone. Finally, we demonstrate how discriminative RBMs can also be successfully employed in a semisupervised setting.
Deep Belief Networks for phone recognition
"... Hidden Markov Models (HMMs) have been the stateoftheart techniques for acoustic modeling despite their unrealistic independence assumptions and the very limited representational capacity of their hidden states. There are many proposals in the research community for deeper models that are capable ..."
Abstract

Cited by 89 (12 self)
 Add to MetaCart
(Show Context)
Hidden Markov Models (HMMs) have been the stateoftheart techniques for acoustic modeling despite their unrealistic independence assumptions and the very limited representational capacity of their hidden states. There are many proposals in the research community for deeper models that are capable of modeling the many types of variability present in the speech generation process. Deep Belief Networks (DBNs) have recently proved to be very effective for a variety of machine learning problems and this paper applies DBNs to acoustic modeling. On the standard TIMIT corpus, DBNs consistently outperform other techniques and the best DBN achieves a phone error rate (PER) of 23.0 % on the TIMIT core test set. 1
On the Quantitative Analysis of Deep Belief Networks
"... Deep Belief Networks (DBN’s) are generative models that contain many layers of hidden variables. Efficient greedy algorithms for learning and approximate inference have allowed these models to be applied successfully in many application domains. The main building block of a DBN is a bipartite undire ..."
Abstract

Cited by 84 (17 self)
 Add to MetaCart
(Show Context)
Deep Belief Networks (DBN’s) are generative models that contain many layers of hidden variables. Efficient greedy algorithms for learning and approximate inference have allowed these models to be applied successfully in many application domains. The main building block of a DBN is a bipartite undirected graphical model called a restricted Boltzmann machine (RBM). Due to the presence of the partition function, model selection, complexity control, and exact maximum likelihood learning in RBM’s are intractable. We show that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and we present a novel AIS scheme for comparing RBM’s with different architectures. We further show how an AIS estimator, along with approximate inference, can be used to estimate a lower bound on the logprobability that a DBN model with multiple hidden layers assigns to the test data. This is, to our knowledge, the first step towards obtaining quantitative results that would allow us to directly assess the performance of Deep Belief Networks as generative models of data. 1.
Learning to Represent Spatial Transformations with Factored HigherOrder Boltzmann Machines
, 2010
"... To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced threeway multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric we ..."
Abstract

Cited by 75 (18 self)
 Add to MetaCart
To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced threeway multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a threedimensional interaction tensor. We describe a lowrank approximation to this interaction tensor that uses a sum of factors, each of which is a threeway outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.
Factored Conditional Restricted Boltzmann Machines for Modeling Motion Style
"... The Conditional Restricted Boltzmann Machine (CRBM) is a recently proposed model for time series that has a rich, distributed hidden state and permits simple, exact inference. We present a new model, based on the CRBM that preserves its most important computational properties and includes multiplica ..."
Abstract

Cited by 59 (10 self)
 Add to MetaCart
(Show Context)
The Conditional Restricted Boltzmann Machine (CRBM) is a recently proposed model for time series that has a rich, distributed hidden state and permits simple, exact inference. We present a new model, based on the CRBM that preserves its most important computational properties and includes multiplicative threeway interactions that allow the effective interaction weight between two units to be modulated by the dynamic state of a third unit. We factorize the threeway weight tensor implied by the multiplicative model, reducing the number of parameters from O(N 3) to O(N 2). The result is an efficient, compact model whose effectiveness we demonstrate by modeling human motion. Like the CRBM, our model can capture diverse styles of motion with a single set of parameters, and the threeway interactions greatly improve the model’s ability to blend motion styles or to transition smoothly between them. 1.