• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A fast learning algorithm for deep belief nets (2006)

Cached

  • Download as a PDF

Download Links

  • [www.cs.berkeley.edu]
  • [www.gatsby.ucl.ac.uk]
  • [www.gatsby.ucl.ac.uk]
  • [www.eecs.berkeley.edu]
  • [www.cs.berkeley.edu]
  • [www.cs.toronto.edu]
  • [www.cs.utoronto.ca]
  • [learning.cs.toronto.edu]
  • [www.cs.toronto.edu]
  • [www.gatsby.ucl.ac.uk]
  • [www.cs.toronto.edu]
  • [www.cs.utoronto.ca]
  • [www.cs.toronto.edu]
  • [learning.cs.toronto.edu]

  • Other Repositories/Bibliography

  • CiteULike
  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Geoffrey E. Hinton , Simon Osindero
Venue:Neural Computation
Citations:241 - 40 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@ARTICLE{Hinton06afast,
    author = {Geoffrey E. Hinton and Simon Osindero},
    title = {A fast learning algorithm for deep belief nets},
    journal = {Neural Computation},
    year = {2006},
    volume = {18},
    pages = {2006}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1

Citations

850 Shape matching and object recognition using shape contexts - Belongie, Malik, et al. - 2002
612 A view of the EM algorithm that justifies incremental, sparse, and other variants - Neal, Hinton - 1998
487 Gradientbased learning applied to document recognition - LeCun, Bottou, et al. - 1998
358 Boosting a weak learning algorithm by majority - Freund - 1995
353 Training products of experts by minimizing contrastive divergence - Hinton
327 Projection pursuit regression - Friedman, Stuetzle - 1981
189 Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural networks - Sanger - 1989
159 Connectionist learning of belief networks - Neal - 1992
153 of experts: A framework for learning image priors - ROTH, Fields
108 Training invariant support vector machines. Machine Learning, 2001. Accepted for publication. Also - DeCoste, Schölkopf - 1996
106 Hierarchical Bayesian inference in the visual cortex - Lee, Mumford - 2003
65 Best practice for convolutional neural networks applied to visual document analysis. ICDR - Simard, Steinkraus, et al. - 2003
65 Exponential family harmoniums with an application to information retrieval - Welling, Rosen-Zvi, et al. - 2005
56 Probabilistic inference in intelligent systems - Pearl - 1988
54 On contrastive divergence learning - Carreira-Perpiñán, Hinton - 2005
43 Energy-Based Models for Sparse Overcomplete Representations - Teh, Welling, et al. - 2003
37 OSINDERO S.: Learning sparse topographic representations with products of student-t distributions - WELLING, HINTON
27 Recognizing Handwritten Digits Using Hierarchical Products of Experts - Mayraz, Hinton - 2002
24 Rate-coded restricted boltzmann machines for face recognition - Teh, Hinton - 2000
15 The wake-sleep algorithm for self-organizing neural networks - Hinton, Dayan, et al. - 1995
10 Diffusion networks, product of experts, and factor analysis - Marks, Movellan - 2001
10 Toward Automatic Phenotyping of Developing Embryos from Videos - Ning, Delhomme, et al. - 2005
8 Bottou (Eds.), Advances in neural information processing systems (vol - Saul, Weiss, et al. - 2007
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University