• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Greedy layer-wise training of deep networks (2006)

Cached

  • Download as a PDF

Download Links

  • [www.iro.umontreal.ca]
  • [www.iro.umontreal.ca]
  • [books.nips.cc]
  • [www.cs.toronto.edu]
  • [www.dmi.usherb.ca]
  • [machinelearning.wustl.edu]
  • [www.dmi.usherb.ca]
  • [www.cs.toronto.edu]
  • [www.iro.umontreal.ca]
  • [www.cs.toronto.edu]
  • [www.dmi.usherb.ca]
  • [www.dmi.usherb.ca]
  • [www.iro.umontreal.ca]
  • [www.cs.toronto.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Yoshua Bengio , Pascal Lamblin , Dan Popovici , Hugo Larochelle
Citations:394 - 48 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Bengio06greedylayer-wise,
    author = {Yoshua Bengio and Pascal Lamblin and Dan Popovici and Hugo Larochelle},
    title = {Greedy layer-wise training of deep networks},
    year = {2006}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hin-ton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this al-gorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input dis-tribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also conrm the hypothesis that the greedy layer-wise unsu-pervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

Keyphrases

montr eal    generative model    input dis-tribution    hidden causal variable    greedy layer-wise unsupervised learning algorithm    complexity theory    high-level abstraction    supervised task    shallow architecture    gradient-based optimization    deep network    deep multi-layer neural network    optimization problem    internal distributed representation    deep architecture    computational element    good local minimum    poor solution    highly-varying function    greedy layer-wise unsu-pervised training strategy    many layer    hin-ton et al    random initialization    deep belief network    many level   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University