• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Largescale deep unsupervised learning using graphics processors (2009)

Cached

  • Download as a PDF

Download Links

  • [www.robotics.stanford.edu]
  • [www.cs.mcgill.ca]
  • [ai.stanford.edu]
  • [icml2009.org]
  • [www.stanford.edu]
  • [ai.stanford.edu]
  • [www.cs.stanford.edu]
  • [robotics.stanford.edu]
  • [ai.stanford.edu]
  • [www.robotics.stanford.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Rajat Raina , Anand Madhavan , Andrew Y. Ng
Venue:International Conf. on Machine Learning
Citations:51 - 8 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Raina09largescaledeep,
    author = {Rajat Raina and Anand Madhavan and Andrew Y. Ng},
    title = {Largescale deep unsupervised learning using graphics processors},
    booktitle = {International Conf. on Machine Learning},
    year = {2009}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

The promise of unsupervised learning methods lies in their potential to use vast amounts of unlabeled data to learn complex, highly nonlinear models with millions of free parameters. We consider two well-known unsupervised learning models, deep belief networks (DBNs) and sparse coding, that have recently been applied to a flurry of machine learning applications (Hinton & Salakhutdinov, 2006; Raina et al., 2007). Unfortunately, current learning algorithms for both models are too slow for large-scale applications, forcing researchers to focus on smaller-scale models, or to use fewer training examples. In this paper, we suggest massively parallel methods to help resolve these problems. We argue that modern graphics processors far surpass the computational capabilities of multicore CPUs, and have the potential to revolutionize the applicability of deep unsupervised learning methods. We develop general principles for massively parallelizing unsupervised learning tasks using graphics processors. We show that these principles can be applied to successfully scaling up learning algorithms for both DBNs and sparse coding. Our implementation of DBN learning is up to 70 times faster than a dual-core CPU implementation for large models. For example, we are able to reduce the time required to learn a four-layer DBN with 100 million free parameters from several weeks to around a single day. For sparse coding, we develop a simple, inherently parallel algorithm, that leads to a 5 to 15-fold speedup over previous methods.

Keyphrases

graphic processor    largescale deep    sparse coding    free parameter    dbn learning    four-layer dbn    parallel method    dual-core cpu implementation    unlabeled data    vast amount    general principle    nonlinear model    unsupervised learning task    current learning algorithm    machine learning application    15-fold speedup    large model    deep belief network    large-scale application    well-known unsupervised learning model    modern graphic processor    hinton salakhutdinov    training example    previous method    single day    computational capability    parallel algorithm    deep unsupervised learning method    smaller-scale model    method lie    multicore cpu    several week   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University