• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 4 of 4

Deep learning via Hessian-free optimization

by James Martens
"... We develop a 2 nd-order optimization method based on the “Hessian-free ” approach, and apply it to training deep auto-encoders. Without using pre-training, we obtain results superior to those reported by Hinton & Salakhutdinov (2006) on the same tasks they considered. Our method is practical, ea ..."
Abstract - Cited by 76 (5 self) - Add to MetaCart
We develop a 2 nd-order optimization method based on the “Hessian-free ” approach, and apply it to training deep auto-encoders. Without using pre-training, we obtain results superior to those reported by Hinton & Salakhutdinov (2006) on the same tasks they considered. Our method is practical

Largescale deep unsupervised learning using graphics processors

by Rajat Raina, Anand Madhavan, Andrew Y. Ng - International Conf. on Machine Learning , 2009
"... The promise of unsupervised learning methods lies in their potential to use vast amounts of unlabeled data to learn complex, highly nonlinear models with millions of free parameters. We consider two well-known unsupervised learning models, deep belief networks (DBNs) and sparse coding, that have rec ..."
Abstract - Cited by 51 (8 self) - Add to MetaCart
recently been applied to a flurry of machine learning applications (Hinton & Salakhutdinov, 2006; Raina et al., 2007). Unfortunately, current learning algorithms for both models are too slow for large-scale applications, forcing researchers to focus on smaller-scale models, or to use fewer training

Experiments with Stochastic Gradient Descent: Condensations of the Real line

by Gustavo Lacerda
"... It is well-known that training Restricted Boltzmann Machines (RBMs) can be difficult in practice. In the realm of stochastic gradient methods, several tricks have been used to obtain faster convergence. These include gradient averaging (known as momentum), averaging the parameters w t, and different ..."
Abstract - Add to MetaCart
report on experiments applying condensations to Hinton & Salakhutdinov’s (2006) Contrastive Divergence procedure on the MNIST dataset, and show a statistically-significant improvement relative to constant and inverse-log schedules of the learning rate. 1

Large Image Databases and Small Codes for Object Recognition

by Rob Fergus, Antonio Torralba, Yair Weiss, William T. Freeman , 2008
"... With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non‐parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Web. Motivated by psychoph ..."
Abstract - Add to MetaCart
little memory, enabling their use on standard hardware or even on handheld devices. Our approach uses the Semantic Hashing idea of Salakhutdinov and Hinton [1], based on Restricted Boltzmann Machines [2] to convert the Gist descriptor (a real valued vector that describes orientation energies at different
Results 1 - 4 of 4
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University