• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 4,602
Next 10 →

Why does unsupervised pre-training help deep learning?

by Dumitru Erhan, Aaron Courville, Yoshua Bengio, Pascal Vincent , 2010
"... Much recent research has been devoted to learning algorithms for deep architectures such as Deep Belief Networks and stacks of autoencoder variants with impressive results being obtained in several areas, mostly on vision and language datasets. The best results obtained on supervised learning tasks ..."
Abstract - Cited by 155 (20 self) - Add to MetaCart
results in an online setting, with a virtually unlimited data stream, point to a somewhat more nuanced interpretation of the roles of optimization and regularization in the unsupervised pre-training effect.

A Maximum-Entropy-Inspired Parser

by Eugene Charniak , 1999
"... We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trained and tested on the previously established [5,9,10,15,17] "stan- dard" se ..."
Abstract - Cited by 971 (19 self) - Add to MetaCart
and combine many different conditioning events. We also present some partial results showing the effects of different conditioning information, including a surprising 2% improvement due to guessing the lexical head's pre-terminal before guessing the lexical head.

Manifold regularization: A geometric framework for learning from labeled and unlabeled examples

by Mikhail Belkin, Partha Niyogi, Vikas Sindhwani - JOURNAL OF MACHINE LEARNING RESEARCH , 2006
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning al ..."
Abstract - Cited by 578 (16 self) - Add to MetaCart
graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

by n.n.
"... Whereas theoretical work suggests that deep architectures might be more efficient at representing highly-varying functions, training deep architectures was unsuccessful until the recent advent of algorithms based on unsupervised pretraining. Even though these new algorithms have enabled training de ..."
Abstract - Add to MetaCart
the advantage of unsupervised pre-training. They demonstrate the robustness of the training procedure with respect to the random initialization, the positive effect of pre-training in terms of optimization and its role as a regularizer. We empirically show the influence of pre-training with respect

Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition

by George E. Dahl, Dong Yu, Li Deng, Alex Acero - IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING , 2012
"... We propose a novel context-dependent (CD) model for large vocabulary speech recognition (LVSR) that leverages recent advances in using deep belief networks for phone recognition. We describe a pretrained deep neural network hidden Markov model (DNN-HMM) hybrid architecture that trains the DNN to pr ..."
Abstract - Cited by 254 (50 self) - Add to MetaCart
to produce a distribution over senones (tied triphone states) as its output. The deep belief network pre-training algorithm is a robust and often helpful way to initialize deep neural networks generatively that can aid in optimization and reduce generalization error. We illustrate the key components of our

Using the Longitudinal Structure of Earnings to Estimate the Effect of Training Programs

by Orley Ashenfelter, David Card - Review of Economics and Statistics , 1985
"... In this paper we set out some methods that utilize the longitudinal structure of earnings of trainees and a comparison group to estimate the effectiveness of training for the 1976 cohort of CETA trainees. By fitting a components-of-variance model of earnings to the control group, and by posing a sim ..."
Abstract - Cited by 302 (8 self) - Add to MetaCart
simple model of program participation, we are able to predict the entire pre-training and post-training earnings histories of the trainees. The fit of these predictions to the pre-training earnings of the CETA participants provides a test of the model of earnings generation and program participation

TRANSFER LEARNING BY SUPERVISED PRE-TRAINING FOR AUDIO-BASED MUSIC CLASSIFICATION

by Aäron Van Den Oord, Er Dieleman, Benjamin Schrauwen
"... Very few large-scale music research datasets are publicly available. There is an increasing need for such datasets, be-cause the shift from physical to digital distribution in the music industry has given the listener access to a large body of music, which needs to be cataloged efficiently and be ea ..."
Abstract - Add to MetaCart
, the Million Song Dataset (MSD), for classifica-tion tasks on other datasets, by reusing models trained on the MSD for feature extraction. This transfer learning ap-proach, which we refer to as supervised pre-training, was previously shown to be very effective for computer vision problems. We show

Mining anomalies using traffic feature distributions

by Anukool Lakhina, Mark Crovella, Christophe Diot - In ACM SIGCOMM , 2005
"... The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue tha ..."
Abstract - Cited by 322 (8 self) - Add to MetaCart
The increasing practicality of large-scale flow capture makes it possible to conceive of traffic analysis methods that detect and identify a large and diverse set of anomalies. However the challenge of effectively analyzing this massive data source for anomaly diagnosis is as yet unmet. We argue

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

by Adam Coates, Honglak Lee, Andrew Y. Ng
"... A great deal of research has focused on algorithms for learning features from unlabeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning algorithms and deep models. In this paper, however, we show that several ..."
Abstract - Cited by 223 (19 self) - Add to MetaCart
A great deal of research has focused on algorithms for learning features from unlabeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning algorithms and deep models. In this paper, however, we show

Unsupervised Pre-training across Image Domains Improves Lung Tissue Classification

by Thomas Schlegl, Joachim Ofner, Georg Langs
"... Abstract. The detection and classification of anomalies relevant for disease diagnosis or treatment monitoring is important during compu-tational medical image analysis. Often, obtaining sufficient annotated training data to represent natural variability well is unfeasible. At the same time, data is ..."
Abstract - Add to MetaCart
spatial appearance pat-terns and classify lung tissue in high-resolution computed tomography data. We perform domain adaptation via unsupervised pre-training of convolutional neural networks to inject information from sites or image classes for which no annotations are available. Results show that across
Next 10 →
Results 1 - 10 of 4,602
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University