Results 1 - 10
of
28
Representation learning: A review and new perspectives.
- of IEEE Conf. Comp. Vision Pattern Recog. (CVPR),
, 2005
"... Abstract-The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can b ..."
Abstract
-
Cited by 173 (4 self)
- Add to MetaCart
(Show Context)
Abstract-The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. This motivates longer term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation, and manifold learning.
Learning Character-level Representations for Part-of-Speech Tagging
"... Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Informa-tion about word morphology and shape is nor-mally ignored when learnin ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
(Show Context)
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Informa-tion about word morphology and shape is nor-mally ignored when learning word representa-tions. However, for tasks like part-of-speech tag-ging, intra-word information is extremely use-ful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level repre-sentation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: En-glish, with 97.32 % accuracy on the Penn Tree-bank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2 % on the best previous known result. 1.
From generic to specific deep representations for visual recognition
- CoRR
"... Evidence is mounting that CNNs are currently the most efficient and successful way to learn visual representations. This paper address the questions on why CNN representations are so effective and how to improve them if one wants to maximize performance for a single task or a range of tasks. We asse ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Evidence is mounting that CNNs are currently the most efficient and successful way to learn visual representations. This paper address the questions on why CNN representations are so effective and how to improve them if one wants to maximize performance for a single task or a range of tasks. We assess experimentally the importance of different aspects of learning and choosing a CNN representation to its performance on a diverse set of visual recognition tasks. In particular, we investigate how altering the parameters in a network’s architecture and its training impacts the representation’s ability to specialize and generalize. We also study the effect of fine-tuning a generic network towards a particular task. Extensive exper-iments indicate the trends; (a) increasing specialization increases performance on the target task but can hurt the ability to generalize to other tasks and (b) the less specialized the original network the more likely it is to benefit from fine-tuning. As by-products we have learnt several deep CNN image representations which when combined with a simple linear SVM classifier or similarity measure pro-duce the best performance on 12 standard datasets measuring the ability to solve visual recognition tasks ranging from image classification to image retrieval. 1
Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks
"... Sentiment analysis of online user generated content is important for many social media analytics tasks. Re-searchers have largely relied on textual sentiment anal-ysis to develop systems to predict political election-s, measure economic indicators, and so on. Recently, social media users are increas ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Sentiment analysis of online user generated content is important for many social media analytics tasks. Re-searchers have largely relied on textual sentiment anal-ysis to develop systems to predict political election-s, measure economic indicators, and so on. Recently, social media users are increasingly using images and videos to express their opinions and share their expe-riences. Sentiment analysis of such large scale visual content can help better extract user sentiments toward events or topics, such as those in image tweets, so that prediction of sentiment from visual content is comple-mentary to textual sentiment analysis. Motivated by the needs in leveraging large scale yet noisy training data to solve the extremely challenging problem of image sen-timent analysis, we employ Convolutional Neural Net-works (CNN). We first design a suitable CNN archi-tecture for image sentiment analysis. We obtain half a million training samples by using a baseline sentiment algorithm to label Flickr images. To make use of such noisy machine labeled data, we employ a progressive s-trategy to fine-tune the deep network. Furthermore, we improve the performance on Twitter images by induc-ing domain transfer with a small number of manually labeled Twitter images. We have conducted extensive experiments on manually labeled Twitter images. The results show that the proposed CNN can achieve better performance in image sentiment analysis than compet-ing algorithms.
Training Neural Networks with Stochastic Hessian-Free Optimization
"... Hessian-free (HF) optimization has been successfully used for training deep au-toencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be com-puted on the same order of time as gradients. In this paper we e ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Hessian-free (HF) optimization has been successfully used for training deep au-toencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be com-puted on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens ’ HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against over-fitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments. 1
Neural Decision Forests for Semantic Image Labelling
"... In this work we present Neural Decision Forests, a novel approach to jointly tackle data representation- and dis-criminative learning within randomized decision trees. Re-cent advances of deep learning architectures demonstrate the power of embedding representation learning within the classifier – A ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this work we present Neural Decision Forests, a novel approach to jointly tackle data representation- and dis-criminative learning within randomized decision trees. Re-cent advances of deep learning architectures demonstrate the power of embedding representation learning within the classifier – An idea that is intuitively supported by the hier-archical nature of the decision forest model where the input space is typically left unchanged during training and test-ing. We bridge this gap by introducing randomized Multi-Layer Perceptrons (rMLP) as new split nodes which are ca-pable of learning non-linear, data-specific representations and taking advantage of them by finding optimal predic-tions for the emerging child nodes. To prevent overfitting, we i) randomly select the image data fed to the input layer, ii) automatically adapt the rMLP topology to meet the com-plexity of the data arriving at the node and iii) introduce an `1-norm based regularization that additionally sparsifies the network. The key findings in our experiments on three different semantic image labelling datasets are consistently improved results and significantly compressed trees com-pared to conventional classification trees. 1.
REVISITING HYBRID AND GMM-HMM SYSTEM COMBINATION TECHNIQUES
"... In this paper we investigate techniques to combine hybrid HMM-DNN (hidden Markov model – deep neural network) and tandem HMM-GMM (hidden Markov model – Gaussian mixture model) acoustic models using: (1) model averaging, and (2) lattice combination with Minimum Bayes Risk decoding. We have performed ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
In this paper we investigate techniques to combine hybrid HMM-DNN (hidden Markov model – deep neural network) and tandem HMM-GMM (hidden Markov model – Gaussian mixture model) acoustic models using: (1) model averaging, and (2) lattice combination with Minimum Bayes Risk decoding. We have performed experiments on the “TED Talks” task following the protocol of the IWSLT-2012 evaluation. Our experimental results suggest that DNN-based and GMMbased acoustic models are complementary, with error rates being reduced by up to 8 % relative when the DNN and GMM systems are combined at model-level in a multi-pass automatic speech recognition (ASR) system. Additionally, further gains were obtained by combining model-averaged lattices with the one obtained from baseline systems. Index Terms — deep neural networks, tandem, hybrid, system combination, TED 1.
(Better than) State-of-the-Art PoS-tagging for Italian Texts
"... Abstract English. This paper presents some experiments for the construction of an highperformance PoS-tagger for Italian using deep neural networks techniques (DNN) integrated with an Italian powerful morphological analyser. The results obtained by the proposed system on standard datasets taken fro ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract English. This paper presents some experiments for the construction of an highperformance PoS-tagger for Italian using deep neural networks techniques (DNN) integrated with an Italian powerful morphological analyser. The results obtained by the proposed system on standard datasets taken from the EVALITA campaigns show large accuracy improvements when compared with previous systems from the literature. Italiano. Questo contributo presenta alcuni esperimenti per la costruzione di un PoS-tagger ad alte prestazioni per l'italiano utilizzando reti neurali 'deep' integrate con un potente analizzatore morfologico. I risultati ottenuti sui dataset delle campagne EVALITA da parte del sistema proposto mostrano incrementi di accuratezza piuttosto rilevanti in confronto ai precedenti sistemi in letteratura.
Article Fast Classification of Meat Spoilage Markers Using Nanostructured ZnO Thin Films and Unsupervised Feature Learning
, 2013
"... sensors ..."
(Show Context)
Toward deep learning software repositories
- in Proceedings of the 12th Working Conference on Mining Software Repositories
, 2015
"... Abstract—Deep learning subsumes algorithms that automat-ically learn compositional representations. The ability of these models to generalize well has ushered in tremendous advances in many fields such as natural language processing (NLP). Recent research in the software engineering (SE) community h ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Deep learning subsumes algorithms that automat-ically learn compositional representations. The ability of these models to generalize well has ushered in tremendous advances in many fields such as natural language processing (NLP). Recent research in the software engineering (SE) community has demonstrated the usefulness of applying NLP techniques to software corpora. Hence, we motivate deep learning for software language modeling, highlighting fundamental differences between state-of-the-practice software language models and connectionist models. Our deep learning models are applicable to source code files (since they only require lexically analyzed source code written in any programming language) and other types of artifacts. We show how a particular deep learning model can remember its state to effectively model sequential data, e.g., streaming software tokens, and the state is shown to be much more expressive than discrete tokens in a prefix. Then we instantiate deep learning models and show that deep learning induces high-quality models compared to n-grams and cache-based n-grams on a corpus of Java projects. We experiment with two of the models ’ hyperparameters, which govern their capacity and the amount of context they use to inform predictions, before building several committees of software language models to aid generalization. Then we apply the deep learning models to code suggestion and demonstrate their effectiveness at a real SE task compared to state-of-the-practice models. Finally, we propose avenues for future work, where deep learning can be brought to bear to support model-based testing, improve software lexicons, and conceptualize software artifacts. Thus, our work serves as the first step toward deep learning software repositories. Keywords—Software repositories, machine learning, deep learn-ing, software language models, n-grams, neural networks I.