Results 1  10
of
28,597
Imagenet classification with deep convolutional neural networks.
 In Advances in the Neural Information Processing System,
, 2012
"... Abstract We trained a large, deep convolutional neural network to classify the 1.2 million highresolution images in the ImageNet LSVRC2010 contest into the 1000 different classes. On the test data, we achieved top1 and top5 error rates of 37.5% and 17.0% which is considerably better than the pr ..."
Abstract

Cited by 1010 (11 self)
 Add to MetaCart
Abstract We trained a large, deep convolutional neural network to classify the 1.2 million highresolution images in the ImageNet LSVRC2010 contest into the 1000 different classes. On the test data, we achieved top1 and top5 error rates of 37.5% and 17.0% which is considerably better than
A Neural Probabilistic Language Model
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... A goal of statistical language modeling is to learn the joint probability function of sequences of words in a language. This is intrinsically difficult because of the curse of dimensionality: a word sequence on which the model will be tested is likely to be different from all the word sequences seen ..."
Abstract

Cited by 447 (19 self)
 Add to MetaCart
is itself a significant challenge. We report on experiments using neural networks for the probability function, showing on two text corpora that the proposed approach significantly improves on stateoftheart ngram models, and that the proposed approach allows to take advantage of longer contexts.
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 679 (10 self)
 Add to MetaCart
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative
RealTime Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations
"... A key challenge for neural modeling is to explain how a continuous stream of multimodal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrateandfire neurons in realtime. We propose a new computational model for realtime computing on timevar ..."
Abstract

Cited by 469 (38 self)
 Add to MetaCart
A key challenge for neural modeling is to explain how a continuous stream of multimodal input from a rapidly changing environment can be processed by stereotypical recurrent circuits of integrateandfire neurons in realtime. We propose a new computational model for realtime computing on time
A Model of Saliencybased Visual Attention for Rapid Scene Analysis
, 1998
"... A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing salie ..."
Abstract

Cited by 1748 (72 self)
 Add to MetaCart
A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing
Learning to rank using gradient descent
 In ICML
, 2005
"... We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data f ..."
Abstract

Cited by 534 (17 self)
 Add to MetaCart
We investigate using gradient descent methods for learning ranking functions; we propose a simple probabilistic cost function, and we introduce RankNet, an implementation of these ideas using a neural network to model the underlying ranking function. We present test results on toy data and on data
Improved prediction of signal peptides  SignalP 3.0
 J. MOL. BIOL.
, 2004
"... We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea that the cle ..."
Abstract

Cited by 654 (7 self)
 Add to MetaCart
We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea
Gradientbased learning applied to document recognition
 Proceedings of the IEEE
, 1998
"... Multilayer neural networks trained with the backpropagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradientbased learning algorithms can be used to synthesize a complex decision surface that can classify hi ..."
Abstract

Cited by 1533 (84 self)
 Add to MetaCart
Multilayer neural networks trained with the backpropagation algorithm constitute the best example of a successful gradientbased learning technique. Given an appropriate network architecture, gradientbased learning algorithms can be used to synthesize a complex decision surface that can classify
A new learning algorithm for blind signal separation

, 1996
"... A new online learning algorithm which minimizes a statistical dependency among outputs is derived for blind separation of mixed signals. The dependency is measured by the average mutual information (MI) of the outputs. The source signals and the mixing matrix are unknown except for the number of ..."
Abstract

Cited by 622 (80 self)
 Add to MetaCart
of the sources. The GramCharlier expansion instead of the Edgeworth expansion is used in evaluating the MI. The natural gradient approach is used to minimize the MI. A novel activation function is proposed for the online learning algorithm which has an equivariant property and is easily implemented on a neural
Results 1  10
of
28,597