Results 1  10
of
7,425
Linear spatial pyramid matching using sparse coding for image classification
 in IEEE Conference on Computer Vision and Pattern Recognition(CVPR
, 2009
"... Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algo ..."
Abstract

Cited by 497 (21 self)
 Add to MetaCart
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup
Optimal Brain Damage
, 1990
"... We have used informationtheoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved sp ..."
Abstract

Cited by 510 (5 self)
 Add to MetaCart
We have used informationtheoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved
Inductive learning algorithms and representations for text categorization,”
 in Proceedings of the International Conference on Information and Knowledge Management,
, 1998
"... ABSTRACT Text categorization the assignment of natural language texts to one or more predefined categories based on their content is an important component in many information organization and management tasks. We compare the effectiveness of five different automatic learning algorithms for text ..."
Abstract

Cited by 652 (8 self)
 Add to MetaCart
categorization in terms of learning speed, realtime classification speed, and classification accuracy. We also examine training set size, and alternative document representations. Very accurate text classifiers can be learned automatically from training examples. Linear Support Vector Machines (SVMs
Sequential minimal optimization: A fast algorithm for training support vector machines
 Advances in Kernel MethodsSupport Vector Learning
, 1999
"... This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic programming (QP) optimization problem. SMO breaks this large QP problem into a series of smallest possi ..."
Abstract

Cited by 461 (3 self)
 Add to MetaCart
possible QP problems. These small QP problems are solved analytically, which avoids using a timeconsuming numerical QP optimization as an inner loop. The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets. Because matrix computation
Boosting the margin: A new explanation for the effectiveness of voting methods
 IN PROCEEDINGS INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 1997
"... One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this ..."
Abstract

Cited by 897 (52 self)
 Add to MetaCart
One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show
Pegasos: Primal Estimated subgradient solver for SVM
"... We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract

Cited by 542 (20 self)
 Add to MetaCart
runtime of our method is Õ(d/(λɛ)), where d is a bound on the number of nonzero features in each example. Since the runtime does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to non
Distance metric learning for large margin nearest neighbor classification
 In NIPS
, 2006
"... We show how to learn a Mahanalobis distance metric for knearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the knearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven ..."
Abstract

Cited by 695 (14 self)
 Add to MetaCart
. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification—for example, achieving a test error rate of 1.3 % on the MNIST handwritten digits. As in support vector machines (SVMs), the learning problem reduces to a
The cascadecorrelation learning architecture
 Advances in Neural Information Processing Systems 2
, 1990
"... CascadeCorrelation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology, CascadeCorrelation begins with a minimal network, then automatically trains and adds new hidden units one by one, creatin ..."
Abstract

Cited by 801 (6 self)
 Add to MetaCart
Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no backpropagation of error signals through the connections of the network.
An Empirical Study of Smoothing Techniques for Language Modeling
, 1998
"... We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e.g., Br ..."
Abstract

Cited by 1224 (21 self)
 Add to MetaCart
We present an extensive empirical comparison of several smoothing techniques in the domain of language modeling, including those described by Jelinek and Mercer (1980), Katz (1987), and Church and Gale (1991). We investigate for the first time how factors such as training data size, corpus (e
On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes
, 2001
"... We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is i ..."
Abstract

Cited by 520 (8 self)
 Add to MetaCart
We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size
Results 1  10
of
7,425