MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Chunking-Synthetic Approaches to Large-Scale Kernel Machines

Download:
Download as a PDF
by Francisco J. González-castaño, Robert R. Meyer
ftp://ftp.cs.wisc.edu/math-prog/tech-reports/00-04.pdf
Add To MetaCart

Abstract:

We consider a kernel-based approach to nonlinear classification that combines the generation of “synthetic ” points (to be used in the kernel) with “chunking ” (working with subsets of the data) in order to significantly reduce the size of the optimization problems required to construct classifiers for massive datasets. Rather than solving a single massive classification problem involving all points in the training set, we employ a series of problems that gradually increase in size and which consider kernels based on small numbers of synthetic points. These synthetic points are generated by solving relatively small nonlinear unconstrained optimization problems. In addition to greatly reducing optimization problem size, the procedure that we describe also has the advantage of being easily parallelized. Computational results show that our method efficiently generates high-performance classifiers on a variety of problems involving both real and randomly generated datasets. 1.

Citations

697 Making large-Scale SVM Learning Practical – Joachims - 1999
655 UCI Repository of Machine Learning Databases [machine-readable data repository – Murphy, Aha - 1992
453 The strength of weak learnability – Schapire - 1990
113 Feature selection via concave minimization and support vector machines – Bradley, Mangasarian - 1998
91 Simplified Support Vector Decision Rules – Burges
91 Improving the accuracy and speed of support vector machines – Burges, Scholkopf - 1997
91 Input space vs. feature space in kernel-based methods – Scholkopf, Mika, et al. - 1999
35 Reducing the run-time complexity of support vector machines – Osuna, Girosi - 1998
17 SSVM: A smooth support vector machine for classification – Lee, Mangasarian
15 NDC: Normally Distributed Clustered Datasets – Musicant - 1998
7 L.: RSVM: Reduced Support Vector Machines. Data Mining Institute – Lee, Mangasarian - 2000
2 Lagrangian support vector machine. (Data Mining Institute – Mangasarian, Musicant - 2000
2 Data Mining via Mathematical Programming and Machine Learning – Musicant - 2000
1 Kernel machines page – Schölkopf, Smola - 2000