Results 1 - 10
of
29
A Comparison of Methods for Multiclass Support Vector Machines
- IEEE TRANS. NEURAL NETWORKS
, 2002
"... Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary class ..."
Abstract
-
Cited by 369 (12 self)
- Add to MetaCart
Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such “all-together” methods. We then compare their performance with three methods based on binary classifications: “one-against-all,” “one-against-one,” and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the “one-against-one” and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.
Multicategory Support Vector Machines, theory, and application to the classification of microarray data and satellite radiance data
- Journal of the American Statistical Association
, 2004
"... Two-category support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We pro ..."
Abstract
-
Cited by 116 (10 self)
- Add to MetaCart
Two-category support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We propose the multicategory support vector machine (MSVM), which extends the binary SVM to the multicategory case and has good theoretical properties. The proposed method provides a unifying framework when there are either equal or unequal misclassi � cation costs. As a tuning criterion for the MSVM, an approximate leave-one-out cross-validation function, called Generalized Approximate Cross Validation, is derived, analogous to the binary case. The effectiveness of the MSVM is demonstrated through the applications to cancer classi � cation using microarray data and cloud classi � cation with satellite radiance pro � les.
Everything Old Is New Again: A Fresh Look at Historical Approaches
- in Machine Learning. PhD thesis, MIT
, 2002
"... 2 Everything Old Is New Again: A Fresh Look at Historical ..."
Abstract
-
Cited by 68 (5 self)
- Add to MetaCart
2 Everything Old Is New Again: A Fresh Look at Historical
Support vector machines classification with a very large-scale taxonomy
- SIGKDD Explorations
, 2005
"... Very large-scale classification taxonomies typically have hundreds ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
Very large-scale classification taxonomies typically have hundreds
A Clustering Technique for the Identification of Piecewise Affine Systems
, 2001
"... We propose a new technique for the identification of discrete-time hybrid systems in the Piece-Wise Affine (PWA) form. This problem can be formulated as the reconstruction of a possibly discontinuous PWA map with a multi-dimensional domain. In order to achieve our goal, we provide an algorithm that ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
We propose a new technique for the identification of discrete-time hybrid systems in the Piece-Wise Affine (PWA) form. This problem can be formulated as the reconstruction of a possibly discontinuous PWA map with a multi-dimensional domain. In order to achieve our goal, we provide an algorithm that exploits the combined use of clustering, linear identification, and pattern recognition techniques. This allows to identify both the affine submodels and the polyhedral partition of the domain on which each submodel is valid avoiding gridding procedures. Moreover, the clustering step (used for classifying the datapoints) is performed in a suitably defined feature space which allows also to reconstruct different submodels that share the same coefficients but are defined on different regions. Measures of confidence on the samples are introduced and exploited in order to improve the performance of both the clustering and the final linear regression procedure.
On the consistency of multiclass classification methods
- In Proceedings of the 18th Conference on Computational Learning Theory (COLT
, 2005
"... Binary classification is a well studied special case of the classification problem. Statistical properties of binary classifiers, such as consistency, have been investigated in a variety of settings. Binary classification methods can be generalized in many ways to handle multiple classes. It turns o ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
Binary classification is a well studied special case of the classification problem. Statistical properties of binary classifiers, such as consistency, have been investigated in a variety of settings. Binary classification methods can be generalized in many ways to handle multiple classes. It turns out that one can lose consistency in generalizing a binary classification method to deal with multiple classes. We study a rich family of multiclass methods and provide a necessary and sufficient condition for their consistency. We illustrate our approach by applying it to some multiclass methods proposed in the literature.
Combining protein secondary structure prediction models with ensemble methods of optimal complexity
, 2004
"... ..."
Multicategory proximal support vector machine classifiers
- Machine Learning
, 2001
"... Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a k-category proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) can be in ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a k-category proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) can be interpreted as ridge regression applied to classification problems (Evgeniou, Pontil, & Poggio, 2000). Extensive computational results have shown the effectiveness of PSVM for two-class classification problems where the separating plane is constructed in time that can be as little as two orders of magnitude shorter than that of conventional support vector machines. When PSVM is applied to problems with more than two classes, the well known one-from-the-rest approach is a natural choice in order to take advantage of its fast performance. However, there is a drawback associated with this one-from-the-rest approach. The resulting two-class problems are often very unbalanced, leading in some cases to poor performance. We propose balancing the k classes and a novel Newton refinement modification to PSVM in order to deal with this problem. Computational results indicate that these two modifications preserve the speed of PSVM while often leading to significant test set improvement over a plain PSVM one-from-the-rest application. The modified approach is considerably faster than other one-from-the-rest methods that use conventional SVM formulations, while still giving comparable test set correctness.
Maximum margin training of generative kernels
, 2004
"... Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximum-mar ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximum-margin training of both the kernel parameters and a Support Vector Machine (SVM) classifier. It combines standard SVM training with a gradient-descent based kernel parameter optimisation scheme. This allows the kernel parameters to be explicitly trained for the data set and the SVM score-space. Initial results on an artificial task and the Deterding data show that such an approach can reduce classification error rates. 1 1

