Results 1  10
of
40
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 865 (3 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
An introduction to kernelbased learning algorithms
 IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2001
"... This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and ..."
Abstract

Cited by 598 (55 self)
 Add to MetaCart
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and
Ridge Regression Learning Algorithm in Dual Variables
 In Proceedings of the 15th International Conference on Machine Learning
, 1998
"... In this paper we study a dual version of the Ridge Regression procedure. It allows us to perform nonlinear regression by constructing a linear regression function in a high dimensional feature space. The feature space representation can result in a large increase in the number of parameters used by ..."
Abstract

Cited by 164 (8 self)
 Add to MetaCart
(Show Context)
In this paper we study a dual version of the Ridge Regression procedure. It allows us to perform nonlinear regression by constructing a linear regression function in a high dimensional feature space. The feature space representation can result in a large increase in the number of parameters used by the algorithm. In order to combat this "curse of dimensionality", the algorithm allows the use of kernel functions, as used in Support Vector methods. We also discuss a powerful family of kernel functions which is constructed using the ANOVA decomposition method from the kernel corresponding to splines with an infinite number of nodes. This paper introduces a regression estimation algorithm which is a combination of these two elements: the dual version of Ridge Regression is applied to the ANOVA enhancement of the infinitenode splines. Experimental results are then presented (based on the Boston Housing data set) which indicate the performance of this algorithm relative to other algorithms....
A Review of Kernel Methods in Machine Learning
, 2006
"... We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide range of methods, ranging from simple classifiers to sophisticate ..."
Abstract

Cited by 95 (4 self)
 Add to MetaCart
(Show Context)
We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide range of methods, ranging from simple classifiers to sophisticated methods for estimation with structured data.
Shrinking the Tube: A New Support Vector Regression Algorithm
, 1999
"... A new algorithm for Support Vector regression is described. For a priori chosen , it automatically adjusts a flexible tube of minimal radius to the data such that at most a fraction of the data points lie outside. Moreover, it is shown how to use parametric tube shapes with nonconstant radius. ..."
Abstract

Cited by 57 (5 self)
 Add to MetaCart
(Show Context)
A new algorithm for Support Vector regression is described. For a priori chosen , it automatically adjusts a flexible tube of minimal radius to the data such that at most a fraction of the data points lie outside. Moreover, it is shown how to use parametric tube shapes with nonconstant radius. The algorithm is analysed theoretically and experimentally.
Structural modelling with sparse kernels
 In Proceedings of the 16th ACMSIAM Symposium on Discrete Algorithms
, 2002
"... A widely acknowledged drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is extremely difficult to interpret. A number of new concepts and algorithms have been introduced by researchers to address this problem. They focus primarily on de ..."
Abstract

Cited by 49 (0 self)
 Add to MetaCart
(Show Context)
A widely acknowledged drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is extremely difficult to interpret. A number of new concepts and algorithms have been introduced by researchers to address this problem. They focus primarily on determining which inputs are relevant in predicting the output. This work describes a transparent, advanced nonlinear modelling approach that enables the constructed predictive models to be visualised, allowing model validation and assisting in interpretation. The technique combines the representational advantage of a sparse ANOVA decomposition, with the good generalisation ability of a kernel machine. It achieves this by employing two forms of regularisation: a 1norm based structural regulariser to enforce transparency, and a 2norm based regulariser to control smoothness. The resulting model structure can be visualised showing the overall effects of different inputs, their interactions, and the strength of the interactions. The robustness of the technique is illustrated using a range of both artifical and “real world ” datasets. The performance is compared to other modelling techniques, and it is shown to exhibit competitive generalisation performance together with improved interpretability.
A pattern search method for model selection of support vector regression
 In Proceedings of the SIAM International Conference on Data Mining
, 2002
"... We develop a fullyautomated pattern search methodology for model selection of support vector machines (SVMs) for regression and classification. Pattern search (PS) is a derivativefree optimization method suitable for lowdimensional optimization problems for which it is difficult or impossible to ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
(Show Context)
We develop a fullyautomated pattern search methodology for model selection of support vector machines (SVMs) for regression and classification. Pattern search (PS) is a derivativefree optimization method suitable for lowdimensional optimization problems for which it is difficult or impossible to calculate derivatives. This methodology was motivated by an application in drug design in which regression models are constructed based on a fewhighdimensional exemplars. Automatic model selection in such underdetermined problems is essential to avoid overfitting and overestimates of generalization capability caused by selecting parameters based on testing results. We focus on SVM model selection for regression based on leaveoneout (LOO) and crossvalidated estimates of mean squared error, but the search strategy is applicable to any model criterion. Because the resulting error surface produces an extremely noisy map of the model quality with many local minima, the resulting generalization capacity of any single local optimal model illustrates high variance. Thus several locally optimal SVM models are generated and then bagged or averaged to produce the final SVM. This strategy of pattern search combined with model averaging has proven to be very effective on benchmark tests and in highvariance drug design domains with high potential of overfitting. ∗This work is supported by the NSF Grants IIS9979860 and IRI97092306.
Multifactor Gaussian Process Models for StyleContent Separation
"... We introduce models for density estimation with multiple, hidden, continuous factors. In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In th ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
We introduce models for density estimation with multiple, hidden, continuous factors. In particular, we propose a generalization of multilinear models using nonlinear basis functions. By marginalizing over the weights, we obtain a multifactor form of the Gaussian process latent variable model. In this model, each factor is kernelized independently, allowing nonlinear mappings from any particular factor to the data. We learn models for human locomotion data, in which each pose is generated by factors representing the person’s identity, gait, and the current state of motion. We demonstrate our approach using timeseries prediction, and by synthesizing novel animation from the model. 1.
Machine learning techniques for braincomputer interfaces
 BIOMEDICAL ENGINEERING
, 2004
"... This review discusses machine learning methods and their application to BrainComputer Interfacing. A particular focus is placed on feature selection. We also point out common flaws when validating machine learning methods in the context of BCI. Finally we provide a brief overview on the BerlinBrai ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
(Show Context)
This review discusses machine learning methods and their application to BrainComputer Interfacing. A particular focus is placed on feature selection. We also point out common flaws when validating machine learning methods in the context of BCI. Finally we provide a brief overview on the BerlinBrain Computer Interface (BBCI).