| M. Stitson, A. Gammerman, V.N. Vapnik, V. Vovk, C. Watkins, and J. Weston, "Support vector regression with anova decomposition kernels," in Advances in Kernel Methods --- Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, Eds., pp. 285--291. MIT Press, Cambridge, MA, 1999. |
.... a number of competing methods and their generalisation performance was found to either match or be significantly better than these methods [Miiller et al. 1997, Drucker et al. 1997, Mattera and Haykin, 1999] The use of SVMs for density esti mation [Weston et al. 1997] and ANOVA decomposition [Stitson et al. 1999] has also been studied. Regarding extensions, the basic SVMs contain no prior knowledge of the problem and much work has been done on incorporating prior knowledge into SVMs [Sch61kopf et al. 1996, Sch61kopf et al. 1998a, Burges, 1998a] Although SVMs have good generalisation performance, they ....
Stitson, M. O., Gammerman, A., Vapnik, V., Vovk, V., Watkins, C., and Weston, J. (1999). Support Vector Regression with ANOVA Decomposition Kernels. In Sch61kopf, B., Burges, C. J. C., and Smola, A. J., editors, Advances in Kernel Methods' - Support Vector Learning, pages 285-292. MIT Press, Cambridge, MA.
....We randomly choose 50 samples for training and 100 samples as unlabeled data. The 356 remaining test samples are used for measuring the generalization error. The regression function is described as f (x) 50 # i=1 # i k(x, x i ) 45) where k is the third order ANOVA decomposition kernel [35, 1]: k(x i , x j ) # 1#k 1 k 2 . k 3 #13 #(x ik 1 , x jk 1 )#(x ik 2 , x jk 2 )#(x ik 3 , x jk 3 ) 46) constructed from a linear spline kernel # [36] #(x i , x j ) 1 x i x j x i x j min(x i , x j ) x i x j 2 (min(x i , x j ) 2 (min(x i , x j ) 3 3 . 47) Here, all of ....
M. Stitson, A. Gammerman, V.N. Vapnik, V. Vovk, C. Watkins, and J. Weston, "Support vector regression with anova decomposition kernels," in Advances in Kernel Methods --- Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, Eds., pp. 285--291. MIT Press, Cambridge, MA, 1999.
....error bounds) is customized; in the first case, by changing the loss function, in the second case, by changing the class of functions that the estimate is taken from. Empirical studies using SVR have reported excellent performance on the widely used Boston housing regression benchmark set (Stitson et al. 1999). Due to Proposition 2, the only difference between SVR and standard SVR lies in the fact that different parameters, vs. have to be specified a priori. Consequently, we are in this experiment only interested in these parameters and simply adjusted C and the width 2oe 2 in k(x; y) ....
....performances which are close to the best performances that can be achieved by selecting a priori by looking at the test set. Finally, note that although we did not use validation techniques to select the optimal values for C and 2oe 2 , we obtained performance which are state of the art (Stitson et al. 1999) report an MSE of 7:6 for SVR using ANOVA kernels, and 11:7 for Bagging trees) Table 1 moreover shows that can be used to control the fraction of SVs errors. Discussion. The theoretical and experimental analysis suggest that provides a way to control an upper bound on the number of training ....
M. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, and J. Weston. Support vector regression with ANOVA decomposition kernels. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods --- Support Vector Learning, pages 285 -- 291. MIT Press, Cambridge, MA, 1999.
....are consistent with the asymptotics predicted theoretically: for = 200, we got 0:24 and 0:19 for the fraction of SVs and errors, respectively. Boston housing benchmark. Empirical studies using SVR have reported excellent performance on the widely used Boston housing regression benchmark set (Stitson et al. 1999). Due to Proposition 2, the only difference between SVR and standard SVR lies in the fact that different parameters, vs. have to be specified a priori. Accordingly, the goal of the following experiment was not to show that SVR is better than SVR, but that is a useful parameter to ....
.... C= 10 Delta 50 (i.e. the original value of 10 was corrected since in the present case, the maximal y value is 50 rather than 1) We performed 100 runs, where each time the overall set of 506 examples was randomly split into a training set of = 481 examples and a test set of 25 examples (cf. Stitson et al. 1999). Table 3 shows that over a wide range of (note that only 0 1 makes sense) we obtained performances which are close to the best performances that can be achieved by selecting a priori by looking at the test set. Finally, note that although we did not use validation techniques to select the ....
[Article contains additional citation context not shown here]
M. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, and J. Weston. Support vector regression with ANOVA decomposition kernels. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods --- Support Vector Learning, pages 285 -- 291. MIT Press, Cambridge, MA, 1999. to appear.
.... 1969] Introduction 2 et al. 1996, Scholkopf et al. 1998] A comprehensive tutorial on SV classifiers has been published by Burges [1998] But also in regression and time series prediction applications, excellent performances were soon obtained [Muller et al. 1997, Drucker et al. 1997, Stitson et al. 1999, Mattera and Haykin, 1999] A snapshot of the state of the art in SV learning was recently taken at the annual Neural Information Processing Systems conference [Scholkopf et al. 1999] SV learning has now evolved into an active area of research. Moreover, it is in the process of entering the ....
....not unlike Chen et al. 1995] for this purpose. Finally, the focus of this review was on methods and theory rather than on applications. This was done to limit the size of the exposition. State of the art, or even record performance was reported in [Muller et al. 1997, Drucker et al. 1997, Stitson et al. 1999, Mattera and Haykin, 1999] In many cases, it may be possible to achieve similar performance with neural network methods, however, only if many parameters are optimally tuned, thus depending largely on the skill of the experimenter. In other words one should not consider SV machines as a silver ....
M. Stitson, A. Gammerman, V. Vapnik, V. Vovk, C. Watkins, and J. Weston. Support vector regression with ANOVA decomposition kernels. In B. Scholkopf, C.J.C. Burges, and A.J. Smola, editors, Advances in Kernel Methods --- Support Vector Learning, pages 285--292, Cambridge, MA, 1999.
.... s 1 K p Gammas (x; y)K s (x; y) For the purposes of this paper, when using kernels produced by ANOVA decomposition, only the order p is considered: K(x; y) K p (x; y) An alternative method of using ANOVA decomposition would be to consider order p and all lower orders (as in Stitson [7]) i.e. K(x; y) p X i=1 K i (x; y) 6 EXPERIMENTAL RESULTS Experiments were conducted on the Boston Housing data set 2 . This is a well known data set for testing non linear regression methods; see, e.g. Breiman [1] and Saunders [6] The data set consists of 506 cases in which 12 ....
....and the value of coefficient a) was selected which gave the smallest error on the validation set, and then the error on the test set was measured. This experiment was then repeated using a support vector machine (SVM) with the same kernels and exactly the same 100 training files (see Stitson [7] for full details) As an illustration of the number of parameters which were considered by the Ridge Regression Algorithm (and the SVM) consider the polynomial kernel which was outlined earlier, using a degree of 5. This maps the input vectors into a high dimensional feature space which is ....
[Article contains additional citation context not shown here]
M. O. Stitson, A. Gammerman, V. N. Vapnik, V. Vovk, C. Watkins, and J. Weston. Support Vector regression with ANOVA decomposition kernels. Technical report, Royal Holloway, University of London, 1997.
No context found.
M. Stitson, A. Gammerman, V.N. Vapnik, V. Vovk, C. Watkins, and J. Weston, "Support vector regression with anova decomposition kernels," in Advances in Kernel Methods --- Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, Eds., pp. 285--291. MIT Press, Cambridge, MA, 1999.
No context found.
M. Stitson, A. Gammerman, V.N. Vapnik, V. Vovk, C. Watkins, and J. Weston, "Support vector regression with anova decomposition kernels," in Advances in Kernel Methods --- Support Vector Learning, B. Scholkopf, C.J.C. Burges, and A.J. Smola, Eds., pp. 285--291. MIT Press, Cambridge, MA, 1999.
No context found.
M. Stitson, A. Gammerman, V.N. Vapnik, V. Vovk, C. Watkins, and J. Weston. Support vector regression with ANOVA decomposition kernels. In B. Scholkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods -- Support Vector Learning. MIT Press, Cambridge, 1999.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC