MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  where we use EX;Y to denote the expectation with respect to the true underlying distribution

Download:
Download as a PDF | Download as a PS
by Tong Zhang
http://mlg.anu.edu.au/~raetsch/ps/consistency.ps
Add To MetaCart

Abstract:

We study how close the optimal Bayes error rate can be approximately reached using a classication algorithm that computes a classier by minimizing a convex upper bound of the classication error function. The measurement of closeness is characterized by the loss function used in the estimation. We show that such a classication scheme can be generally regarded as a (non maximum-likelihood) conditional in-class probability estimate, and we use this analysis to compare various convex loss functions that have appeared in the literature. Furthermore, the theoretical insight allows us to design good loss functions with desirable properties. Another aspect of our analysis is to demonstrate the consistency of certain classication methods using convex risk minimization. This study sheds light on the good performance of some recently proposed linear classication methods including boosting and support vector machines. It also shows their limitations and suggests possible improvements. 1

Citations

4514 Statistical Learning Theory – Vapnik - 1998
1410 Convex Analysis – Rockafellar - 1970
1133 A decision-theoretic generalization of on-line learning and an application to boosting – Freund, Schapire - 1997
727 Spline Models for Observational Data – Wahba - 1990
550 Functional Analysis – Rudin - 1973
543 Additive logistic regression: a statistical view of boosting – Friedman, Hastie, et al.
108 Prediction games and arcing algorithms – Breiman - 1999
93 Multilayer Feedforward Networks with Nonpolynomial Activation Function can Approximate any Function – Leshno, Ya-Lin, et al. - 1993
89 Boosting the margin: a new explanation for the eectiveness of voting methods – Schapire, Freund, et al. - 1998
51 Schapire and Yoram Singer. Improved boosting algorithms using confidence-rated predictions – Robert - 1999
24 Support vector machines are universally consistent – Steinwart - 2002
15 The relaxation method of the common point of convex sets and its application to the solution of problems in convex programming – Bregman - 1967
11 Arcing classi The Annals of Statistics – Breiman - 1998
8 Some in theory for predictor ensembles – Breiman - 2000
7 On the Bayes-risk consistency of boosting methods – Lugosi, Vayatis - 2001
7 A leave-one-out cross validation bound for kernel methods with applications in learning – Zhang - 2001
3 The consistency of greedy algorithms for classi – Mannor, Meir, et al. - 2002
1 Boosting with the l2 loss: Regression and classi – Buhlmann, Yu - 2001