MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Robustness of regularized linear classification methods in text categorization (2003) [11 citations — 3 self]

Download:
Download as a PDF | Download as a PS
by Jian Zhang, Yiming Yang
In Proceedings of SIGIR 2003: The Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
http://www.cs.cmu.edu/~yiming/papers.yy/jzhang-sigir03.ps.gz
Add To MetaCart

Abstract:

Real-world applications often require the classication of documents under situations of small number of features, mis-labeled documents and rare positive examples. This paper investigates the robustness of three regularized linear classi cation methods (SVM, ridge regression and logistic regression) under above situations. We compare these methods in terms of their loss functions and score distributions, and establish the connection between their optimization problems and generalization error bounds. Several sets of controlled experiments on the Reuters-21578 corpus are conducted to investigate the robustness of these methods. Our results show that ridge regression seems to be the most promising candidate for rare class problems.

Citations

4514 Statistical Learning Theory – Vapnik - 1998
961 Text Categorization with Support Vector Machines – Joachims - 1997
512 A comparative study on feature selection in text categorization – Yang, Pedersen - 1997
425 Solution of Ill-posed Problems – Tikhonov, Arsenin - 1977
416 A re-examination of text categorization methods – Yang, Liu - 1999
319 Inductive Learning Algorithms and Representations for Text Categorization – Dumais, Platt, et al. - 1998
211 A comparison of two learning algorithms for text categorization – Lewis, Ringuette - 1994
200 The Elements of Statistical Learning (Data Mining, Inference and Prediction – Hastie, Tibshirani, et al. - 2001
77 Solution of incorrectly formulated problems and the regularization method – Tikhonov - 1963
76 An example-based mapping method for text categorization and retrieval – Yang, Chute - 1994
6 A comparison of classi and document representations for the routing problem – Schutze, Hull, et al. - 1995
1 8] D. Luenberger. Optimization by Vector Space Methods – Addison-Wesley - 1989