Download:
|
by Jian Zhang, Yiming Yang
In Proceedings of SIGIR 2003: The Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
http://www.cs.cmu.edu/~yiming/papers.yy/jzhang-sigir03.ps.gz
Add To MetaCart
Abstract:
Real-world applications often require the classication of documents under situations of small number of features, mis-labeled documents and rare positive examples. This paper investigates the robustness of three regularized linear classi cation methods (SVM, ridge regression and logistic regression) under above situations. We compare these methods in terms of their loss functions and score distributions, and establish the connection between their optimization problems and generalization error bounds. Several sets of controlled experiments on the Reuters-21578 corpus are conducted to investigate the robustness of these methods. Our results show that ridge regression seems to be the most promising candidate for rare class problems.
Citations
|
4514
|
Statistical Learning Theory
– Vapnik
- 1998
|
|
961
|
Text Categorization with Support Vector Machines
– Joachims
- 1997
|
|
512
|
A comparative study on feature selection in text categorization
– Yang, Pedersen
- 1997
|
|
425
|
Solution of Ill-posed Problems
– Tikhonov, Arsenin
- 1977
|
|
416
|
A re-examination of text categorization methods
– Yang, Liu
- 1999
|
|
319
|
Inductive Learning Algorithms and Representations for Text Categorization
– Dumais, Platt, et al.
- 1998
|
|
211
|
A comparison of two learning algorithms for text categorization
– Lewis, Ringuette
- 1994
|
|
200
|
The Elements of Statistical Learning (Data Mining, Inference and Prediction
– Hastie, Tibshirani, et al.
- 2001
|
|
77
|
Solution of incorrectly formulated problems and the regularization method
– Tikhonov
- 1963
|
|
76
|
An example-based mapping method for text categorization and retrieval
– Yang, Chute
- 1994
|
|
6
|
A comparison of classi and document representations for the routing problem
– Schutze, Hull, et al.
- 1995
|
|
1
|
8] D. Luenberger. Optimization by Vector Space Methods
– Addison-Wesley
- 1989
|