Evaluating Cost-Sensitive Unsolicited Bulk Email Categorization (2002)
| Citations: | 14 - 0 self |
BibTeX
@MISC{Hidalgo02evaluatingcost-sensitive,
author = {Jose María Gómez Hidalgo},
title = {Evaluating Cost-Sensitive Unsolicited Bulk Email Categorization},
year = {2002}
}
OpenURL
Abstract
In this paper, we discuss cost-sensitive Text Categorization methods for UBE filtering. In concrete, we have evaluated a range of Machine Learning methods for the task (C4.5, Naive Bayes, PART, Support Vector Machines and Rocchio), made cost sensitive through several methods (Threshold optimization, Weighting, and MetaCost). For the evaluation, we have used the Receiver Operating Characteristic Convex Hull method, that best suits classification problems in which target conditions are not known, as it is the case. Our results do not show a dominant algorithm nor method for making algorithms cost-sensitive, but are the best reported on the test collection used, and approach real-world manual classifiers accuracy.







