eMailSift: Adapting Graph Mining Techniques for Email Classification (2004)
BibTeX
@MISC{Aery04emailsift:adapting,
author = {Manu Aery and Sharma Chakravarthy},
title = {eMailSift: Adapting Graph Mining Techniques for Email Classification},
year = {2004}
}
OpenURL
Abstract
Text classification is the problem of assigning pre-defined class labels to incoming, unclassified documents. The class labels are defined based on a set of examples of pre-classified documents, used as a training corpus. For text classification, a number of approaches have been proposed such as Support Vector machines, Decision trees, k-nearest-neighbor classification, Linear Least Square fit and Bayesian classification among others. The need for handling and classifying large amounts of personal emails have prompted the use of text classification approaches to address email classification. Email classification is trickier than text classification as it is based on personal preferences, consequently it uses disparate criteria which are difficult to quantify. Also, documents are richer in content as compared to emails whose content can vary dramatically from folder to folder, hence conventional approaches may not be well-suited. In addition, as opposed to a static set of corpus typically used for training in text classification, the mail environment is constantly changing, with a need for adaptive and incremental re-training.







