MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Classifying unseen cases with many missing values (1999) [3 citations — 0 self]

Download:
Download as a PDF | Download as a PS
by Zijian Zheng, Boon Toh Low
In Proceedings Third Paci Conference (PAKDD99), volume 1574 of LNAI
http://www3.cm.deakin.edu.au/~zijian/Papers/pakdd99-zl-missing.ps.gz
Add To MetaCart

Abstract:

Abstract. Handling missing attribute values is an important issue for classifier learning, since missing attribute values in either training data or test (unseen) data affect the prediction accuracy of learned classifiers. In many real KDD applications, attributes with missing values are very common. This paper studies the robustness of four recently developed committee learning techniques, including Boosting, Bagging, Sasc, and SascMB, relative to C4.5 for tolerating missing values in test data. Boosting is found to have a similar level of robustness to C4.5 for tolerating missing values in test data in terms of average error in a representative collection of natural domains under investigation. Bagging performs slightly better than Boosting, while Sasc and SascMB perform better than them in this regard, with SascMB performing best. 1

Citations

3215 C4.5: Programs for Machine Learning – Quinlan - 1993
2489 Induction of Decision Trees – Quinlan - 1986
2138 UCI Repository of Machine Learning Databases – Merz, Murphy - 1996
1453 Bagging Predictors – Breiman - 1996
1133 A decision-theoretic generalization of on-line learning and an application to boosting – Freund, Schapire - 1997
1004 Experiments with a new boosting algorithm – Schapire - 1996
483 Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods – Schapire, Freund, et al. - 1997
453 The strength of weak learnability – Schapire - 1990
356 An empirical comparison of voting classification algorithms: Bagging, boosting and variants – Bauer, Kohavi - 1999
338 A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection – Kohavi - 1995
298 Boosting a weak learning algorithm by majority – Freund - 1995
232 Bagging, boosting and C4.5 – Quinlan - 1996
199 Arcing classifiers – Breiman - 1998
93 Unknown Attribute Values in Induction – Quinlan - 1989
79 C.J.: UCI Repository of machine learning databases, http://www.ics.uci.edu/~mlearn /MLRepository.html – Blake, Keogh, et al. - 1998
40 Machine learning bias, statistical bias, and statistical variance of decision tree algorithms – Dietterich, Kong - 1995
23 Learning probabilistic relational concept descriptions – Ali - 1996
13 Stochastic attribute selection committees – Zheng, Webb - 1998
7 T.G.: Machine learning research – Dietterich - 1997
1 The Problem of Missing Values in Decision Tree Grafting – Webb - 1998