MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Cross-validated c4.5: Using error estimation for automatic parameter selection (1994) [5 citations — 1 self]

Download:
Download as a PDF | Download as a PS
by George H. John
ftp://elib.stanford.edu/pub/reports/cs/tn/94/12/CS-TN-94-12.ps
Add To MetaCart

Abstract:

Machine learning algorithms for supervised learning are in wide use. An important issue in the use of these algorithms is how to set the parameters of the algorithm. While the default parameter values may be appropriate for a wide variety of tasks, they are not necessarily optimal for a given task. In this paper, we investigate the use of cross-validation to select parameters for the C4.5 decision tree learning algorithm. Experimental results on five datasets show that when cross-validation is applied to selecting an important parameter for C4.5, the accuracy of the induced trees on independent test sets is generally higher than the accuracy when using the default paramter value.

Citations

655 UCI Repository of Machine Learning Databases [machine-readable data repository – Murphy, Aha - 1992
61 Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical Science l – Efron, Tibshirani - 1986
27 Classification and Regression Trees, Chapman – Breiman, Friedman, et al. - 1993
19 Estimating the accuracy of learned concepts – Bailey, Elkan - 1993
9 THAID: a sequential analysis program for the analysis of nominal scale dependent variables – Morgan - 1973