Cross-validated c4.5: Using error estimation for automatic parameter selection (1994) [5 citations — 1 self]
Abstract:
Machine learning algorithms for supervised learning are in wide use. An important issue in the use of these algorithms is how to set the parameters of the algorithm. While the default parameter values may be appropriate for a wide variety of tasks, they are not necessarily optimal for a given task. In this paper, we investigate the use of cross-validation to select parameters for the C4.5 decision tree learning algorithm. Experimental results on five datasets show that when cross-validation is applied to selecting an important parameter for C4.5, the accuracy of the induced trees on independent test sets is generally higher than the accuracy when using the default paramter value.
Citations
| 655 | UCI Repository of Machine Learning Databases [machine-readable data repository – Murphy, Aha - 1992 |
| 61 | Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical Science l – Efron, Tibshirani - 1986 |
| 27 | Classification and Regression Trees, Chapman – Breiman, Friedman, et al. - 1993 |
| 19 | Estimating the accuracy of learned concepts – Bailey, Elkan - 1993 |
| 9 | THAID: a sequential analysis program for the analysis of nominal scale dependent variables – Morgan - 1973 |

