@MISC{Friedl_cross-validation, author = {Herwig Friedl and Erwin Stampfer}, title = {Cross-Validation}, year = {} }

Share

OpenURL

Abstract

Introduction Cross-validation is a resampling technique that is often used for the assessment of statistical models, as well as selection amongst competing model alternatives. Basically, it is a method to estimate the prediction error of statistical predictor functions. This technique can be very useful in data problems involving minimal distributional assumptions. It has found many applications ranging from linear regression, partial least squares, ridge regression, classi cation and discrimination, to smoothing and neural networks, in univariate as well as in multivariate settings. Cross-validation is rooted in the well-known phenomenon that estimating prediction error on the same data used for model building tends to give downward-biased estimates. The reason for this is that the parameter estimates are optimized to reect the peculiarities of the data-set. When new data arrive, the model usually performs worse than expected on the grounds of assessment measurements on the trainin