@MISC{Hyndman14measuringforecast,

author = {Rob J Hyndman},

title = {Measuring forecast accuracy},

year = {2014}

}

Everyone wants to know how accurate their forecasts are. Does your forecasting method give good forecasts? Are they better than the competitor methods? There are many ways of measuring the accuracy of forecasts, and the answers to these questions depends on what is being forecast, what accuracy measure is used, and what data set is used for computing the accuracy measure. In this chapter, I will summarize the most important and useful approaches. 1 Training and test sets It is important to evaluate forecast accuracy using genuine forecasts. That is, it is invalid to look at how well a model fits the historical data; the accuracy of forecasts can only be determined by considering how well a model performs on new data that were not used when estimating the model. When choosing models, it is common to use a portion of the available data for testing, and use the rest of the data for estimating (or “training”) the model. Then the testing data can be used to measure how well the model is likely to forecast on new data. Training data Test data ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● time Figure 1: A time series is often divided into training data (used to estimate the model) and test data (used to evaluate the forecasts). The size of the test data set is typically about 20 % of the total sample, although this value depends on how long the sample is and how far ahead you want to forecast. The size of the test set should ideally be at least as large as the maximum forecast horizon required. 1 This chapter is based on Section 2.5 of Forecasting: principles and practice by Rob J Hyndman and George Athanasopoulos, available online at www.otexts.org/fpp/2/5, and used with permission. 1 Measuring forecast accuracy The following points should be noted. • A model which fits the data well does not necessarily forecast well. • A perfect fit can always be obtained by using a model with enough parameters. • Over-fitting a model to data is as bad as failing to identify the systematic pattern in the data. Some references describe the test data as the “hold-out set ” because these data are “held out ” of the data used for fitting. Other references call the training data the “in-sample data ” and the test data the “out-of-sample data”. 2 Forecast

forecast accuracy accuracy measure test data new data test set enough parameter model performs genuine forecast available online good forecast historical data many way competitor method in-sample data total sample perfect fit hold-out set available data time series data set test data set maximum forecast horizon systematic pattern following point rob hyndman useful approach george athanasopoulos forecasting method out-of-sample data

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University