In this post we will review the statistical background for time series analysis and forecasting. Forecast Accuracy The Mean Absolute Percentage Error - MAPE, measures the difference of forecast errors and divides it by the actual observation value. The crucial point is that MAPE puts much more weight on extreme values and positive errors, which makes MASE a favor metrics. But the big benefit of MAPE is the fact that it is scale independent: that means we can use MAPe to compare a model on
different datasets. Quire often, Akaike Information Criterion - AIC is used. We start as an example with a random time series. We divide the series into a training and a test set using the window() function. With
window() function we can easily extract a time frame, in this case we take the part of the data starting from 1818 and ending in 1988. It is a beautiful fuction to split time series. The training data is used to fit the model and the test set is used to see how well the model performs.
In
the results we can see the error of both training and test set. It is also calculated the difference between the actual values from mytstest and the forecasted values from the model. The importance of Residuals in Time Series Analysis
As we can see from the graph above, we have a peak at around zero and the tails to both ends look quite equal which is what would expect with the normal
distribution of the residuals. From the acf() function (autocorreation function), we can test for auto-correlation: if we have several of the vertical bars above the threshold level, we have auto-correlation. Autocorrelation The acf() is used to identify the moving average part of the ARIMA mode, whie pacf() identifies the vaues for the autoregresiove part.
In the example above, there are sevelar bas ranging out of the 95% confidence intervals (we omit first bar because it is the autocorrelation against isef at lag0). On the contrary, pacf() starts at lag1. Determining the Forecasting Method Univariate
Seasonal Time Series It the foowing example we wi use the AirPassengers dataset to expalin the Seasonal Decomposition.
The graph shows a seasona pattern along with a trend. It is also quite evident that the seasonality increases quite a bit at the ast of the plot. Exponential Smoothing with ETS We can use Additive Decomposition Method that adds the Error, Trend and seasonality up. Or
Mutipicative Decomposition Method that mutiplies these components. In general, if the seasona component stays constant over several cycles, it is best to use the Additive Decomposition Method.
The first info we get from the results is the model type itself: ETS(A,N,A) that is about additive error, no trend and additive seasonality, that is quite what we woud expect from a
temperature based data. Let’s see how the mode looks compared to the original dataset and the forecast.
Now, with the mutiplicative method the information criteria (AIC, AICc, BIC) are higher and that means the model is not as good as the one we got from out initially. In fact, if we look at the graph comparison above, the mutiplicative method does not catch the peacks as well as the initia one. There is a big distance between the back peaks and the red peaks. ARIMA: Autoregressiove Integrated Moving Average Crucially, the model made the time series stationary if it is not, and only then, the parameter p and q can be identified. The all ARIMA model is based on the summation of lags (autoregressive part) and the summation of
the forecasting error (moving average part). For example: ARIMA(2,0,0) is a second order (lag) of autoregressive parameter AR and we are interested in the first and second lag. For the first order autoregressive AR(1) or ARIMA(1,0,0) we have the formuation: The observed value Yt at time point t consist of the constant c plus the value of previous time point (t-1) multiplied by a coefficient ɸ plus the error term of time point t (et). For the forecasting the Kalman Filter would be applied in order to the error term. ARIMA(0,1,0) means the time series is not constant, which is required for a forecast. We are not in presence of stationarity (dataet with no constant mean). The differencing is: The formua above says that the present observation Yt minus the preavious observation Yt-1 is equal to a constant plus an error term. The term t-1 is also called the Backshift Operator which is used to denote differencing time series. What is the difference between the actual value of the time series and the forecasted value called?A forecast “error” is the difference between an observed value and its forecast.
How is time series forecast accuracy measured?The mean absolute percentage error (MAPE) is one of the most popular used error metrics in time series forecasting. It is calculated by taking the average (mean) of the absolute difference between actuals and predicted values divided by the actuals.
Why is MAPE used in time series?MAPE. Mean absolute percentage error is a relative error measure that uses absolute values to keep the positive and negative errors from canceling one another out and uses relative errors to enable you to compare forecast accuracy between time-series models.
Which of the following forecasting performance measures will you use if you want to penalize the large errors?Root Mean Squared Error (RMSE)
Like MSE, this metric also penalizes larger errors more. This metric is also always positive and lower values are better. One advantage of this calculation is that the RMSE value is in the same unit as the forecasted value. This makes it easier to understand compared to MSE.
|