This study is based on the following goals:
 Forecasting the Rsoftware builtin “Nile” data using an appropriate Time Series Model by:
 BoxJenkins approach (Part A)
 Simple Exponential Smoothing (Part B)
 Holt’s linear trend method (Part C)
 Comparing them using their respective accuracy measures and mean square errors (MSE) (Part D)
About Data Set
Rsoftware builtin “Nile River Data” named as “Nile” data contains a time series data of “Flow Volume” for 100 years, from 1871 to 1970. Since, after forecasting the “Flow Volume”, this study will compare the different approaches of forecasting, i.e. will find out the best approach of forecasting the “Nile” series, so the series “Nile” are split into two series named as “train” and “test” series. The “train” series contains the “Flow Volume” for the 90 years, from 1871 to 1960 and “test” data contains the “Flow Volume” for the rest 10 years, from 1961 to 1970.
For each approach, the “train” series will be forecasted for 10 years, and then using the forecasted values, the method of forecasting will be compared by estimating the accuracy measures and their respective mean square error for the “test” series and forecasting series.
BoxJenkins Approach (ARIMA Model)
If we combine differencing with autoregressive and a moving average model, we obtain a nonseasonal ARIMA (Autoregressive Integrated Moving Average) model. The full model can be written as follows:
Where is the differenced series (it may have been differenced more than once). The “predictors” on the right hand side include both lagged values ofand lagged errors. We call this an ARIMA (p, d, q) model, where
p= order of the autoregressive part;
q= order of moving average part
d= degree of first differenced involved
The first step of any time series modelling is to plot the data and identify any unusual observations.
Figure 1: Time series plot of Train Data
From figure 1, it is recommended that the time series is not both variance and mean stationary. To confirm this decision, the Augmented DickeyFuller (ADF) test and the KwiatkowskiPhillipsSchmidtShin (KPSS) test can be used to test whether the series is stationary or not.
Test for Stationary
The assumption of BoxJenkins ARIMA model is that the underlined time series should be stationary. To test the Staionary of the data, both informal and formal tests have been done.
Informal approach (Correlogram)
It is clear that from the constructed correlogram that the autocorrelation function decaying gradually that indicates the series consists of higher order of moving average term and the fall off of the first spike of partial autocorrelation function indicates that there may be one autoregressive coefficient exists in the series. So, it is observed from the correlogram that the original series are nonstationary.
Figure 2: ACF and Partial ACF of the train data
Formal Approach
 ADF Test
The hypothesis of the Augmented Dekay Fuller test is:
H_{0}: the series has a unit root, i.e. nonstationary
H_{1}: the series has no unit root, i.e. stationary
Table 1: Augmented DickeyFuller (ADF) Test Outcomes
Augmented DickeyFuller Test data: train DickeyFuller = 3.2347, Lag order = 4, pvalue = 0.08716 alternative hypothesis: stationary

Under this hypothesis we observe that at the level the series has a unit root (Pvalue=0.87) that means the series is not stationary. So, to make the series stationary we take the difference order one that makes the series stationary (Pvalue=0.000).
 KPSS Test
The hypothesis of the KwiatkowskiPhillipsSchmidtShin (KPSS) test is:
H0 : The series is stationary.
H1 : The series is not stationary.
Table 2: KPSS Test Outcomes
KPSS Test for Level Stationarity data: train KPSS Level = 1.1459, Truncation lag parameter = 3, pvalue = 0.01 
According to KPSS test from [Table 2], the series is not seems to be stationary depending on significant pvalue < 0∙01.
Since, the series is not stationary so the transformation of the series is necessary. To use possible transformation, using BoxCox formula, a decision rule can be made by finding confidence interval of the maximum likelihood estimate for the values of the parameter λ, where λ=1 indicates no transformation and λ=0 indicates logtransformation.
Figure 3: 95% confidence limit of loglikelihood function for λ
Here the Boxcox transformation is applied to make the variance stable or variance stationary. But from figure 5 it is clear that the data is not mean stationary. Possible method of making a series to mean stationary, is to find out the first difference of the series.
Figure 4: Time series plot of diff(train) data
From this graph there exists a stationary pattern of the series. From ADF test which also proves that the process is stationary. Here pvalue is .01 which indicates that the process is stationary.
Table 3: Augmented DickeyFuller (ADF) test outcomes
Augmented DickeyFuller Test data: diff(train)DickeyFuller = 6.246, Lag order = 4, pvalue = 0.01alternative hypothesis: stationary

Using first difference, the ADF test concludes that the series is stationary (pvalue < 0∙01).
Model Selection Criterion
Sample ACF and partial ACF is used to identify a model. Fom the above graphs, since both the
Figure 5: ACF and Partial ACF of the diff (train) data
ACF and partial ACF cuts off after first lags (p=1 and also q=1) in [Figure 5], and since first difference have been used to make the series mean stationary, so the probable time series model is autoregressive first difference moving average model with order (1,1,1) that means ARIMA(1,1,1) model. To find out the best fitted ARIMA model with the appropriate order of lags “p” and “q”, corrected Akaike Information Criterion (AICc) and Bayesian Information Criterion for small sample sizes would be used.
Table 4: Model Selection using AICc and BIC
Model  AICc  BIC 
(0,1,2)  1140.74  1147.93 
(0,1,1)  1141.81  1146.65 
(1,1,1)  1140.03  1147.22 
(1,1,2)  1142.06  1151.54 
By considering some tentative ARIMA models, we have seen from the above table that, for ARIMA (1,1,1) the minimum value of AICc and BIC are confirmed. Thus ARIMA (1,1,1) is the best fitted forecasting model.
Model Estimation and Diagnostic Checking
The suggested ARIMA (1, 1,1) model may be represented mathematically as follows:
Where, ϕ_{1} is the parameter of AR(1) process and θ_{1} is the parameter of MA(1) process and e_{t} are the random noise.
The time series models those are tested and found satisfactory in all stages of model fitting process being used for estimation of the time series coefficient. These models would be used for forecast purpose, which is the ultimate goal of univariate time series analysis.
Table 5: Estimated parameters of ARIMA (1,1,1) model
The estimates of the parameters of ARIMA (1,1,1) model is found from the above table as 0∙254371 and 0∙874136 respectively. The parameter of both AR(1) and MA(1) process is significant at 5% level.
Model Diagnosis
Some important assumptions of BoxJenkins approach that the error term e_{t} is assumed to follow the assumptions for a stationary univariate process, the residuals should be white noise (or the residuals are uncorrelated) drawings from a fixed distribution with a constant mean and variance as well as should follow the normality assumptions. The constructed correlogram by using the residuals from the fitted models indicate that all the series are free from autocorrelation problem since all the spikes are laying by the limit of permissible lines. From the residual plot it is clear that the residual of ARIMA (1,1,1) model is zero mean stationary series and from the histogram it is clear that the residuals are normally distributed
Figure 6: Model diagnostics for the residual of ARIMA(1,1,1)
Now, by using a portmanteau test such as Ljung (pronounced Young) Box test we can find out whether the residuals are uncorrelated or not, at given lags. The hypothesis of this test is:
H_{0} : The residuals are uncorrelated and hence follow a white noise process.
H_{1} : The residuals are not uncorrelated and hence don’t follow a white noise process.
Table 6: BoxLjung test outcomes
BoxLjung test data: eXsquared = 0.082212, df = 1, pvalue = 0.7743 
A portmanteau test returns a large pvalue, also suggesting that the residuals are white noise. Therefore, the series can be used to forecast for some lags depending on ARIMA(1,1,1) model.
Forecasting
Therefore, by using ARIMA (1,1,1) model the forecasted value of train data set from year 1961 to 1970 is shown blue line in the given figure:
Figure 7: Time series Forecasting plot by using ARIMA (1,1,1)
By comparing the forecasted train value (19611970) and observed test (19611970) value, we can comment on the forecasting performance of BoxJenkins approach as well as ARIMA (1,1,1) model. Thus, this is the main reason behind splitting the dataset into training data and test data.
Year  Test (observed value)  Forecasted value 
1961  1020  860.1976 
1962  906  871.9006 
1963  901  874.9309 
1964  1170  875.7156 
1965  912  875.9187 
1966  746  875.9713 
1967  919  875.9850 
1968  718  875.9885 
1969  714  875.9894 
1970  740  875.9896 
MSE = 5.495 
Accuracy measures of the forecasting “Nile” by the BoxJenkins approach are estimated in the following Table:
Table 6: Estimated accuracy measures by ARIMA (1,1,1) Method
Accuracy measures  Estimates 
ME  15.36 
RMSE  139.67 
MAE  109.15 
MPE  3.9 
MAPE  12.69 
MASE  .83 
ACFI  .03 
R Codes (ARIMA Model) 
library(forecast) library(ggplot2) data(Nile) View(Nile) plot(Nile) train<window(Nile, start=1871, end=1960) autoplot(train) test< window(Nile, start=1961, end=1970) par(mfrow=c(2,1)) acf(train) pacf(train) library(tseries) adf.test(train) library(MASS) boxcox(train~1) train_dif< diff(train) plot.ts((train_dif), xlab=”Year”, ylab=”First difference”) adf.test(diff(train)) ### model selection par(mfrow=c(2,1)) acf(train_dif) ### cut of after 1 lag ie p=1 pacf(train_dif) ### q=1 auto.arima(train) library(forecast) Arima(train, order=c(0,1,2))$aicc Arima(train, order=c(0,1,1))$aicc Arima(train, order=c(1,1,1))$aicc Arima(train, order=c(1,1,2))$aicc ### model fitting Fit<Arima(train, order=c(1,1,1)) library(lmtest) coeftest(Fit) summary(Fit) ### model diagnosis checkresiduals(Fit) ### show three graph of assumtion checking fitted_residual<resid(Fit) Box.test(fitted_residual, type=”Ljung”) ## forecating library(forecast) library(ggplot2) autoplot(forecast(Fit))+ xlab(“Year”) + ylab(“Nile”) f.fit<forecast(Fit, h=10) round(accuracy(f.fit),2) fit_mean<f.fit$mean fit_mean MSE<(sum(testfit_mean)^2)/10 MSE 