Comparison of Box-Jenkins models predicting Iraq's population growth rate

The prediction of the urban population growth rate it gives estimate the expected change in the rate of growth in the future Based on socio-economic development and geography of the population, The use of time series analysis is one of the most important statistical methods used in the study and analysis of annual, monthly and daily data To prediction the values of future random phenomena based on what happened in the past which helps in making future plans for economic development. The goal of most of the statistical population studies is to provide an approximate forecast future population. In this paper, the Box-Jenkins models were compared to predict the rate of population growth in urban areas until the year (2033) based on the criterion of the mean absolute percentage error (MAPE) to choose the best prediction model.


Introduction
The prediction of the urban population growth rate is an approximation in a later period and gives the expected change in the future population. Based on socioeconomic development and geography of the population. The goal of most population statistical studies is to provide an approximate future forecast of the population. The study of population data is an important field of interest to governments and international organizations; this is because of the importance of the human element in the planning and implementation of economic and social development.
Developing countries are less aware of their population and their needs. Developed countries are concerned and more aware of the current and future population distribution. It is fact no country can be on the correct scientific path and technological development in the world, there should be a population census in the right method. There is no doubt that the future population studies by predicting through the use of statistical methods that help in providing a set of data and numbers on population, which leads to the face of economic and social development as well as health problems. As many of these problems can be addressed and preparedness after the development of rational economic policies to contain the lowest possible negative effects in the future.
The Box-Jenkins methodology is one of the most important methods used to predict time series, this method does not assume any particular pattern of historical data for the series we predict, and where is selecting the appropriate model, the distributions of the self-correlation coefficients of the time series are compared with the theoretical distributions of the different models the model selection is good if the differences (residuals) between the estimated values of small historical data are distributed normally, and independent of each other.
There are many researchers addressing the Box-Jenkins methodology from them (Zakria & Muhammad ,2009)  The main focus of this paper is to forecast the urban population growth rate for the next period (2033) on the basis of previous trends for model fitting and forecasting.

The methodology of Box
Box-Jenkins models that methodology applied by both George Box & Gwilyn Jenkins on the 1970 time series this methodology is distinct from other methods Its ability to modelling and predict random phenomena without assuming any prior model and provide comprehensive solutions for all stages of analysis of time series starting from the initial selection of the appropriate model stages of the methodology by estimating the parameters of the model, diagnosing it, and ending with predicting future observations this methodology is accompanied by some statistical tests that enable us to identify the appropriate data model and the inability to use the appropriate statistical tests to validate the model that is being built.
Among the most prominent methods of prediction Autoregressive Integrated Moving Average Models (ARIMA), this methodology is based on a consolidating between autoregressive model and the moving average model.

Autoregressive model: AR (p)
The current value of the time series (ܻ ௧ ) it is written as a weighted sum of the previous values of the time series (ܻ ௧ିଵ , ܻ ௧ିଶ , … ) plus the current error value(ܾ ௧ ) .
The formula model as the following (p): Where:  The formula model as the following (q): Where:

Mixed model
In this step the forms are mixed above (MA (P) , MR (q)) for analysis of time series data.
The formula model as the following (ARMA (p,q)) Where:

standard compared
In this paper we will based on the criterion of the mean absolute percentage error (MAPE), for comparison between models whichever is more accurate in prediction. The formula as the following:

Empirical data analysis
The data was collation urban population growth (annual %) in Iraq for the period (1960-2018) from the website World Bank. When drawing the iterative curve of the time series data for the period from (1960) to (2018), represented by Figure No. (1) It is noted that the population growth in urban areas has been affected greatly during these years, as it has decreased and instability. The chain has been fluctuating due to the reasons mentioned previously.  After drawing Autocorrelation Function (ACF) vs. Partial Autocorrelation Function (PACF). We start by selecting the best model from a set of proposed models for ARIMA models, based on the mean absolute percentage error. Through the comparison between the proposed models, it was found that the model (ARIMA (1, 1, 0)) It is preferable because the criteria for differentiation for it were lower than the rest for other models. Note that the rest of the models were not mentioned and were neglected because their parameters were not significant.  Table (3) Shows estimates of model parameters, and through the value of (sig = 0.000), which is less than (0.05), it turns out that the Equation model as the following ܻ ௧ = 2.283 + .638ܻ ௧ିଵ … … (5)

Model Fit Statistics
If this hypothesis is significant, this means that the sample values are independent and the correlation between them is zero and this leads to the stability of the time series. Drawing Residual Autocorrelation Function (ACF) vs. Residual Partial Autocorrelation Function (PACF) of the (prediction errors). The residues follow the pattern of the white noise chain (White Noise) and this means that they are independent and distribute a natural distribution with an arithmetic mean of (0) and a variance of its value (ߪ ଶ ).

Conclusions
A standard model was built to predict the urban population growth rate in Iraq through 2033.The rate of urban population growth in Iraq during the studied period is an unstable time series Because the charting of the correlation coefficients function showed that as a result of the significant decrease in the rate of population growth the best model among the proposed models is to predict the rate of urban population growth ARIMA (1, 1, 0), A decrease in the rate of urban population growth is expected in the coming years, through the proposed model.