A comparative forecasting model of COVID-19 case in Indonesia

COVID-19 had been a disaster in Indonesia. Moreover, it is needed a study to analyze the trend of this case. The objectives of this study were (1) to propose the model for predicting COVID-19 using exponential smoothing, autoregressive integrated moving average (ARIMA), neural network, and fuzzy time series and (2) to compare the performance for each model by using RMSE as evaluation tool. In this study, the splitting data is implemented by 3:1 ratio on train and test data set. The results show that the neural network has the smallest error, 772.46 for RMSE. It means that neural network perform better than other forecasting model. Once, the characteristic data had big impact to building forecasting model whether in classical or modern model.


Introduction
COVID-19 is the abbreviation of the coronavirus disease of 2019 [1]. It has been spreading initially in Wuhan, China. Critical acute respiratory syndrome coronavirus 2 (SARS-CoV2) cause COVID-19. WHO stated COVID-19 as global pandemic which reached the confirmation cases 200,000 patients with more than 8000 deaths across over 160 countries on this March [2]. According to Setiati and Azwar [3] in the end of March 2020, Indonesia had been widely impacted by COVID-19 since case fatality rate being 8.9%. As of June 14 2020 in Indonesia, there were 37,420 confirmed cases and 2,091 the deaths. According to Laing [4], COVID-19 as global pandemic has not only affected infections and deaths, but it has also caused chaos in the global economy on the Great Depression level. COVID-19 has the power to abolish personal life, finance, market, industries and whole economies. The survey stated that GDP of Indonesia had been loss on 0.2%. There are 1.9 million of Indonesian people who have lost their jobs due to pandemic [5].
The forecasting COVID-19 model had been conducted by numerous researchers. Logistic model and machine learning is utilized as tool to forecast the trend of COVID-19 pandemic in Brazil, Russia, India, Peru and Indonesia [6]. The Boltzmann function had been implemented to forecast the rate of COVID-19 case in Brazil [7]. The COVID-19 pandemic in Egypt had been predicted by applying  [8]. The measurement of death impact of COVID-19 in Jakarta, Indonesia had been analysed by using ARIMA [9]. Yet, it had been faced the study gap to find the comparative study of forecasting COVID-19 case in Indonesia. In order to fulfill the study gap, this paper will discuss about forecasting model by using four methods. Exponential smoothing and ARIMA are as statistical model. Thus, fuzzy time series and neural network are as artificial model. The forecasting result from statistical and artificial intelligent will be compared by using RMSE as evaluation tools.

Data management
The Novel Corona Virus (COVID-19) data is obtained from https://ourworldindata.org/coronavirussource-data. There are 80 daily dataset from March, 27 2020 until June, 14 2020. Figure 1 capture trend of COVID-19. In this study, the data set will be grouped into two group namely train and test dataset. Based on Septiarini and Musikasuwan [10], the optimal ratio of dataset is 3:1, for training and testing, respectively. Creating stationary data is through differencing process. Stationary data is a primary assumption in classical forecasting model (exponential smoothing, moving average, ARIMA, etc). It has constant statistical properties so can be predicted easily [11]. The ACF plot will rapidly close to zero in stationary. However, the researcher was struggling to reach in stationary level even so there is no stationary exactly. As showed in Figure 2, Indonesia's ACF graph presents the stationary data after differencing process. The ACF depicted several lag appear from the horizontal line. Figure 3 captured time series plot after differencing process.

Fuzzy Singh's model
The fuzzy time series procedure [12] to forecast COVID-19 confirmation case in Indonesia is explained as follow: 1. The universal set was divided into same length 2. Conduct the fuzzy sets 3. Fuzzification and construct the fuzzy rules

Forecasting 5. Diagnostic checking
After forecasting step, it will be set back whether the residuals correlated by using ACF plot and whether they are normally distributed with mean zero and constant variance by using time plot and histogram of the residuals as shown in Figure 3.

Autoregressive Integrated Moving Average (ARIMA)
According to Box and Jenkins [13], there are three elementary steps to propose ARIMA model as follow:

Identification
Observation of the autocorrelation function and partial autocorrelation function plot will be set up in this step.

Parameter estimation
The R program will execute to estimate the parameter for each time series. And the parameter estimation applied ARIMA (0,0,1).

Diagnostic checking
The diagnostic checking deal for 20 forecasted values. As shown on Figure 4, residuals did not appear to be associated, and the residuals satisfy normally distribution with constant variance and zero mean, moreover the constructed ARIMA can be said as a predictive forecasting model.

Exponential smoothing
Simple exponential smoothing is a conventional predicting method which is simple process as its name. Simple exponential smoothing is required to approximate a parameter (α) for training data 75%. The estimating process of simple exponential smoothing model had been executed by using R program which considered the RMSE. In Figure 5, it depicts whether the residuals correlated by using ACF plot and whether they are normally distributed with constant variance and mean zero by using time series graph and the forecast errors.

Neural network
Neural network is kind of machine learning model. There are eight prodcedures in constructing a neural network for predicting aims [14], as follow : 1. Variable selection 2. Data collection 3. Data preprocessing 4. Training, testing, and validation sets 5. Neural network parameter Evaluation criteria 6. Neural network training 7. Implementation R studio program is applied to execute neural network in this study. In Figure 6, the residuals did not appear to be correlated, and the residuals satisfy normally distributed with constant variance and zero mean, moreover the constructed neural network can be said as a predictive forecasting model.

Result and Discussion
The implementation of four models has been evaluated in this study to forecast COVID-19 confirmation case in Indonesia. The forecasting results are from May, 26 2020 until June, 14 2020. After that the forecasting result will be evaluated with testing data set by using RMSE. Figure 7 captures the results of time series plot. Table 1 depicts the RMSE values for test data set of all models. The smallest RMSE is neural network. It means that neural network has the most powerful model to forecast confirmation case of COVID-19 in Indonesia. According to Musikasuwan and Septiarini [15], for optimizing fuzzy time series model, weighted fuzzy time series can be applied to forecast COVID-19 confirmation case. Again, in order to extend the research about COVID-19, it is needed to add more data since the trend of confirmation case is still going up.