The forecasting model of discharge at Brantas sub-basin using autoregressive integrated moving average (ARIMA) and decomposition methods

A watershed is a combination of several rivers and tributaries with certain boundaries that function to drain rainwater into a lake or a sea. One of the hydrological data contained in the watershed is discharge data. If there is incomplete discharge data, it must be extended based on historical data. ARIMA and decomposition are methods that can predict time series data. The purposes of this research are to determine the historical discharge patterns of Brantas Sub-basin, to know the discharge forecasting model of Brantas Sub-basin, to know the results of forecasting Brantas Sub-basin discharge, and to compare the accuracy between ARIMA and decomposition methods. The accuracy is obtained by calculating MSE and RMSE values. The best method is a method that has the smallest MSE and RMSE values. The results of the research showed that Brantas Sub-basin discharge data in 2007-2017 has a seasonal pattern. The best ARIMA model is ARIMA (0,0,3)(1,0,1)12 model, while the best decomposition model is the additive decomposition model. The Decomposition method has better accuracy than the ARIMA method in predicting discharge of Brantas Sub-basin.


Introduction
Some rivers and tributaries will form a watershed which has certain boundaries on land and sea that are used as rainwater container and then flow into lake or sea [1]. The watershed can consist of one or more sub-basin that function to receive and drain rainwater through tributaries to the main river. One of the hydrological data contained in watersheds is discharge data [2]. According to Limantara, the discharge is a volume of water that passes through a river cross-section which is calculated per unit of time [3]. Some rivers and tributaries will form a watershed which has certain boundaries on land and sea that are used as rainwater container and then flow into lake or sea [4]. The watershed can consist of one or more sub-basin that function to receive and drain rainwater through tributaries to the main river [5]. One of the hydrological data contained in watersheds is discharge data. According to Zhang, the discharge is a volume of water that passes through a river cross-section which is calculated per unit of time forecasting can be done according to right method [6]. These patterns include seasonal, cyclical, trend, and irregular patterns.
Auto-Regressive Integrated Moving Average (ARIMA) is a method that can be used to extend data using time series data [7]. According to Juwono in Pramujo, time-series data used in the ARIMA model must be stationary [8]. This makes the ARIMA model forecasting results better than the ARMA model. There is also a decomposition method that can be used to forecast data. According to Meilano  components of the decomposition method include trend factor, seasonal factor, cycle factor, and a random factor [9]. The advantage of this method is that patterns or components can be calculated separately, so that more accurate forecasting results are obtained.
Discharge data is an example of time series data in the field of hydrology, so it can be extended using stochastic methods. Some previous research state that the ARIMA method can be used to predict discharge data and provide good results [10]. Based on this, the research tries to compare the accuracy of the ARIMA method with the decomposition method. ARIMA and decomposition methods are used to forecast discharge data with the help of Minitab 18 software.
The research was conducted at Brantas Sub-Basin to predict the availability of water resources in the future, this is because the communities around the site need large amounts of water that need to be used in industry, food crop agriculture, plantation, and tourism. This is in accordance with the Decree of Minister of Settlement and Regional Infrastructure Number: 360/KPTS/M/2004 in 2004 which states that river discharge data forecasting is needed to estimate water availability needed. The results of forecasting are used when planning, designing, building, operating and maintaining water resources. Data commonly used are annual, seasonal, monthly, semi-monthly, and ten-daily discharge data. Discharge forecasting can also be used to predict flooding.
The purpose of this research is to conduct a comparative analysis of ARIMA and decomposition methods for the extension of discharge data. The results of this research are expected to obtain information on the reliability level of both methods in predicting the extension of discharge data.

Research method
Discharge forecasting of Sumber Brantas Sub Watershed uses ARIMA and Decomposition Methods which is modeled with Minitab 18 software. The historical data used is discharge data of Brantas Subbasin with a discharge gauge at Sengkaling Dam. The data began in 2007 until 2018 which is divided into two parts, namely in-sample data (data processed for model determination) and out sample data (data used to calibrate forecast results for the next period). Discharge data in 2007-2017 are used as in sample data, while discharge data in 2018 are used as out sample data.

Deletion of outlier data and pattern analysis
Outlier data is data that has value too far from other data. The data needs to be deleted so that forecast results have good accuracy [11]. Analysis of historical data pattern needs to be done before making time series forecasting. This is important because data pattern will affect type of forecasting method that will be used

Forecasting with arima method
The ARIMA method consists of several processes with different models. The models is :

Moving average (MA) process
The equation for moving average (MA) is ARIMA (0,0,q) or MA (q)

Steps forecasting
The stages of forecasting with ARIMA methods are.
• Input in sample data in Minitab 18 software.
• Test stationarity of data on variant and average.
• Determine ARIMA models based on ACF and PACF patterns.
• Test parameters of model, white noise assumption and residual normality.
• Determine the best ARIMA model.
• Forecast based on the best ARIMA model.

Forecasting with decomposition method
Decomposition is a method for analyzing time series data by calculating each component separately [12]. Forecasting results from the method are obtained by combining the estimates of each component. This method depend on 4 component : • Trend factor : the trend factor shows that data has increased, decreased or been constant over a long period of time. Trends can be represented by straight lines, exponential S curves and other long-term patterns. • Cycle factor : the cyclical factor shows that data has a varying pattern where the data moves from large values to small values and returns to large values in a longer period of time than seasonal factors. • Seasonal factor : seasonal factors show that data has a steady increase and decrease over a long period of time, such as monthly, quarterly and yearly. • Random factor : the random factor is assumed to be the difference between the combined effect of the three decomposition factors and the actual data

Decomposition aditif model
The aditif model of the decomposition method can be seen in equation below Xt = (It + Tt + Ct) + Et (4)

Decomposition multiplikatif model
The multiplikatif model of the decomposition method can be seen in equation below Xt = (It * Tt * Ct) * Et (5)

Steps forecasting
The stages of forecasting with decomposition methods are: • Input in sample data in Minitab 18 software.
• Analyze data using additive and multiplicative decomposition models.
• Determine the best decomposition model.

Fitting test
Data accuracy testing must be done in every forecasting process. This can be done by minimizing the error rate, which is the difference between the actual value and the forecast value [13]. There are several ways that can be used to test the accuracy of forecasting results, including MSE and RMSE [14].
Mean Square Error (MSE) is the result of dividing the error value squared by the amount of data [15]. The MSE calculation formula can be seen in equation below = ( − ) 2 (6) where: Xt = actual value in t period .Ft = forecast value in t period ..n = forecast period Root Mean Square Error (RMSE) is the root of the MSE value [16]. The RMSE calculation formula can be seen in equation below.
where: Xt = actual value in t period .Ft = forecast value in t period ..n = forecast period To make the implementation of the research does not deviate from the formulation of the problem and the purpose, a flow diagram is used as in Figure 1. The diagram is compiled from the data which were obtained from the relevant institutions or survey research locations.

Results and discussion
Pattern of Brantas Sub-basin discharge data in 2007-2017 can be seen in Figure 2 below   [11]. Data stationarity test on average is done by looking at trend pattern of data Result of data stationary test on variant and test for average can be seen in Figure 3   The tests are also carried out on other ARIMA models and then calculate MSE and RMSE values of each model [13]. Recapitulation of significance tests, MSE and RMSE calculation results can be seen in Table 1 below Results of additive model analysis include time series, component analysis graph and seasonal analysis graph. Time series plot of additive decomposition model can be seen in Figure 6 below.  Figure 6. Comparison of observed discharge compared to the results of decomposition model The analysis is also carried out on multiplicative model and then calculate MSE and RMSE values of each model. Recapitulation of MSE and RMSE calculation results can be seen in Table 2 below Comparison of accuracy between ARIMA and decomposition methods can be seen in Table 3 below

Conclusions
Discharge data of Brantas Sub-basin in 2007-2017 has seasonal pattern. It can be seen from existence of monthly seasonal factor that affect data. Therefore, forecasting process can be carried out using ARIMA and decomposition methods. Discharge forecasting model of Brantas Sub-basin are made using ARIMA (0,0,3)(1,0,1) 12 and additive decomposition model. Both models are chosen as forecasting models because they have the best accuracy among other ARIMA and decomposition models. The results of forecasting Brantas Sub-basin discharge data in 2018 using decomposition method have MSE and RMSE values that are smaller than ARIMA method. It shows that the results forecasting using decomposition method has smaller error values and data variance that is closer to actual data variance. Decomposition method has better accuracy than ARIMA method in predicting discharge data of Brantas Sub-basin. That is because MSE and RMSE values of decomposition method is smaller than ARIMA method for out sample criteria.