Rainfall estimation with TFR model using Ensemble Kalman filter

Rainfall fluctuation can affect condition of other environment, correlated with economic activity and public health. The increasing of global average temperature is influenced by the increasing of CO2 in the atmosphere, which caused climate change. Meanwhile, the forests as carbon sinks that help keep the carbon cycle and climate change mitigation. Climate change caused by rainfall intensity deviations can affect the economy of a region, and even countries. It encourages research on rainfall associated with an area of forest. In this study, the mathematics model that used is a model which describes the global temperatures, forest cover, and seasonal rainfall called the TFR (temperature, forest cover, and rainfall) model. The model will be discretized first, and then it will be estimated by the method of Ensemble Kalman Filter (EnKF). The result shows that the more ensembles used in estimation, the better the result is. Also, the accurateness of simulation result is influenced by measurement variable. If a variable is measurement data, the result of simulation is better.


Introduction
Rainfall is one of the most important elements in weather in Indonesia. Because of its location on the equator, temperature in Indonesia does not have much fluctuation. In certain years, the duration of the rainy season and the intensity of drought during the dry season vary greatly in Indonesia. Certain trajectories of monsoons vary from one year to the next, leading to widespread variation in rainfall across the archipelago from year to year [1].
The climate in Indonesia consists of one rainy season and one dry season throughout each year. The characteristics of the rainy season are at least 200 mm of rainfall's amount per month. This provision is based on the minimum required for rice production and calculates account evaporation and seepage through the soil [2]. The rainy season and the dry season are in varying months throughout the provinces of Indonesia.
The average deviations of rainfall at the district level have positive impact to the average deviation of rice production at the district level in Indonesia in the 1990s. (The year in which very high rainfall produces very high rice production, and the year in which very low rainfall produces very low rice production) [3]. The secondary report also shows that the higher rainfall was, the more agricultural productivity earned in Indonesia [2]. Production of rice in the rainy season and dry season depends on the time of the monsoon wind. Unexpectedly, it is clear that drought in the dry season is driven by the El Nino weather phenomenon. For example, there were ten long or short droughts (large-scale crop failures) between 1921 and 1954. The planting time during the rainy season was based on the threshold of accumulated rainfall. Generally, when the planting time is delayed, the harvest season is reduced. The delaying in the start of the monsoon winds in the rainy season, and the ongoing drought during the season, leads to a reduction in the rainy season's harvest. Secondary crop yields may decrease during the following dry season due to harvest delays in the previous monsoon season (eg, El Nino's 1997-1998 perforation). On the other hand, in La Niña year (when rainfall is very high), the planting season can start earlier and produce above average crop. Due to the importance of the rainy season cycle, food insecurity also tends to differ each season. Food insecurity tends to be highest at the end of the dry season and the beginning of the next rainy season, when food supply from the previous rainy season is low and demand is high with initiation of planting [4]. Thus, the drought makes the food scarcity worse.
The rainfall is not only impacting in agricultural production and human income, but also child growth and health. Unusual rainfall can affect the growth of disease in the environment, and diseases directly affect the absorption of nutrients, especially in the early years of children growth, when kids are weak from diseases. Fluctuations in rainfall can affect other environmental conditions that connected with economic and public health activities, such as increasing rate of forest wildfires, rate of floods, or rate of landslides, the availability of clean water, and agricultural pest control [5]. Some of these relationships imply the negative effects of rainfall that offset the positive effects through increased yields.
Most of the increase in global average temperature is caused by a certain gas increase in the atmosphere. The atmosphere is made up of a wide variety of naturally occurring gases. Gas is also produced by human activities. If some of the gas produced is excessive, it can cause changes in natural processes that eventually lead to climate change [6].
Some of the atmospheric gases are capable of extracting or absorbing heat from the sun and the earth and keeping it in the lowest atmospheric parts closest to the earth. There are a lot of greenhouse gases in the atmosphere, but the most important greenhouse gas is carbon dioxide (CO2). This gas is produced when carbon substances join the oxygen in the air. The increase in atmospheric CO2 is the biggest cause of climate change.
Carbon can be found in three ways. The carbon can be 1) absorbed from the air as part of carbon dioxide by plants and trees, then used as energy and food for growth; 2) released back into the air as part of CO2 by plants, trees, animals and humans by breathing; and 3) stored in tree trunks, animal bodies, human bodies, and rocks and other inanimate objects. Different types of areas cause different amounts of carbon storage. Forests with many trees can store large amounts of carbon, while grasslands or farms store less carbon.
The main reason the climate is changing is that human activities interfere with Earth processes and cycles that control the Earth's climate, such as the greenhouse effect and carbon cycle. The more CO2 emissions from human activities that alter the balance of the Earth's natural processes, this leads to global warming and climate change.
Forests and natural areas play a very important role in maintaining natural processes. Forests are one of the largest carbon sinks to help keep the carbon cycle and other natural processes running well and help mitigate climate change. However, forests can also be one of the largest sources of CO2 emissions. Because forests and other plants also absorb CO2 out of the atmosphere, this dual role makes forests more and more important. The earlier investigations say that destruction of forest caused by human is delivering carbon in atmosphere is between 12-17% of all CO2 around the world [7].
It can be concluded that climate change that causes deviations of rain intensity can affect the economy of a region, even the country. Climate change is affected by forests and crop areas. Result obtained through a mathematical model for global temperature, forest area, and seasonal rainfall show that the higher the forest cover, the less fluctuation there is between rainy-season and summer rainfalls. Moreover, growth in forest cover also correlates with an increase in summer rainfalls [8].
Therefore, it is important to estimate the behavior of rainfall in Indonesia. Some researches about rainfall prediction are done with Kalman Filter method. Kalman Filter is an algorithm to estimate the state variable of the stochastic dynamical linear system. The work are about implementing Kalman Filter analysis for modeling and forecasting rainfall in Semarang city with annual rainfall data in Semarang form 2005-2012 [9] and the designed model from the relationship between rainfall and climate anomaly indicator to forecast rainfall condition in the future to support agriculture planning [10].
There are many modification of Kalman Filter algorithm. The modification has been done because we need faster computational time, lesser error, or better algorithm. One of modified algorithm in Kalman Filter is the Ensemble Kalman Filter. Ensemble Kalman Filter algorithm has been implemented in some researches. One of it is the groundwater pollution estimation in Surabaya, where the groundwater quality is influenced by industrial pollution [11]. Also, the Ensemble Kalman Filter has been used to estimate position of an autonomous underwater vehicle based on dynamical system of AUV motion [12].
In this work we use Ensemble Kalman Filter to estimate the rainfall behavior using a dynamical model that shows the relationship between temperature, forest area, and amount of rainfall. The paper is organized as follows: mathematics model of seasonal rainfall is studied in Section 2, discretization of the mathematics model is described in Section 3, implementation of Ensemble Kalman Filter for the model in Section 4, simulation is described in section 5, and finally, Section 6 concludes the result of the study.

Seasonal rainfall model
The change in global temperature, forest area, and seasonal rainfall is related. Here, we use that represents global temperature at time t, that represents forest cover area at time , that represents amount of rainfall at time t, and that represents rate of change of rainfall. Our work is studying the mathematical model that represents a relationship about global temperature, forest area, and amount of seasonal rainfall.
The global temperature and forest cover are represented by logistic equation, not exponential equation. The global temperature is increased exponentially, while forest area is decreased exponentially [13]. However, they cannot increase or decrease forever. While exponential growth is unbounded, logistic growth is bounded. Therefore we assume logistic equations of global temperature and forest area are as follow: With and are increasing rate of global temperature and decreasing rate of forest area respectively, and area minimal global temperature and minimal forest area, respectively, where and are gaps between two equilibrium points. The amount of rainfall in Indonesia is seasonal and periodical (see Figure 1 and Figure 2) according to the data from Badan Pusat Statistik for the year 2000-2010 [14], year 2011-2015 [15] and Badan Meteorologi, Klimatologi, dan Geofisika [16]. Therefore the solution of second order differential equation should be periodic. Because the rest variables state can be represented by first order differential equations, the second order differential order is reduced to two first order differential equations. With as the rate of change of rainfall, the periodic rainfall's amount is represented by the dynamical system as follows However, global temperature and amount of seasonal rainfall have correlation to other variables. Therefore, the model should show that the rate of change of each variable is affected by other variable. By that assumption, we add prey-predator term to the model to show the relationship. Here the new model is: The parameter and represent the rate of change of forest area caused by amount of rainfall absorption and increasing rate of rainfall's amount caused by rest of water in forest, respectively.
The term represents proportion of the amount of water in forest. The coefficients in term , and , indicate the growth rate of forest area and decreasing rate of rainfall caused by proportion of the amount of water in forest, respectively. While the term represents proportion of the rainfall difference change in the forest, we use the coefficients and both to show the difference rates of the amount of rainfall affected by the change of rainfall pattern in the forest. The term represents proportion of the forest area at each temperature level. The coefficients for term, and indicate the difference rate of forest area and rainfall at temperature level, respectively. The term represents proportion of the rainfall at each temperature level, that's why the coefficients and both show the decreasing or increasing of rainfall's amount caused by proportion of the rainfall at each temperature level. While the term represents the proportion of rate of change of rainfall at each temperature level, the coefficients and both indicate the difference rates of the amount of rainfall affected by the change of rainfall pattern at each temperature level The last term, represents the correlation between amount of rainfall and rate of rainfall change. We use and to show the difference rate of the amount of rainfall and the rainfall difference rate affected by the relationship between the change in the pattern and the amount of rainfall, respectively.

Dicretization of model
Ensemble Kalman Filter algorithm can be implemented only for discrete system. First, the continuous model (5)-(8) must be discretized. We can use finite difference method to discretized the model. After the model discretizate with the finite forward difference method, the model can be written as follows The model in equations (5)-(8) used to understand the effect of global temperature and forest cover on seasonal rainfall. In fact, there is temperature, forest cover, and amount of seasonal rainfall that didn't match with the model, which was called by noise. Noise caused deterministic model turn into stochastic model. The equation (9) can be written in the form (10) With is a nonlinear function defined in equation (9). The system noise, , and noise measurements, , are generated through the computer and are generally taken normally distributed and have mean value zero, the system noise variation is expressed by and the measurement noise variation is expressed by . Both depend on time and the value assumed to be constant.
The matrix of square-shaped system noise variations corresponds to the size of the error covariance of the estimates. While the variance matrix size of the measurement noise is square-shaped in accordance with the number of rows from the observed -vector.
The measurement vector is determined from the state variable that is used as the measurement variable. In this case, it is assessed by the measurement variable that there are state variables and so that the measurement vector is

Assimilation method
The assimilation method is varied. In this work, we use Ensemble Kalman filter method. Ensemble Kalman filter is estimation method for nonlinear dynamical stochastic system. This method used ensemble value as initial estimation for state variable and measurement data based on real data. This is EnKF steps shown as follow [17]:

First Step: Initialization
First generation of ensembles is generated in accordance with the initial value for each state by providing system noise.
[ ] Next is to find the average value of each state of the ensemble generation.

Second Step: Prediction
We get prediction value from ̂ from the previous step by adding noise system ̂ ( ̂ ) for Mean value from the estimation is The original unit of global temperature is degree Celcius with the maximum value of the maximum body temperature that normal human can tolerate [18]. The unit of the forest cover is squared kilometers and maximum is set to region's total area, Kabupaten Malang, 3535 [19]. The maximum possible monthly rainfall in the region is 700 mm making millimeter the unit of the amount of rainfall according to the data in 2014 from Badan Meteorologi, Klimatologi, and Geofisika Malang (see Figure. 2) [16].
For this simulation, the initial value when the initial time is January 2014. We use monthly temperature and rainfall's amount data from Karangkates, Malang. The temperature of Karangkates is 25.5 , the forest area of Malang is 110,494.66 ha or 1104,946. 6 according to data from Dinas Kehutanan Provinsi Jawa Timur [20], and amount of rainfall is 369 mm. The time change value used is . Estimated rainfall is done as much as 100 iterations. Rainfall estimates use different number of ensembles, 100, 200, and 500 ensembles.  Kalman filter method. It shows that the forest area will be wider from 1106,2 km 2 to 1107.2 km 2 . The distance between real and EnKF is far enough (about 1 km 2 ) so we can see that RMSE value is big.   state variable although the given noise value is the same. This is because in this simulation, variables and are not used as measurement data.

Conclusion and discussion
In this work, TFR model is estimated with Ensemble Kalman filter. The investigation brings us to the conclusion as follow: 1. The result of simulation with more ensembles gives better result than few one. 2. The accurateness of simulation result is influenced by measurement variable. If a variable is measurement data, the result of simulation is better. This is shown from short distance of real and estimation value in graphic or low RMSE value.