A Comparison of The Fuzzy Time Series Methods of Chen, Cheng and Markov Chain in Predicting Rainfall in Medan

Medan has a high rainfall variability. The amount of rainfall affects the welfare of life such as in the fields of health, economy, agriculture, industry, transportation, tourism and so on. To find out changes in rainfall patterns, a prediction of rainfall levels is designed to see and analyze the rainfall patterns that will form in the future. Forecasting is the art and science of predicting future events by taking historical data and projecting it into the future by using some form of mathematical model. One of the methods used to predict an event is the fuzzy time series method. Fuzzy time series is a concept that can be used to predict problems where historical data is formed in linguistic values. While the latest data as a result are in the form of real numbers. The purpose of this research is to implement the fuzzy time series method to predict rainfall in Medan by comparing several developments of the fuzzy time series method, namely Fuzzy Time Series Chen, Markov Chain and Cheng. In determining the interval in the Fuzzy time series Avergae Based rules are used to get the best results. In this study the result is MAPE value of each method. Chen’s method give MAPE=8.002%, Markov chain’s method give MAPE=30.12% and cheng’s method give MAPE=34.5 %. So the best method for forecasting rainfall is Chen Method.


Introduction
The importance of weather information today has become an inseparable part of daily activities, as agricultural activities, transportation and others. Rainfall forecasting is an important part of weather information. Weather conditions are always change in a certain period of time, it makes people difficult to determine the pattern of rainfall completely. Generally rainfall patterns influenced by geographical conditions of the region. North Sumatra Province especially Medan City is included in the equatorial type. Equatorial type is an area that has two peak monthly rainfall distribution with two maximum rainy seasons and most of the year is included in the rainy season. Equatorial pattern is characterized by the type of rainfall in the form of bimodial (two rainfall peaks) which usually occurs around March and October or during the equinox, with an average rainfall of 2000-3000 mm per year. The high intensity of rain often occured floods in the city [1].
Forecasting is the activity of predicting future circumstances based on past information with a relatively long period of time. Rainfall forecasting is one of the cases in time series forecasting. Many time series forecasting methods are commonly used, one of which is fuzzy time series. Fuzzy Time Series (FTS) is a method introduced by Song and Chissom (1993) which is a concept to predict the problem where actual data is formed in linguistic values [2]. The fuzzy time series method has several advantages including the calculation process that does not require complex systems such as genetic algorithms and artificial neural networks, so that it is certainly easier to develop. beside that, this method can also solve the problem of forecasting historical data of linguistic values [3].
The development of science today causes the fuzzy time series method to develop rapidly. Many methods show various types of development of forecasting steps to produce the correct prediction value and produce the smallest possible error. Some of the developing fuzzy time series methods that appear today include the Chen method, the Cheng method, Markov chain and others. Each of these methods provides different steps in forecasting a value. The Fuzzy time series Chen model was developed when the FTS model Song and Chissom gave results with low accuracy [4]. According to [5] FTS forecasting using the Chen method produces a more accurate forecasting value on smaller data samples than using larger data samples. However, Chen's method also has weaknesses, that are lack of consideration in determining the universe and the length of the interval, and ignoring the pattern of changes in previous data trends [6]. In [7] forecasting using Markov Chain FTS produces more accurate forecasting with MSE value is 0.216 compared to Chen's FTS method with MSE value is 0.656. Whereas in [8] forecasting using the Cheng method which uses adaptive forecasting has a smaller forecasting error size than the Chen method with MAPE value is 2.1779%. Therefore, in this study the performance of the three fuzzy time series methods, namely Chen's FTS, Cheng's, and Markov Chain's methods to compare the accuracy of the three methods in predicting rainfall in the city of Medan.

Time Series
Periodical data (time series) is collection of data arranged based on periodical time or data arranged in chronological order. Chronological order may be arranged based on days, weeks, months, years, and so on. Therefore, periodical data are linked with statistical data which recorded and observed within certain time interval, as for trade, price, supply, labor production, exchange rate, and stock market stock price.

Fuzzy Time Series
Fuzzy time series is a concept that can be used to predict problem or case where historical data are formed in linguistic values, which also means that the previous data in fuzzy time series is linguistic data, while the actual data as the result is real numbers [5].

Fuzzy Time Series Chen
According to [5] [9] Stages in prediction using Method Fuzzy Time Series (FTS) Chen Model are as written below: [Step 1] forming the universe of discourse (U) (1) Definition and are the constant defined by the researcher. is the smallest data from data histories and is the biggest data from data histories. [ Step 2] forming the interval Divide universe of discourse into several intervals with the same length. To discover how many intervals are needed can be done with the use of Sturges formula as written below: (2) After sum of intervals are found, the next step is to decide the length of interval using the following formula. (3) Until it forms certain amount of linguistic values to presented a fuzzy set on the intervals that comes from the universe of discourse (U) , , then the equation to find the final prediction value is as written below (5) Definition is defuzzification and is the median of .

Fuzzy Time Series Markov Chain
According to [10] [11] [12] Stages of prediction using Fuzzy Time Series Markov Chain methods is as written below: 1. Determine the universe of discourse as mentioned in step 1. 2. The universe of discourse (u) divided into several intervals with the same length as mentioned in step 2. 3. Define fuzzy step on u and fuzzificate the observed historical data. 4. Determine the relation of fuzzy logic based of historical data. 5. Determine groups from fuzzy logic relation into groups of fuzzy logic relation 6. Determine transition probability matrix p based on fuzzy logic relation group as mentioned in previous step. Markov transition probability matrix is in p x p dimension, with p as the sum of fuzzy sets. State transition probability is formulated as written below (6) definition, = transition probability from state to = number of transition from state to = number of data included in state 7. Counting prediction value Law 1 If there are fuzzy sets which don't have fuzzy logic relation, for example , and then there are data in the (t-1) period included in , the prediction value F t is is the median of interval on fuzzy logic relation group which formed in the (t-1) data.

Law 2
If fuzzy logic relation group is one to one relation (for example , where and ), where the collected data is Y t-1 in (t-1) time included in state then the prediction value of F t is with is the median of u p in fuzzy logic relation group which formed on the (t-1) data.

Law 3
The If state is related to , starting from state at time t-1 as F t-1 = and making a downward transition to state at time t where , then the adjustment value of D t is defined as follows If the transition starts from state , at time t-1 as F t-1 = and makes the transition jump forward to state at time t where then the trend adjustment value of D t is defined as (10) With s is the number of jumps ahead.

Law 4
If the process is defined to state at time t-1 as F t-1 = and makes the transition jump back to state at time t where , then the trend adjustment value of D t is defined as (11) With v is a lot of backward jumps. 9. Determine the results of forecasting by adjusting the predictive value of forecasting.
If the fuzzy logic relation group is one to many and state can be accessed from where state is related to , the result of forecasting becomes

Fuzzy Time Series Cheng
According to [13] [14] Stages in prediction using method Fuzzy Time Series (FTS) Model Cheng methods are as written below: 1. Define the universe of speech (universe of discourse) then divide it into several intervals with the same distance. If there is an amount of data in an interval greater than the average value of the amount of data in each interval, then the interval can be subdivided into smaller intervals by dividing 2. 2. Define the fuzzy set in the universe of speech and fuzzify the observed historical data. 3. Establish fuzzy logic relations based on historical data. In data that has been fuzzified two sets of fuzzy sequences and can be expressed as FLR . , given weight 1 (t=3) , given weight 2 (t=4) , given weight 3 (t=5) , given weight 4 Where t states time. 6. Then transfer the weight to the normalized weighting matrix whose equation is written as follows.

=
(13) 7. Calculating forecast results. To produce the forecast value, the weighting matrix which has been normalized to is then multiplied by the defuzzification matrix, which is where is the middle value of each the interval. The way to calculate the forecast is (14) 8. Modify forecasting by adaptive forecasting using the formula: Adaptive forecasting (15)

Research Data
The data used in this study is rainfall data in Medan, period of January 2009 to June 2019. Rainfall data used is monthly data in millimeters (mm), with 84 data as training data and 42 as testing data. Data obtained through the site www.bmkg.go.id.
The initial step taken in forecasting this fuzzy time series is plot the rainfall time series data in Medan, period January 2009 to June 2015 as shown in Figure 1 below.  The time series plot in Figure 1 shows that the rainfall data does not show a certain pattern. This can be caused by unexpected factors such as wind direction, humidity, air pressure, etc., and did not become observations in this study. From Figure 1 it can be seen that the lowest rainfall occurred in July 2014 amounted to 10.5 mm and the highest rainfall occurred in March 2009 amounted to 509 mm.

Fuzzy Time Series 3.2.1. Universe of Discourse (U)
After sorting historical rainfall data, the minimum and maximum values obtained from the data are Xmin

Interval length with Average Based
Process of forming intervals in the Chen and Markov chain methods is same, so many intervals and the length of the intervals in both methods are the same. Following are the intervals formed based on average based on Chen and Markov Chain methods Likewise in the Chen method, initial process of forming interval is same. But at intervals that have frequencies exceeding the middle value the interval will be halved. If the chen and markov chain intervals formed are as many as 8, the cheng method of the intervals formed becomes 13. This happens because there are two intervals that have frequencies exceeding the middle value, so the interval is broken into two parts. Based on the calculation interval length with average based on Chen dan Markov Chain Model, 65 is the effective interval length. With the set of universes obtained namely U = [10,510]. The universe U set will be divided into several intervals with an interval length equal to 65, so that the number of intervals is 8.

Fuzzification
Fuzzification stage based on the effective interval that obtained can be determined linguistic values according to the number of intervals formed. The results of the fuzzification of Chen, Markov Chain and Cheng Model notated into linguistic numbers that can be seen in Table 3.

Fuzzy Logical Relationship Group (FLRG)
FLRG is done by grouping fuzzy sets that have the same current state and then grouped into one group in the next state based on Table 4, Table 5 and Table 6.   A2, A5, A6, 2A12  5  A11  A2, A6, A9  3  A12  A1, A4, A5, A7, A9  5  A13  A3, A7  3 Each method has a different way of forming Fuzzy Logical Relationship Group (FLRG). Chen methods form (FLRG) based on the appearance of different linguistic values. while in the markov chain method, (FLRG) is formed based on all existing linguistic values even though the value is repeated.

Defuzzification
There are two stages in defuzzifying forecasting value. first, finding the middle value for each interval, then calculating forecasting value based on 3 deffuzification rules For each method defuzzification rules is different, according equation (5), (12) and (15). Thus, defuzzification results from FLRG are obtained in Table 7.   Figure 2, actual data can be compared with forecasting of Chen, Markov Chain and Cheng Model. The form of forecasting plot results for each period has a value that is not much different from the actual data for Chen Method. However, the difference between the actual value and forecasting in the Markov Chain and Cheng Method is large enough. The possibility that caused this to happen is on Chen Method interval that is formed is not so much, so the variation in linguistic value is not too much. In addition to forecasting the Chen method is also not influenced by previous data, forecasting is only controlled by the middle value.
Different from the Markov Chain and Cheng methods, where forecasting is influenced by previous observational data. Even though the number of intervals in the Markov Chain method is the same as Chen, there is a correction factor in the form of an adjustment value in forecasting While in Chen, besides the arena the interval is more well-ventilated. this causes variations in linguistic values also become more. So that forecasting is strongly influenced by the linguistic values that emerge and the observed values in previous observations. Besides that, in the Cheng method there is a weighting value called an additive value, the magnitude is between 0 -1. But in this study an adaptive value is 0.9 used because that value that provides quite good forecasting.

Measurement of the Accuracy of Forecasting Results
According to [8] a model has a very good performance if the MAPE value is below 10%, and has a good performance if the MAPE value is between 10% and 20%. In this case the MAPE in Chen Model = 8.002%, while MAPE for Markov Chain and Cheng Model each other is 30,12% and 34,5%. FTS Chen forecasting performance was very good.

Conclusion
In this study Chen's method give MAPE = 8.002%, Markov chain's method give MAPE = 30.12% and cheng's method give MAPE =34.5 %. So the best method for forecasting rainfall is Chen Method. The differences in the three methods are found in the interval formation stage and the forecast value calculation stage.