Rainfall prediction over Ambon Meteorological Station using Multi-Physics Ensemble WRF-ARW

One of the methods to create good forecast using WRF-ARW modelling is tuning the parameterization. However, this method cannot provide rainfall event probability. Current research result revealed that it was able to simulate and forecast some weather parameters. However, based on the verification results, there were some weather parameters which still had low accuracy. Due to such low accuracy on some weather parameters, the authors were interested in performing post-processing methods in forecasting the weather during extreme weather at Pattimura Ambon Meteorological Station. In this study, we employed multi-physics ensemble prediction system (MEPS) by combining 20 WRF-ARW parameterization schemas, which were processed to obtain the ensemble mean, ensemble spread, and basic probability to get the uncertainty from each weather parameters. Verification process was done by using spreads, skill method and ROC curves. It was discovered that MEPS products have a better skill compared to the forecast control, the correlation value of MEPS products is larger and has the lowest error value. In addition, the result of ROC curves shows that the MEPS has an ability to predict weather condition during cloudy and extreme rain.

Based on the research result [1], the accuracy of TAFOR in Pattimura Ambon Meteorological Station in April and June 2012 was obtained, where the accuracy of each constituent of the weather parameter in TAFOR was sufficient to assign the verification standards based on Instruksi Met/No.009/Verifikasi Prakiraan/1/88. However, there were two weather parameters that were still below the standard, those were wind direction and weather condition. The difficulties on both parameters become an obstacle for Pattimura Ambon Meteorological Station forecaster.
In order to forecast the weather to meet the public needs, there are many weather forecasters models have been developed by experts through various approaches, one of them is by numerical weather model of Weather Research and Forecasting (WRF) [2].
This model is expected to provide a good forecast for weather conditions. Since the model is too sensitive to the initial conditions, the calculation of initial conditions is also expected to get a good forecast. However, it does not run linearly, because the Earth's atmosphere is chaotic or irregular as proposed by Edward Lorenz in the 1960s. In this case, data assimilation was employed to improve the initial conditions which do not necessarily give a good forecast [3].
In the 1960s, Lorenz who first discovered chaotic atmosphere phenomenon, gave the fact that uncertainty can be found in every step of forecast process. Therefore, a complete forecast should be 2 provided with a description or explanation of the distribution of probabilistic and forecasts uncertainty [4]. Accordingly, it is necessary to develop a method that provides information on opportunities or probability of an event, one of them is by developing ensemble prediction system [5].
In 1992, ensemble prediction model was introduced for the first time by the European Center for Medium-Range Weather Forecasts (ECMWF) in Europe, while in the United States it was introduced by the National Centers for Environmental Prediction (NCEP) [6]. This prediction model has been widely used in various Meteorological Centre for operations, as did by the South African Weather Service (SAWS) in South Africa. The method used was adopted from NCEP EFS since 2000 with several years of development process. Due to the consistency of profitable results, this method is still in use for operational weather prediction [7].
Previous research was conducted [8] by using ensemble method towards single prediction system ANFIS, Wavelet ANFIS, Wavelet ARIMA, and ARIMA. The ensemble prediction system of the total monthly rainfall at Indramayu District has shown a more consistent results compared to single prediction system.
Basically, ensemble prediction system (EPS) is a numerical weather prediction system (NWP) which allows us to forecast the possibilities and the uncertainty in weather forecasting. Some EPS have been used for more than one model or same model but with a different combination of parameterization schemas. EPS is designed to find out the possibility of a particular result that will occur [9].
Based on the issues above, this research needs to be conducted in order to obtain a better accuracy level by using ensemble prediction model. The ensemble prediction model is formed by combining several single prediction systems that are resulted from WRF-ARW parameterization, called multiphysics ensemble. The building of the prediction system was performed by using multi-physics ensemble by combining the results of 20 WRF-ARW parameterization schemas and predict the probability of extreme weather conditions at Pattimura Ambon meteorological station. Then, the results were statistically processed by producing the uncertainty value that can be used for operational needs. By the existence of probabilistic information, it can be used as the basic manufacture of early warning of extreme weather conditions.

Data
The materials used in this study are: a. Synoptic Data In this research, the data used were taken from meteorological station Class II Pattimura Ambon with the positions of 3.706° LS and 128.089° BT in the form of observational data of air surface (synoptic) from 14 -19 July 2012, 31 July -5 August 2012, and 23 July -2 August 2013. These data were considered representing the data of Ambon Island. b. GFS Data The data global forecasting system (GFS) used in the current study were run for 3 hours per cycle which has a spatial resolution of 0.5° × 0.5° downloaded from http://nomads.ncdc.noaa.gov/ data/gfs on 14 July 2012, 31 July 2012, 23 July 2013, and 28 July 2013 with prediction of 120 hours.

Domain
In this study, the writer used Ambon Island region (Figure 1.) as the research location representing the Pattimura Ambon meteorological station with the positions of 3.706 LS and 128.089 BT.
This was calculated as standard deviation from the output model variable which provides uncertainty level in the parameter.
c. Basic Probabilistic This displays the opportunity of an event or the parameters from the ensemble members at the grid point or any particular location.
3. The stage of verifying the spread, skill and ROC curve. a. Spread and Skill The spread value of anomaly corelation coefficient (ACC) and spread root mean square error (RMSE) were calculated to see the distribution of variation values of ensemble mean and ensemble spread toward forecast control. The skill of anomaly corelation coefficient (ACC) and the skill of root mean square error (RMSE) were further also calculated to find out the capabilities of the forecast control, ensemble mean, and ensemble spread toward the observation value [11]. b. ROC Curve In this process, the Relative Operating Characteristics (ROC) curve was obtained by plotting between the hit rate value and false alarm rate value. The hit value and false alarm rate were arranged from a set of the probability of reference values used in determining whether an early warning will be given or not based on the probability of occurrence. If the curve line lies on above line 1:1, then a system of weather prediction has the skill or reliability in predicting, but if the curve lies on line 1:1 or below the line 1:1 then a system of weather prediction does not have the skills or reliability in predicting.

Result
After implementing WRF-ARW with 21 different schemas, then the result was produced in the form of .ctl. The .ctl was processed with GrAds for issuing the weather parameter values then processed again to determine the forecast control and multi-physics ensemble prediction system (MEPS).

Determination of forecast control
In the process of MEPS making, it requires the presence of forecast control that aims to see whether the existence of perturbations or disturbances on the parameterization will give a forecast closer to the true value. This forecast control is the best configuration of schema at Pattimura Ambon meteorological station. The determination of this forecast control was done by performing parameterization test on 21 schemes against three cumulus schemas and seven microphysics schemas toward four extreme rain events. This was done through the verification of the Taylor diagram toward the surface air pressure parameters (QFE), air surface temperature, air surface humidity, wind speed and rainfall per three hours.  and V, and R schema has the least error value. Therefore, we can conclude that R scheme is the best in predicting air surface pressure (QFE).

Air Surface Temperature
On the weather elements of air surface temperature, many schemas have a standard deviation that is quite far from the observation standard deviation value as seen in Figure 2.a), in which the observation has a value of about 1.25 -1.3, and schemas that have the same standard deviation value as the observation were L and O schemes. For the values between 1.2 -1.35, there are I, J, K, L, M, N, and O schemas. Furthermore, the low schema correlation value was between 0.4 -0.6, while for the correlation value of 0.575 -0.6 are J, K, M, and N schemas, N schema with the highest correlation. Meanwhile, the schema RMSE error obtained a value between 1. 1 -1.3, where F schema is the schema with the least error of 1.1, but with correlation value of 0.55 and standard deviation value of 1. Hence, based on the standard deviation and best correlation value, the RSME value obtained was between 1.15 -1.2, which are J, K, L, M, N, and O schemas. Based on the three analyses above, we can conclude that the best schema with the standard deviation value closer to the observation value and biggest correlation value, and RMSE value was 1.15. Therefore, we can conclude that based on the weather elements of air surface humidity, it has a standard deviation closer to the observation which is D schema and the least value of RMSE, although the correlation value is not too large, which is around 0.375. So we can conclude that in the wind speed prediction with height of 10 meters the best schema is the D schema which has a value of standard deviation closer to observation standard deviation, correlation of 0.425, and the RMSE value of 3.

Rainfall
In Figure 2.d, the weather parameters of the rainfall based on the observation standard deviation value is 29, and the entire schema has standard deviation value below 18. F schema has a value that is closer to the observation standard deviation of 25. From the correlation value, a negative value has been identified from one of the schema values, that is the T schema and the correlation value is about -0. Therefore, based on the analysis above, we can conclude that N schema has the greatest correlation values and the least of RMSE value despite having the standard deviation of 9, it has 20 differences from the observation standard deviation.

The discussion of the best schema
The conclusion of the best schema that would be used as forecast control is by calculating the configuration schema that frequently appear on every analysis results of weather elements.
Based on the data presented in table 4.3, we could see that every verification index from weather elements have different results. Based on the calculation, we could conclude that the N schema appears often as weather elements or whether from the verification index. Meanwhile, the rainfall from the result of contingency table showed that the best configuration schema is F. It is supported by the verification result of Taylor diagram on standard deviation index.
So the N configuration schema is the best configuration in forecasting the weather elements at the area of Pattimura Ambon meteorological station and it was used as the forecast control in the making of MEPS.

Ensemble prediction
Prediction was done by using the methods of ensemble mean, ensemble spread, and basic probabilistic. The calculation was performed by using 20 ensemble members with the configuration that stated on the Table 3. Weather elements that were processed namely: air surface pressure (QFE), air surface temperature, air surface humidity, 10 meters of wind speed, and the weather condition per three hours.  Figure 3.a, it shows that on 1 August 2012, there was an expanded spread compared to the following days. This case indicates the existence of variations diverse from each member of ensemble.
Where the mean of the standard deviation was 0.26 hPa. The ensemble mean did not predict the existence of low pressure, but it only predicted for five days ahead of the air pressure at the Pattimura Ambon meteorological station under 1011.0 hPa.

Air Surface Temperature
In predicting the air surface temperatures, ensemble mean and spreads, it showed a reduction of air surface temperature from 31 July until 1 August 2012, and it increased on 2 August until 5 August 2012. As seen on Figure 3.b the spread value was bigger than the spread of air surface pressure (QFE), where the mean of its standard deviation was 0.45°C.

Air Surface Humidity
The air surface humidity can be seen on the Figure 3.c in which the ensemble mean and spread show an increment in humidity over 90% from 31 July until 1 August 2012. Based on the value of ensemble spread, it shows a bigger value with the average of 2.75%.

Wind Speed
Wind speed at a height of 10 meters can be seen on Figure 3.d, ensemble spread has a variation of value that is sizable with the average 1.90 knots. From 31 July until 1 August 2012, the standard deviation showed 2 -4 knots, but after that, the standard deviation decreased. In addition, the ensemble mean showed an increment in wind speed from 31 July until 01 July 2012, and on 2 -5 August 2012, the wind speed decreased till below 10 knots.

Weather Condition
When specifying the weather condition, the method used was by calculating the probabilistic of rainfall per three hours from the result of 20 ensemble members. Its intensity was divided to several kinds, with the five categories as follows:  The probabilistic value can be used to predict the weather condition by realizing how big the occurrence potential is. From the results of probabilistic calculation, it shows when on the extreme rain potential occurrence, which was on 01 August 2012, with the probabilistic of 20%-30% as shown in Figure 4. On 2 -4 August 2012, the probabilistic occurrence of cloudy weather was over 50%. The verification process was done in two ways, those are by spread and skill method as well as ROC curves. The spread and skill method show how the distribution from the results of the ensemble mean and the spread toward the forecast control and identify the ability of forecast control, ensemble mean, and ensemble spread. As for the ROC curves, it was used to identify the reliability of probabilistic prediction of weather condition.

Spread and skill
a) Air Surface Pressure Based on the spread value ( Figure 5), it shows ensemble mean, ensemble spread max and ensemble spread min have better value with ACC and RMSE is quite less. From the skill value, it also obtained that forecast control, ensemble mean, ensemble spread max and the ensemble spread min have a solid ACC, eventhough it becomes weaker as time goes by. Meanwhile, RMSE also shows an error of ± 2.5 hPa. For forecast control, its value is close to the ensemble spread max. Air Surface Temperature For air surface temperature (Figure 6), the spread values were on ensemble mean, ensemble spread max and ensemble spread min which showed a solid ACC value, although it had decreased on the third day. Meanwhile, the RMSE values show error of ± 2ºC. For the skill value of forecast control, ensemble mean, ensemble spread max, and ensemble spread min from ACC showed a strong correlation, although it had been weakened on the second day. Based on the RMSE, the skill showed error of ± 2.5 °C. For the forecast control, it has a pattern that is quite close with the ensemble spread max.  Air Surface Humidity The air surface humidity spread (Figure 7) values are on ensemble mean, ensemble spread max and ensemble spread min which have a moderate ACC value with RMSE that is quite varied and ensemble spread min has a stable error of ± 6%. On forecast control, ensemble mean, ensemble spread max and ensemble spread min, the skill values on ACC showed a correlation from moderate to weak. Meanwhile, the RMSE has a quite big error reaching 12%, but the error decreased until below 8%. On this parameter, the value of forecast control has a pattern that is quite different from ensemble mean, ensemble spread max and ensemble spread min. Wind Speed On the element of wind speed with height of 10 meters, it shows ( Figure 8) that ensemble mean, ensemble spread max, and ensemble spread min at the ACC spread, has greatly varied value and it is difficult to determine the tendency of its correlation. Meanwhile, based on the RMSE, it also shows a large error on the first two days reaching 8 knots and is stable on the fourth day under 4 knots. Meanwhile, on the forecast control skill, ensemble mean, ensemble spread max and ensemble spread min have a weak ACC and RMSE has an error value under 6 knots. Forecast control has a pattern that is almost the same as the ensemble spread max.

ROC Curves of Weather Condition
The weather condition was determined by the rainfall per three hour. Based on the Figure 9, it shows that the prediction of weather condition with basic probability on the condition of rainy weather with the extreme rain and cloudy has the best reliability rate than any other weather condition. It is shown on ROC curve in which the curve lies on skill line.

Conclusions and discussion
Based on the research results conducted, it can be concluded that the verification results with Taylor diagram has the best configuration for predicting the weather parameters at Pattimura Ambon meteorological station by using Betts-Miller-Janjic schema from cumulus with WDM-5 Class schema from microphysics that was being used as forecast control.
MEPS products indicate that each weather parameter has a variability pattern. Furthermore, the air pressure has a consistent pattern in which the peak air pressure was around 01:00 UTC and 13:00 UTC, the air temperature occurred around 05:00 UTC, while the humidity occurred at 23:00 UTC, and the wind speed occurred at around 05:00 UTC. These three parameters reached peak pattern depends on the weather conditions, when there is a significant weather events then the diurnal pattern is not visible. As in Figure 3. b) c) d) the value of the air temperature, humidity, and wind speed tends to stabilize on a value with a great spread but the air pressure fluctuates with its diurnal pattern. The MEPS products have a better skill compared to the forecast control, the correlation value of ensemble product is larger and has the lowest error value. Based on the verification results of spread and skill, every weather elements have ACC value and different RMS error. The skill value for air surface pressure has an average ACC value between 0.5 -1 with the RMS error less than 3 hPa, for the air temperature, the ACC value tend to be weaken with varied value under 0.5 and the RMS error is less than 3°C. Meanwhile, the air humidity also has a weak ACC value with varied value below 0.5 and the RMS error is less than 15%, and for the wind speed has a weak ACC value with the RMS error less than 8 knots. The spread value of air surface pressure has a strong ACC value with a value close to one and the RMS error less than 2 hPa, the air temperature has an average ACC value between 0.5 -1 and the RMS error less than 2ºC. Meanwhile, the humidity has ACC value between 0.5 -1 and the RMS error less than 10%, and the wind speed tend to have a weaker ACC value with varied values below 0.5 and RMS error less than 8 knots. With the best ensemble products are ensemble spread min and ensemble spread max, where the forecast control has a value that is closest to the ensemble spread min.
The result of ROC curve shows the method of MEPS that able to predict weather condition. it is shown on ROC curve, the curve of cloudy and extreme rain lies on skill line.