Performance of high-resolution satellite rainfall datasets in developing rainfall-duration threshold for landslide incidents over Badung Regency

Satellite-based rainfall datasets provide high-resolution worldwide rainfall information, which has potential used in identifying rainfall conditions that trigger landslides. Landslides can be forecasted by rainfall thresholds which is used as an early warning system. The threshold model used needs to be validated to know the accuracy in forecasting landslide occurrences provoked by rainfall events. The objective of the current study is to evaluate the ability of three high-resolution satellite-based rainfall datasets (IMERG, GSMaP, and PERSIANN) to develop a rainfall thresholds model for landslide occurrences in Badung Regency. The recent study used cumulative rainfall events (1, 3, 5, 7, 10, 15, 21, and 30 days) leading up to the incidents of landslides. The determination of rainfall threshold values used the statistical distribution namely: first (Q1), second (Q2), and third quartile (Q3). Validation of rainfall threshold results was conducted utilizing receiver operating characteristic (ROC) curves and the area under curve (AUC). The analysis results show that the first quartile (Q1) exhibited the finest accuracy and gives a good estimation of landslide occurrence. Moreover, among all cumulative rainfall events, the 15-day cumulative rainfall demonstrates the highest AUC value (> 0.75), implying a greater likelihood of triggering landslide events over Badung Regency.


Introduction
Landslides are natural disasters that often occur in tropical countries, one of which is in Indonesia [1].One of the triggers for landslides is very high rainfall [2].The high rainfall can cause soil conditions to become unstable and cause slope collapse [3].Landslides often occur in areas with complex topography, one of which is Bali Province.This condition is supported by the existence of landslide disasters that reached an area of 3,354.4ha of the total disaster cases that occurred in Bali Province, 250 cases occurred in Badung Regency [4].Landslide-prone areas in Badung include Petang, Abiansemal, Mengwi, North Kuta, and South Kuta sub-districts [5].When viewed from topographic conditions, Badung Regency has steep slopes with most of the soil classified as inceptisols made from intermediate volcanic ash and tuff, where this type of soil is a weathered soil that is easily eroded when it is above impermeable rock on hills/ridges with moderate to steep slopes.These conditions have the capability to trigger landslides in periods of abundant rainfall throughout the wet season.Landslides have an impact on the loss of life and material as well as environmental damage, so to reduce the loss of life, mitigation 2 is needed with an early warning system [6].One of them is the method of applying rainfall thresholds in the early warning system.
The forecasting of rainfall that instigates landslides can be facilitated through the utilization of rain threshold modeling incorporated within early warning systems.Rain threshold modeling involves the assessment of both rain intensity and duration, employing satellite-derived rainfall data.Each distinct threshold model established exhibits varying degrees of precision in its ability to anticipate landslide occurrences.Consequently, the evaluation of these threshold models is standard practice, with methods like ROC (Receiver Operating Characteristic) analysis commonly employed for this purpose.To gauge the accuracy level of each devised rain threshold model in predicting landslide events attributed to rainfall, it is imperative to subject them to rigorous evaluation.This evaluation is effectively conducted through the application of ROC analysis.The methodology entails the utilization of statistical indices and ROC curves, which serve as indicators of the precision of the rain threshold model [7].The implementation of ROC analysis holds the advantage of furnishing more precise outcomes by integrating a contingency table and computing the Area Under Curve (AUC) [3], [7].In the context of this study, ROC analysis serves as the instrumental approach for ascertaining the accuracy of the rain threshold model in its ability to predict rainfall-induced landslide events.
Satellite-derived rainfall data stands out as an alternative with the potential to yield more precise and contextually relevant information [8].Illustrative examples of frequently employed satellite rainfall datasets for landslide event analysis and prediction include Tropical Rainfall Measuring Mission (TRMM), Global Satellite Mapping of Precipitation (GSMaP), Global Precipitation Measurement -Integrated Merged Multi-satellite Retrievals (GPM-IMERG), Climate Hazards Group InfraRed Precipitation with Station (CHIRPS), Climate Prediction Center Morphing Method (CMORPH), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN) [1], [9]- [14].Considerable research has been conducted on the utilization of satellitederived rain data, such as IMERG.This data showcases a robust correlation with observational data on a monthly scale, albeit with a moderate correlation on a daily scale.Evaluation of IMERG's performance in Surabaya indicates a strong correlation, particularly during the rainy and dry seasons.However, this correlation weakens during transitional periods between the wet and dry seasons [15].In 14 Indonesian cities, GSMaP data testing reveals a solid correlation for monthly rainfall patterns, yielding values between 0.82-0.92,with associated errors remaining below 100 mm/month [10] [11].Moreover, validation of GSMaP-MVK (Moving Kalman Filter) across several Indonesian islands yields a strong correlation coefficient during the rainy season, with Kalimantan Island exhibiting the most promising Probability of Detection (POD) and False Alarm Ratio (FAR) values [16].Remarkably, research focusing on the application of satellite rainfall products within Bali Province is scant, encompassing TRMM, CMORPH, PERSIANN, IMERG, GSMaP, and CHIRPS [17]- [20].Among these studies, one elucidates that IMERG and PERSIANN showcase superior performance in measuring average rainfall intensity compared to GSMaP.Cumulative rainfall estimation by PERSIANN yields a robust correlation with rainfall stations, whereas IMERG and GSMaP exhibit instances of overestimation and underestimation.Notably, IMERG presents minimal deviation when gauging accumulated rainfall that triggers landslide events in comparison to other datasets [21].Further analysis reveals IMERG's proficiency across daily, decadal, and seasonal time scales, while GSMaP displays a negative bias across all observed scales (daily, decadal, monthly, and seasonal) [17].Collectively, prior studies underscore the variances in performance observed among satellite-based rainfall products, influenced by the unique conditions prevailing in each region.
The use of rain threshold-based early warning systems has been widely used.The important thing about these systems is the availability of components related to rainfall forecasts [22].Most slope collapses/landslides are triggered by rainfall.A number of researchers have tried to establish rainfall thresholds in accurately predicting slope collapse/landslide using the parameters of average rainfall, duration of the rainfall event, a ratio of rainfall to daily rainfall, previous rainfall to annual average rainfall, and daily rainfall to maximum previous rainfall ratio [2], [23]- [28].The utilization of satellite rainfall products in determining the rainfall threshold for landslide occurrence is still little done especially in Bali Province [21].Previous researchers have analyzed many landslide-triggering rainfall events for the determination of rainfall threshold values using daily, dasarian, and monthly rainfall data [6].However, previous studies have not analyzed rainfall thresholds based on variations in rainfall accumulation, and variations in analysis based on statistical location measures.Therefore, in the current study determine the rainfall thresholds using variations of rainfall accumulation, as well as variations in the size of the statistical location.On the other hand, research on rainfall threshold analysis triggering landslides has never been conducted in Badung Regency.This study is expected to improve the performance of high-resolution satellite rainfall datasets so that it can be used as an alternative in analyzing the rainfall threshold that triggers landslides and can be applied in the establishment of an early warning system in Badung Regency.

Study Area
This research was conducted in Badung Regency -Bali.Geographically, Badung Regency has an area of 418.52 km 2 or about 7.43% of the total area of Bali Province (Figure 1).The geological conditions of Badung Regency are mostly young volcanic products consisting of volcanic breccia, passive tuff, and lava deposits.Most of the soils in Badung Regency are classified as Inceptisols made from intermediate volcanic ash and tuff.Meanwhile, when viewed from the topographic conditions, the slope of Badung Regency is grouped into 7 (seven), namely slope 0 -3%, is a flat area, slope > 3 -5%, is a gentle area, slope > 5 -10% is an undulating hilly area, slope > 10 -15% is a slightly sloping area, slope > 15 -30% 1311 (2024) 012060 IOP Publishing doi:10.1088/1755-1315/1311/1/0120604 is a sloping area, and slope > 30 -70% is a very steep area.The more to the north has a higher slope [29].

Data 2.2.1 Landslide Events
The data used in this research is landslide event data for the time interval of 2015 -2022.Landslide event data required includes the location of the event, date of the event, coordinates of the event location, the area affected, and the level of loss.The landslide data was obtained from the report of the Regional Disaster Management Agency (BPBD) of Badung Regency.

Figure 2. Number of landslide events per district
In Figure 2, it is explained that the Petang district has the highest number of landslide events with 180 events because the Petang district which is located in the northernmost area of Badung regency has an area with a slope above 45% (very steep).Followed by the Mengwi district with 64 landslide events, and then the Abiansemal district with 51 events.Other districts in the regency tend to be dominated by sloping areas (0-8% slope) especially Kuta, North Kuta, and South Kuta sub-districts.This also shows that the incidence of landslides in these areas is the lowest among other areas.

Rainfall Data
This research used 3 (three) satellite rainfall products, namely GSMaP, IMERG, and PERSIANN in determining rainfall condition before landslide occurrences.This products have chosen due to the high spatial and temporal resolution.GSMaP provides rainfall data across Indonesia with a spatial resolution of 0.1º x 0.1º, equivalent to approximately 11.06 x 11.06 km [30].Hourly GSMaP data can be accessed from the JAXA website at https://sharaku.eorc.jaxa.jp/GSMaP/index.htm.IMERG data offers a temporal-spatial resolution of 0.1° x 0.1° every half-hour and can be downloaded from https://giovanni.gsfc.nasa.gov/giovanni/[31].This study uses the PERSIANN-Cloud Classification System (CCS) which can estimate global rainfall with a spatial resolution of 0.04° (approximately 4x4 km).The PERSIANN satellite rainfall data was obtained from the PERSIANN website http://chrsdata.eng.uci.edu[32].

Method 2.3.1 Determination of Rainfall Thresholds
Thresholds are employed to ascertain both the critical point at which landslides take place and the minimum duration that initiates such incidents.The correlation between cumulative rainfall rainfall at various time scales and landslide events in this study can be obtained by scatter plot.The cumulative rainfall of rainfall events was calculated with various time variations of 1, 3, 5, 7, 10, 15, 21, and 30 days before the time of landslide occurrence.To determine the threshold value of rainfall, this study uses statistical location measures, namely the first (Q1), second (Q2), and third quartile (Q3).

Performance Analysis of Rainfall Thresholds
This study uses ROC analysis to determine the accuracy of the rain threshold model in predicting rainfall events that trigger landslides or not.The ROC consists of two indices namely true positive rate and false positive rate (Table 1 and Table 2).While the area beneath the curve delineates the precision of the empirical model, its computation is conducted using a technique referred to as the Area Under Curve (AUC) calculation method.AUC takes the form of a square area, with values consistently ranging between 0 and 1 [14].The classification of AUC levels can be seen in Table 3 [14].Threshold performance is calculated by a confusion matrix that contains actual landslide events with predicted landslide events which results in four conditions that can occur (Table 1).True positive occurs if rainfall triggers a landslide in both the actual event and the predicted event (1,1).True negative is when rainfall does not trigger landslides in the actual or predicted event (0,0).A false positive is when rainfall does not trigger landslides in the actual event, but according to the prediction, rainfall can trigger landslides (0,1).False negative is when rainfall can trigger landslides in the actual event, but according to prediction, it does not trigger landslides (1.0) [14].[14] Value AUC Description 0.5 < AUC ≤ 0.6 Limited differentiation 0.6 < AUC ≤ 0.7 Satisfactory differentiation 0.7 < AUC ≤ 0.8 Superior differentiation 0.9 < AUC Exceptional differentiation

Rainfall Threshold Results
The relationship between landslide occurrences and cumulative rainfall conditions exhibits significant variability.The quartile statistics approach plays a pivotal role in data science.Quartiles hold great importance as they can effectively describe the variability of scattered data, regardless of the data point count [33].The rainfall thresholds for landslide occurrences were determined using the first quartile (Q1), second quartile (Q2), and third quartile (Q3).The analysis results reveal an increase in rainfall threshold values for different cumulative rainfall periods (1, 3, 5, 7, 10, 15, 21, and 30 days) obtained from three satellite rainfall products (PERSIANN, IMERG, and GSMaP).The highest threshold values were observed in the third quartile (Q3), followed by the second quartile (Q2), and finally, the first quartile (Q1).The Q3 thresholds were found to be 437.00mm, 413.50 mm, and 406. 25  Landslide thresholds obtained from the cumulative rainfall exhibit a range spanning from 4 mm to surpassing 400 mm.This emphasizes that the determination of landslide thresholds is significantly influenced by the geographical location, prevailing climate, and the methodology employed for establishing the threshold criteria [8].Regions characterized by elevated terrains featuring steep inclines and lowlands characterized by relatively gentle gradients will witness disparate levels of rainfall intensity required to trigger landslides, thus leading to variations in the corresponding rainfall thresholds.Furthermore, when pinpointing the landslide threshold for a particular area, it becomes imperative to factor in distinctions in terms of seasons, climate patterns, land cover, and soil conditions, especially when making comparisons to other locales.Consequently, even if the areas under examination are identical, these varying conditions inevitably give rise to distinct threshold values.

Threshold Performance Analysis
Based on 316 landslide events spread across Badung Regency, the number of rainfall events that caused landslide (TP), no landslide (TN), and false positive (FP) were obtained.The ROC curve shows that the accuracy of various cumulative rainfall (1, 3, 5, 7, 10, 15, 21, and 30 days) from the three satellite rainfall products (IMERG, GSMaP, and PERSIANN) is quite good because the results obtained are above the diagonal line (Figure 3). Figure 3 illustrates that the Q1 approach has the best performance in determining the rainfall threshold for landslide occurrences over Badung Regency at various cumulative rainfall levels.This is evident from the position of Q1 on the ROC curve, which is much closer to the upper left corner.
The AUC of the rainfall threshold indicates the level of accuracy in detecting landslide-triggering and non-landslide-triggering rainfall events.The cumulative rainfall of 1, 3, 5, 7, 10, 15, 21, and 30 days for the three satellite rainfall products (IMERG, GSMaP, and PERSIANN) shows that the 15-day rainfall produces better performance.Based on the AUC obtained from the ROC curve, the rainfall threshold has a pretty good accuracy, where the results obtained for each satellite rainfall product are

Table 3 .
AUC value classification

Table 4 .
mm for PERSIAN, IMERG, and GSMaP, respectively.Subsequently, the Q2 thresholds were calculated as 285.50 mm, 284.71 mm, and 275.26 mm for PERSIAN, IMERG, and GSMaP, respectively.Finally, the lowest thresholds, represented by Q1, were 205.00 mm, 208.31 mm, and 208.06 mm for PERSIAN, IMERG, and GSMaP, respectively.Table4provides a summary of the rainfall threshold values for each quartile under various cumulative rainfall conditions.Results of Rainfall Threshold Value