Determination of landslide hazardous map using logistic regression in Lima Puluh Kota Regency

Many ways are done to determine the area of the possibility of landslides based on the factors causing it, which groupings start from a simple (heuristic), statistical to deterministic way. Especially for statistical methods, it is still divided into various statistical methods, one of which is using the logistic regression method. This study aims to use logistic regression to determine the landslide hazardous area in Lima Puluh Kota Regency with landslide events as the dependent variable and landslide occurrence factors as independent variables. The study resulted in areas with high, medium, and low hazard. Thus, the resulting map can be used as a guide for an accuracy scale of 50,000.


Introduction
Landslides are disasters that result from balance disturbances that cause soil mass movements.Factors causing landslides include land use, geological structure, lithology, soil type, rainfall, and seismicity factors.Fifty Cities District has a high potential for disasters caused by geological and geographical conditions.Faults cause ground movement in the Regency of Fifty Cities.Like the West Sumatra region in general, the position of the Regency area is also under the influence of the Great Fault system, namely the Semangko Fault system.Landslides generally occur in hilly areas with cliffs and river bank areas, especially in cut-of-slope areas such as sloped areas with a slope of > 15%, undulating hilly areas, steep hilly areas, foothills areas, and this situation covers most of the Lima Puluh Kota Regency (BPBD, 2021).
Landslides usually disrupt the function of public infrastructures, such as roads on the left and right are people's plantation areas.On the Payakumbuh-Suliki-Koto Tinggi route, the landslide-prone area is in the gorge of a lot of fish and the jorong of the Mangkirai river in Kenagarian Pandam Gadang.On the Payakumbuh-Suliki-Baruah Gunuang line, the landslide-prone road is at the Jorong Lancaran Lack of understanding and awareness of the community related to the level of vulnerability of the area to landslides, resulting in the community being less prepared to anticipate disasters so that the impact will be in the event of a landslide.A landslide is the movement of soil or rock masses, or a mixture of both, down or out on a slope due to disruption of its stability.Landslide hazard is highly dependent on the slope.Steep slopes have a higher potential for landslides than those with gentler slopes.
Many ways are done to determine the area of the possibility of erosions based on the factors causing it, which groupings start from a simple (heuristic), statistical to deterministic way.Applying statistics and GIS introduces a new dimension in landslide research; the many approaches that can be used in landslide research include heuristic, , probabilistic, quantitative, deterministic, and multi-criteria approaches, but not all of these approaches are effective for landslide disaster assessment.The statistical approach can be better used in large-scale landslide research (Soeters & Westen, 1984).
This study used a statistical approach in the form of logistic regression, the most significant method to determine the relationship between landslides and geographic data that causes landslides (Mancini, Ceppi, & Ritrovato, 2010).Determining all the factors that cause landslides is an important task in carrying out a linear regression model converted to a GIS environment (raster value domain).The regression coefficient is then derived for each category using SPSS (Mandal & Mondal, 2018).This approach has been successfully carried out in several studies, such as research by (Dai & Fan Lee, 2002), (Lee & Pradhan, 2007), (Guzzetti, Carrara, Cardinali, & Reichenbach, 1995).
This study uses logistic regression to determine the hazardous landslide area in Lima Puluh Kota Regency with landslide events as the dependent variable and landslide occurrence factors as independent variables.

Rainfall
Rainfall and snowmelt patterns, storm intensity and duration, and soil moisture replenishment during the rainy season directly affect the occurrence of landslides.Strong winds can also increase the load on trees and contribute to slope failure.On the other hand, higher temperatures increase wind speed, and lower relative humidity leads to soil drying and increased slope stability (Forbes et al., 2012).Lima Puluh Kota Regency is a disaster-prone area, mainly when high rainfall occurs (Annisa, Sutikno, & Rinaldi, 2015).

2.2.2
Geology Factors causing landslides are classified into two: driving and triggering (Cruden & Varnes, 1996).Geology is one of the factors that move as a trigger for landslides based on the type of lithology that composes and the geological structure of the region.

Soil
The soil structure causes landslides because water cannot flow vertically if the soil grains are coarse and do not blend so that water can enter through the gaps in the soil grains and cause a high potential for landslides (Sobirin, Sitanala, & Ramadhan, 2017).

2.2.4
Geomorphology The surface of the earth's shape that controls the flow source, flow direction, and concentration of soil moisture is an essential factor in the occurrence of landslides.The geomorphological information is elevation, slope, aspect, and curvature.The slope of slope is the main topographic factor in the cause of landslide fractures.Steep slope gradients will generally be prone to landslides, although geological and climatic factors can make slopes more susceptible to collapse.The steeper the slope is directly proportional to the higher potential for landslides in an area and vice versa (Hermansyah, Supriatna, & Wibowo, 2015).The direction of the slope is also a factor that triggers landslides; for example, slopes facing a certain direction can experience more severe storms, making them more susceptible to ground movement.

2.2.5
Landuse and Distance from Road Human interventions on slopes, such as settlement construction, road and removal of forest cover, can alter the steepness of the slopes and the groundwater situation, leading to slope instability (Swanson & Dyrness, 1975).

Data Processing
The point of landslide occurrence is divided into 20%, which is used as test data, and 80% as training data; besides that, it also uses random points as many as the number of test points.The training and random data were then overlaid on the factors causing landslides using Extract MultiValue to Points which were then processed by logistic regression using SPSS.For nominal data types, the data conversion process is carried out using Density so that the data becomes ratio data. = (  /  )/ ∑ (  /  =1   ), Bi is the landslide occurrence in the i-type of specific parameters, Ai is the i-type area of certain parameters, and N is the number of types in specific parameters (Zhu & Huang, 2006).
The value of column B in the variable in the equation is entered as a logistic regression systematic model, which is processed using the Raster Calculator tool.Logistics statistics work with all types of data to estimate the probability of landslide hazards.This probability result ranges from 0 to 1, which is then classified with a threshold (Naufal & Susetyo, 2020).Classification is categorized into low (0-25%), medium (50-75%), and high (75-100%).Then the prediction results of landslide susceptibility were tested using test data so that the accuracy of the prediction results was known (Zhu & Huang, 2006).

Results and Discussion
At the data processing stage, using logistic regression with stepwise steps, the significant ones will be added to the model, and the non-existent ones will be removed.From the results of processing using SPSS obtained, some information.The equation is acceptable because the chi-square value is more significant than 0.05 according to the Homser-Lemeshow test, which in this study is worth 0.753.

Table 1. Hosmer and Lemeshow Test
Step Chi-square df Sig.6 5.047 8 .753 The Cox and Snell R Square values are 0.406, which means that 40.6 percent of landslide events in the Lima Puluh Kota Regency can be explained by the independent variables used.While the Nagelkerke value of 0.541 means that 54.1 percent of landslide events in the Lima Puluh Kota Regency can be explained by the independent variables used.From the results of logistic regression data processing, it was found that the variables that affect the occurrence of landslides are geological variables, land use, soil type, slope direction, slope, and distance from the road.Then a logistic regression systematic model is also generated that will be used in predicting landslide susceptibility to processed data in ArcGis software as follows.From the results of the formulation of the spatial model using a raster calculator, it is found that the landslide susceptibility class in Lima Puluh Kota Regency is divided into three classes, namely low, medium, and high classes.Landslides mostly occur on slopes of 0-30%, and geology with the type of Tuff or Qpt and geology of Qal is in line with the results of vulnerability modeling, which shows that landslides often occur on similar slopes (0-30%) and geological types of Qpt.From the results of landslide susceptibility modeling with logistic regression, it was found that in Lima Puluh Kota Regency, the dominant landslide susceptibility is low vulnerability (87.3%) and moderate vulnerability (12.07%).The sub-districts that have high disaster-prone areas are Situjau Limo Nagari and Luak sub-districts.

Figure 11. Landslide Hazardous Map
The value of the AUC of the ROC curve of 0.975 indicates a very valid model validity, meaning that the accuracy of the predicted model is very high.Thus this landslide susceptibility map can be used for a scale of 1:50,000.

Conclusion
The hazardous map generated from data processing using the logistic regression method shows that most of the regency of Lima Puluh Kota are in the low hazard category, however, areas with the medium class category should still be watched out for, which are generally located in built-up areas.Likewise, areas with a high level of danger are in most of the built-up areas.Therefore it is necessary to carry out mitigation measures, especially in high and medium areas by involving all stakeholders so that landslide events can be anticipated early on.

Figure 1 .
Figure 1.Research Area Map

Table 2 .
R Square

Table 3 .
Variables in the Equation