Spatio-Temporal Mapping and Monitoring of Mangrove Forests Changes from 1991 to 2021 in Panay Island, Philippines using Machine Learning

Mangrove forests hold a crucial role in our social, economic, and ecological activities. Despite this immense importance, they are constantly threatened by reclamation, deforestation, and climate change. In order to forward conservation and restoration efforts, accurate and cost-effective mangrove mapping and monitoring must be done. This paper explores the use of a supervised learning algorithm called Random Forest (RF) in mapping mangrove extent in Panay Island, Philippines from 1991-to 2021. Using land cover data from Landsat, maps of the mangrove extent from 1991to 2021 were developed. Results revealed that there has been an 8% decline from 1991 to 1996; 24% decrease in 1996 to 2001; 6% increase in 2001 to 2006; 21% decline from 2006 to 2011; 17% increase in 2011-2016; and 16% increase in 2016 to 2021. Over the past three (3) decades, the Philippines has lost 20% of its mangrove forests. From 31713 ha in 1991 to only 25313 ha in 2021. Through a confusion matrix, the model was evaluated and it showed a specificity, sensitivity, and AUC(Area Under the ROC Curve) above 70%. This suggests that, machine learning, when integrated with remote sensing, can provide an effective yet low-cost approach to mapping mangrove extent at a large-scale.


1.
Introduction Mangroves are salt-tolerant trees found in the intertidal zones of tropical and subtropical coastlines [1]. These forests hold a crucial role for many coastal communities, serving as breeding grounds for fishes, reducing wave action during typhoons, and ultimately combating climate change by storing blue carbon [2]. From half a million ha of mangrove cover in 1918 to only about 120, 000 ha in 1994, mangrove conservation in the country faces many institutional issues such as the promotion of development through aquaculture, low economic valuation of mangroves, bureaucracy, corruption, and lack of political will [3] [4]. Despite data on mangrove extent in some areas of Panay Island, there's a critical deficiency in updated and accurate information on large-scale mangrove cover change over time [5]. To address this urgent need to understand the status of mangroves and the drivers of their forest loss and gain, rapid and accurate extent mapping must be done [6]. By harnessing the power of satellite technology as well as artificial intelligence, these ecosystems can be managed more effectively. With that, remote sensing and GIS provide a way to quantify the historical distribution of mangroves, helping researchers and policymakers identify key areas for rehabilitation [7].  (Figure 1) is the sixth-largest island in the country, with a total land area of 1,239,400 ha [8]. The island is divided into four (4) provinces: Aklan, Antique, Capiz and Iloilo -all part of Western Visayas. It is considered as a coastal municipality with a coastline length of 884.01 kilometers or 549.30 miles. The province of Capiz, the northern part of Iloilo, Aklan, and the northern portion of Antique were highly scourged by typhoon Haiyan in 2013, destroying millions-worth of homes and livelihood [9]. Data Pre-processing The study spans data from 1991 to 2021, and it utilized datasets from Landsat 5, 7, and 8 retrieved from the United States Geographical Survey(USGS) website. After setting up the area of interest, the Landsat data were masked to filter clouds. This step is essential to classify mangroves in cloudy regions. Landsat contains a 'pixel_qa' band which can be used to mask clouds. Spectral indices NDVI, NDMI, MNDWI, SR, Ratio 54 and 35,, and GCVI were added to quantify vegetation, vegetation water content, water information, simple vegetation index, map water features, and identify green leaf biomass, respectively [10] [11].
To produce the false color composite images, the following bands 5, 6, and 4 were used for Landsat 8 while bands 4, 5, and 3 were utilized for Landsat 5 and 7. By using these band combinations,

Data Training
Training the data is an important step because this serves as the reference point of the entire machine learning model. In this study, the researcher used RF Classification because it is efficient at running large datasets. It can also handle thousands of variables without diminishing any of them. Among all algorithms, RF has an "unexcelled" accuracy [13]. The formula for RF tree is: In the equation, stands for the final tree prediction; "B" represents the total number of trees; and Վ "b" is the current tree and x′ is the training sample. As the number of trees increase, the accuracy also increases. However, more trees would require more computation time [14].

Accuracy Assessment and Model Performance
The standard in validating a classification algorithm like RF is a confusion matrix. It works by taking training data samples and comparing them to the predicted output [15]. K-fold cross-validation (k = 5) method was used to split the training (80%) and testing (20%) data using the kfold function from the dismo package in R [16]. This procedure was repeated 5 times following the 5-fold cross validation. The performance scores obtained were averaged over the different random sets to evaluate the predictive performance of the model [17]. The model performance was evaluated by calculating the Area Under the receiver-operating Curve (AUC). The specificity and sensitivity indices from the confusion matrix of the predicted and observed values using the PresenceAbsence package v1.1.9 in R [18]. The sensitivity and specificity indices are threshold dependent.

2.5.
Time-Series Analysis To better understand the historical distribution of mangroves, a time-series analysis of data points were collected from 1991, 1996, 2001, 2006, 2011, 2016, and 2021.

2.6.
Colonization Rate Using the formula in a study by Albano [19], the colonization rate of mangroves was computed: represents the colonization rate of mangroves (ha/yr), MCa is the mangrove ԱՀ ՠ cover for the earliest year "a" while is the mangrove cover for the earliest year "b". ԻԱ Պ

2.7.
Loss and Gain of Mangroves Loss and Gain of mangrove was computed by extracting presences of mangrove and associated location (latitude and longitude) from tiff file derived from RF model in google earth engine. It was further analyzed in R software and converted in 1/12 degrees resolution (10km). The mangrove loss and gain were calculated by subtracting the number of mangrove presences of the current year from the previous year. The spatial maps of the loss and gain and corresponding intensity were provided.   The trend above shows a staggering decline in mangrove cover from 1991-2001. In 2006, there was a slight increase in mangrove cover, but it decreased again by 21% in 2011. The graph shows that the mangrove extent has constantly increased from 2011 to 2021. The highest mangrove extent recorded was in 1991 where it was around 31713 ha while the lowest was in 2011 where it only recorded 18651 ha.

3.2.
Colonization Rate of Mangroves Figure 6 shows that there is a fluctuation in the colonization rate of mangroves in the island. In years 1991-1996, colonization rate was -498.    Classification and Accuracy Assessment The confusion matrix was calculated based on the training data and predicted data. By using the formula for accuracy(Acc = (TP + TN)/(TP+TN+FP+FN)) it showed that the model has a 99% accuracy. This suggests that 99% of the time, the model predicts the objects correctly. Using the precision formula: P = TP + FP, the model precision also showed a 99% precision. Precision depicts the quality of positive predictions made by the model. The sensitivity of the model was 99%, suggesting that the model has correctly predicted the actual positives 99% of the time. The specificity IOP Publishing doi:10.1088/1755-1315/1199/1/012039 9 of the model was also recorded to be 99%, this suggests that the proportion of actual negatives(non-mangroves) recorded were predicted to be negatives most of the time. The F-1 score on the other hand takes the mean between precision and recall. The perfect F-1 score is 1 and this study recorded an F-1 score of 0.99 which suggests that the model was able to classify the observations into the correct class (Table 1).

Conclusion and Recommendations
This study has generated the first mangrove extent map in Panay Island, Philippines. It has also proven that remote sensing, combined with machine learning algorithms, can effectively assist in the creation of mangrove ecosystem maps. Considering that this study was done by utilizing open-source data(Landsat) and software(QGIS and GEE), this can be easily replicated to be used for mangrove mapping in other areas in the Philippines. Hotspots of mangrove loss are recommended to be areas suitable for mangrove restoration. With that, we can ensure that mangroves are planted at their natural habitat.
Results from this study are recommended to be shared with different stakeholders involved in mangrove conservation and restoration, advancing decision-making by utilizing data. The researcher suggests future studies to utilize ground-truth data in order to make the machine learning model more robust. Mapping the drivers of mangrove cover change can also be considered in future research. Finally, utilizing other ML models would also be beneficial to determine the most appropriate algorithm for the area. This study encourages future mangrove conservation programs to take into consideration the following recommendations: a. Push forth information, education, and communication programs for the protection and conservation of mangroves. Increasing awareness on the importance of mangroves is highly beneficial in crafting a successful mangrove conservation program. b. Utilize local data on mangrove distribution history to better identify areas suitable for mangrove restoration. This paper argued that abandoned aquaculture farms in the coastline are perfect areas for mangrove planting since these areas are originally mangrove habitats. c. Collaborate with grassroots communities to better understand the drivers of mangrove loss and gain. 5.