An approach to the analysis of deforestation in Colombia, applications of physical tools

Deforestation, a global challenge with significant environmental and social impacts, raises pressing concerns for the sustainability of the planet, especially in Colombia. This phenomenon, particularly relevant in the Colombian context where biodiversity and national identity are intrinsically linked, has undergone a dramatic transformation in recent decades due to the expansion of anthropogenic activities. This article clarifies the current state of the problem, discusses the efforts undertaken by the Office of the Procuraduria General de la Nación colombiana, and proposes an innovative approach by integrating statistical tools, formalism inherent to statistical mechanics, and geographical features. Leveraging quarterly data issued by the Instituto de Hidrología, Meteorología y Estudios Ambientales, such as early deforestation alerts, the study will outline a methodology to discern patterns and behaviors.


Introduction
Deforestation is a globally pervasive phenomenon with significant environmental and social consequences, including biodiversity loss, climate change, disruption of the hydrological cycle, soil degradation, impact on local communities, increased greenhouse gas emissions, and the loss of ecosystem services.It stands as one of the most pressing challenges for the sustainability of the planet.In the Colombian context, where biodiversity is intricately woven into national identity, deforestation assumes an even more pronounced importance.The vast tropical forests that blanket Colombian territory, home to a unique array of flora and fauna, have borne witness to a dramatic transformation in recent decades.This transformation, marked by the accelerated expansion of human activities, poses crucial questions regarding underlying causes, intricate interactions among diverse stakeholders, and long-term implications for the environment and local communities.Deforestation in Colombia transcends a mere reduction in forest cover; it is a multifaceted phenomenon extending beyond tree loss.
Identifying this issue, various international organizations have launched diverse initiatives to address its mitigation.The Paris Agreement, developed within the United Nations Framework Convention on Climate Change (UNFCCC) during COP21 in 2015, encourages adopting parties to make relevant decisions to reduce emissions associated with deforestation and forest degradation, as outlined in Article 5, Section 2 [1].The REDD+ program (Reducing Emissions from Deforestation and Forest Degradation) constitutes a fundamental part of global efforts to mitigate climate change, with the Food and Agriculture Organization of the United Nations (FAO) providing financial incentives to countries implementing effective measures against deforestation [2].The Tropical Forest Alliance (TFA), a global pact aimed at engaging different economic sectors such as businesses, public sectors, and civil society to promote the reduction of goods production associated with deforestation, is specifically focused on productive sectors in Colombia, including oil palm cultivation, meat, dairy, cocoa, and coffee [3].In Colombia, various initiatives have been developed to prevent deforestation, including Payments for Environmental Services (PSA), which aims to compensate landowners for providing services, including forest conservation in strategic sectors [4].The National Ecological Restoration Plan (PNR), issued by the Ministry of Environment in 2015, serves as a specific tool for forest restoration and reversing the effects of deforestation [5].
Similarly, motivated by these international and national commitments, several research initiatives have been undertaken to offer a better or more accurate understanding of this phenomenon.Noteworthy among them is the study conducted by P. Rivadeneyra et al. [6], which proposes a forest evaluation for deforestation analysis by contrasting different information sources and methodologies for calculating deforestation.The study compares data from the European Space Agency (ESA), the Instituto de Hidrología, Meteorología y Estudios Ambientales (IDEAM), the official source for national data, and the Global Forest Change Data (GFCD) from the University of Maryland.The study finds a significant advantage in the calculation performed by IDEAM due to its field validation but notes a deficit in periodicity and information access.Kay Khaing Lwin et al. [7] create comprehensive forest cover maps by processing national images made available by GFCD for Myanmar (Burma), producing different maps with thresholds ranging from 10% to 90% and evaluating them on ecological and national scales.Addressing more semantic concerns, C. Bovolo et al. [8] question the definition of "forest" and its basic units for accurate preservation and analysis in remote sensing models, the most widely used in the field.L. Fergusson et al. [9] identify difficulties in using satellite imagery for deforestation detection based on the GFCD database, primarily due to large oil palm plantations, which are assumed to be forests.This work emphasizes caution in using satellite imagery as a resource for monitoring deforestation in Colombia.E. D. Ponte et al. [10] present a regional analysis of tropical forest regions in the case of deforestation analysis in Paraguay, examining information sources, existing studies, and methodologies for data analysis .As evident, different academic sectors have focused on deforestation studies, primarily utilizing satellite imagery due to administrative and strategic challenges of ground monitoring, a more complex scenario in the case of Colombia given its armed conflict.Additionally, various methodologies have been implemented to address diverse needs: forest cover change, deforestation drivers, expanse, and mobility.
The Office of the Procuraduria General de la Nación, Colombia, as the head of the public ministry, has the Delegada para Asuntos Ambientales, Minero-Energéticos y Agrarios (PDAAMEA), whose functions motivate this study.Within its functions, PDAAMEA has taken several actions, among which the 2022 Directive 006 stands out, focusing on the control and surveillance of extensive livestock activities in areas of the National Natural Parks System (SPNN) and Regional Natural Parks.The directive aims to address the development of extensive livestock activities in areas of National Natural Parks and Regional Parks, which are prohibited by law but have become one of the main causes of deforestation in these strategic areas due to the absence of control by different State entities.Directive 007 of 2022 was issued as an "Alert for the effective processing of environmental sanctioning processes related to infractions associated with deforestation in the Amazon region."This directive, through recommendations and exhortations, aims to contribute to the effective and efficient processing of environmental sanctioning processes related to infractions associated with deforestation in the Amazon region, thereby contributing to counteracting deforestation.More than 10 workshops were conducted with communities and institutions on the effectiveness and impact generated by projects and initiatives for protecting strategic ecosystems in the Colombian Amazon.This work allowed the Procuraduria General de la Nación, Colombia, to identify various improvement opportunities in these initiatives and generate recommendations in this area.As a result of these workshops, Circular No. 7 was issued on March 15, 2023, aiming to generate a series of recommendations for the implementation of projects in the Colombian Amazon.
This article presents a proposal for geo-statistical analysis using quarterly official data issued by IDEAM.The first part explains the model and mathematical formalism using early deforestation alert points as input [11].The second part provides brief conclusions and proposes a working path for implementation.

Mathematical and analytical proposal
The IDEAM provides information on early deforestation alerts in the form of a KML file, containing a large set of points for each quarter from the first quarter of 2020 to the first quarter of 2023 (information reviewed up to November 2023) [11].Each point reported for the quarter of interest is georeferenced, with basic information such as municipality, department, and surveillance jurisdiction, among others.The valuable aspect for the present study is the points, their quantities, and positions.For the first quarter, 7969 alerts were reported; for the second, 3664; for the third, 7738; and for the fourth, 8339.
Regarding the trend of point accumulation by zones, various deforestation drivers are presupposed, as referenced in [12], being the main causes of their expansion and development.Thus, a clustering system of points is proposed to analyze their behavior and evolution over time.To determine the best possible point clustering, the silhouette method is proposed.In [13], P. Rousseeuw proposes the methodology, and Shahapure et al [14] validate its implementation satisfactorily using k-means algorithms from Scikit-learn in Python.In this methodology, we proceed with a quantity k of clusters, where the cohesion a (i) is determined by Equation (1) [12][13][14].
where d (i,j) represents the distance between points i and j within the same k cluster.The separation b (i) of the i-th point is determined, measuring the difference between it and the points from neighboring clusters to which it does not belong (see Equation ( 2)) [12][13][14].
for l different from i, and d (i,l) is the distance between the study point and those belonging to a neighboring cluster to which it does not belong (l).The silhouette S (i) of the i-th point is determined with Equation (3) [12][13][14].
The average silhouette value for the entire set of points is calculated with Equation (4) [12][13][14].
Thus, the silhouette value for a data division into k clusters is determined, which can fluctuate between -1 and 1.When S k → 1, it means that the grouping of values is the most accurate according to the feature of interest, in this case, distance.When S k → −1, it indicates that there are points that were poorly grouped or that the feature is not appropriate for connecting the points.
Once the appropriate value of clusters k is determined, the centroids of each can be found.Analyzing many points grouped by subsets, it is appropriate to determine that statistical point concentrates the most information from each cluster; for this, the x position of the centroid C (x) is determined by finding the x position that minimizes the Equation (5) and is within the same cluster [12][13][14].
where n represents the number of points in that cluster; for its y position, it is analogous, so the coordinates of the centroid will be described by Equation ( 6) [12][13][14].
Now, from statistical mechanics, in the canonical ensemble, if conditions such as a constant number of particles, constant area and constant temperature are satisfied, it is possible to determine the partition function for each system, with which it would be possible to find its average energy, employing the formalism evidenced in [12][13][14][15].It is known that the partition function is given by Equation (7) .
So, if each cluster is associated with an energy in the function of the longitude x like is shown in Equation (8) [12][13][14].
where a is a specific constant for each cluster, representing relationships such as the total affected area per cluster, the quantity of contained alerts, among others, then it would be possible to determine Z (T ) (x) and β is 1/kT where k y the Boltzman constant and T is the constant temperature of the system.If the analysis is extrapolated to the continuous case, the following is obtained by Equation (9) [12][13][14].
where it is evident that this pseudo-energy associated with the system relates representativity, a constant per cluster, and variance as a measure statistical fitting dispersion.

Conclusions
The analysis of early deforestation alerts through the proposed clustering system can offer valuable insights into the spatial and quantitative aspects of deforestation trends.The utilization of the silhouette method for clustering achieves a comprehensive understanding of the behavior and evolution of points over time.The determination of appropriate clusters may facilitate the creation of meaningful histograms, shedding light on the patterns of deforestation alert accumulation in different quarters.Furthermore, the application of statistical mechanics principles to associate pseudo-energy with each cluster, represented by the constant "a", introduces a novel perspective.This quantitative representation of the "weight" of each cluster, considering factors such as the total affected area and the quantity of alerts, provides a deeper understanding of their significance.The extrapolation of the analysis to a continuous case, incorporating the concept of partition function and average energy, adds a level of complexity to the study, linking the principles of statistical mechanics to real-world environmental phenomena.