Integrating satellite-based forest disturbance alerts improves detection timeliness and confidence

Satellite-based near-real-time forest disturbance alerting systems have been widely used to support law enforcement actions against illegal and unsustainable human activities in tropical forests. The availability of multiple optical and radar-based forest disturbance alerts, each with varying detection capabilities depending mainly on the satellite sensor used, poses a challenge for users in selecting the most suitable system for their monitoring needs and workflow. Integrating multiple alerts holds the potential to address the limitations of individual systems. We integrated radar-based RAdar for Detecting Deforestation (RADD) (Sentinel-1), and optical-based Global Land Analysis and Discovery Sentinel-2 (GLAD-S2) and GLAD-Landsat alerts using two confidence rulesets at ten 1° sites across the Amazon Basin. Alert integration resulted in faster detection of new disturbances by days to months, and also shortened the delay to increased confidence. An increased detection rate to an average of 97% when combining alerts highlights the complementary capabilities of the optical and cloud-penetrating radar sensors in detecting largely varying drivers and environmental conditions, such as fires, selective logging, and cloudy circumstances. The most improvement was observed when integrating RADD and GLAD-S2, capitalizing on the high temporal observation density and spatially detailed 10 m Sentinel-1 and 2 data. We introduced the highest confidence class as an addition to the low and high confidence classes of the individual systems, and showed that this displayed no false detection. Considering spatial neighborhood during alert integration enhanced the overall labeled alert confidence level, as nearby alerts mutually reinforced their confidence, but it also led to an increased rate of false detections. We discuss implications of this study for the integration of multiple alert systems. We demonstrate that alert integration is an important data preparation step to make use of multiple alerts more user-friendly, providing stakeholders with reliable and consistent information on new forest disturbances in a timely manner. Google Earth Engine code to integrate various alert datesets is made openly available.


Introduction
Averting tropical deforestation is key to mitigate climate change (Busch et al 2019).The successful implementation of local, national, and international climate initiatives and laws targeting the reduction of deforestation, such as the European regulation on deforestation-free commodities, necessitates consistent and timely deforestation information across tropical regions (Nabuurs et al 2022a).
Over the past decade, satellite-based monitoring and alerting systems (Diniz et al 2015, Hansen et al 2016, Watanabe et al 2018, Reiche et al 2021) have emerged as primary tools to provide near-real-time information on new forest disturbances in a costeffective manner.The increasingly high spatial detail and observation frequency of new optical and radar satellites, combined with open data policies, have enabled the tracking of forest disturbances with subweekly updates and at a spatial detail of 10-50 m.
The value of satellite-based forest monitoring and alerting systems in support of law enforcement actions against illegal human activities, sustainable land management, and zero-deforestation pledges has been widely recognized by governments, non-governmental organizations, the private sector, and communities (Lynch et al 2013, Pratihast et al 2016, Finer et al 2018, Weisse et al 2019, Tabor and Holland 2020, Coelho-Junior et al 2022, Nabuurs et al 2022b).Several studies have shown that the use of satellite-based forest disturbance alerts and related law enforcement actions can help to decrease deforestation and related carbon emissions at both community and national levels (Moffette et al 2021, Slough et al 2021, Nabuurs et al 2022b).The open distribution of forest disturbance alert products through nationally-hosted web portals, as well as open platforms like the World Resources Institute's Global Forest Watch (https:// globalforestwatch.org), the Nustantara Atlas (https:// nusantara-atlas.org)and the Brazilian MapBiomas platform (https://brasil.mapbiomas.org), has resulted in improved transparency and data accessibility globally.These improvements benefit less-technical users, communities, and civil society, playing a crucial role in raising public awareness regarding ongoing forest changes (Finer et al 2018, Tabor andHolland 2020).
Operational forest disturbance alerting systems primarily utilize freely distributed medium-scale resolution (10-50 m) imagery from sources such as Landsat and Sentinel-1 and 2 sensors.The Global Land Analysis and Discovery Landsat (GLAD-L) alerts, first introduced in 2016, pioneered an openly available system for detecting forest disturbances.The GLAD-L alerts rely on 30 m Landsat data to provide alerts across the pan-tropics (Hansen et al 2016, https://glad-forest-alert.appspot.com).The GLAD-L alerts employ a trained model to classify each pixel for presence or absence of tree cover loss, utilizing subsequent observations to enhance confidence in identified changes.In 2021, 10 m GLAD Sentinel-2 alerts (GLAD-S2) were introduced for the Amazon basin, an extension of the methods of (Hansen et al 2016) (https://glad-forest-alert.appspot.com).The higher spatial detail and observations every 5 d allow for improved mapping of fine-scale changes, including selective logging.In regions with dense cloud cover, typical of portions of the humid tropics such as parts of the Congo Basin or the Guyana Shield, limited availability of cloud-free Landsat and Sentinel-2 observations reduce the ability to track change events consistently in near real-time (Sannier et al 2014, Moffette et al 2021, Flores-Anderson et al 2023).
Radar satellites signals can penetrate through clouds and smoke while being sensitive to changes in the physical structure of forests.This capability provides an opportunity to complement opticalbased forest monitoring (Joshi et al 2016, Reiche et al 2016).With Sentinel-1, global temporally dense C-band radar data at a high resolution of 10 m spatial scale are freely available, with observations every 6-12 d in the tropics.The RAdar for Detecting Deforestation (RADD) alerts, introduced in 2021 (Reiche et al 2021), harness Sentinel-1 data to provide forest disturbance alerts for most of the pantropics.The RADD alerts employ a simple probabilistic change detection approach to calculate the deforestation probability of each observation, utilizing Bayesian updating (Reiche et al 2015) and subsequent observations to detect new changes and increase detection confidence.Another notable pantropical system is the JJ-FAST (JICA-JAXA Forest Early Warning System in the Tropics) alert, which utilizes ALOS-2 PALSAR-2 ScanSAR L-band radar data at a spatial scale of 50 m (Watanabe et al 2021).It offers event-based forest disturbance alerts, updated every 1.5 months.The system's ability to detect smallscale changes is constrained by a minimum event area size of 1 ha (version 4.1, January 2024).
In addition to these openly available systems covering larger geographies, numerous regional and national systems have been developed using various optical and radar satellite data streams (Diniz et al 2015, Vargas et al 2019, Ballère et al 2021, Doblas et al 2022, Doblas Prieto et al 2023).These include the Peruvian Landsat-based Geobosques system (Vargas et al 2019), the Brazilian Real-Time System for Detection of Deforestation (DETER), which utilizes data from the Advanced Wide Field Sensor onboard the Indian Remote Sensing satellites to provide monthly forest disturbance information at a 56 m spatial scale (Diniz et al 2015), and the Sentinel-1-based Brazilian DETER-R system (Doblas et al 2022).
While sourcing data from diverse satellite sensors and covering various geographical areas, most of the current operational alerting systems employ similar forest definitions and detection methods to monitor new forest disturbances in near-real-time.First, forest disturbances are generally defined as either complete or partial loss of tree cover, often mapped only within the boundaries of a forest baseline map.This includes both human-induced and natural causes, without differentiation.Secondly, comparable processing steps are applied.These include generating historical satellite image metrics to characterize previous forest conditions, pre-processing each newly acquired image, applying a forest disturbance algorithm, and building confidence with subsequent observations.Alerts are triggered based on a single observation from the most recent image (low confidence alert).Subsequent observations are then used to increase confidence and transition to a high confidence alert or to dismiss a change.The alert's date is set to the image date that initially triggered the alert.Thirdly, alert systems are typically tuned to be conservative in detecting forest disturbances to minimize false alarms.A trade-off between alert confidence and detection timeliness is an inherent aspect of near-real-time forest disturbance monitoring.A detection based on a single image is the most immediate but comes with a lower confidence level.Conversely, considering multiple subsequent observations heightens confidence but entails a waiting period (Diniz et al 2015, Hansen et al 2016, Reiche et al 2021, Watanabe et al 2021).
The variation in detection capability among the different alerting systems considered here (GLAD-L, GLAD-S2 and RADD) arises primarily from the physical attribution of the satellite sensor (e.g.measured wavelength and spatial resolution) rather than the employed methods (e.g.applied detection method and minimum mapping unit), and may vary by geography and season.The enhanced spatial detail of 10 m in Sentinel-1 and −2 data, for instance, enables the tracking of subtle forest disturbances linked to selective logging activities that are challenging to detect using 30 m Landsat for example.Sentinel-2's 5 d observation frequency enables highly timely tracking of changes during the cloud-free dry season.On the other hand, in the rainy season or in cloudy tropical forests with often month passing between cloudfree observations, alert systems leveraging radar data can offer more consistent monitoring without temporal gaps.While optical satellite data-based alert systems excel in detecting large-scale clearings and fires due to their ability to track defoliation, radar-based systems face challenges in detecting such events when structural elements or debris remains (Balling et al 2021, Doblas Prieto et al 2023).Additionally, local inaccuracies in global forest baseline maps can result in false detections in radar-based systems in wetlands and agricultural areas (Verhelst et al 2021).Likewise, residuals cloud and cloud shadow after imprecise masking lead to an abrupt change in the signal and can lead to false detections in optical-based alert systems.
Studies that introduced alert systems (Hansen et al 2016, Ballère et al 2021, Reiche et al 2021, Watanabe et al 2021, Doblas et al 2022) typically report a single accuracy metric summarizing the detection accuracy and timeliness across the entire study area, often at a country or continental scale.This approach falls short in providing a nuanced understanding of the distinct advantages and limitations of the systems associated with different change types (e.g.fine-scale detections or wildfires) and regions characterized by persistent cloud cover, among other factors, which can vary significantly at the local level (as described above).
The availability of multiple forest disturbance alerts, each with varying detection capabilities, poses a challenge for users in selecting the most suitable system for their monitoring needs and workflow (Berger et al 2022).Integrating multiple forest disturbance alert systems holds the potential to address the limitations of individual systems and offer forest disturbance alert information that is more timely, confident and user-friendly.When a forest disturbance is indicated at a specific location by multiple systems utilizing different sensors and algorithms, it increases confidence in the disturbance being genuine and can provide higher confidence at an early stage.A simulated integration (Doblas Prieto et al 2023) of various radar-based alert systems with GLAD-S2 alerts demonstrated improved spatial and temporal aspects, particularly when combining GLAD-S2 and RADD.
A first operational integration of alert systems was implemented on Global Forest Watch in 2022 with the introduction of the integrated deforestation alert layer.This layer combined GLAD-L, GLAD-S2, and RADD alerts onto a common grid and included a highest confidence level for alerts indicated by multiple alert systems (Berger et al 2022).Another integrated alert using RADD and GLAD-L was introduced as part of the Nusantara Atlas, utilizing the higher confidence level (low or high) of either of the alerts (Gaveau 2023).The MapBiomas Alert integrates GLAD-L alerts with a number of local and national alert systems and refines them with high resolution satellite data (MapBiomas Alerta n.d.).Current integration efforts have been limited to pixelbased integration and have not considered other parameters or rulesets based on, for example, spatial neighborhood and temporal alignment.
With the emergence of integrated alert products and the release of additional operational alert systems anticipated in the near future, it is important to assess how the integration of multiple alert systems under given rulesets affects detection capabilities for different change types and environmental factors like cloud cover.
In this study, we aim to evaluate how integrating RADD, GLAD-S2, and GLAD-L forest disturbance alerts enhances the timeliness and accuracy of detecting forest disturbances across the Amazon Biome.Specifically, we will assess (i) alert integration effects for key change types and environmental conditions; (ii) the validity of the confidence ruleset employed by the integrated deforestation alert layer on Global Forest Watch and compare it against a more conservative ruleset, and (iii) the potential performance enhancement by considering spatial neighborhoods beyond pixel-based alert integration.

Study area and forest baseline
We selected ten 1 • sites across the Amazon Basin, each spanning approximately 12 000 km 2 (figure 1).These sites were selected to include different change types, cloud cover frequencies, and environmental characteristics, including wetlands and mountainous areas (table 1).The key change types (as observed in high-resolution 3.7 m PlanetScope data) include large-scale forest fires (sites 1 and 5), large-scale (sites 4, 6 and 9), and small-scale clearings for agriculture (sites 2, 3, 8, 9 and 10), selective logging (sites 2 and 3), small-scale mining (sites 2, 3, and 10) and large blowdowns (site 10).Those changes are considered the most frequent drivers of tree cover loss in the tropics (Curtis et al 2018, Tyukavina et al 2018).The sites located in the Guyana Shield (sites 2 and 3) experience notably high cloud cover persistent over most of the wet season.The southern Amazon Basin typically experiences lower cloud cover, resulting in more frequent cloud-free observations.Site 2 encompasses a significant wetland area, and site 7 features mountainous terrain.
We restricted all analysis to be within the boundaries of a humid tropical forest mask.We used a primary humid tropical forest mask for 2001 from (Turubanova et al 2018) and removed 2001-2020 forest loss (Hansen et al 2013) and mangrove forest (Bunting et al 2018).

Satellite alert products
We processed RADD, GLAD-L, and GLAD-S2 alerts for the ten sites throughout the year 2021.Each update was recorded, allowing for the calculation of time differences between different confidence levels.This level of detailed analysis is not possible using the operational products unless a user downloaded the maps each day, as only as only the date of the initial alert appearance is provided.
All products utilized a consistent definition for forest disturbance.Forest disturbance was defined as the complete or partial removal of tree cover within a 10 m pixel (Sentinel-1 for RADD, Sentinel-2 for GLAD-S2) or a 30 m pixel (Landsat for GLAD-L).Complete tree cover removal indicates a standreplacement disturbance at the pixel scale, while partial removal typically represents disturbances related to boundary pixels and smaller-scale events such as selective logging (Hansen et al 2016, Pickens et al 2021, Reiche et al 2021).

RADD alerts
RADD alerts map forest disturbances in near-realtime across most of the tropics (50 countries, as of January 2023), using Sentinel-1 Ground Range Detected (GRD) data with a spatial resolution of 20 × 22 m (10 m pixel spacing).Each new Sentinel-1 GRD image is pre-processed to improve image quality and remove artefacts (Mullissa et al 2021, Reiche et al 2021).Using a probabilistic change detection approach (Reiche et al 2018), conditional probabilities of forest disturbance are derived based on per-pixel forest and non-forest probability metrics.Subsequent observations iteratively update the forest disturbance probability, enhancing confidence and confirming or rejecting the alert (false detection).Low confidence alerts are provided for forest disturbance probabilities above 85%, and high confidence alerts when the forest disturbance probability surpasses 97.5%.Alerts that remain unconfirmed (low confidence) after 90 d are removed from the data set.The product (version 1) maintains a minimum mapping unit of 0.1 ha (equivalent to 10 8-connected pixels).The data is accessed and processed on Google Earth Engine (Gorelick et al 2017) and results are updated weekly.For more information on methodology, see Reiche et al 2021 and http://radd-alert.wur.nl.

GLAD-L alerts
The GLAD-L alerts map tree cover loss in near-real-time across the tropics (30 • N to 30 • S) using 30 m Landsat imagery, initially Landsat 7 and 8 and now Landsat 8 and 9 (https://glad-forest-alert.appspot.com).Each new image is assessed for cloud cover or poor data quality, radiometrically normalized, and compared to Landsat-derived metrics of previous years (including ranks, means, and regressions of red, near infrared, and shortwave infrared bands, and ranks based on Normalised Difference Vegetation Index, Normalised Burned Ratio, and thermal).The metrics and the latest Landsat image are run through a set of decision trees to identify likely tree cover loss.Alerts remain unconfirmed (low confidence) until a second observation is labeled loss within the next four cloud-free observations and within 180 d.Alerts that remain unconfirmed (low confidence) after subsequent observations or 180 d are removed from the data set.The data is accessed and processed on Google Earth Engine (Gorelick et al 2017) and the results are updated daily.For more information on methodology, see (Hansen et al 2016).

GLAD-S2 alerts
The GLAD-S2 alerts are an extension of the methodology of (Hansen et al 2016) to 10 m Sentinel-2 data within the Amazon region of South America (https://glad.earthengine.app/view/s2-forest-alerts).It also evaluates each cloud-masked image for tree cover loss with a set of decision trees employing 10 and 20 m reflectance data (bands 2, 3, 4, 8, 11, 12) from the current image and from baseline Sentinel-2 metrics and updates the confidence based on subsequent images.However, it reports tree cover loss only within the boundary of the primary forest mask of (Turubanova et al 2018) with 2001-2018 forest loss from (Hansen et al 2013) removed.The confidence is split into four levels: single detection, low, medium, and high confidence based on the number of loss detections within the next four observations.If there is not a second detection of loss within a maximum of 4 observations or 180 d the alert is removed.Each day all new Sentinel-2 data is all downloaded and processed locally and the updated results are uploaded to Earth Engine and made available online.

Data preparation
We standardized the alert products by resampling the RADD and GLAD-L alerts to match the 10 m GLAD-S2 pixel grid (0.0001 • × 0.0001 • ), and aligning various date formats to the day of the year format.We reclassified the four GLAD-S2 alert confidence levels into low confidence (=single detection) and high confidence (=low, medium and high confidence) in order to align them with the low and high confidence levels of RADD and GLAD-L.Additionally, we computed the high confidence alert (i.e. the date the alert was marked as high confidence), in addition to the low confidence date (i.e. the date the alert was first detected) provided by the original alert products.

Pixel-based
We integrated alerts at the pixel level using two separate rulesets to attribute confidence.In addition to the low and high confidence levels provided by the original alert products, we introduced the highest confidence level for the integrated alerts.It is important to emphasize that the term 'low confidence' and 'high confidence' is more appropriate than the term 'unconfirmed' and 'confirmed' as used in previous research (Hansen et  Satellite-based alerts offer varying levels of confidence regarding the accuracy of a detected disturbance rather than confirming it.Confirmation implies an independent assessment, such as ground truthing or visual evaluation of very high-resolution data, to validate that the change is indeed true. Ruleset-1 assigns the highest confidence to disturbances detected by at least two alert systems at the same location, regardless of the original alert's confidence level.Even if two alert products exhibit low confidence, the integrated alert is marked with the highest confidence.Disturbances detected by a single alert system retain their original confidence level (either low or high).Ruleset-1 is applied to generate the Global Forest Watch integrated alert (Berger et al 2022).
Ruleset-2 represents a more conservative approach to reaching highest confidence.In this case, highest confidence level is assigned for forest disturbances detected by three alert systems (with at least low confidence) or by two alert systems with high confidence.In situations of either two low confidence alerts or one low and one high confidence alerts, this would only be considered high confidence, rather than highest confidence, as according to Ruleset-1.
There is no difference in the low and high confidence level for Ruleset-1 and −2.Differences are only evident in the highest confidence level.

Pixel-based with a spatial neighborhood rule
In addition to the pixel-based integration, we also investigated the impact of considering spatial neighborhood (figure 2).For alert pixels that do not overlap but are in close proximity (e.g.within 50 m), we hypothesized that they are related to the same event.This consideration of proximity allows adjacent alerts to reinforce and boost each other's confidence without directly overlapping.It was theorized that this would lead to higher confidence levels, possibly within a shorter time frame.We set a maximum limit of 180 d to allow for this neighborhood influence, aiming to prevent alerts from different events, potentially years apart, from influencing each other in a manner that lacks meaningful correlation.
We evaluated the impact of the spatial neighborhood rule to enhance alert confidence for increasing distances from 20 to 100 m.This assessment was compared with the pixel-based approach that does not consider spatial neighborhood (distance = 0 m).

Evaluation
We evaluated the timeliness and accuracy of detection for each of the individual alert systems and integrated alerts separately at each of the 10 sites.Alert integration was performed for all combinations of alerts, including RADD + GLAD-L, RADD + GLAD-S2, GLAD-L + GLAD-S2, and all three alert systems.These integrated alerts were then compared against the individual alert systems.Ruleset-1 and −2 are only compared for the highest confidence level as there is no difference in the low and high confidence level.

Detection timeliness
We identified the earliest per pixel low confidence date among the three individual alert systems (RADD, GLAD-S2, and GLAD-L) for each alert pixel, establishing it as the baseline date.For each site, we computed the detection delay in relation to this baseline date for the different alert confidence levels (low, high, and highest) and their respective dates.Additionally, we calculated the duration it took to reach high or highest confidence after the initial detection, represented by the low confidence date.To mitigate early detection of potential false positives, which is more probable in low confidence alerts solely detected by a single system, we narrowed our analysis of the detection timeliness to only alerts that reached high confidence.

Detection accuracy
We calculated area-adjusted user's and producer's accuracies of forest disturbance alerts for the last alert update of 2021 for all the alert confidence levels: low [and greater] (all alerts), high [and greater] (high and highest confidence alerts), and highest.This was done because, in most cases, users of alert systems employ a minimum confidence level (e.g.low or high) to select alerts.Validating, for instance, the producer's accuracy of only low confidence alerts without also considering higher confidence levels is not meaningful.
We used high-resolution 3.7 m PlanetScope data (Planet 2017) to validate the forest disturbance alerts.In some cases, persistent cloud coverage hindered visual interpretation using PlanetScope data alone, and additional reference data were consulted, including Sentinel-1 and −2, and Landsat-7 and 8 imagery.
We used probability sampling (Stehman 2014) with five strata with 50 samples each, comprising a total of 250 sample points per site.These strata were: 'forest disturbance (three alert products)' , 'forest disturbance (two alert products)' , 'forest disturbance (single alert product)' , 'No disturbance within a 200 m buffer zone' , and 'No disturbance outside the buffer zone' .Specifically, the strata defined by only one or two alert products identifying a forest disturbance helped evaluate commission errors.A precise estimation of the false detection rate (commission error) is crucial, especially when evaluating near-realtime systems.The stratum 'No disturbance within a 200 m buffer zone' was implemented to account for omission errors, which are more likely in proximity to existing forest disturbances (Olofsson et al 2020).The sampling unit corresponded to a single 10 m Sentinel-2 pixel (0.0001 • , ∼0.01 ha).We accounted for unequal inclusion probabilities among strata by calculating sample inclusion probabilities based on strata areas and the number of sample points.These inclusion probabilities were then used to calculate area-adjusted user's accuracy (Stehman et al 2003, Stehman 2014).To accommodate the difference in acquisition geometry between Sentinel-1 (sidelooking) and Landsat and Sentinel-2 (nadir-looking) as well as shifts due to the reprojection, we considered a 30 m tolerance buffer (∼1 Landsat pixel), around sample points.This approach aligns with the objective of alerting systems: timely detection of new events rather than unbiased area estimation (Tang et al 2019).
Our benchmark forest map relied on a Landsatbased annual tree cover loss product (Hansen et al 2013) to exclude forest disturbance events that occurred in 2020 and earlier.However, it is important to note that certain prior disturbance events went undetected by the Landsat-based algorithm, often due to a lack of cloud-free Landsat data towards the end of 2020 (Verhelst et al 2021).We documented detected disturbances originally predating 2021, as observed in PlanetScope time series and other reference data, but these were excluded when calculating user's and producer's accuracy (Reiche et al 2021, Balling et al 2023).

Pixel-based integration 4.1.1. Detection timeliness
Comparing the mean detection delay of new disturbances of individual alert systems, RADD exhibited the earliest detection on average at three sites (sites 2, 3, and 8), while GLAD-S2 showed the earliest detection at the remaining sites.GLAD-L displayed longer detection delays for new disturbances across most sites (figure 3).While the mean detection delay of RADD and GLAD-S2 were similar for certain sites (sites 2, 7, 8, and 9 within ± 5 d), substantial site-specific differences were observed.The most significant detection delays for RADD were noted at sites dominated by large-scale clearings for agriculture (sites 4 and 6), where disturbances were detected on average 23 and 49 d later compared to GLAD-S2.This delay can be attributed to the presence of remaining debris in these large-scale clearings, causing the Sentinel-1 C-band radar backscatter to appear similar to stable forest backscatter levels, often resulting in omissions and delayed detections (Balling et al 2021, 2023, Doblas Prieto et al 2023).In contrast, GLAD-S2 showed an average 27 d delay in detection compared to RADD at site 3.This site is dominated by small-scale clearings for agriculture and mining activities, and has frequent cloud cover.
While average detection delays are reported, substantial variability at the pixel level was observed, as evidenced by a high standard deviation at the site level.For example, although RADD detected new disturbances on average 4 d earlier than GLAD-S2 at site 2, individual events exhibited variations of over two months (both earlier and later detections) between the two systems, emphasizing the complementary potential of the two alert systems.
Achieving high confidence after the initial detection took a similar amount of time for all three alert systems.On average (mean of site means), RADD took 34 d, GLAD-S2 took 26 d, and GLAD-L took 29 d to attain high confidence (calculated as the difference between the high confidence date and low confidence date) (figure 4).
Alert integration demonstrated major improvements in detection timing (figure 4).The detection delay of individual systems reveals improvements in detection timing achieved through the integration of all three alert systems.When integrating all three alert systems, the detection of new disturbances improved by on average (mean of site means) 16 d (compared to RADD), 9 d (compared to GLAD-S2), and 38 d (compared to GLAD-L).At the site level, average improvements reached up to 53 d (compared to RADD at site 6), 33 d (compared to GLAD-2 at site 3), and 70 d (compared to GLAD-L at site 2).
Improvements were less when integrating two alert systems.Compared to using all three systems, the integration of only RADD and GLAD-L was on average 12 d slower in initial detection, compared to GLAD-S2 and GLAD-L was 6 d slower.RADD and GLAD-S2 was only 1 d slower.
For integrated alerts, the additional days required to attain high confidence (calculated as the difference between the high confidence date and low confidence date) was either equal to (for GLAD-S2) or slightly shorter compared to individual alerts (for RADD and GLAD-L).On average (mean of site means), it took 26 d to reach high confidence when integrating all three alert systems.Furthermore, it took on average 45 d to achieve the highest confidence using ruleset-1 and 73 d using ruleset-2 (calculated as the difference between the highest confidence date and low confidence date).Similar durations to reach highest confidence after initial detection were observed for combining two alert systems., displayed for each of the 10 sites.The system with the lowest mean detection delay is the one that, on average, detects new forest disturbances earliest at the respective site.Refer to Table 1 for a detailed site description and appendix 1 for the complete dataset.

Detection accuracy
The producer's accuracy (detection rate; 100%omission error) for individual alert systems exhibit large site-specific differences (figure 5).Among the individual alert systems and considering all alerts (class low and greater), GLAD-S2 had the highest average (mean across sites) producer's accuracy of 85.7% (53.3%-100%).The lowest producer's accuracy for GLAD-S2 was found at site 5 (large-scale fires) and site 10 (blowdown, small scale clearings and mining, and frequent cloud cover).RADD followed with the second-highest producer's accuracy averaging 60.2% (10.8%-84.6%).The lowest producer's accuracy for RADD was observed at sites dominated by large-scale fires (sites 1 and 5).Remaining debris and forest structures at fire sites, caused relatively little change in the Sentinel-1 C-band backscatter signal and omission errors in RADD (Balling et al 2021, 2023, Doblas Prieto et al 2023) (figure 6).GLAD-L consistently displayed the lowest producer's accuracy across all sites averaging 38.3% (14.2%-91.2%).For GLAD-L, the lowest producer's accuracy was mainly at sites with frequent cloud cover and dominated by small-scale changes, such as selective logging (sites 2, 3, 7 and 10) where the coarser 30 m spatial resolution of Landsat limited the detection ability.
Combining the three alert systems increased the producer's accuracy compared to any single detection systems.Considering all alerts (class low [and greater]) revealed an average producer's accuracy of 97% of detected disturbances, with a reduced spread across sites of 70.3%-100% when compared to the individual systems.When only considering high and highest confidence alerts (class high [and greater]), the producer's accuracy is marginally lower with an average of 92.2% (65%-100%).Highest confidence alerts had an average producer's accuracy of 57,1% (18.6%-94.6%)for ruleset-1, while the more conservative ruleset-2 showed a lower average producer's accuracy of 50.3% (12.8%-100%).Lower producer's accuracies were observed when combining two systems, with RADD and GLAD-S2 showing the highest producer's accuracy when integrating two system.Some omission errors were due to undetected events that occurred near the end of 2021, which would likely be detected in early 2022 but were not included in the 2021 alerts assessed here.The user's accuracy (100%-commission error) was found to be high for all individual alert systems (i.e.low false detection rate).When considering all alerts (class low [and greater]), the user's accuracy was on average 98.8% for RADD, 96.6% for GLAD-L, and 89.9% for GLAD-S2.For high confidence alerts (class high [and greater]), the user's accuracy increased to on average 98.9% for RADD (92.3%-100%), 100% for GLAD-L, and 96.9% for GLAD-S2 (91.4%-100%).
The user's accuracy decreased when combining alert systems due to aggregating the false detections of the individual systems.The average user's accuracy for combining three alerts was 89.5% (70.6%-99.2%)when considering all alerts (class low [and greater]) and 96.6% (93.8%-100%) for considering high and highest confidence alerts (class high [and greater]).Combining only two alert systems resulted in fewer false detections, i.e. a smaller decrease in user's accuracy.For combining two or three alert systems, the highest confidence class achieved a user's accuracy of 100% for both ruleset-1 and −2, effectively eliminating the false detections.This implies that ruleset-1 does not lead to less reliable highest confidence alerts thane the more more conservative ruleset-2.

Pixel-based integration with spatial neighborhood rule
To assess the effect of applying the spatial neighborhood rule when integrating all three alert systems, we employed only ruleset-1 as it was found to reach highest confidence more quickly than ruleset-2 with equal reductions in false detections.The effect of spatial neighborhood of alerts from other systems is evident only in the highest confidence class since a neighboring alert when treated as the same as overlapping would cause the alert to reach highest confidence immediately.
Applying the spatial neighborhood rule (figure 7) led to an increased producer's accuracy (detection rate) and slightly reduced delay in reaching higher confidence after first detection, while the users accuracy decreased (i.e.increased false detections) (table 2).
With increasing spatial neighborhood distance more alerts are included in the highest confidence class as reflected by an increasing producer's accuracy.The producer's accuracy increased from an average (mean across sites) of 57.1% for pixel-based integration without considering spatial neighborhood (distance = 0 m) to 68.9% for considering a 100 m spatial neighborhood distance.At the same time false detections increased for increasing neighborhood distance.The user's accuracy decreased, which was found to be 100% (no false detections) at all sites when using pixel-based integration without considering spatial neighborhood (distance = 0 m), decreased to 98.4% for 100 m spatial neighborhood distance, and ranges between 93.8% and 100% at the site level.
Increasing spatial neighborhood distance slightly decreased the time to reach highest confidence after initial detection by on average (mean of site means) up to 5 d (distance = 100 m), relative to an average of 45 d it takes without considering spatial neighborhood (distance = 0 m).

Discussion and conclusions
Here we demonstrated that integration of operational satellite-based forest disturbance alerts results in faster and more comprehensive detection of new disturbances.We integrated radar-based RADD, and optical-based GLAD-S2 and GLAD-L alerts in different combinations of two and all three alerts systems using two confidence rulesets.We assessed their synergies in the timely detection of a wide range of change types and diverse environmental conditions at ten 1 • sites across the Amazon Basin.

Alert integration improves detection timeliness and confidence
Alert integration improved the detection speed of new disturbances by days to months when compared to the earliest detection of any of the three systems (figure 3), and effectively shortens the delay to increase confidence, a process that otherwise requires additional satellite passes from the same sensor for individual alerts.This can take additional weeks or months (figure 4).
The increased detection rate to an average of 97% when combining alerts highlights the complementary capabilities of the optical and cloud-penetrating radar sensors in detecting various disturbance types and environmental conditions found across the ten sites, such as fires, selective logging, and cloudy circumstances.The most improvement was observed when integrating RADD and GLAD-S2, capitalizing on the high temporal observation density and spatially detailed 10 m Sentinel-1 and 2 data.This highlights the necessity to extend the use of high resolution optical and radar alerts to other tropical regions and areas outside of the humid tropics.
The highest confidence class was introduced for the integrated alert as an addition to the low and high confidence classes of the individual systems.Highest confidence was applied to forest disturbances detected in multiple alert systems (ruleset-1) and is employed in Global Forest Watch's integrated deforestation alert layer (Berger et al 2022).Importantly, this highest confidence class displayed no false detection, and had a higher level of confidence in comparison to the original high confidence class of the individual alert systems.While, achieving 100% user's accuracy may not be realistic in all places or times, this indicates that the likelihood of false detections is low.Alert pixels triggered by data artifacts, such as residual cloud and cloud shadows after imprecise masking in optical images, or rapid changes in wetlands in radar images, are unlikely to simultaneous occur in different systems, both spatially and temporally.This implies that combining low confidence alerts from two different systems can in most cases be assumed to result in highest confidence already (as applied in ruleset-1).Our results suggest that a more conservative ruleset might not yield a more reliable highest confidence class.However, requiring at least three low confidence alerts or one high and one low confidence alert to reach highest confidence (ruleset-2) would address occasional situations involving two overlapping false low confidence alerts.
Considering spatial neighborhood during alert integration enhanced the overall labeled alert confidence level of alerts, as nearby alerts mutually reinforced their confidence.However, this approach has two main drawbacks.Applying the spatial neighborhood rule leads to an increased rate of false detections in the high and highest confidence classes.This trade-off challenges the usefulness of considering spatial neighborhood, particularly as it counteracts the idea of enhancing confidence in the highest confidence class.Additionally, applying spatial neighborhood strongly increases the computational complexity.
Our results emphasize large site-specific differences in the detection capabilities of individual alert systems.Observed producer's and user's accuracies, when compared at sites with similar disturbance types and environmental conditions, were comparable to those found in previous studies when considering equivalent assessment criteria (e.g.sampling unit) (Hansen et al 2016, Reiche et al 2021).For instance, results for RADD in the Congo Basin (83% detection rate and 2% false detection rate for high confidence alerts, Reiche et al 2021) are similar to those obtained from sites in Suriname (site 2) and Guyana (site 3), which are also characterized by small-scale changes such as selective logging, mining and smallholder agriculture.
Due to the choice of narrowing the assessment of spatial accuracy to the last alert update for 2021, we did not investigate the case of overlapping low confidence alerts that became highest confidence for a time but then dropped off (disappeared from the database).This dynamic has been observed for opticalbased alerting systems in selective logging areas where the forest change signal does not endure long enough for additional cloud free observations (rapid canopy closure).Conversely, false low confidence alert can temporally trigger highest confidence but are eventually removed.Those represent unmeasured benefits and drawbacks of integrating when alerts are used in a near-real-time context, as intended.

Alert integration improves user experience
Alert integration effectively reduces the trade-off between timely detection and confident identification in forest disturbance monitoring.This is particularly beneficial for forest monitors relying on the highest possible confidence to promptly address deforestation, especially those with constrained resources for on-the-ground law enforcement activities (Cappello et al 2022).Having three distinct confidence classes (low, high, highest) available provides users with the flexibility to tailor their selection based on specific user needs, particularly considering their tolerance for false detections (Reiche et al 2021).Users can make informed decisions about which alert confidence classes align with their specific objectives and acceptable levels of false positives, enhancing the utility and applicability of the alerting system to diverse monitoring and law enforcement requirements.
In addition to enhancing detection speed and confidence, alert integration improves user accessibility and reliability of forest disturbance information.Combining multiple individual alert sources into a single stream simplifies alert use.Integrating alerts from various sensors also helps mitigate interruptions due to sensor failures (e.g.Sentinel-1B in 2021, (ESA 2022)) or processing pipeline issues (e.g.Landsat's switch to Collection 2 in 2022), ensuring a more consistent stream of alerts.Moreover, integrating alerts can help to expand overall coverage compared to individual alerts that often focus on specific regions.

Implications for operational alert integration
Our integration involved conservative alert systems, minimizing false detections (user accuracy of high confidence alerts is 96.6%, and when considering low confidence alerts, it is 89.5%).It is crucial to avoid integrating alert systems with high false detection rates.The efficacy of different alert systems integrated here primarily relies on the physical detection capabilities of the sensor used rather than the method.Therefore, integrating multiple alert systems based on the same satellite sensors may not yield substantial benefits and could even be counterproductive.The results emphasise the importance of having a good understanding of the uncertainty levels of alert systems to be integrated, especially considering the potential aggregation of false detections from individual alerts during integration.An improved understanding is particularly important as many countries are in the process of implementing their own national-based alert systems.As methods for forest disturbance alerting continually advance (Balling et al 2023, Mullissa et al 2023, Slagter et al 2023, Zhang et al 2023) and with the forthcoming availability of new satellite datasets, such as temporally dense L-band radar data from NISAR (Rosen et al 2016), it is likely that additional alerting systems will be developed.
Alert integration, performed at the product level, offers major advantages by leveraging the strengths of various optical and radar sensors without introducing new artifacts.In contrast to data-level integration, as proposed in prior studies (Reiche et al 2018, Tang et al 2023), where optical and radar observations instead of reinforcing each other might conflict due to differences in what they detect within forest disturbances (e.g.defoliation vs structural damage), using product-level integration of alert systems described avoids such issues (Balling et al 2021).
The integrated alerts show a large increase in detected disturbances with only a low rate of additional false detection when compared to the individual alerts, and high confidence alert level is reached more timely.While forest disturbance alerts are primarily a law enforcement tool and reporting of changes should follow a sample-based approach, the more balanced user's and producer's accuracy increases opportunities for more frequent (subannual) reporting of forest disturbance trends and associated carbon emissions (Csillik et al 2022).
The impact of forest disturbance alerts depends on their institutional use to support forest management and law enforcement actions against illegal and unsustainable activities (Tabor andHolland 2020, Coelho-Junior et al 2022).Reducing institutional barriers to use the alerts, whether they are structural, cultural or political, is crucial (Tabor and Holland 2020).Establishing comprehensive guidelines on leveraging forest disturbance alerts, and encouraging knowledge exchange can further accelerate their use.Alert integration is an important data preparation step to make use of multiple alerts more simply, providing reliable and consistent information on new forest disturbances in a timely manner to stakeholders.

Figure 1 .
Figure 1.(A) Site footprints within the Amazon Basin, and humid tropical forest mask.Total number of observations for 2021 for (B) Sentinel-1 radar, (C) Sentinel-2, and (D) Landsat-7 & 8. Sentinel-2 and Landsat data were filtered to include only cloud-free land observations, and Sentinel-1 data to remove the overlap in observations from different relative orbit paths at the scene edges (separately for ascending and descending orbits).(E)-(I) Examples of typical forest disturbances across the Amazon Basin: (E) selective logging at site 2, (F) mining at site 3, (G) forest fires (site 5), (H) large-scale clearings for agriculture at site 6, and (I) small-scale clearings for agriculture at site 8.

Figure 2 .
Figure 2. Example of pixel-based alert integration without (A) and with (B) considering the spatial neighborhood rule.Ruleset-1 is applied for the example to integrate two mock-up alert products.

Figure 3 .
Figure 3. Mean detection delay ± standard deviation (in days) of the low confidence date (initial detection date) of RADD, GLAD-S2, and GLAD-L relative to the earliest per pixel detection date among the three systems (baseline date), displayed for each of the 10 sites.The system with the lowest mean detection delay is the one that, on average, detects new forest disturbances earliest at the respective site.Refer to Table1for a detailed site description and appendix 1 for the complete dataset.

Figure 4 .
Figure 4. Mean (solid point) and range (dashed line) of the 10 site-specific mean detection delay [days] of low, high and highest alert confidence dates relative to earliest per pixel detection date among the three systems (baseline date), separately for individual and integrated alerts.The site-specific mean detection delay of the 10 sites is provided in addition (open points).For the is provided as Refer to Table1for a detailed site description and appendix 1 for the complete dataset.

Figure 5 .
Figure 5. Mean and range (dashed line) of the 10 site-specific producer's accuracy (detection rate; 100%-omission error) (A) and user's accuracy (100%-commission error) (B) of low [and greater] (all alerts), high [and greater] (high and highest confidence alerts) and highest confidence level, separately for individual and integrated alert systems.The site-specific mean detection delay of the 10 sites is provided in addition (open points).Refer toTable 1 for a detailed site description and appendix 2 for the complete dataset.

Figure 6 .
Figure 6.Disturbance alerts at two sample locations for each of the individual alert systems and for the integrated alert based on all 3 systems for ruleset-1 and −2.The sample location in the left panel shows selective logging (within site 2), and the yellow box highlights forest disturbances detected by RADD and not by GLAD-S2.The sample location on the right panel shows large-scale clearing for agriculture (within site 6), and the yellow box highlights alerts detected by GLAD-S2 but omitted by RADD.The underlying base map is a post-monitoring Planet monthly image composite.

Figure 7 .
Figure 7. Disturbance alerts at two sample locations based on all 3 systems (ruleset-1) without (distance = 0 m) and with considering the spatial neighborhood rule (distance = 40 and 100 m).The sample location in the left panel shows selective logging (within site 2), the sample location on the right panel shows large-scale clearing for agriculture (within site 6).The underlying base map is a post-monitoring Planet monthly image composite.
Appendix 1.Site-specific detection timeliness of pixel-based results.Mean ± standard deviation of site-specific detection delay [days] of the different alert confidence level and relative to the earliest low detection date among the three systems (baseline date), separately for individual alert systems and integrated alerts.R-1 refers to Ruleset-1 and R-2 to Ruleset-2.At the bottom, mean and range (minimum-maximum) of site-specific mean detection delay[days] are provided.Site (lower right coord.)

Table 1 .
Study site information, including the site location (latitude/longitude coordinates of lower right corner), a short description of key forest disturbance types and environmental conditions, and the average number of available observation for Sentinel-1, Sentinel-2 and Landsat-7 & 8, where Sentinel-2 and Landsat data were filtered to include only cloud-free land observations.

Table 2 .
Mean of site-specific (minimum-maximum of site means) producer's accuracy, user's accuracy, and decrease in delay [days] to reach highest confidence after initial detection, given for increasing spatial neighborhood distance.