Mapping global forest regeneration–an untapped potential to mitigate climate change and biodiversity loss

Forest regeneration can be a low-cost solution to mitigate climate change, and mapping its extent can support global goals such as the Bonn Challenge, which set a goal to put 350 million hectares of degraded forests and landscapes into restoration by 2030. Our study combined multiple remote sensing datasets and expert surveys, identifying 55.7±6.2 million hectares of likely regenerated forests between 2000 and 2015 across areas that were not forested before 2000 and have remained forested from 2015 to 2018. The identified forest regeneration could potentially represent 22–25 billion young trees and a total biomass of about 3.2 billion tonnes. Forest regeneration took place in sites with less opportunity cost for agriculture for every country, but in more developed regions, forest regeneration took place in sites with higher suitability for cultivation. Expert feedback associated agricultural land use transitions and the establishment of protected areas, coupled with effective management and local support, as the key factors leading to successful forest regeneration. The results, publicly available, can facilitate discussions and help identify strategic locations to foster forest regeneration to achieve the global goals of mitigating climate change and restoring biodiversity.


Introduction
Mapping and estimating forest regeneration at the global level can help quantify progress towards the global goals of sequestering carbon for climate change mitigation and improving habitat for biodiversity, as forest regeneration is one of the most cost-effective ways to achieve both of those goals [1][2][3][4]. While there are local and regional estimates of regenerated forest areas [5][6][7], global estimates are mostly derived from the Food and Agriculture Organization (FAO) of the United Nations' Global Forest Resources Assessment (FRA) which quantifies forest area gained. Based on national land use reports, the FRA estimated 140 million hectares of forest gained between 2000 and 2015 [8]. However, the estimate is not spatially explicit. The currently available spatially explicit global forest gain data [9] has limitations and does not represent the full picture of forest regeneration: The dataset only flags a pixel when its forest cover is over 50% regardless of the previous land cover [9], or whether the forest remains [10], or whether the gain came from commercial forest plantation [11]. In addition, the spatial resolution at 30 m likely only represents a few trees in tropical regions. As a result, Global Forest Watch, a widely used open-source tool used for monitoring forest change that uses the same dataset, uses the term 'tree cover' rather than 'forest cover' to describe the change detected [12].
Remotely sensed data represents a viable and practical first proxy for identifying areas where forest regeneration may have occurred [13]. However, its use for this purpose remains more complex than the detection of forest loss [14], as forest regeneration is a long and slow process. Forest regeneration can take many forms: 1. tree planting through industrial plantation (herein, referred to as 'plantation'), 2. active forest regeneration (including agroforestry, seedling, site preparation, etc), and 3. natural forest regeneration (NFR) [12,15]. NFR delivers great benefits for biodiversity and other environmental services, such as regulating the local hydrological cycle, mitigating heat waves, and preventing landslides [16,17]. However, using remote sensing alone is almost impossible to distinguish NFR from the other two forms of forest regeneration, let alone other phenomena that can result in an increase in biomass or greenness.
Given these challenges, we used multiple remote sensing datasets and products coupled with relevant local ancillary data and context information to create a global forest regeneration map. Using multiple remote sensing datasets and products from various satellite sources can help improve the mapping and monitoring of land cover change, as each of them can complement the shortcomings of one another [18,19]. Ancillary information provided by local experts and practitioners on the ground around the world can help calibrate remote sensing results, identify the type of forest regeneration, exclude plantation areas that are only known locally, and provide local socioeconomic or policy context for the drivers of observed forest change.
To gain an understanding of the biophysical and socioeconomic contexts of forest regeneration and where it has occurred, this study 1. developed a method that combined available forest cover products, remote sensing data, and survey information to map global forest regeneration; 2. collected local expert feedback via a survey to help validate the map and distinguish between different forms of forest regeneration; and 3. acquired a preliminary understanding of why the regeneration has occurred based on the feedback. We aimed to provide a global view of where forest regeneration has likely occurred and its contributing factors. The definition of forest regeneration adopted in this study is defined specifically as forest cover gain between 2000 and 2015 in areas that were not forested for 10 years prior and the regrown forest persisted until 2018. Such definition is intended to help exclude regrowth due to plantations and temporary disturbance events such as fire or windthrows.

Materials and methods
Multiple remote sensing datasets and products were used to create an intermediate global forest regeneration map, which was sent to the forest experts for feedback through a standardised survey. The feedback was used to validate the intermediate map and understand the local contextual factors that enabled forest regeneration, and was incorporated to create the final global forest regeneration map. A 25% canopy cover threshold was used for defining a forest in the remote sensing analyses, which was concluded based on the discussions with forest experts around the world (appendix section 1).

Remote sensing analysis
We used multiple sources of remote sensing datasets and products to create a global forest regeneration map. The advantage of using multiple sources of data is particularly relevant at the global scale with diverse forest types and locations such that no single dataset or classification method works well enough. For instance, optical sensors such as Landsat have difficulties in monitoring tropical areas with persistent cloud cover, but they work well in boreal regions [20,21]; whereas active sensor data such as synthetic aperture radar data can help overcome the issues related to cloud cover [19], but have difficulties with noise resulting from snow cover in boreal regions [22].
We used 23 land use and land cover remote sensing datasets and products to identify areas of potential forest regeneration using the following criteria (table 1): 1. classified as non-forest for 10 years before 2000; 2. the presence of forest stand age younger than 4 years in 2000 or a statistically significant monotonically increasing trend in the biomass/greenness estimates. Such trend was described by various yearly time series of vegetation indices between 2000 and 2015 using the Mann-Kendall test (p < 0.05) [37,38] and the Sen's non-parametric estimator [39] of slope greater than 0.01 on the normalised data; and 3. classified as forest from 2015 to 2018. We flagged areas where at least two remote sensing datasets supported each of the three criteria as potential forest regeneration (figure 1). Threshold values of p-value, slope, and the number of datasets used for each criterion, were selected based on discussions with global forest experts (appendix sections 2 and 3). Given that this study had a global scope, each remote sensing dataset and product was resampled to 250 m resolution using the nearest neighbour resampling method.

Expert feedback
The intermediate map was calibrated with the feedback from local experts. A web platform was developed to allow local experts to provide feedback on the intermediate map generated by remote sensing. Local experts were able to draw polygons of areas they were familiar with. A list of questions associated with the polygon allows the local experts to verify the map and provide contextual local information that enabled forest regeneration to occur (appendix section 4). Qualitative analyses were used to 1. determine for each biome [40] if sufficient feedback responses were received and whether they were representative; 2. detect the patterns of error in the intermediate map; and 3. correct the remote sensing estimates (appendix section 4.1). Regeneration areas labelled by the experts as due to plantations or natural disturbances were removed, and missed forest regeneration areas were added after confirmation with high  resolution remote sensing time series images. Based on the conclusion from the qualitative analyses of the feedback, plantation areas were further masked out based on the local land use information, the Spatial Database of Planted Trees [41], and the Global Oil Palm Plantation Map [42] to create the final forest regeneration map.

Accuracy assessment
Accuracy assessments were conducted for the final regeneration map at three levels: 1. global, 2. selected hexagons with each consists of at least 3.1% of mapped forest regeneration area, and 3. the mapped regeneration areas with their 1 km buffer. For the global level, 2141 out of 2500 sample points were used. Because the proportion of the total mapped forest regeneration area is only 0.1% of the land area, to gain control over the sample points allocated to the mapped regeneration area, we created 10 000 km 2 hexagonal grid cells covering all land areas. Hexagons with 5000 or more pixels of mapped forest regeneration were selected, which is equivalent to at least 3.1% of the area of a hexagon. We used 2145 out of the 2700 stratified random sampling points distributed across those selected hexagons. The buffer-level assessment was used to calculate the 95% confidence intervals of the area estimates [43], in which, 2020 out of the 2200 stratified random sampling points were used. For all three levels, the sampling points were distributed across the land areas where Antarctica was excluded. We visually assessed whether the 250 m × 250 m surrounding area of each sample point satisfied all three criteria of forest regeneration. High spatial and temporal resolution images from 1990 to 2018 available on Google Earth Pro were used as reference land cover. Sample points were eliminated due to the inaccessibility of high spatial resolution time series images necessary for validation. Accuracy assessment of each level was summarised in a confusion matrix [44] (appendix section 5).

Additional analyses
The final global forest regeneration map was combined with tree density predictions using the ecoregion and the biome-level models [45] to estimate the total number of trees regenerated. The resolution of the tree density maps is 897.3 m. Each pixel was partitioned to ensure the correct number of trees were counted in each resampled 250 m resolution pixel (appendix section 8). We estimated the regenerated forest biomass by combining the map and the biomass density estimates of forests with age younger than 20 [46]. Calculations were performed for a hypothetical scenario where 100% of the regenerated forest were natural (appendix section 8). A map with a finer definition of ecoregions produced by the Nature Conservancy [47] was used to estimate the encroachment of forest into non-forest ecoregions (refer to appendix section 10 and dataset S5 for the classification of the forest and non-forest ecoregions). To understand the dynamics of different types of forest regeneration with agriculture, the final map was overlaid with global slope values calculated from the Global Multi-resolution Terrain Elevation Data 2010 [48] by ESRI [49] and the Agricultural Suitability of Global Soil [50] data for the major regions in the world. The Agricultural Suitability of Global Soil data was developed based on an index derived from climate and soil data that represents the suitability level for cultivation in a particular location [51] and provides the fraction of each grid cell suitable for cultivation [50] (appendix section 9). The analysis was done for areas that were identified by forest experts as NFR, active regeneration, and without any identification. The map of annual gross economic rents of the world's crop and grazing lands of 2005 [52] was used to compare the opportunity cost of agriculture where forest regeneration had taken place and where it had not (appendix section 11).

Global forest regeneration map and accuracy assessments
Our results show an estimated 55.7 ± 6.2 million ha where forests appear to have regrown between 2000 and 2015 in areas that were not forested from 1990 to 1999 and remained forested until 2018. These regenerated forests could represent, based on the spatial model result of tree density at maturity [45], between 22.0 and 25.1 billion trees. Total aboveground biomass (dry mass) recovered is 3.2 billion tonnes, or 5.9 Gt CO 2 e, if all were naturally regenerated. About 1% of the total mapped areas were located in the tundra ecoregion and about 3.6% in grassland/savanna ecoregions (appendix section 10).
Remote sensing results showed clustering of apparent regeneration in the boreal region (including Russia, Canada, and northern Europe, especially Sweden), northern Mongolia, central China, the lower Mekong region, Europe around the Caspian Sea and the Mediterranean Sea, west Africa, northern Congo Basin, southeastern Brazil, Cuba, Central America, and southeastern United States (figure 2, figure 3). Large countries such as Russia, China, and Canada showed the largest areas of mapped regenerating forest (table 2). The top ten countries with the highest area mapped accounted for 78% of the total. When accounting the mapped regeneration area relative to the land area of a country, smaller countries in the Caribbean, the lower Mekong, and eastern Europe topped the list, with the top ten of these countries accounting for 23.1% of the total (table 3) (total and relative regeneration areas for all countries are available in dataset S1). The area estimates of forest regeneration in the tropical regions (between 25 • N and 25 • S) of our study were compared with those from Fagan et al (2022) for a similar time period [7] (table 4). Accuracy assessment at the global level showed an overall accuracy of 86.8 ± 1.5% (the uncertainty is expressed as the estimate ±95% confidence interval, henceforth). Producer's accuracy was estimated at 94.8% while the user's accuracy was estimated at 72.5% for the regeneration class. At the hexagon level, the overall accuracy was 90.6 ± 1.4%. Producer's and the user's accuracies of the regeneration class are 94.6% and 76.3% respectively. See appendix section 5 for the confusion matrices and detailed explanations of validation procedures. Global-level validation sample points were also partitioned by biomes. The accuracy levels for nearly all biomes are above 84%, except for 'Tropical and Subtropical Grasslands, Savannas, and Shrublands' , a non-forest biome, with accuracy of about 77.5% (appendix section 6).

Expert-local feedback
Expert feedback was collected from 123 sites in 29 countries, 13 biomes, and accounted for 141.8 million ha. Qualitative analyses of the feedback showed that additional feedback is needed for the Tropical and Subtropical Coniferous Forests biome, because the percentage of areas received feedback accounted for less than that of the forest regeneration mapped using remote sensing analyses (table 5). Additionally, Regarding the underlying factors thought to enable forest regeneration, 42% of the responses reported NFR and 19% reported active regeneration such as assisted regeneration and agroforestry (dataset S2). Enabling factors for NFR could be categorised into three main types: 1. natural dynamics (17%); 2. establishment and impact of protected areas (33%); and 3. agriculture or pasture land abandonment (42%). For natural dynamics, forest regeneration was reported to have occurred on newly available lands due to naturally occurring phenomena such as meandering rivers. The impact of protected areas was interpreted as fostering NFR by the reduction in grazing, illegal logging, and fires, which ensure the survival of the seedlings. Lastly, NFR was also reported to occur in abandoned or fallow agricultural and pastoral lands. The abandonment was reportedly due to increases in labour cost, migration, and mechanisation.

Interaction with agriculture
The forest regeneration identified by local experts as NFR tended to occur in areas with a steeper slope, compared with those identified as active regeneration or those unidentified (figure S1). The only exception is eastern Asia, where the mean of the slope for NFR (mean = 8.65 • ) was significantly less than that of both unidentified (mean = 10.21 • ) and active (mean = 9.60 • ) regeneration areas, according to the unpaired two-samples Wilcoxon test (p < 0.0001). In general, regeneration occurred in areas with a slope less than 10 • , except for eastern and southern Asia.
Forest regeneration occurred over a wide range of levels of soil suitability for cultivation ( figure S2). The forest regeneration map showed that most regeneration in the boreal region took place at where the soil was the least suitable for agriculture. The locations include northern Europe, northern America (Canada and the United States), and Asiatic Russia, where most regeneration occurred in areas with less than 14% soil suitability for cultivation.
For some non-boreal regions, regeneration also tended to occur in regions with low soil cultivation suitability, such as areas in southeastern Asia, South America, and Sub-Saharan Africa. Regeneration in developed regions such as Europe and Australia/New Zealand occurred in places with a wide range of soil suitability levels and the mean and median values were on the higher ends of suitability (over 53% of the areas were suitable for cultivation). In contrast, in central America (including Mexico) and the Caribbean, the majority of regeneration occurred in areas with greater than 80% soil suitability for cultivation. But areas reported as NFR were located in lands with soil less suitable for cultivation than those reported as agroforestry/active regeneration, except for Europe.
For each country, the agriculture opportunity cost of the sites where forest regeneration happened was less than where it did not occur. The average difference was about $126/ha. Distribution of the difference in the agriculture opportunity cost for all the countries is available in appendix section 11.

On the scale of forest regeneration and carbon sequestration
In comparison with the FRA estimates, where 150 million ha of forest were reported as recovered between 2000 and 2015 [8], our estimate of 55.7 million ha is a much lower figure. However, our study results are complementary to the estimate of the FAO and not suitable for direct comparison, as the FAO estimate was based on national land use reports [8] while our estimate was based on land cover data. The recovered forest that countries reported could be too young to be detected by remote sensing and the exclusion of the areas that were forested from 1990 to 1999 likely resulted in the lower estimate in our study. Similarly, our estimates were smaller than those of Fagan et al [7], but they fell within the range of uncertainties. The smaller values are likely due to our study having a longer time frame, a higher canopy cover threshold in defining forest (Fagan et al 2022 used 10% as the threshold [7]), and more restrictions on the prior land cover. Our estimate could be considered as conservative because for an area to be identified, at least two remote sensing datasets were needed to support each and all of the criteria stated above. Such an approach was used to balance the need of minimising false positives, while providing information that can be used for large-scale planning.
Estimating the proportion of NFR within the total area of identified forest regeneration remains highly uncertain, as the total area of feedback received was limited. Collecting feedback from around the world remains challenging. If we assume the survey feedback represented a comprehensive and an unbiased sample of the total mapped regeneration area, then out of the 55.7 million ha of potentially regenerated forests, 42% could be attributed to NFR. However, neither remote sensing nor experts can accurately differentiate between active regeneration and NFR at the global scale, because in reality, spontaneous forest regeneration can happen in between intensely managed forests [53]. Additionally, NFR could have been missed in the boreal region due to the slow growing nature of secondary forests or dry forest regions due to low vegetation signal-to-noise ratios [54].

Forest encroachment to other ecoregions
Forest encroachment to other non-forest ecoregions creates more negative impacts on the environment than positive. About 4.6% of the mapped forest regrowth occurred in non-forest biomes. Such forest encroachment in non-forest biomes, such as tundra [55,56], grassland/savanna [55], or other previously non-forested areas at high altitude [57] could be part of the complex interactions and dynamics between the changing climate and biomes [55]. Forest encroachment in those non-forest biomes not only causes a positive feedback that exacerbates global warming due to the decrease in albedo as darker tree canopy covers replace lighter surfaces such as snow [58] and grass. It also results in the loss of biodiversity [59]. In areas vulnerable to fire, burning of the encroached forests releases more carbon than grasses [60].

Conditions driving forest regeneration
The conditions driving forest regeneration are quite different when comparing boreal and non-boreal regions. For the boreal region, forest regeneration is mostly associated with events caused by global climate change, such as post-fire or post-disturbance regrowth [61,62]. Global climate change also disturbs the fire and hydrological cycles, which could lead to forest loss [63,64] and species composition change [63] in the boreal region. For example, local experts from Mongolia confirmed changes in species composition in the northern part of the country. Species composition change can be detected from remote sensing as greening [65].
In contrast, forest regeneration in non-boreal region was more closely related to agricultural land use dynamics, as depicted in the forest transition theory [66]: it is expected that forest loss during the early stages of economic development will be followed by recovery at later stages in areas where the dependence on agriculture declines, or when countries depend more on agricultural imports [15,66]. The pattern of the distribution of forest regeneration across soil suitability supports such theory (figure S2): it was more evenly distributed in the more economically developed geographical regions, where most of the countries belong to the post-transition status (e.g. Europe, Australia/New Zealand), than in regions where most countries are dependent on the agricultural sector [67]. Another evenly distributed region was eastern Asia, where the majority of the mapped regeneration came from China. Its government policies for nation-wide afforestation/reforestation programmes were the primary factor contributing to forest regeneration [68]. For regions that are highly dependent on the agricultural sector, such as South America and sub-Saharan Africa, forest regeneration took place in areas with lower levels of soil suitability, suggesting that agricultural abandonment occurred in marginal lands. In southeastern Asia, the detected forest regeneration was likely due to regrowth after land was left fallow under shifting agriculture [69][70][71], an observation confirmed by feedback from various experts. Similar feedback was received from experts for areas in southern Mexico.

Implication to forest restoration policies and looking forward
A sustainable project would be located in an area where biophysical or socioeconomic risk is minimised. Forests that regenerated in areas highly suitable for agriculture tend to be more productive and resilient to biophysical risks such as fires or drought [5]. Since many of the regenerated forests identified in this study were found on land with poorer soil suitability, they might be more vulnerable to biophysical risks. Socioeconomic risk cannot be generalised at the global level. However, a successful restoration project has to be locally based, inclusive, socially desirable, and satisfying to the interests of various stakeholders at both the local and national level [72]. Establishing protected areas, such as national parks or indigenous territories, if they are well managed, allows for better fire management and decreases pressure from grazing or other land uses, which could ensure the survival of young trees [73]. An important note is that the effectiveness of protected areas depends on the interests of local stakeholders. In all reported cases where the protected area successfully enhanced NFR, it was part of larger land use planning that benefitted the communities, such as mitigating landslides, increasing agricultural yield as the result of improving habitats, or generating economic benefits from tourism. For long term sustainability of a forest restoration project, the interests of the local stakeholders must be taken into account.
We further invite additional feedback from experts about the socioeconomic, policy, and historical context of the mapped regenerated areas 4 . The goal is to form a baseline understanding of the current conditions (2000-2018) that resulted in successful forest regeneration so that it can be promoted in places where there appear to be low barriers to success. Remotely sensed data, when coupled with periodic local surveys, can provide information for land managers to meet the different needs of various stakeholders [13]. Additional expert feedback would still be needed in order for the feedback to be representative of the distribution of the regenerated forest, particularly for the following biomes: Boreal Forests/Taiga, Temperate Coniferous Forests, Temperate Grasslands Savannas and Shrublands, Tropical and Subtropical Coniferous Forests, and Tropical and Subtropical Dry Broadleaf Forests (see dataset S3 for details). The results of this study show NFR has been occurring at significant scale and the enabling conditions appear to differ substantially across continents, biomes and countries. Therefore, to better achieve global goals in carbon reduction and improving habitats, such as the Bonn Challenge, we suggest prioritising areas for in-depth analyses based on local relevance and demand.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).