Evaluating permafrost definitions for global permafrost area estimates in CMIP6 climate models

Global permafrost regions are undergoing significant changes due to global warming, whose assessments often rely on permafrost extent estimates derived from climate model simulations. These assessments employ a range of definitions for the presence of permafrost, leading to inconsistencies in the calculation of permafrost area. Here, we present permafrost area calculations using 10 different definitions for detecting permafrost presence based on either ground thermodynamics, soil hydrology, or air–ground coupling from an ensemble of 32 Earth system models. We find that variations between permafrost-presence definitions result in substantial differences of up to 18 million km2, where any given model could both over- or underestimate the present-day permafrost area. Ground-thermodynamic-based definitions are, on average, comparable with observations but are subject to a large inter-model spread. The associated uncertainty of permafrost area estimates is reduced in definitions based on ground–air coupling. However, their representation of permafrost area strongly depends on how each model represents the ground–air coupling processes. The definition-based spread in permafrost area can affect estimates of permafrost-related impacts and feedbacks, such as quantifying permafrost carbon changes. For instance, the definition spread in permafrost area estimates can lead to differences in simulated permafrost-area soil carbon changes of up to 28%. We therefore emphasize the importance of consistent and well-justified permafrost-presence definitions for robust projections and accurate assessments of permafrost from climate model outputs.


Inconsistencies in model-based permafrost estimates
Permafrost, generally defined as ground that is frozen for two consecutive years [1], is currently subject to change due to global warming [2].Vast areas of permafrost are found in the Northern Hemisphere high latitudes, and high-altitude environments on the Tibetan Plateau [3].These permafrost regions are spatially heterogeneous with distribution zones characterized depending on the fraction of the ground surface that is underlain by permafrost-typically denoted continuous, discontinuous, sporadic, and isolated permafrost zones [4,5].Arctic amplification means these regions are undergoing large changes because they are exposed to warming up to four times the global average [6], making permafrost a vulnerable component of the climate system.Permafrost thaw can occur over decades to millennia, causing a positive carbon-climate feedback [7] and local to regional impacts on ecosystems [8] and society [9].
Monitoring these changes in permafrost temperature and extent is commonly based on sparse and inhomogeneously distributed site-specific observations, often aggregated to provide information about the large-scale status of permafrost.The Global Terrestrial Network for Permafrost coordinates terrestrial permafrost monitoring with ground temperature observations from over 1000 boreholes [10].However, these monitoring sites are unevenly distributed, and their thermal properties are inhomogeneous, which hinders permafrost data extrapolation to larger regions and results in large unsampled areas [5].Thus, despite the importance of permafrost for the climate system, high-resolution detection of permafrost presence is limited at the global scale [3,11].
Therefore, Earth system model (ESM) simulations are commonly used under the assumption of various shared socioeconomic pathways (SSPs) to explore the response of permafrost (herein referring to near-surface permafrost within the first few meters of the ground) to future changes in climate and to determine its spatial extent.Resulting permafrost masks are commonly used for bulk estimates of model output variables describing carbonrelated emissions, hydrology, and vegetation in permafrost regions [12][13][14][15].However, the confidence in model-derived estimates for future permafrost conditions is often based on a comparison with the observation-based estimates under past and current climate conditions.Evaluating different generations of ESMs has revealed large uncertainties compared to observational estimates, as well as between models, even in the latest generation of models [16,17].In addition to these inter-model differences, numerical estimates of permafrost area are also subject to uncertainty from variations in the methods used to define permafrost presence, which can provide a range of results for the same model.Commonly, individual studies use a single method to define the presence of permafrost.However, some studies have provided insights into the uncertainty between methods with an ad hoc comparison of permafrost presence based on only a few definitions of permafrost extent [16][17][18].Typically, they find substantial variations in the derived global permafrost area, indicating a considerable impact of permafrost definitions on estimates of the spatial extent of permafrost.Hence, using different definitions can introduce biases when comparing and integrating data from different studies, hindering interdisciplinary collaboration and the synthesis of current knowledge on permafrost dynamics.
Here, we present a systematic assessment of the impact of the definitions of permafrost presence for climate model-based estimates.We provide permafrost area estimates following common definitions used in recent literature derived from 32 ESMs from the Coupled Model Intercomparison Project Phase 6 (CMIP6).We discuss the potential causes of differences between models and intra-model variations caused by different permafrost-presence definitions on estimates of permafrost area in the Northern Hemisphere.Further, we demonstrate the impact of definition-caused variations in the permafrost presence on soil carbon changes in the permafrost region over the 21st century and how the propagation of uncertainties may introduce systematic biases in projections of the evolution of the permafrost region in the future.

Permafrost-presence definitions
We use three different types of definitions to diagnose permafrost presence from ESMs: derived from 1) soil temperature (Ground thermodynamic), 2) air temperature with an assumed air-ground coupling (Air-Ground coupling), and 3) frozen soil moisture (Ground hydrology).Based on these categories we calculate permafrost presence using a total of 10 different definitions (table 1).
Ground thermodynamic definitions are based on ground temperatures (table 1: SLT, ALT, ZAA, TTOP).These definitions assume permafrost presence to be predominantly determined by the propagation of surface temperatures into the ground, though are influenced by the depth of the model bottom boundary condition [35,36] and soil thermal diffusivity [37,38].Definitions of this type differ in the depth at which ground temperatures below 0 • C are considered to determine the presence of permafrost (fixed for SLT and variable for ALT, ZAA, and TTOP).
Definitions based on air-ground coupling generally infer permafrost presence from air temperature data.Such definitions are based on a predetermined relationship between ground and air temperatures in permafrost regions (table 1: PROB, FNA, FNG).The relationships between ground and air temperatures used in definitions of this type are governed by land surface characteristics, such as snow or vegetation cover.Thus, for these definitions, differences in the representation of ground thermodynamics among climate models are not considered.This implies a possibility for a better cross-model agreement in the permafrost area estimates for models that project a similar climate and its variability.
Permafrost-presence definitions based on soil hydrology are mainly used through geophysical modeling in hydrological/hydrogeological studies.They rely on soil ice content or saturation, derived from soil ice fraction or the presence of frozen soil water below the freezing point (table 1: SIC).The choice of defining permafrost this way is primarily due to the fact that soil ice saturation of about 75% by volume often results in practically impermeable soils [39].
To evaluate permafrost presence based on the 10 different definitions, we use subsurface and surface temperature, air temperature and soil moisture (table 1) from the monthly output of 32 CMIP6 , where DDT (DDF) are the degree days of thawing (freezing).These are calculated from monthly data and are the sum of the degree days above (below) freezing in any given year.FN has a defined threshold, typically 0.5 (FNA5), with values above that threshold considered as permafrost.Here we also discuss the value of 0.6 (FNA6).

tas
Air-ground coupling [16,29,30] Frost number from ground surface temperature FNG Frost number is defined as for FNA5/FNA6 above but using monthly soil temperatures at 20 cm depth.
tsl Air-ground coupling [16,30,31] Soil ice amount SIC There is more than 0.5 kg m −2 of frozen soil water in the soil column summed over all soil layers, averaged over the grid cell area fraction (i.e. the total mass of frozen water contained in the soil column divided by the grid cell land area).mrfso Soil hydrology [32][33][34] a Universal name of variables in CMIP6 data used for the calculations herein.b Examples of studies using the respective definition.models on native model grids for the historical and SSP5-85 scenario periods.Our selection of models was determined by data availability on the CMIP6 data server at the time of access.Subsequently, permafrost presence maps are regridded with bilinear interpolation to a common horizontal resolution of CESM2 (0.9 × 1.25 grid; 288 × 192 longitude/ latitude).

Evaluation of permafrost area in CMIP6 models
The estimates of the northern high-latitude permafrost area show a large spread of results among the 10 permafrost-presence definitions and across individual models (figure 1(A)).Despite some models such as GFDL, ACCESS and NorESM2 showing reduced sensitivity to the definitions, most models have a large definition-related spread of 15-25 Mio.km 2 in the calculated permafrost area.A few models, such as GFDL, GISS-E2-2-G and ACCESS-CM2, show a much larger permafrost area compared to present-day estimates of 13.9 (10.1-19.6)Mio.km 2 [4], regardless of the definition.The other models typically have the majority of their definitions in agreement with the observational estimates.Many models show clustering of multiple definitionbased estimates around the same value, although  1. Observation-based estimates from Obu et al [40].are shown as gray dashed lines with its uncertainty range as gray shaded area.Note that marker colors for the permafrost definitions also apply to the lines in the righthand side panels.(E) Model ensemble-averaged permafrost area for all 10 definitions in the SSP5-85 scenario.(F) and (G) Absolute spread contributions [Mio.km 2 ] from models (averaged over all definitions) and from definitions (averaged over all models) for different percentiles of the distribution, respectively.Note that (F) and (G) have a common y-axis with (E).(H) same as (F) and (G) but as the relative contribution of model and definition spreads [%] for the 10th-90th percentiles (p10-90, from black line in panels (F) and (G)).
For the calculation of permafrost area, the native horizontal and vertical model grids were used.All data were then regridded to a common grid of CESM2 and aggregated spatially.some outliers result in a larger spread of results, for example, CanESM-based models.Overall, model performance lacks consistency, with no single model excelling across all permafrost-presence definitions, consistent with findings from Burke et al [17].MIROC6 performs the best, given that most of its permafrost area estimates fall into the uncertainty range of observations and show a relatively small spread.Figure 1(A) is sorted by increasing model depth and the deeper models appear to show a smaller spread in permafrost-presence definitions.Additionally, for the deeper models, the outlier of FNA5 tends to be less extreme when land model depth is increased.
Apart from the large model spread in the presentday permafrost area estimates, differences in the density distribution (across models) can be found between different definition types (figures 1(B)-(D)).The ground-thermodynamic-based definitions (SLT2, SLT3, ALT, ZAA, and TTOP; figure 1(B)) have their distribution peak close to the present-day reference permafrost area but show a wide distribution of estimates.This results from large inter-model differences in the representation of the subsurface thermal regime.In contrast, ground-air couplingbased definitions (figure 1(C)) show a decreased model spread, with PROB's density peak being close to the observation-based present-day permafrost area estimate.As PROB considers an observation-based relationship between air and ground temperatures, it better accounts for factors influencing thermal coupling, such as snow and vegetation cover, and soil thermal properties.FNA5-based estimates, on the other hand, show a significantly larger permafrost area for all models based on the standard frostnumber threshold of 0.5.Increasing the threshold for the FNA6 to 0.6 produces area estimates comparable to PROB, which utilizes more advanced groundair coupling.The larger disparity in permafrost area estimates resulting from the soil-temperaturebased definitions compared to the air-temperaturebased ones suggests that the influence of land model features on the overall range of results is more pronounced than that of surface climate conditions.This is also illustrated by the fact that frostnumber-based estimates based on ground temperatures (FNG) have a wider distribution, mirroring the results from the ground-thermodynamic-based definitions.Permafrost area calculated by SIC is, on average, close to present-day estimates but also has a wide model distribution.Hydrology-based definitions (figure 1(D)) for permafrost area estimates exhibit a larger spread than those based on temperatures, as not all current-generation climate models represent the presence of unfrozen water in the soil for temperatures below 0 • C. Furthermore, there is a much larger model spread in the representation of soil hydrology processes in permafrost regions than for temperature [41].
Significant differences are reflected by multimodel mean results for each definition (figure 1(E)).Respective permafrost area model average estimates for different definitions range between 15 and 21 Mio.km 2 for the pre-industrial baseline state, except for FNA5 that deviates at least 9 Mio.km 2 of permafrost area from the other definitions.Throughout the historical period and following the SSP5-85 scenario, permafrost area decreases gradually with surface warming for all the definitions, and differences in estimates between models also tend to decrease.Hence, permafrost-presence definitions overestimating present-day permafrost area experience a larger degradation, while the opposite applies to definitions yielding smaller permafrost area.The relatively homogeneous decline in permafrost area among the definitions suggests that estimates of permafrost have similar sensitivities to climate from the perspective of the multi-model mean.During the last three decades, SIC, ZAA, and TTOP exert a slower loss of model-average permafrost area.Hence, the definition spread is still large at the end of the 21st century.
The contribution of both model and definition spreads are quite different throughout the simulations (figures 1(F) and (G)).The definition spread of calculated permafrost area exerts a larger value than the model spread for the interquartile range (p25-75) and up to the 5th-95th percentiles (p5-95), while the absolute spread (p0-100) shows a larger range for the model spread.This leaves the definition spread fraction at approximately 68% contribution to the total variation in results across models and permafrostpresence definitions for the 10th-90th percentile (figure 1(H)).Hence, the selection of permafrostpresence definitions employed herein produces even more uncertainty in permafrost area estimates than from differences across CMIP6 models, which stays persistent over various percentile ranges.This is substantial, given that even the current-generation models still have a large simulation spread, as supported by figure 1(A) and Burke et al [17].Additionally, differences in the simulation of Arctic amplification likely increase the model spread [42], which leaves the conclusion that the relative definition spread can be even larger than the effective uncertainties from numerical differences among ESMs.During the 21st century, the definition spread reduces while the model spread increases by only 4%.This minimal decrease means that under warming conditions when significant biogeochemical and biogeophysical changes in the permafrost region can be expected, the differences between definitions still induce a large fraction of uncertainty.
Permafrost area estimates are illustrated spatially in figure 2. As expected from figure 1(A), the model agreement is highest in the ground-air coupling-based permafrost-presence definitions (PROB, FNA, and to some extent FNG).FNA5 strongly overestimates the permafrost domain by yielding permafrost too far south compared to observation-derived permafrost area [40] (figure 2; orange contour).PROB, instead, matches the observation-derived estimates quite well, with almost all regions being in agreement for all the models.However, for PROB, all models exceed the southern edge of the observation-derived area near the Ural mountains, while in Alaska and South-East Siberia, fewer models are in agreement.Notably, the groundair coupling-based permafrost-presence definitions models agree on similar permafrost areas in North America and Eurasia.However, in Southern Yakutia, fewer models agree on the presence of permafrost for all definition groups.More consistent permafrost area agreement with observational estimates can be found with most ground-thermodynamicbased definitions, namely ALT, ZAA, TTOP, and the ground-hydrology-based SIC.However, model agreement among those is generally lower, with about 75% of models widely consistent, with only TTOP showing model agreement of 90%-100% in sporadic patches.The ground-thermodynamic-based definitions agree well on the observed southern boundary of the permafrost area in Eurasia but show more disagreement in North America, with a rapid southward decrease of model agreements.In most instances across all definitions, at least half of the models predict permafrost presence beyond the southern edge of observationbased permafrost area across the Arctic, extending permafrost area predominantly into Central Siberia and North-West Russia.Up to 25% of the models across the ground-thermodynamic-based definitions simulate permafrost presence as far south as 50 • N.However, this southward expansion is massively reduced for PROB, supported by its smaller model spread seen in figure 1(A).

Causes for inter-model variations
Many ESMs produce different surface climates due to variations in their global response to greenhouse gas forcing [42].Under warming, permafrost becomes  1 for grid cells in which more than 50% of the area fraction is considered to exert permafrost presence.The orange contour donates a probability of 50% of permafrost occurrence from observation-based estimates from Obu et al [40].For the calculation of permafrost area, the native horizontal and vertical model grids were used.All data was then regridded to a common grid of CESM2.

Figure 3.
Effect of land model depth on simulated permafrost area: (A) example of permafrost area comparison between two definitions (TTOP vs ZAA) shown for CMIP6 sub-ensembles of models with shallow (S; 2.89-14.00m) and deep (D; 42.10-90.00m) land models for the period 1997-2014.Note that this figure allows ignoring the effect of ECS differences between models by considering only the distance of markers to the diagonal reference line, whereas ECS controls where each model sits along that line.(B) Simulated differences δ for permafrost area between all unique 45 combinations of definition-comparisons for both model-depth CMIP6 sub-ensembles.The difference is defined as δz = 1 nz ∑ nz m=1 A1(m) − A2(m), with z being either S for the shallow model group or D for the deep model group, m the number of models in either group, and A1 and A2 the permafrost area calculated by two definitions.The red marker donates the result for the example of (A).The dominance of points below the diagonal reference line indicates a worse performance of shallow models compared to deeper models.more vulnerable to increasing summer air temperatures and the direction and magnitude of winter snow depth changes determines the degree of the land surface cover insulation.This effect is particularly relevant at the southern edge of the permafrost regions [43].The regional climate response is more complex, predominantly related to Arctic amplification, which can cause differences in the exposure of permafrost regions to warming among ESMs.Additionally, permafrost area differences between ESMs can be expected due to differences in the structure and parameterization of the land surface model components, recognized as one of the primary sources of uncertainty in Arctic climate change projections [44,45].Differences in the representation of processes such as the snow insulation, the thermal and hydrological discretization, the depth of the soil column, the definition of thermal properties due to water phase changes, and the physical characteristics of the organic layer near the ground surface, limit their ability to represent subsurface processes, particularly relevant for cold regions [17,21,25,36,37,46].
Many models employ a zero-flux bottom boundary condition that distorts the representation of subsurface temperatures by impacting thermal heat diffusion [36,37,47].The effect of model depth is investigated in figure 3. We divide the CMIP6 models into two sub-ensembles, separated by model depth.Shallow models cover depths of 2.89-14.00m and deep models depths between 42.10 and 90.00 m (also see figure 1(A)).As an example, the differences in simulating permafrost area between TTOP and ZAA show almost perfect agreement for deep models but give a lower performance for many shallow models (although some show good agreement, too) (figure 3(A)).The advantage of this analysis is that the influence of ECS can be neglected by considering the distance of markers to the diagonal reference line, while ECS controls the position of markers.This allows evaluating the average distance of markers for the model groups across all 45 possible unique comparisons between definitions, for which the results are shown in figure 3(B).Overall, the shallow model group exerts larger deviations than the deep model group across all definition combinations, indicating a better performance of models with increased land model depth for simulating ground permafrost conditions.Thus, the results illustrate a clear effect of land model depth on estimates of permafrost area in the CMIP6 ensemble.We note that 9 of 12 models from the deep model group are based on different versions of the Community Land Model, which means that their results could be influenced by the performance of that specific land model.Nonetheless, its model depth is significantly larger than the shallow model group, and results can therefore be attributed to the relatively better representation of its subsurface thermal regime.
Too shallow ground depth in models may influence soil temperatures by providing low heat sinks [36,48].Accordingly, shallow models are subject to larger annual temperature fluctuations, leading to intra-annual freeze-thaw cycles in marginal regions of permafrost extent otherwise classified as being permanently thawed in deeper models.During the thawing process, temperature fluctuations are subdued because energy is transferred by the phase change between ice and water.Hence, temporal variability in calculated permafrost area can be dampened for those models that consider water phase changes due to the zero-curtain effect.
Differences in the simulated snow insulation capability may leave soils at different temperatures with equal snow depth.As the snow cover in high-latitude regions persists for a significant part of the year, variations in the simulated snowpack are associated with substantial differences in both surface and subsurface temperatures [49][50][51].Excessive snow mass at the hemispheric scale is found to be a feature of many CMIP6 models.However, inconsistent timing of snow onset season and spring snow-melt facilitate surface cover difference between ESMs, mainly influencing the exchange of heat between air and ground [50].In some models, snow is represented rather simply with static snow properties (e.g.snow density, snow conductivity), the absence of liquid water in snow, and single or composite snow layer schemes [51].The lack of structural complexity and snow-specific dynamic processes makes the soil more susceptible to air temperature variations, leading to biased seasonal snow insulation and a systematic misrepresentation of soil temperatures [51], affecting the simulated presence of permafrost.
Permafrost integrity and snow cover may also be influenced by high-latitude vegetation [52].On the one hand they serve as snow traps and protect it from wind erosion.On the other hand, dark shrub branches absorb sunlight, warming the snow and altering its properties [53].Further, the vegetation buffers underlying permafrost from climatic conditions through mechanisms such as shading [54], the modulation of surface turbulent fluxes [55], hydrological interaction [41,56] and the accumulation of organic layers.The latter constitutes another insulating surface layers consisting of plant litter, moss, and lichen, which play a crucial role in controlling the depth of thawing [57,58].Heterogeneity in microtopography, permafrost characteristics and ground hydrological conditions make it challenging for climate models to accurately simulate permafrostclimate interactions [17].
Furthermore, models rely on data inputs, organic soil representation, and pedotransfer functions, including ground thermal properties such as thermal conductivity and specific heat capacity [46,[59][60][61].Respectively, profiles of soil and bedrock thermal properties can significantly impact simulating soil temperatures and hydrology [37].Obtaining accurate and representative measurements of these properties across large areas of permafrost regions is challenging and subject to large variations among models [38].Variability in these properties can introduce uncertainties into the model outputs on ground temperatures, ultimately impacting the assessment of permafrost distribution.

Effect on permafrost-based model diagnostics
Spatial distribution of permafrost presence is commonly used to assess different modeled quantities, such as permafrost-area soil carbon [62] or burned area, and its evolution under future scenarios [63].Given the large differences in calculated permafrost area from permafrost-presence definitions in figures 1 and 2, we quantify how these uncertainties propagate to dependent diagnostic variables, with the example of permafrost areas soil carbon.The absolute definition spread S A (absolute differences between the definition-based estimates of permafrost area) relative to the permafrost-presence definition median varies between the models from 127 (−82-45) to 280 (−175-105) Pg C (figure 4(A)).This represents the substantial discrepancies in permafrost-area soil carbon estimates derived from the different permafrostpresence definitions, making the choice of definition a substantial factor in the assessment of potential greenhouse gas release in permafrost areas.However, this spread only represents an offset in absolute values and, therefore, becomes irrelevant when considering relative changes in permafrost-area soil carbon.Hence, the relative definition spread S R (relative increase of definition-based permafrost-area spread over time with reference to 2015) gives more insight into differences in definition-based estimates of permafrost area and the associated over-or underestimation of permafrost-area evolution under a given climate scenario (figure 4(B)).Under the SSP5-85 scenario, permafrost-area derived soil carbon changes are very different among a selection of seven CMIP6 models, ranging between about 110 Pg C of both carbon uptake and carbon loss.Note that only seven models provided the required soil carbon output for our scenario selection.For the extreme cases, the permafrost-presence definition spread maximum makes up between 25 and 40 Pg C. Generally, the larger the soil carbon change during the simulation, the larger the effect of the definition spread.
Therefore, S A is 2-6.5 times larger than the absolute change in permafrost-area soil carbon under the SSP5-85 scenario (figure 4(C)).Apart from MIROC-ES2L, which shows only a relative definition spread S R of below 5%, most models agree on S R making up more than 30% of their respective absolute soil carbon change (figure 4(D)), despite showing very different soil carbon change trajectories (figure 4(B)).Both S A and S R consistently reduce considering decreasing percentile ranges.However, for the maximum range, the overall model mean S R is 28%, which illustrates that estimates of future permafrost soil carbon estimates could be significantly overor underestimated by choosing different definitions of permafrost area.

Summary and conclusions
Permafrost area estimates are highly dependent on the method used to define permafrost presencefrom either ground temperatures, ground ice content, or air temperatures.Generally, CMIP6 models do not accurately simulate permafrost distribution, although some models may exhibit satisfactory performance compared to observation-based presentday estimates.Variances in the representation stem from various factors: (1) dissimilarities in simulated surface climate, (2) inconsistencies in the representation of air-ground thermal coupling, and (3) differing capabilities of the underlying land surface models to simulate the ground thermal regime and hydrothermodynamic exchanges.We find that the latter likely exerts a larger influence, which is consistent with findings from previous literature [16,17].
Defining permafrost presence by ground thermodynamic-based methods (SLT, ALT, ZAA, TTOP) gives the largest spread in calculated permafrost area, presumably because of large model differences in the numerical representation of subsurface thermodynamic processes.This is mainly caused by variations in land-model depth, soil parameters such as soil thermal conductivity, and surface processes such as snow insulation.All ground thermodynamicbased definitions are subject to these uncertainties.In contrast, ground-air coupling-based definitions give better model agreement (PROB, FNA), as they are not subject to the limitations of the thermodynamic assessments.However, the frost number definition with the common threshold of 0.5 (FNA5) greatly overestimates the present-day permafrost area.Adjusting the threshold value to 0.6 (FNA6) gives better results.The only method based on soil hydrology also shows a large model spread, presumably because of differences in soil water freezing parameterizations and the interaction with variations in ground thermodynamics among models.The variation in permafrost area across the CMIP6 models, as derived from soil-thermodynamic-based definitions, is influenced by both climate and land model differences.In contrast, permafrost area from land surface climate-based definitions is driven purely by climate differences.The disparity between the two approaches highlights the substantial role of the land surface model in assessing permafrost in ESMs.
The spread in permafrost areas from different definitions is more than twice as large as the spread between models within a given definition, which is only slightly reduced under future warming scenarios.Consequently, calculated permafrost area differences can result in substantial discrepancies in, for example, soil carbon estimates, with absolute differences up to 6 times, and relative differences up to 28%, larger than absolute soil carbon changes observed on average among CMIP6 models under SSP5-85.
Based on our findings, and given the limitations of land models to represent the subsurface thermal regime, we recommend defining permafrost presence based on air-ground coupling methods that properly take into account surface cover information for a more realistic air-ground heat transfer when comparing model output based on permafrost area.From the definitions presented here, that is PROB and FNA6.For ground-based definitions, TTOP, ZAA and SIC perform best, but all are subject to relatively large model disagreements.
Permafrost presence definitions are crucial for evaluating and validating climate models against observational data.Consistency in defining permafrost presence allows for a more accurate assessment of model performance and the identification of model biases and uncertainties, as the choice of permafrostpresence definition can influence projections of permafrost response to climate change.Further, it is important to consider the specific definitions used in individual studies when comparing and synthesizing global permafrost area estimates, specifically when considering model intercomparisons such as CMIP6.Our results highlight the importance of consistent and well-justified permafrost-presence definitions in climate modeling to ensure robust and reliable projections of permafrost dynamics and their feedbacks in the Earth system.With this, we call for more consciousness in defining and reporting on permafrost area estimates in model assessments to improve the accuracy of projections and avoid substantial over-or underestimation of future permafrost-area dependent model diagnostics.

Figure 1 .
Figure 1.Northern Hemisphere permafrost area estimates based on the definitions in table 1: (A) definition-based distribution of historical (1997-2014) permafrost area estimates for each model in the CMIP6 ensemble.Models are sorted by their land model depth increasing from left to right.(B)-(D) illustrate the density distributions of simulated permafrost area for each permafrost-presence definition for all models shown on the left and for the three types of permafrost-presence definitions in table1.Observation-based estimates from Obu et al[40].are shown as gray dashed lines with its uncertainty range as gray shaded area.Note that marker colors for the permafrost definitions also apply to the lines in the righthand side panels.(E) Model ensemble-averaged permafrost area for all 10 definitions in the SSP5-85 scenario.(F) and (G) Absolute spread contributions [Mio.km 2 ] from models (averaged over all definitions) and from definitions (averaged over all models) for different percentiles of the distribution, respectively.Note that (F) and (G) have a common y-axis with (E).(H) same as (F) and (G) but as the relative contribution of model and definition spreads [%] for the 10th-90th percentiles (p10-90, from black line in panels (F) and (G)).For the calculation of permafrost area, the native horizontal and vertical model grids were used.All data were then regridded to a common grid of CESM2 and aggregated spatially.

Figure 2 .
Figure 2. Permafrost area model agreement: Number of models agreeing on the spatial extent and location of simulated permafrost between 1997 and 2014 based on the 10 definitions in table1for grid cells in which more than 50% of the area fraction is considered to exert permafrost presence.The orange contour donates a probability of 50% of permafrost occurrence from observation-based estimates from Obu et al[40].For the calculation of permafrost area, the native horizontal and vertical model grids were used.All data was then regridded to a common grid of CESM2.

Figure 4 .
Figure 4. Permafrost-presence definition-based soil carbon change: (A) absolute spread SA of total soil carbon within permafrost area in 2015 relative to the median value of permafrost-presence definitions for a selection of seven CMIP6 ESMs.(B) Evolution of permafrost-area soil carbon change and relative definition spread SR with respect to their values at 2015.This allows to determine the relative change of the definition-based spread when considering permafrost-area soil carbon changes relative to a reference date.Shaded areas are determined by the spread in soil-carbon change in reference to the definition-based differences of calculated permafrost area for different percentile ranges.(C) Absolute definition spread SA and (D) relative definition spread SR for permafrost-area-based soil carbon estimates relative to the absolute soil-carbon change at 2100.Note that for (C) and (D) various percentile ranges are overlayed by each other (rather than plotted cumulatively).The cross-model averages for the percentile ranges are shown as gray lines.

Table 1 .
Description of the 10 different definitions of permafrost presence used for the calculation of permafrost area.