Historical evaluation and future projections of compound heatwave and drought extremes over the conterminous United States in CMIP6

Independently, both droughts and heatwaves can induce severe impacts on human and natural systems. However, when these two climate extremes occur concurrently in a given region, their compound impacts are often more pronounced. With the improvement in both the spatiotemporal resolution and representation of complex climate processes in the global climate models (GCMs), they are increasingly used to study future changes in these extremes and associated regional impacts. However, GCM selection for such impact assessments is generally based on historical performance and/or future mean changes, without considering individual or compound extremes. In contrast, this study evaluates historical performance and projected changes in heatwaves, droughts, and compound heatwave-droughts using an ensemble of GCMs from the latest Phase 6 of Coupled Models Intercomparison Project at a regional scale across the conterminous United States. Additionally, we explore the inter-model differences in the projected changes that are associated with various characteristics of extremes and the choice of drought indices. Our analysis reveals considerable variation among the GCMs, as well as substantial differences in the projected changes based on the choice of drought indices and region of interest. For example, the projected increases in both the frequency and intensity of drought and associated compound extreme days, based on the standardized precipitation evapotranspiration index far exceed those derived from the standard precipitation index. Further, the largest changes in the frequency of compound extremes are projected over the Southwest, South Central, and parts of the Southeast while the smallest changes are projected over the Northeast. Overall, this study provides important insights for the interpretation and selection of GCMs for future assessment studies that are crucial for the development of regional adaptation strategies in the face of climate change.


Introduction
Heatwaves and droughts individually put enormous strain on various components of human and natural systems [1][2][3][4][5].However, when these extreme events occur concurrently, their severity and associated impacts can compound due to feedback mechanisms that intensify the magnitudes of each extreme.For instance, drought-induced moisture deficits increase the sensible to latent heat ratio by limiting land evaporation, which in turn enhances local warming.Similarly, heatwave-induced high temperatures further intensify and prolong droughts by increasing evaporative demand [6].Concurrent hot and dry conditions also increase the risk of other hazards such as wildfires, that result in poor air quality, impact human health, and have lasting ecosystem effects [7].Additionally, these extremes result in compounding impacts on infrastructure.For instance, prolonged drought reduces electricity production (e.g. by decreasing hydroelectric power supply and reducing cooling water for thermal electricity generation) while heatwaves increase electricity demand, straining the power grid when occurring in tandem [8][9][10].Therefore, a comprehensive evaluation of projected compound heatwave-drought (hereafter CHD) events and associated risks is crucial.
Numerous studies have evaluated the characteristics of individually occurring heatwave or drought events [11][12][13][14][15][16][17].Recent decades have witnessed a marked increase in the frequency and intensity of heatwaves over the United States, with such trends projected to exacerbate due to global atmospheric warming during the 21st century [13][14][15].However, trends in droughts are more regionally heterogeneous.For example, from 1925 to 2003 the southwestern United States experienced an upward trend in drought frequency and severity, while the remainder of the United States experienced decreasing trends [18,19].During the last decade, there has been growing interest in the investigation of CHD events.Recent studies reveal a surge in the occurrence of CHD events over the past several decades [7,20] and project continued increases during the 21st century [21].
With an increase in spatial resolution and improved representation of complex processes in the global climate models (GCMs), these models are increasingly used for climate change impact assessment either directly or after downscaling to a higher resolution.The selection of GCMs for these impact assessment studies is generally based on their performance in simulating historical climate mean and extremes, and/or the extent of future changes in mean climate [22,23].However, the extent to which these models project changes in extremes and concurrently occurring extremes has not generally been used as a criterion.Additionally, previous studies have generally focused on continental or global scales whereas in-depth regional scale evaluations are still limited [24,25].A thorough extremes-based evaluation at a regional scale can aid in the selection of GCMs for applications that require insight regarding both individual and CHD extremes for relevant adaptation studies.
Therefore, in this study, we perform a comprehensive evaluation of heatwave, drought, and CHD extremes using an ensemble of Coupled Models Intercomparison Project Phase 6 (CMIP6) GCMs, the latest suite of Intergovernmental Panel on Climate Change models with a focus on regional scales over the conterminous United States (CONUS).Specifically, we assess the capability of CMIP6 GCMs in simulating the characteristics of these extremes in the historical period and evaluate the projected future trends under the Shared Socioeconomic Pathway 585 (SSP585) scenario.Additionally, we investigate regional differences in the projected future changes of extreme event characteristics that arise from the choice of GCMs and drought metrics over the CONUS.Our analysis also demonstrates substantial inter-model differences that may serve to guide the selection of GCMs for future regional assessment studies.

Data
We use daily maximum temperature (T max ) to identify and calculate the heatwave characteristics and monthly averaged 2 m temperature, and precipitation (P) to calculate drought indices.We use an observational and a reanalysis dataset to evaluate the performance of GCMs for the 1981-2014 period, i.e. the parameter-elevation regressions on independent slopes model (PRISM) observations at 4 km horizontal resolution [26] and the European Centre for Medium-Range Weather Forecasts reanalysis (ERA5) Land dataset at 9 km horizontal resolution [27] respectively.We utilize 23 CMIP6 GCMs in historical (1981-2014) and future periods (2015-2100) under the SSP585 scenario (table S1).The GCMs are chosen based on the availability of all the variables that are needed for heatwave and drought identifications.We also obtain latent and sensible heat fluxes from the GCMs to calculate evaporative fraction (EF).Prior to analysis, all the GCMs, PRISM, and ERA5 datasets are remapped to a common 1 • latitude-longitude grid using bilinear interpolation.The future changes are calculated for two 40 year periods: 2021-2060 (mid-century) and 2061-2100 (endcentury) with respect to 1981-2020.The projected changes are calculated for the CONUS and seven United States Geological Survey climate adaptation science center regions (hereafter regions) shown in figure 1(a).

Methodology 2.2.1. Heatwaves
We identify heatwaves as a period of at least three consecutive days with T max above the 95th percentile of T max for all the summer months (June, July, and August) from 1981 to 2014.The 95th percentile T max thresholds are calculated separately for each grid cell and each dataset.To accurately compare heatwaves across different time periods, we use the historical T max thresholds to identify the heatwaves in both historical and future periods.Heatwave intensity is calculated as the mean of T max during the heatwave days.We acknowledge that this particular definition could result in classifying longer heatwaves with a mix of high T max and moderate T max days as having a lower overall heatwave intensity compared to shorter heatwaves composed only of high T max days.

Droughts
We use two different drought indices in this study: (1) the standardized precipitation index (SPI) to identify precipitation deficits [28], and (2) the standardized precipitation-evapotranspiration index (SPEI) to capture the combined effects of precipitation and evaporative demand on regional water availability [29].Our analysis uses SPI and SPEI values calculated at 6 month timescales.In this study, we analyze only extreme droughts, which we define here as any period for which there are contiguous periods of 6 month SPI/SPEI <−1.64 (i.e.5th percentile) with at least 1 value <−2.054 (i.e.2nd percentile) [9,10].Extreme droughts are henceforth referred to simply as droughts in the subsequent sections for the sake of simplicity.Additionally, if a given month is under drought, all days during that month are considered to be under drought.Drought intensity is calculated as the mean of SPI/SPEI indices during the days under drought.Further details on calculating drought indices are provided in the supplementary material.

Compound events
On a given day, a grid cell is classified as being under: (1) CHD extremes, if both a heatwave and drought are occurring simultaneously, (2) heatwave only, if a heatwave is occurring but no drought, (3) drought only, if a drought is occurring but no heatwave, (4) heatwave, if a heatwave is occurring (irrespective of presence or absence of drought), and (5) drought, if a drought is occurring (irrespective of presence or absence of heatwave).

Historical evaluation of heat wave-drought characteristics in CMIP6 GCMs
The biases in the characteristics of extremes in CMIP6 GCMs with respect to PRISM are shown in figure 1.The mean bias averaged across all 23 CMIP6 GCMs (hereafter mean bias) in the frequency of heatwave days is positive over most of the CONUS except over parts of the Northeast (figure 1(a)).The mean bias in heatwave intensity during SPI-based CHD days is primarily positive over the majority of the Midwest and South Central, whereas mixed signs of biases are seen over the rest of the regions (figure 1(d)).In contrast, mean biases in the frequency of SPI drought days and intensity of SPI drought during CHD days are less regionally defined.Both the biases in the frequency and intensity of SPI drought days show high variability within regions (figures 1(b) and (e)).However, SPEI drought day frequency and SPEI drought severity during CHD days are underestimated over the majority of the CONUS except over certain parts (figures 1(c) and (f)).For instance, SPEI drought intensity is overestimated over the Southwest whereas SPEI drought day frequency is overestimated over parts of the Southwest and Southeast.We note that more negative values for drought intensity bias (shown in red) represent more severe drought in GCMs and vice versa.The percentage of heatwave and drought days that occur as CHD days are generally underestimated (figures 1(g)-(i)).For instance, the percentage of heatwave days that occur as CHD days is underestimated over most of the South Central, parts of the Midwest, North Central, Southeast, and Northeast (figure 1(g)).Similarly, the percentage of drought days that occur as CHD days is underestimated over most of the South Central, Midwest, parts of Northeast, and Southeast for both SPI-and SPEIderived droughts (figures 1(h) and (i)).
These mean biases could mask important differences across individual CMIP6 GCMs.Therefore, we investigate the inter-model variability by comparing the CONUS-scale mean bias in extreme event characteristics between all 23 CMIP6 models used in this analysis (figure 1(j)).We find that with a few exceptions, the majority of CMIP6 GCMs generally agree on the sign of biases in the heatwave and drought characteristics over the CONUS.For instance, out of the 23 analyzed GCMs, 15 overestimate heatwave intensity, 19 underestimate SPI drought intensity, and all 23 underestimate SPEI drought intensity during CHD days as compared to the PRISM data respectively.The frequency of heatwave and SPI drought days are overestimated whereas those of SPEI droughts days are underestimated by the majority of GCMs.The percentage of heatwave and drought days occurring as CHD days are underestimated by most of the GCMs in the case of both SPI and SPEI droughts.Overall, while CMIP6 GCMs show substantial biases in the simulation of characteristics of heatwaves, drought as well as in the frequency of CHD days with respect to PRISM, they generally show similar signs of biases in the simulation of these extreme characteristics.
To assess the observational uncertainty, we repeat the analysis to calculate the bias in the GCMs from ERA5 (figure S1).We find the most noticeable differences in the mean bias calculated relative to ERA5 and PRISM are in the frequency of extremes (figures S1(a), (b) and (j)).For instance, the mean bias in heatwave days is negative over the majority of Southwest and South Central regions for ERA5 whereas it is positive for PRISM (figures 1(a) and S1(a)).Similarly, the mean bias in SPI drought days is more negative over parts of Northwest, North Central, and Midwest for ERA5 than PRISM (figures 1(b) and S1(b)).The mean bias in the percentage of extreme days occurring as CHD days is also more negative when calculated based on ERA5 than PRISM.Consequently, the average CONUS bias shows marked differences in the frequency of extreme days with a more negative bias relative to ERA5 than PRISM.This also results in changes in relative bias among the GCMs (figures 1(j) and S1(j)).

Future changes in the extremes as projected by CMIP6 multi-model mean
We analyze the projected changes in the characteristics of extremes.Specifically, we evaluate the changes in the percentage of the area impacted by extremes (figure 2) and the intensity, and frequency of extremes over the CONUS and across the seven regions.(figure 3 and S2-S4).
During the 21st century, the percentage of the CONUS area under CHD days derived from both SPI and SPEI-based drought is projected to increase in the CMIP6 multi-model mean.The area under heatwaves is projected to increase substantially.However, in the case of drought, while the area under SPIbased drought shows comparatively small changes, it shifts from drought only toward the CHD days (figure 2(a)).Contrarily, the area under SPEI-based drought days increases as well as shifts towards CHD days.Regionally, there are marked differences in the projected changes in the area under CHD days.For instance, in the case of SPI-based extremes, there is an increase in drought as well as heatwave days over Southwest and South-Central regions.However, over the other regions, while there is an evident increase in area under heatwave-only, drought-only days generally shift towards the CHD days and show no evident change (figure S2).For SPEI-based extremes, areas under all kinds of extremes are projected to increase across all the regions.However, Southwest and South Central regions show a larger shift from heatwave only to CHD days than other regions (figure S3).
Spatially, the largest changes in the frequency of both SPI-and SPEI-based drought days are projected over Southwest, South Central, and parts of Southeast.While SPEI drought days are projected to increase over the remaining parts of the CONUS, SPI drought days show a decrease or no change over the Northeast, Midwest, and parts of Southeast (figures 3(b) and (c)).Contrarily, the number of heatwave days is projected to generally increase uniformly across all the regions (figure 3(a)).Consequently, the CHD days based on both SPI and SPEI are projected to increase across the entire CONUS with the largest increase over the Southwest, South Central, and parts of the Southeast following the increase in drought days (figures 3(d) and (e)).Further, the intensity of heatwaves during CHD days is projected to magnify over most of the CONUS with the largest increase over the North Central region (figure 3(f)).Contrarily, the largest increase in the intensity of both SPI and SPEI CHD days is projected over Southwest, South Central, and parts of the Southeast US.Additionally, while SPEI intensity exhibits a projected increase throughout the US, SPI intensity is projected to decrease in parts of the Northeast, Midwest, and Southeast (figures 3(g) and (h)).Further, the projected changes in intensity and frequency of extreme days exhibit a similar tendency of change in the mid and end century with an amplified signal in the far future (figure S4).

Key differences in projected future changes
We investigate the key differences in projected future changes in the characteristics of extremes.Particularly, we evaluate the inter-model differences among the GCMs, the regional variations within each individual GCM's projections, and the differences that arise from the choice of drought indices over the CONUS and across the seven regions (figures 4, 5, S5, and S6).We present these differences in bar plots and arrange the GCMs from left to right based on the magnitude of projected changes in the frequency of SPI-(figures 4 and S4) and SPEI-based (figures 5 and S5) CHD days over the CONUS.Across all the GCMs, we find a robust increase in the frequencies of both SPI-and SPEI-based CHD days over the CONUS and across all the regions in mid-and end-century periods (figures 4(a)-(h), 5(a)-(h), S5(a)-(h), and S6(a)-(h)).However, the magnitude of change differs substantially depending on the region, GCM, and drought metric.
For SPI-based CHD days, the projected changes in frequency vary from an increase of less than 1 day in the FGOALS-g3 GCM to more than 7 days in the CNRM-CM6-1-HR GCM with a multi-model mean change (i.e.mean change across all 23 GCMs) of ∼3 days per summer season over the CONUS in the end-century (figure 4(a)).At the regional scale, the southwest exhibits the largest (∼5 days) and the Northeast exhibits the smallest (∼1 day) increase in spatially averaged multi-model mean change in SPI-based CHD days (figures 4(b) and (h)).Moreover, with a few exceptions, GCMs that project comparably larger average changes over the CONUS also tend to project large changes over all the other regions.We also find that the projected changes in the frequency of SPI-based CHD days are positively correlated with projected changes in the frequency of SPI-based drought days across all GCMs (figures 4(i)-(p)).There is also a large intermodel variability in the projected changes in the SPI drought days over the CONUS with a few GCMs projecting decreases (e.g.FGOALS-g3, UKESM1-0-LL) and others projecting increases (e.g.MPI-ESM1-2-LR, CNRM-CM6-1-HR).Projected changes in SPIbased drought intensity during CHD days also exhibit large variability and are also projected to become less severe for a few GCMs and over a few regions (e.g.Northwest, Midwest, and Northeast) and more severe over the rest (figures 4(y)-(af)).The changes in the frequency of SPI-based CHD days are also associated with those in drought intensity for the CONUS over a few regions (e.g.Southwest, South Central, North Central, and Midwest) but are not generally associated with the projected changes in frequency or intensity of heatwaves (figures 4(q)-(an)).This implies that drought is the primary determinant of the characteristics of SPI-based CHD days given that intensity and frequency of heatwaves are projected to increase across all the GCMs and regions.Additionally, the projected multi-model mean changes in intensity and frequency of heatwaves show less regional variability compared to droughts (figures 4(q)-(x) and (ag)-(an)).For instance, multimodel mean changes in heatwave frequency vary from ∼35 days over the Midwest to ∼39 days over the Southwest (figures 4(r) and (x)) and changes in heatwave intensity vary from 1.5 • C over South Central to 2.6 • C over North Central (figures 4(a)-(i) and (an)).Indeed, we still find large intermodel differences in projected heatwave characteristics.For example, the range of CONUS projected changes in heatwave days varies from ∼15 days in the FGOALS-g3 to ∼54 days in the UKESM1-0-LL and changes in heatwave intensity during CHD days range from 0.76 • C in FGOALS-g3 to 3.65 • C in the HadGEM3-GC31-LL (figures 4(q)-(x) and (ag)-(an)).
In the case of SPEI-based droughts, we find that the multi-model mean change in CHD days is ∼22 days over the CONUS in the end-century.The projected changes in the frequency of SPEI-based CHD days range from ∼6 days in the FGOALS-g3 to ∼35 days in HadGEM3-GC31-MM over the CONUS (figure 5(a)).The FGOALS-g3 and HadGEM3-GC31-MM also project comparably lower and higher changes in the frequency of SPEI-based CHD days over other regions, respectively (figures 5(a)-(h)).The projected changes in the frequency of SPEI-based CHD days are much higher than SPI-based CHD days (figures 5(a)-(h) and figures 4(a)-(h)).For instance, while both SPEI-and SPI-based CHD days project the largest change over the Southwest and the smallest change over the Northeast, the magnitude of change using SPEI-based droughts is much higher over both the regions (figures 4(b), (h), 5(b), and (h)).Moreover, we also find variations in the relative magnitudes of change among GCMs depending on the choice of drought indices.For example, GCMs that project larger changes in the frequency of SPI-based CHD days do not necessarily project a similar relative level of change in SPEI-based CHD days, and vice versa.For instance, CNRM-CM6-1-HR projects the largest CONUS scale change for SPI whereas it projects moderate level change for SPEIbased extremes.Similarly, the HadGEM3-GC31-MM model projects the largest CONUS scale change for SPEI but it projects only moderate levels of change for SPI.Unsurprisingly, for SPEI-based CHD days, the projected changes in frequencies are closely associated with changes in frequencies and intensities of both drought and heatwaves (figures 5 and S6).Further, Additionally, to understand the role of landatmosphere feedbacks during the CHD days, we plot the projected changes in temperature against projected changes in EF, with EF defined as the ratio of latent heat flux to total energy flux (latent + sensible flux), (figures 6 and S7).For both SPI and SPEI, we find a correlation coefficient of >0.5 between T max and EF among the GCMs for three, and between 0.4-0.5 for two out of seven regions.Generally, we find that models with larger increases in EF (i.e. higher partitioning towards latent heat flux) tend to exhibit comparatively smaller increases in T max during the CHD days.For instance, in the far-future period, we find that the HadGEM3-GC31-LL, which projects the largest increase in heatwave intensity during CHD days over the Northeast, North Central, and Midwest (figures 4 and 5) also projects the largest decrease in EF for both SPI-and SPEI-based CHD days over these same regions (figures 6(f)-(h) and 6(n)-(p)).Moreover, in the case of SPI-based CHD days, the HadGEM3-GC31-LL exhibits only small increases in temperature and positive changes in EF over Southwest, South Central, and Southeast regions (figures 6(b)-(d)).Contrarily, in some cases, this association does not hold e.g. for UKESM1-0-LL over Southwest and North Central regions and the IPSL-CM6A-LR over the South Central region (figures 6(b), (g), (j), and (o)).Overall, for SPI-based extremes, we find a weak association between temperature and EF over the South Central and Southeast regions.For SPEI-based extremes, we find weak associations over the CONUS, Southwest, and South Central regions.The results for the mid-century period (2021-2060) are generally similar except for a weaker correlation over the Northwest and a comparatively stronger correlation over the North Central and Southeast (figure S7).
Furthermore, our analysis indicates that GCMs with lower biases in historical extreme metrics denoted by relatively smaller circles, tend to display reduced variability in both temperature and EF changes both in mid-and end-century periods over some regions (figures 6 and S7).We also find differences in the mean and standard deviation of temperature and EF changes based on the selection of GCMs according to their biases.Some regions exhibit a greater spread in changes when using high-bias GCMs, while in other regions, a larger spread is seen with low-bias GCMs.For instance, for the end of the century period, the standard deviations in change tend to be higher for GCMs with higher bias over five (three) out of the seven regions for T max and four (four) out of seven regions for EF for SPEI (SPI).Similarly, in the mid-century, the standard deviation is larger for five (four) three (three) regions in T max and EF changes for SPEI (SPI).However, it is worth noting that these differences are generally not statistically significant (table S2).Furthermore, in certain regions, the majority of GCMs with lower bias tend to cluster within the same quadrant as the multi-model mean.For instance, for end-century changes associated with SPI, 9 out of 10 models over the Northwest, and 8 out of 10 models over the Northeast, each with the lowest bias lie in the same quadrant as the multi-model mean (figures 6(e) and (h)).Similarly, for far future changes associated with SPEI, 9 out of 10 models fall in the same quadrant as the multi-model mean over both the Southwest and North Central regions (figures 6(j) and (o)).

Summary and discussion
The CMIP6 GCMs exhibit substantial biases in simulated characteristics of both individual heatwave and drought events as well as CHD events compared to PRISM and ERA5 reference datasets.While the sign of the bias is generally consistent, with a few exceptions, there is substantial variability in the magnitude of the bias among the GCMs.Additionally, we find noticeable differences in model biases depending on the choice of reference datasets.Particularly, the bias in the frequency of heatwave days, SPI drought days, and CHD days exhibits marked differences when calculated with respect to the two reference datasets.Thus, observational uncertainty is a limiting factor for fully understanding the historical performance of GCMs.While utilizing bias-corrected datasets can mitigate observational bias relative to the reference data used for correction, there is a necessity for further investigation of the potential consequences of biases in future projections.This emphasizes the significance of incorporating multiple reference datasets when evaluating the performance of GCMs and conducting bias correction.
In the future period, the area under the CHD days and the intensity and frequency of heatwave and drought during CHD days are projected to increase for both SPI-and SPEI-based extremes over the majority of the CONUS.However, there are noticeable differences in the projected changes that are associated with the choice of drought indices.The projected increases in the frequency and intensity of SPEIbased drought and CHD days are substantially larger than those derived from SPI.Further, there is an increase in the frequency and intensity of SPEIbased drought days across the CONUS whereas the projected changes in the frequency and intensity of SPI-based drought days show large variability with an increase over the Southwest, South Central, and parts of the Southeast and a decrease over the Northeast.The frequency and intensity of heatwave days are also projected to increase across the CONUS.Overall, this results in an increase in frequency for both SPIand SPEI-based CHD days across the CONUS.Even though the changes in temperature are accounted for in the heatwave metric, CHD extremes derived from the SPI exhibit a lower increase in comparison to those derived from SPEI.This suggests that the projected changes in CHD days are predominantly driven by the choice of the drought index.As opposed to SPI, which solely accounts for precipitation deficits, SPEI considers both precipitation and potential evapotranspiration (PET) and accounts for the impact of increasing temperature on evaporative demand and consequently on drought severity.We note that in our analysis SPEI does not directly use evaporation.Instead, it relies on estimates of PET, calculated as a function of temperature, following the Thornthwaite method.The use of SPEI accounts for the impact of rising temperatures on atmospheric water demand.Using these estimates to construct the SPEI metric provides a means to represent the temperature impacts on soil/plant water availability.This representation provides a better indication of agricultural and ecological drought stress, a dynamic that SPI, which primarily reflects meteorological droughts, does not capture.Therefore, while the use of SPI may be more suitable for shortterm planning, the use of SPEI can help discern how the observed droughts are influenced by temperature increases, which is essential for understanding the long-term impacts of climate change.Overall, the use of both SPI and SPEI enhances the robustness of analysis by considering both precipitation and temperature effects on drought and consequently on the compound extremes.These results are also consistent with the conclusions drawn in the working group 1 summary for policymakers of the Sixth Assessment Report by the Intergovernmental Panel on Climate Change (IPCC), which indicate a more pronounced increase in ecological and agricultural drought when compared to hydrological drought [30].
We also find substantial inter-model differences in the projected changes among the GCMs for heatwaves as well as SPI-and SPEI-based drought and CHD days.Additionally, while we find the changes in temperature during the CHD days to be closely associated with the changes in the partitioning of energy fluxes, where GCMs that project a decrease in EF (i.e. higher partitioning towards sensible heat) tend to predict more substantial increases in heatwave intensity, and vice versa, there are some exceptions.Furthermore, we note considerable variability in these changes across the GCMs.However, when we sub-select GCMs with lower historical bias, we find reduced variability in the projected changes related to both temperature and EF over some regions.Overall, this suggests that a more accurate simulation of these processes in the models can yield more robust projections of extremes across the GCMs, which are crucial for accurate impact assessment.
This study evaluates historical biases in the characteristics of extremes and the key differences in the projected future changes, particularly inter-model variations among the GCMs, regional variations within each individual GCM's projections, and differences from drought metrics over the CONUS and seven regions.Overall, these results can serve as a guide for GCM interpretation and selection for regional-scale impact assessment where the projected changes in extreme events are crucial.It also highlights the value of further reducing observational uncertainty to aid in the evaluation, selection, and improvement of GCMs over time.
will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (www.energy.gov/downloads/doepublic-access-plan).

Figure 1 .
Figure 1.Spatial plots showing biases in CMIP6 multi-model mean with respect to the PRISM dataset calculated over the historical period (1981-2014).Shown are biases in the number of (a) heatwave days/decade, (b) SPI drought days/decade, and (c) SPEI drought days/decade.(d) The average daily maximum temperature (heatwave intensity) and (e) drought intensity during SPI-based CHD days, and (f) drought intensity for SPEI-based CHD days.Also shown are biases in the percentage of (g) heatwave days, (h) SPI drought days, and (i) SPEI drought days occurring as CHD days.(j) Heatmap showing the area average CONUS scale biases in various extreme characteristics for each of the CMIP6 GCMs.The values in the heatmap are normalized using the multi-model mean bias for the sake of presentation.GCMs are arranged from left to right based on the lowest to highest mean absolute bias across the metrics.Color bar 1 corresponds to all the metrics except for average SPI and SPEI drought severity for which color bar 2 is used.Red (blue) color on spatial maps shows higher (lower) frequency and severity of events in CMIP6 relative to PRISM.Negative drought intensity denotes more severe droughts.Boxes marked with asterisks show models with bias smaller than multi-model mean bias.

Figure 2 .
Figure 2. Stacked plots showing mean percentage of the CONUS area under only heatwaves (red), only drought (orange), and compound extremes (purple) during all summer days (June-August) from 1980 to 2100 based on (a) SPI and (b) SPEI drought indices.

Figure 3 .
Figure 3. Spatial maps showing projected end-century changes in the multi-model mean for the (a) number of heatwave days, (b) number of SPI-based drought days, (c) number of SPEI-based drought days, (d) number of SPI-based CHD days, (e) number of SPEI-based CHD days, (f) heatwave intensity for SPI-based (g) drought intensity for SPI-based, and (h) drought intensity for SPEI-based CHD days.The end-century changes are calculated as the average differences during the summer months between 2061-2100 and the 1981-2020 period.

Figure 4 .
Figure 4. Bar plots showing the end-century projected changes in the (a)-(h) number of CHD days, (j)-(p) number of drought days, (q)-(x) number of heatwave days, (y)-(af) drought intensity during CHD days, and (ag)-(an) heatwave intensity during CHD days.Changes show the average differences between the end-century (2061-2100) and historical periods (1981-2020) using 23 CMIP6 GCMs (and the multi-model mean) and averaged over the CONUS and (ao) the seven regions.Drought characteristics are calculated using the SPI index.

Figure 5 .
Figure 5. Same as figure 4 but for SPEI drought index.

Figure 6 .
Figure 6.Comparison between changes in temperature and changes in evaporative fraction during (a)-(h) SPI-and (i)-(p) SPEI-based CHD days averaged over the CONUS and seven regions.End-century changes (2061-2100) are calculated relative to the historical period (1981-2020).The size of the circle represents the normalized mean bias across all the metrics presented in figure1(j).The smallest circle corresponds to the GCM with the least bias, while the largest circle represents the GCM with the highest bias.