Assessment of precipitation and near-surface temperature simulation by CMIP6 models in South America

This study evaluated the performance of 50 global climate models (GCMs) from the Coupled Model Intercomparison Project Phase 6 (CMIP6) in simulating the statistical features of precipitation and air temperature in five subdomains of South America during the historical period (1995–2014). Monthly precipitation and temperature simulations were validated with data from the Climate Prediction Center Merged Analysis of Precipitation, the Global Precipitation Climatology Project, and the ERA5 reanalysis. The models’ performance was evaluated using a ranking analysis with statistical metrics such as mean, standard deviation, Pearson’s spatial correlation, annual cycle amplitude, and linear trend. The analyses considered the representation of precipitation and air temperature separately for each subdomain, the representation for all five regions together, and the joint representation of precipitation and air temperature for all five subdomains. In the Brazilian Amazon, the best-performing models were EC-Earth3-Veg, INM-CM4-8, and INMCM5-0 (precipitation), and IPSL-CM6A-LR, MPI-ESM2-0, and IITM-ESM (temperature). In the La Plata Basin, KACE-1-0-G, ACCESS-CM2, and IPSL-CM6A-LR (precipitation), and GFDL-ESM4, TaiESM1, and EC-Earth3-Veg (temperature) yielded the best simulations. In Northeast Brazil, SAM0-UNICON, CESM2, and MCM-UA-1-0 (precipitation), BCC-CSM2-MR, KACE-1-0-G, and CESM2 (temperature) showed the best results. In Argentine Patagonia, the GCMs ACCESS-CM2, ACCESS-ESM1-5 and EC-Earth3-Veg-LR (precipitation), and CAMS-CSM1-0, CMCC-CM2-HR4, and GFDL-ESM4 (temperature) outperformed. Finally, for Southeast Brazil, the models ACCESS-CM2, ACCESS-ESM1-5, and EC-Earth3-Veg-LR (precipitation), and CAMS-CSM1-0, CMCC-CM2-HR4, and GFDL-ESM4 (temperature) yielded the best simulations. The joint evaluation of the regions and variables indicated that the best models are CESM2, TaiESM1, CMCC-CM2-HR4, FIO-ESM-2-0, and MRI-ESM2-0.


Introduction
Global climate models (GCMs) are crucial for investigating natural climate variability and climate change induced by increases in greenhouse gas concentrations [1].In this sense, the World Climate Research Program (WCRP) created the Coupled Model Intercomparison Project (CMIP) in 1995 to systematize the outputs of coupled ocean-atmosphere models from various global climate centers [2,3].CMIP is currently in its sixth phase (CMIP6), and its outputs provide essential information for the Intergovernmental Panel on Climate Change (IPCC), as well as for various climate modeling studies [2,3].
Each new generation of the CMIP is based on the premise that the new GCMs will present improvements over their previous versions, considering a context of progressive evolution in computational efficiency, resolution, and representation of physical processes [1,4].However, even with significant advances in GCMs, they still have limitations due to their horizontal resolution and representation of sub-grid processes, which restricts their application for studies of climate change impacts at regional and local scales.Given this, downscaling techniques are helpful for the spatial refinement of climate projections and enable a more reliable analysis of mitigation and adaptation to climate change at regional and local scales.In this context, a sub-selection of GCMs from the various CMIP generations may be convenient before downscaling is applied to reduce the computational cost and choose the models that best represent climate processes relevant to the region of interest [1,5].
Several studies have evaluated the performance of CMIP6-GCMs in different regions of the globe, such as North America [1,6,7], Central America [7,8], Africa [9][10][11][12], Asia [13][14][15][16][17], Europe [18,19], and Oceania [20].For South America (SA), such analyses are even more problematic and necessary, given that the continent has climate complexity resulting from its great latitudinal extension and topographic heterogeneity [21,22].In addition, SA has global relevance, given that it concentrates on some of the most important biodiversity areas in the world, harboring many endemic species threatened by anthropogenic activities [23].Many regions of the continent are vulnerable to reduced water availability, increased flooding and overflows, decreased food production, and rising incidence of vector-borne diseases [24].Over the last four decades, various parts of SA have experienced a reduction in rainfall volumes, indicating an expansion of dry subtropical zones and a greater frequency of drought events over these regions [25], directly impacting millions of people and fundamental socio-economic activities such as agriculture and electricity generation.
Realistically representing the complex features of the South American climate and topography is still challenging for GCMs.Despite their good ability to describe the average characteristics of precipitation, temperature, and atmospheric circulation, many models still present systematic errors in precipitation magnitudes over regions such as the Andes and the La Plata Basin, limitations in the simulation of mesoscale processes associated with the Andes and the genesis of mesoscale convective complexes in southeastern SA [7].Regarding the latest generation of CMIP6 models, studies have evaluated their ability to represent the main characteristics of the South American climate in recent decades and the climate projections.Rivera and Arnould [25] assessed the performance of 14 CMIP6 models in simulating the temporal and spatial patterns of rainfall over southwestern SA from 1995-2014.Although most models captured the main characteristics of regional rainfall, they overestimated precipitation along the Andes and southern Chile.In addition, the models were adept in representing the decline in precipitation during the 20th century but showed spatial deviation from the rainfall maximum.
Dias and Reboita [3] evaluated the performance of 46 CMIP6 GCMs in representing the spatial and temporal variability of precipitation and air temperature over tropical SA, considering the historical period 1996-2014.The authors concluded that the best models' ensemble provides results that are more similar to the present climate validation data than the 46 models' ensemble.Similarly, Oliveira et al [26] analyzed the performance of 50 GCMs from CMIP6 in simulating precipitation from the historical period (1995-2014) only in southeastern Brazil and southern Amazonia.They concluded that in the Amazon region, the models performed best during the months of MAM and JJA, while in the Southeast, the best performances were for the months of DJF.Likewise, Firpo et al [27] investigated the performance of 35 CMIP6-GCMs in simulating the temperature and precipitation of the historical period in Brazil.They observed that the models perform better for winter rainfall than summer, and the La Plata Basin (Brazilian Northeast) showed the best (worst) results.
Almazroui et al [4] assessed the performance of 38 CMIP6 models in simulating the climate of South America over a recent period (1995-2014) and their climate projections for the periods 2040-2059 and 2080-2099 under different emission scenarios (SSP1-2.6,SSP2-4.5,SSP3-7.0, and SSP5-8.5).Although the models successfully represented the continental climate, many showed varying performance when simulating rainfall's spatial and temporal distribution over SA's highest latitudes and altitudes.Regarding climate projections, the models indicated decreased rainfall in the Amazonia and the northern and southern sectors of the Chilean Andes.On the other hand, the increase in temperature proved robust even under the least extreme scenario, SSP1-2.6.In general, the authors identified that these projections agree with those obtained by previous versions of CMIP3 and CMIP5.
Ortega et al [28] analyzed the seasonal simulation of precipitation and temperature in SA of 49 CMIP5 and 33 CMIP6 models.Despite models consistently characterizing the variables spatially and temporally, most showed systematic errors over the oceans.The authors identified greater accuracy of the models from December to May, possibly associated with the better ability to represent the Intertropical Convergence Zone (ITCZ) in this period.In addition, CMIP6 models reduced the biases compared to CMIP5 and improved the representation of climate patterns in the Brazilian Midwest and Bolivian Chaco regions.At the same time, sectors such as the Andes, central Chile, and the Guyanas continue to show less satisfactory results.Not all CMIP6 models generally agree on the projected signal of reduced rainfall over most of the continent.At the same time, there is a greater consensus on temperature, indicating persistent warming over the historical period and an average increase of up to 6 • C by the end of the 21st century under the most severe scenario.
Arias et al [24] used 49 CMIP5 and 33 CMIP6 models to investigate their ability to simulate seasonal rainfall and temperature patterns in northern SA.Although CMIP6 presents corrections and improvements compared to CMIP5, there are still systematic errors in the representation of the ITCZ and topography-dependent processes, factors that strongly influence the variability of precipitation and temperature in the region.In addition, the authors found that CMIP6 shows larger temperature deviations over the Andes than CMIP5.Concerning climate projections, the models indicate increased temperature and mixed signals in rainfall, with declines over the Orinoco and Colombian Amazonia and an increase over the eastern equatorial Pacific.
There are different methods for evaluating GCMs, such as institutional democracy and weighting approaches.Through institutional democracy, a GCM is selected from each modeling institute, assuming that each simulation of the model presents an equally valid and probable future scenario [29][30][31].Although this is an efficient way of dealing with model dependence, it may not continue to be effective in future studies as institutes increasingly copy or collaboratively develop their models and components [32].The weighting approach assesses the model's performance in simulating past climate and assigns a weight to this performance in future projections [29,33].Although a good simulation of historical climate does not determine more accurate climate projections, if a model fails to simulate aspects of past climate, it will probably produce less reliable projections of future climate [29,34].
Given the framework, this study aims to evaluate CMIP6 models in terms of their ability to recreate historical climate statistics (1995-2014) for five South American subdomains.In this context, we point out that previous studies have also evaluated the performance of CMIP6 models in representing the statistical characteristics of the present climate in SA [3,4,26,27,35], but with different purposes and subdomains analyzed.Here, we evaluate critical regions for energy studies in SA, as we are interested in the renewable resources present in these sectors.Thus, we do not follow the areas defined by the IPCC since they are broad and mix climate characteristics from different sectors (a discussion of the representativeness of these areas is carried out in the WCRP Regional LAC Concept Paper Challenges for Climate Change Adaptation in Latin America and the Caribbean Region).In addition, our study differs in the number of models evaluated since the works cited analyzed a smaller number of GCMs.
Furthermore, here we revisit the ranking methodology consolidated in the literature [29], which has recently regained relative notoriety with a study in Australia [36].This methodology is applied because it is objective, ranks and synthesizes much information, and has not yet been used for SA.Lastly, we highlight that this study is part of the R&D project 'Hydro, wind, and solar energy in South America: Changes projected by CMIP6 climate models' [37][38][39], so one of the objectives is to evaluate the best models to represent the South American climate and their use in statistical and dynamical downscaling techniques on the continent.Therefore, our goals include providing the final information by applying the ranking method and contributing to the South American energy sector since this study meets a demand from the energy industry.

Study area
Given the territorial extension and climate heterogeneity of SA, we addressed the analyses for five regions of the continent (figure 1): southern Amazonia and portions of the Brazilian Midwest (AMZ, R1; 5  W-52 • W).These sectors followed Ferreira and Reboita [22], which conducted a spatial clustering analysis with monthly rainfall data for SA.Thus, the subdomains analyzed comprise regions with homogeneous precipitation patterns, unlike the areas defined by the IPCC, which are broad and encompass the climate characteristics of different sectors.In this way, it is possible to examine the models' regional performance and optimize the selection of GCMs in various regions of the continent.

CMIP6-GCMs
Monthly data of precipitation and 2 m air temperature from 50 GCMs (table 1) were obtained from the Earth System Grid Federation platform (available at (https://esgf-data.dkrz.de/projects/cmip6-dkrz/)for the historical period from January 1995 to December 2014.Although the historical CMIP6 simulations cover the period 1850-2014, in this study, we analyzed climate simulations for 1995-2014, as also established by the AR6 Working Group [40] and used in several studies [1,3,4,7,13,27].

Reference data
We evaluated the GCMs' performance by comparing them with the ERA5 reanalysis and precipitation data from the Climate Prediction Center Merged Analysis of Precipitation (CMAP) [41] and the Global Precipitation Climatology Project (GPCP) [42].Hourly data of 2 m air temperature from ERA5 and daily precipitation from the CMAP and GPCP over 1995-2014 were employed.For precipitation, we obtained rainfall in continental areas by averaging the two datasets (CMAP and GPCP) and using only the GPCP data for oceanic regions.For validation, we interpolated the CMIP6 historical simulations to the same spatial resolution as the reference data (0.5 • ) using the bilinear interpolation technique [43].
Assessing precipitation patterns in SA is a complex task due to the continent's low density of meteorological stations.Thus, using products from satellites, radar, reanalysis, and numerical simulations is one way to overcome these constraints associated with inherent observational uncertainty.In this context, da Rocha et al [44] compared the performance of six gridded datasets with local stations, concluding that, although most datasets did not capture more refined aspects of precipitation (such as the phase and amplitude of the diurnal cycle and intensity of heavy events), they satisfactorily represented other characteristics like the seasonal mean and subseasonal variability.Furthermore, the CMAP and GPCP datasets used here strongly agree over terrestrial regions [45], mainly over tropical South American areas [46][47][48].Additionally, GPCP shows greater accuracy over oceanic areas [45,47,48].We emphasize that the dataset used as a reference for analyzing the precipitation in each SA subdomain refers only to the CMAP/GPCP averaged ensemble over continental areas.Nevertheless, precipitation from GPCP over oceanic regions is also relevant to visually assess the models' ability to simulate systems such as the ITCZ and the South Atlantic Convergence Zone.
On the other hand, we stress that the selection of a single dataset for temperature evaluation stems from the good performance of ERA5, reiterated in several studies that demonstrate its ability to represent the diurnal [49], monthly [50], and annual [51] cycles of land surface temperature in SA.

GCMs evaluation and model ranking by overall performance
We compared the monthly precipitation and air temperature at 2 m from the 50 CMIP6-GCMs (table 1) to the reference data.Initially, we calculated seasonal averages (DJF, MAM, JJA, and SON) and annual averages of each variable, model, and reference dataset.Subsequently, the models' biases were calculated and presented in maps.For brevity, only the DJF and JJA maps are shown here.Regarding precipitation, for a given model i and observation obs, relative bias (RB) is calculated according to equation (1), where biases are represented in percentage values: In addition, a regional analysis was carried out, calculating regional and seasonal averages and other statistical parameters (standard deviation, spatial correlation, annual cycle amplitude, and linear trend), which are presented in heatmaps.The statistical metrics include: (a) Mean and standard deviation: the seasonal mean and standard deviation were calculated for each year (1995-2014); (b) Spatial correlation: used to assess the similarity of the spatial patterns between the model outputs and the reference dataset.Calculated for each season and year of the historical period using Pearson's coefficient; (c) Annual cycle Amplitude: the average amplitude of the seasonal cycle is defined as the difference between the hottest and coldest months for temperature and between the wettest and driest months for precipitation; (d) Linear trend: for the entire time series (not divided into seasons), the linear trend was calculated using the least squares method, and the angular coefficient (slope of the line) was selected, which indicates whether the trend is positive or negative.
Ranking GCMs according to performance is not trivial since various statistical metrics and seasons are analyzed.We standardized the metrics (assigning equal weight or importance to each metric) to simplify and compile all the information, applying the methodology of Rupp et al [29] and ranking the models according to their performance.For a given model i and metric j, the bias E i,j is calculated: where E i,j is the absolute error (absolute value of the bias), x i,j and x obs,j are the simulated and observed metrics, respectively.The next step is to calculate the relative error E * i,j , which can be described as a standardized bias time series: where min E i,j and max E i,j are functions used to select the minimum and maximum values of a time series.If the metric is a correlation, each min E i,j or max E i,j is reversed.In this metric, the absolute error E * i,j is divided by the amplitude of the error, E * i,j = 0 means that the model performs perfectly, while E * i,j = 1 indicates poor performance.Then, according to Rupp et al [29], the relative errors associated with each statistical metric j of a model i are added together, providing the total relative error: The final step is to rank the models according to their relative error.We divided by the maximum value of E * i,j to consider the error on a scale of 0-1.This information is shown in the 'overall' column of the heatmaps. .Other studies also reported the drier bias of CMIP5 [100] and CMIP6 [4,24,27,28] models in sectors such as northern SA and northern Brazil.These systematic model errors are generally associated with a less satisfactory representation of the ITCZ, which is attributed to the models' oversensitivity to sea surface temperature (SST) and failure to simulate surface wind convergence in the equatorial zone [100].According to Ortega et al [28], the CMIP6 models improved the representation of rainfall volumes over SA compared to the CMIP5 models.However, they are still deficient in simulating the position and intensity of the ITCZ, which partially justifies the negative rainfall biases over the northern sector of the continent.A similar pattern occurs in the Amazonia, where models tend to underestimate rainfall due to insufficient representation of the different processes that occur in the biome, such as cumulus convection, biosphere-atmosphere interactions, surface processes, soil moisture, as well as low coverage of rainfall stations in the region, which limits analysis of the magnitude and location of rainfall in the area [101].In contrast, the GCMs overestimate rainfall over the Andean region of Chile, Bolivia, Peru, and Ecuador due to excess modeled convective rainfall and deficiencies in topographic representation [4,25].In this context, validating climate simulations in these regions has many uncertainties due to the scarcity of rainfall stations in mountainous areas [25].

Results and discussion
Considering the austral winter (JJA, figure 3), most models maintain the underestimation of rainfall north of SA and amplify the dry bias over practically the entire continent.While some models, such as CESM2-WACCM (figure 3(j)), CESM2 (figure 3(k)), and FGOALS-F3-L (figure 3(x)), have a primarily dry behavior, other models, such as ACCESS-CM2 (figure 3(a)) and KACE-1-0-G (figure 3(kk)) overestimate rainfall over most of SA.However, even with a wetter simulation pattern, these models underestimate rainfall The ensemble mean maintains the warmer behavior of the models in sectors such as the Andean region and north-central Argentina, as well as the colder pattern in extreme southern Argentina and northern SA.Other studies have also found a colder bias of climate models in SA [27,[102][103][104][105][106] due to the limitation of the models in simulating the complex topography of the continent, especially over regions such as the Andes and the La Plata Basin [107], as well as the initial soil moisture conditions since this variable modifies the surface temperature amplitude [108].
For the austral winter (JJA, figure 5), the warm bias of models such as MIROC6 (figure 5(f)), GISS-E2-1-G (figure 5(dd)), and GISS-E2-1-H (figure 5(ee)) persists.Still, most models present underestimation errors, indicating that they tend to simulate colder temperatures than those observed during the winter.The ensemble mean reduces the error magnitude, with a cold bias of up to 1.5 • C over most Brazil.In contrast, the warm bias on the west coast of South America is less intense than the individual GCM simulations (figure 5(yy)).In general, the biases found for temperature are reasonable, given that the acceptable error for this variable is within 2 • C [109].In addition, the presence of uncertainties in the reference datasets, resulting from the information used for quality control, filling in missing data, and interpolation methods, must be considered [110].In this context, the ERA5 temperature dataset represented southeastern SA's maximum and minimum temperatures well [111].Furthermore, the relationship between temperature and precipitation biases is not evident since systematic temperature errors are also related to other processes, such as heat advection, surface interactions, and parameterizations [27].Despite this, the spatial patterns obtained here are similar to those found by Bazzanela et al [35], who saw a predominance of cold (warm) bias in the models in regions where precipitation is overestimated (underestimated), such as in Northeast Brazil (northern SA).According to the authors, this pattern may be due to the relationship between model cloudiness and incident shortwave radiation.On the other hand, the warm bias may result from errors in modeling the hydrological cycle, such as reduced evapotranspiration due to decreased precipitation [35].Furthermore, different models from different CMIP generations can perform better, with CMIP3-GCMs more accurately representing precipitation extremes in Northeast Brazil, while CMIP5 performs better for the Midwest and CMIP6 for the other regions of the country [112].

Ranking analysis
In the previous section, we presented the precipitation and air temperature biases through spatial seasonal fields.Other evaluation metrics are also required to identify whether a model performs well in representing regional and temporal climate variability.In this sense, the statistical metrics described in section 2.

Precipitation
Figure 6 shows the overall performance of each CMIP6 GCM through the ranking analysis from Rupp et al [29] for the five selected subdomains of SA, based on precipitation and air temperature separately for each region.Likewise, figure 7 illustrates the overall performance of each model based on rainfall and air temperature separately for all five areas (figures 7(a) and (b)) and based on the two variables together for all five subdomains (figure 7(c)).Additionally, figures S1-S10 in the supplementary material present the seasonal performance of each GCM per region, statistical metric, and the overall result for precipitation and air temperature.
Considering rainfall in AMZ (R1, figure S1), the models' seasonal biases are more intense at the DJF and SON seasons, and there is a better representation of the spatial variability of rainfall in the MAM and JJA months.The models perform moderately well in representing the annual rainfall amplitude but better in describing the temporal yearly trend.Of the 50 GCMs evaluated, 17 (34%) provide an overall ranking metric ('Overall' column of figure S1) above 0.8, indicating a less satisfactory performance (with the FGOALS-f3-L model showing the worst performance).Figure 6 presents the same analysis as the 'Overall' column, organized in ascending order, indicating that the EC-Earth3-Veg, INM-CM4-8, and INMCM5-0 (IPSL-CM6A-LR, MPI-ESM2-0, and IITM-ESM) models are the three best GCMs for simulating precipitation (temperature) in AMZ.
In LPB (R2, figure S2), the models show unsatisfactory performance in representing the spatial distribution of rainfall practically all year round, except during JJA, when most GCMs tend to present better simulations.Similarly, the models show moderate skill in representing the annual amplitude of rainfall in the region but indicate good performance in simulating temporal trends.In addition, 13 models (26%) resulted in an overall ranking above 0.8, showing poor performance (with the GISS-E2-2-H model providing the worst performance).Figure 6 reveals that the three best precipitation (temperature) simulation models for In NEB (R3, figure S3), the models indicate larger rainfall biases in the DJF and SON seasons but still clearly represent rainfall's spatial distribution over the entire year.Despite moderately simulating the annual rainfall amplitude in the region, the models show less skill in representing temporal trends, unlike the other sectors described.In addition, 28 models (56% of the set) resulted in a poor overall ranking (with the IPSL-CM6A-LR-INCA and NESM3 models showing the worst performance).In general, the three models with the best performance in simulating precipitation (temperature) in the region are SAM0-UNICON, CESM2, and MCM-UA-1-0 (BCC-CSM2-MR, KACE-1-0-G, and CESM2).
Considering rainfall in PAT (R4, figure S4), the GCMs show better bias performance throughout the year than the other regions described.Still, they are less adept at representing rainfall's spatial distribution, except in JJA.Similarly, the models provide a moderate simulation of the annual amplitude and a satisfactory representation of the temporal rainfall trends in the region.However, this skill is not reflected in the overall ranking since 74% of the GCMs result in poor performance (with the AWI-CM-1-1-MR model providing the worst performance).For this sector, the three best models for simulating precipitation (temperature) are ACCESS-CM2, ACCESS-ESM1-5, and EC-Earth3-Veg-LR (CAMS-CSM1-0, CMCC-CM2-HR4, and GFDL-ESM4).
In SEB (R5, figure S5), the models indicate moderate performance concerning seasonal biases and representation of the precipitation's spatial distribution, with better results during JJA.Although the GCMs perform well in simulating the annual amplitude and temporal trend, the overall ranking results show that 32% of the models perform poorly (with the GISS-E2-2-H model providing the worst performance).The three best models obtained for simulating precipitation (temperature) in this region are MPI-ESM1-2-HR, EC-Earth3-CC, and AWI-ESM-1-1-LR (TaiESM1, CMCC-CM2-HR4, and CMCC-ESM2).The better performance of GCMs during the austral winter compared to the summer is due to various factors inherent to the models and the climate system, and these results corroborate previous analyses [28,35].Most SA has the rainy season in the DJF months [21,22], so their accurate simulations depend strongly on the models' cumulus parameterizations for representing convective processes.Furthermore, during the summer, CMIP6 models show a double-ITCZ pattern, associated with an inadequate representation of ocean-atmosphere coupling, especially over the Tropical Pacific Ocean [28,35] and, despite improvements in CMIP6 related to the previous generations, this pattern persists [113].
Furthermore, the models' biases also stem from their difficulty in representing the effects of topography on precipitation regimes.For example, the Andes orography favors precipitation to the west of the mountains by promoting upward movements and consequent adiabatic cooling of moisture transported by the westerly winds in the southern South Pacific Subtropical Anticyclone [21,114].Similarly, during the summer, Amazonia transports moisture to the eastern Andes, promoting convection due to the orographic effect of the mountains [21,22,115].These phenomena are not yet satisfactorily represented by GCMs that need more spatial resolution to resolve these processes adequately.On the other hand, during the austral winter (JJA), the atmosphere is more stable and, therefore, less dependent on convection parameterization, which leads to better results during this season.
Figure 7 illustrates the models' performance, considering all the precipitation and temperature evaluation metrics for all the regions separately and together.Generally, the models closest to the left side (ranking tending to zero) perform best.These results are expected to contribute to climate studies in SA, given the scarcity of studies evaluating the performance of historical simulations of a large set of CMIP6 models for the continent.This information is helpful for climate research in the region since it presents the performance of various CMIP6 climate models in a simplified and direct way, saving analysis time and computational resources for those interested.

Air temperature
Figures S6-S10 in the supplementary material show the models' performance through the ranking analysis for temperature simulation in the five SA subdomains.In AMZ (figure S6), the models generally show better simulation performance, with satisfactory results in representing the spatial distribution of the variable during the DJF and JJA seasons.Despite showing moderate performance in simulating the annual temperature range in the region, the GCMs show good performance in representing the temporal trend of the variable.In terms of overall ranking, the results are more satisfactory than those for precipitation, with only eight GCMs resulting in a ranking > 0.8 (with the MCM-UA-1-0 model showing the worst performance).As described above, the IPSL-CM6A-LR, MPI-ESM2-0, and IITM-ESM models perform best when simulating temperature in the region.

M S et al
In (figure S7), the models have larger biases in the DJF season but show good skill in representing the temperature's spatial distribution during summer and winter.The simulation of the annual temperature amplitude is in an intermediate range for almost all the models, but the representation of temporal trends is more satisfactory for all GCMs.Similarly to AMZ, 34% of the models have an overall score >0.8 (with the NorESM2-LM model providing the worst performance), while the other GCMs have a moderate performance range.The best models obtained for simulating temperature in the sector are GFDL-ESM4, TaiESM1, and EC-Earth3-Veg.
In NEB (figure S8), the models show excellent performance in representing the temperature's spatial distribution during JJA and SON.Additionally, they demonstrate a good (moderate) ability to simulate the temporal trend (annual amplitude) in the sector, with the GCM AWI-ESM-1-1-LR showing the worst performance.The models with the best performance in simulating temperature are BCC-CSM2-MR, KACE-1-0-G, and CESM2.
Similarly, in PAT (figure S9), the models perform excellently in representing the spatial distribution throughout the year and can simulate the temporal trend.However, 78% of the models perform poorly, with the NorCPM1 showing the worst performance.In this region, the GCMs with the best temperature simulation performance are CAMS-CSM1-0, CMCC-CM2-HR4, CMCC-CM2-SR5, and GFDL-ESM4.
Finally, in SEB (figure S10), most models can represent the temperature's spatial distribution throughout the year, with a few GCMs indicating unsatisfactory performance during DJF.Similarly, most GCMs perform well in portraying the region's annual amplitude and temporal temperature trend, reflected in the overall ranking results.Considering this metric, only 10% of the models perform poorly, with the GISS-E2-1-H model having the worst performance.The best performance models in this sector are TaiESM1, CMCC-CM2-HR4, CMCC-ESM2, IPSL-CM6A-LR-INCA, and MRI-ESM2-0.It is also worth mentioning that some of the GCMs classified here with the best performance have also shown promising results for SA in other studies, such as the CAMS-CSM1-0, EC-Earth3-Veg, IITM-ESM, IPSL-CM6A-LR, MRI-ESM2-0, and TaiESM1 models [3,35].

General analysis
The results of the previous sections showed that the models' performance differs considerably among the regions and variables analyzed.In this way, this study can be a valuable tool to help researchers, as it presents the selection of the most suitable models for different sectors of SA simply and straightforwardly, saving time and computational resources for research.Generally, the best models for the five South American subdomains analyzed include EC-Earth3-Veg, KACE-1-0-G, and TaiESM1.In addition, we note that among the best-performing models for simulating precipitation and temperature in the five SA regions, three model families stand out with seven best-performing GCMs (EC-Earth3-Veg, EC-Earth3-Veg-LR, EC-Earth3-CC, CMCC-CM2-HR4, CMCC-ESM2, ACCESS-CM2, and ACCESS-ESM1-5).
Individually understanding each model is beyond the scope of this study, as this would require a singular analysis of all 50 GCMs, including a detailed assessment of the configuration of each model and its sensitivity to different aspects, such as dynamics and physical parameterizations.Differences among the GCMs' results are associated with various factors, such as the parameterizations of key physical processes like convection and the models' configuration, including numerical techniques and horizontal and vertical resolutions.However, a satisfactory simulation of the past climate yields more reliable future climate projections and justifies the model selection, given that skillfulness in historical simulation can translate into future predictions that are also consistent [34].Furthermore, studies indicate that the ensemble mean composed of the best-performing GCMs produces results closer to observations and improves the quality of climate simulations compared to any individual model [3,116].In this sense, our study, which attends to a demand from the energy sector, optimizes research by presenting the most suitable models and saving computational space, indicating the most appropriate GCMs for regional studies.Nevertheless, we stress that biases persist even in the ensemble, reinforcing the need to apply a bias correction method before adopting it to any impact study.

Summary and conclusions
This study evaluated the performance of 50 CMIP6 GCMs in simulating the statistical characteristics of the historical (1995-2014) simulations of temperature and precipitation for SA.To this end, climate simulations were spatially evaluated concerning seasonal and annual climatologies, considering the objective analysis of Rupp et al [29], which examines statistical metrics such as mean, standard deviation, spatial correlation, the annual cycle's amplitude, linear trend, in addition to the overall performance of the models.Table 2 shows the ranking analysis results with the best-performing models in simulating precipitation and air temperature for five SA subdomains.Considering the austral summer, most models underestimate rainfall in the Amazon region, northern Brazil, and north SA and overestimate rainfall in Brazil's northeastern and central sectors and the west coast of SA, corroborating previous studies [4,24,28,35].The models indicate heterogeneity in systematic biases for summer temperatures, with a more pronounced warm bias in the Andean region and north-central Argentina and a cold bias in the extreme south of Argentina and northern SA.
Regarding the austral winter, the models mostly underestimate rainfall in northern SA.At the same time, even though some GCMs (such as ACCESS-CM2 and KACE-1-0-G) overestimate rainfall over almost the entire continent, they still underestimate it in the South American northern sector.This failure to simulate rainfall in the equatorial region in winter is associated with a weakened or broken representation of the ITCZ over the continent between the Pacific and Atlantic oceans [35].
For winter temperature, most models simulate colder temperatures over much of the continent and a persistent warm bias on the west coast of SA.This pattern of biases has also been found in studies with different GCMs [3,[102][103][104][105] and stems from the limitation of the models in representing the complex topography of the continent [107,108].Generally, the observed temperature biases fall within acceptable limits, within the 2 • C range [109].Despite the absence of a clear correlation between temperature and precipitation biases, the spatial patterns identified align with those observed by Bazzanela et al [35], who noted a prevalence of cold (warm) bias in regions where precipitation is overestimated (underestimated), exemplified in Northeast Brazil (northern SA).We mention that GCMs exhibiting optimal performance in this study, including CAMS-CSM1-0, EC-Earth3-Veg, IITM-ESM, IPSL-CM6A-LR, MRI-ESM2-0, and TaiESM1, have also demonstrated promising outcomes in prior research on SA [3,35].
GCMs serve as essential tools for understanding and predicting climate patterns on a global scale.However, their efficacy in simulating regional climate phenomena remains scrutinized, particularly in SA.Topographic complexities within the continent, such as the presence of the Andes mountain range, significantly influence regional climate patterns.
The failure of some GCMs to adequately resolve these intricate geographical features can lead to erroneous temperature simulations, particularly concerning the elevation-dependent lapse rates.Additionally, land-use changes and alterations in surface properties are pivotal in shaping local climate conditions, yet their integration into GCMs often lacks the requisite precision.These deficiencies underscore the need for enhanced model parameterizations and finer spatial resolutions to improve the fidelity of climate simulations over SA and address the persistent cold biases observed in current GCM outputs for the continent.
Highlighting certain constraints of this study is valid.The investigation primarily relies on statistical analyses of rainfall and temperature, omitting an evaluation of dynamic and thermodynamic processes, such as examining winds at high and low atmospheric levels.Therefore, while the insights gained from the statistical analyses are valuable within their defined scope, we recommend interpreting them with an awareness of their drawbacks.However, by understanding the models' capacities and limitations in different subdomains, decision-makers can better assess the reliability of climate projections and plan more appropriate adaptation measures for each sector.For example, in the AMZ, most models have shown promising results in representing precipitation and temperature, indicating their potential applicability in this region, vulnerable to reduced rainfall and increased intensity and frequency of heat extremes and droughts [117][118][119][120][121]. Similarly, the models' satisfactory performance in simulating temperature in the NEB corroborates their use for studies in this sector, which has experienced an increase in mean temperature and the frequency and intensity of droughts and heat extremes [37,112,[118][119][120][121][122][123].Other regions, such as the LPB and SEB, also obtained good temperature simulation results from most models, demonstrating their potential usefulness these areas vulnerable to increases in mean temperature and the intensity and frequency of heat extremes [110,120,121,124].
In this way, we expect that the information provided here can contribute to a better assessment of the regional impacts of climate change on different socio-economic sectors potentially affected, such as energy generation in the NEB [122,125] and SEB [126].On the other hand, regions where the models have shown lower accuracy in simulating precipitation highlight the associated uncertainties and the need for further research.In this context, we emphasize the need to improve the current network of observations on SA, which would give greater reliability to GCM simulations and projections.
This study aspires to assist fellow researchers in selecting a refined subset of CMIP6 models proficient in replicating SA's average precipitation and temperature patterns, albeit acknowledging certain systematic biases.It is imperative to underscore that while the models identified in this study demonstrate reasonable proficiency in capturing statistical features of atmospheric variables, such efficacy does not automatically extend to the accurate representation of all atmospheric processes.Consequently, we advise a comprehensive assessment of the climatology of interest within the reference climate.This precautionary step ensures a better understanding of the models' performance and enhances the reliability of their application in diverse research contexts.Lastly, these results may help climate researchers involved with statistical or dynamical downscaling studies in different regions, saving the computational resources needed for research and giving greater confidence to the analyses.

Figure 1 .
Figure 1.Study area with topography (m) and location of the subdomains used to analyze the performance of the CMIP6 GCMs.Source: United States Geological Survey-Earth Resources Observation System (EROS) Center.

Figure 2 .
Figure 2. Relative precipitation bias (%) of each GCM (a-xx) and ensemble mean (yy) for the DJF season.The observed data's average rainfall (mm day −1 ) is shown in (zz).

Figure 3 .
Figure 3. Similar to figure 2, except for the JJA season.
4 were calculated and standardized to examine model performance.The ranking analysis is conducted in three ways: (a) first, model selection is based on precipitation and air temperature separately for each region; (b) second, model selection is based on precipitation and air temperature separately for all five regions; and (c) third, model selection is based on all variables for all five regions.

Figure 4 .
Figure 4. Temperature bias ( • C) of each GCM (a-xx) and ensemble mean (yy) for the DJF season.The observed data's average temperature ( • C) is shown in (zz).

Figure 5 .
Figure 5. Similar to figure 4, except for the JJA season.

Figure 6 .
Figure 6.Overall CMIP6 GCM performance for precipitation (left side) and air temperature (right side) for the South American subdomains.

Figure 7 .
Figure 7. Overall CMIP6 GCM performance considering the five regions for precipitation (a) and air temperature (b) and precipitation and air temperature together in the analysis (c).

Table 2 .
with the best performance for each South American region under analysis.