Difference in boreal winter predictability between two dynamical cores of Community Atmosphere Model 5

This study investigates the sensitivity of the boreal winter prediction skill of Community Atmosphere Model 5 to the choice of the dynamical core. Both finite volume (FV) and spectral element (SE) dynamical cores are tested. An additional FV with the SE topography (FVSE) is also conducted to isolate the possible influence of the topography. The three dynamical core experiments, which ran from 2001/2002–2017/2018, are validated using Japanese 55 year reanalysis data. It turns out that the SE (−4.27 °C) has a smaller cold bias in boreal-winter surface air temperature (SAT) than the FV (−5.17 °C) and FVSE (−5.29 °C), particularly in North America, East Asia, and Southern Europe/Northern Africa. Significant North Atlantic Oscillation-like biases are also identified in the mid-troposphere. These biases affect seasonal prediction skills. Although the overall prediction skills of boreal-winter SAT, quantified by the anomaly correlation coefficient (ACC), and root-mean-square error (RMSE), are reasonably good (ACC = 0.40 and RMSE = 0.47 in the mean values of SE, FV, and FVSE), they significantly differ from one region to another, depending on the choice of dynamical cores. For North America and Southern Europe/Northern Africa, SE shows better skills than FVSE and FV. Conversely, in East Asia, FV and FVSE outperform SE. These results suggest that the appropriate choice of the dynamical cores and the bottom boundary conditions could improve the boreal-winter seasonal prediction on a regional scale.


Introduction
While seasonal prediction of surface air temperature (SAT) is a longstanding issue, it has recently received special attention due to the frequent occurrence of extreme cold winters in the Northern Hemisphere mid-latitudes, e.g.Numerous studies have investigated the potential sources of predictability for those winter SAT in specific regions at week-to-seasonal time scales (Lee et al 2013, Lim and Kim 2013, Scaife et al 2014, 2017, Wang et al 2017, Zhang et al 2020, 2023, Schuhen et al 2022).Various empirical predictors or explanatory factors have been identified, including low-frequency teleconnection patterns (e.g.North Atlantic Oscillation (NAO), East Atlantic/Western Russia, and Scandinavian patterns), Arctic Oscillation, El Niño-Southern Oscillation, Rossby wave dynamics (e.g.subtropical and extratropical jet streams) and Arctic sea ice.
However, accurately translating potential predictors into predictability is challenging due to the highly nonlinear characteristics of the climate system and their imperfect numerical representation in climate models (Lorenz 1963, Wang et al 2014).Previous studies have endeavored to enhance predictability via improvements in physical processes (Phillips et al 2004, Yhang and Hong 2008, Doblas-Reyes et al 2013), ensemble methods (Palmer et al 2004, Ahn andLee 2016), and initialization processes (Polkova et al 2014, Kim and Ahn 2015).
Even with improvements in climate models over the past few decades (Alizadeh 2022), climate models continue to face many challenges.Among others, the importance of the dynamical core has recently been highlighted (Staniforth and Wood 2008, Thuburn 2011, Hagos et al 2015, Jun et al 2018).A dynamical core is the most fundamental part of a climate model that simulates the behavior of the atmosphere by solving the governing equations of atmospheric motion (Trenberth 1992, Rood 2011).It solves a set of partial differential equations of mass, and energy in the atmosphere (Williamson 2007).Various numerical methods, including finite difference, spectral, finite volume (FV), and finite element, have been used for dynamical cores (Fox-Rabinovitz et al 1997, Dennis et al 2012, Guerra and Ullrich 2016, Natale et al 2016, Melvin et al 2018).Different dynamical cores, with varying grid structures, have different computational efficiencies, and efforts to enhance climate model accuracy and efficiency are being made by improving dynamical cores and developing advanced numerical techniques (Wedi et al 2015, Kühnlein et al 2019).
A series of studies, including those of the dynamical model intercomparison project (Lauritzen et al 2010, Hall et al 2016, Ullrich et al 2017), have provided compelling evidence that the choice of dynamical cores can have a substantial impact on simulation results (Choi and Hong 2016, Jun et al 2018, Sato et al 2018, Gupta et al 2021).Jun et al (2018) for instance showed that model simulations of the Arctic winter climate and the associated teleconnections at mid-latitudes are sensitive to the choice of dynamical cores.They reported that different dynamical cores yield different SAT biases in boreal winter, and suggested that inter-model spread in the surface climate among climate model simulations is partly attributable to different dynamical cores.
Despite the importance of dynamical cores in climate simulation, a comprehensive evaluation of how different dynamical cores affect boreal winter climate predictions has yet to be conducted.Also, it remains unclear whether the differences in boreal winter climates are exclusively attributable to the choice of dynamical cores.Therefore, these considerations have prompted the need to address potential inconsistencies between model simulations and prediction systems.It has also emphasized the necessity of investigating the influence of different dynamical cores with identical physics schemes on the predictability of boreal winter climates within prediction systems.This investigation can help identify the impact of dynamical cores on the boreal winter prediction system and may lead to improvements in the prediction system through enhancements in dynamical cores.
To this end, the present study quantifies the performance of long-term seasonal prediction and its sensitivity to the choice of dynamical cores in the Community Atmospheric Model version 5 (CAM5) (Neale et al 2010).Two dynamical cores are considered.They are a FV dynamical core that uses an equal-distance longitude-latitude grid (Lauritzen et al 2011b) and a spectral element (SE) dynamical core that uses a quasi-uniform polygonal grid (Dennis et al 2012).Also, it is necessary to isolate uncertainties arising from model components other than the dynamical core to accurately examine the prediction performance of boreal winter climates.To address this issue, we have designed sensitivity experiments that incorporate different topography sets in addition to dynamical core differences.By evaluating the boreal-winter prediction skills of the hindcasts initialized every December 1st from 2001/2002 to 2017/2018 (17 winters) against the Japanese 55 year reanalysis (JRA-55; Harada et al 2016), the impacts of dynamical cores on the boreal-winter seasonal predictions are quantitatively assessed.
This paper is structured as follows.Section 2 describes the model, reanalysis data, and validation methods.Section 3 presents the results of 17 year hindcasts with different dynamical cores and evaluates their prediction skill compared to the JRA-55 product.In section 4, conclusions and discussion are presented.

Dynamical cores
The CAM5 was set to the same CAM4 physics package using the parameter settings (Gent et al 2011).An atmospheric model component of the Community Earth System Model (CESM) version 1.2.1, was used to make seasonal predictions (table 1).We strategically prepared two nearly identical seasonal predictions with different dynamical cores.The first uses the FV on a latitude-longitude grid system with 91 latitudinal and 144 longitudinal grid points ('fv19') (Jun et al 2018).The FV is the current default dynamical core in CESM and was used for the CESM's contributions to the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (Lauritzen et al 2011a).The other uses the SE on a cubed-sphere grid system with a horizontal resolution of 16 × 16 elements on one face and four collocation points on one element edge ('ne16np4' , approximately 2 • at the equator) (Williamson 2007, Dennis et al 2012).The FV treats the atmosphere as a collection of FVs or grid cells, focusing on the conservation of mass, momentum, and energy within each volume.It can handle complex geometries and physical processes.In contrast, the SE combines the aspects of spectral and finite-element methods.It uses highorder polynomials to approximate atmospheric variables within each element and is known for its accuracy and efficiency in handling complex geometries (Mirin andSawyer et al 2005, Putman andLin 2007).The SE incorporating a cubed-sphere grid (a quasi-uniform mesh) exhibits scalability and parallel efficiency in resolving polar singularities compared with a longitude-latitude system (Jablonowski andWilliamson 2011, Jun et al 2018).

Topography
To map the topography in a climate model and avoid numerical instabilities and inaccuracies that can arise when representing complex terrain features, smoothing must be applied to the topography (Gao et al 2006).Technically, different dynamical cores require different levels of smoothing of elevation data.The highest wavenumbers are removed during mapping to a latitude-longitude grid in the FV, whereas the surface geopotential is smoothed by multiple applications in the SE (e.g. a Laplace operator combined with a bound-preserving limiter and optimization-based mesh-improvement methods).The mapping method applied to the SE particularly uses a strong high-order interpolation technique to improve the quality of the topography while retaining the integrity of the original surface approximation, resulting in a smoother topography compared with that in the FV (Dennis et al 2012, Choi andHong 2016).Figure 1 depicts the topography in the (a) SE dynamical core, (b) FV dynamical core, and their (c) differences.Considerable differences are visible between the height of the mountains; the minimum to maximum range of the topography is 0.33 m to 4966.64 m for the SE and −9.42 m to 5016.16 m for the FV.The latter has a sharper topography, the difference in the topography being more pronounced near the terrain edges and borders.This indicates that the different terrain between the FV and SE can act as a source of uncertainty.
To address this issue, we devised sensitivity experiments utilizing different topography sets on top of dynamical core differences.The first experiment is the default 'SE' experiment, which uses an SE dynamical core on a cubed-sphere grid system with a generic SE topography distribution.The second experiment is the default 'FV' experiment based on an FV dynamical core built on a longitude-latitude grid with a generic FV topography distribution.The third experiment is the 'FV SE ' experiment, which differs from FV only in that the generic topography used for the default FV topography was replaced by the default SE topography distribution.The details of seasonal prediction experiments are summarized in table 1, with a brief description of the specific settings in table 2.

Model experiment
The initial conditions for seasonal predictions were prepared using the JRA-55 product.The reanalysis variables, including the 2 m air temperature (T2M), winds, radiation, specific humidity, precipitation, evaporation, and other climate parameters, are interpolated onto the model's horizontal and vertical grids to initialize the corresponding model variables.The hindcasts are made for 17 winter seasons (DJF) from 2001/2002 to 2017/2018.Ensembles of    All three experiments, i.e.SE, FV, and FV SE experiments, were conducted in 17 winters from 2001/2002-2017/2018.For each year, the three months (i.e.December to February) were averaged to define winter predictions with a 0 month lead time prediction.The ensemble mean of 15 ensemble members is utilized to evaluate the prediction skill.T2M and upper-level climate parameters (e.g.geopotential height and wind vector at 500 hPa) which are closely associated with changes in boreal winter climate, were used for validation.Oceanic regions were masked out to focus on the validation of land areas.

Validation data and method
The prediction skill scores were calculated using the ensemble mean instead of individual ensemble members to provide a more accurate estimate of the model's prediction skill.Two deterministic verification metrics, i.e. anomaly correlation coefficient (ACC) and root-mean-square error (RMSE) (Murphy and Epstein 1989), were utilized.The ACC evaluates the correlation of linear association between the reanalysis data and a set of hindcast runs, (1) Here H jτ represents hindcasts and R jτ is reanalysis.The subscript j is the initialization year (n = 17), and τ is the forecast lead-time.The climatological averages of reanalysis (R) and hindcasts (H) are calculated by: To estimate the uncertainties in ensemble predictions, model error is also examined by computing the RMSE of the ensemble-mean hindcast relative to the reanalysis data.The RMSE is based on the meansquared error (MSE) (Murphy 1988): and For an ACC close to 1, RMSE is proportional to the absolute difference between the predicted and reanalysis standard deviations.

2 m air temperature
The upper panels in figure 2 depict the mean bias of (a) SE, (b) FV, and (c) FV SE predictions for DJFmean T2M.The predictions from the three dynamical core experiments exhibited a cold bias (underestimation) over continental regions compared to JRA-55.The cold bias was more pronounced in high-terrain regions (figure 1), such as the Tibetan Plateau, which showed a cold bias of approximately 20 • C or more across the three dynamical core experiments.These cold biases of three dynamical core experiments on the Tibetan Plateau including mountainous regions, are likely linked to the uneven spatial distribution of meteorological stations.This non-uniform distribution may introduce some uncertainties in the interpolated observations in these regions (Reeves Eyre and Zeng 2017).
In area-averaged mean bias values over Northern Hemisphere continents, the SE (−4.27 • C) exhibited a smaller cold bias than FV SE (−5.17 • C) and FV (−5.29 • C).In addition, the mean bias in figures 2(a)−(c) cannot be considered an absolute bias.This is because the comparison products, such as JRA-55, also contained topographical and systematic errors within the reanalysis model itself.Thus, these mean biases provide a relative estimate of the bias in predicted results compared to the reanalysis product.
In figure 2(d), the SE predictions for boreal winter T2M reveal substantially warmer differences than FV, particularly in North America, Greenland, Southern Europe/Northern Africa, and Eurasia.Over Greenland and Eurasia, particularly the Tibetan Plateau, T2M differences are influenced by topography (as shown in figure 1(c)).Figure 2(e) illustrates the mean difference between SE and FV SE predictions for boreal winter T2M.Since FV SE shares both physical and topographical characteristics with SE, any differences between them can be attributed to the different dynamical cores.In the difference between figures 2(d) and (e), it is plausible that the T2M differences over Greenland and Eurasia are primarily attributed to topography rather than the dynamical core itself.The persistent warm differences in the SE prediction extend across North America, Russia, and Southern Europe/Northern Africa, emphasizing the substantial influence of the dynamical core over these regions.

Geopotential height at 500 hPa
The upper panels in figure 3 illustrate the mean bias of (a) SE, (b) FV, and (c) FV SE predictions for DJF-mean geopotential height and wind vectors at 500 hPa (Z500 and UV500) compared to JRA-55.In figure 3(a), the SE exhibits positive biases (indicating higher pressures than in JRA-55) in the North Pacific, Greenland, Eurasia, and Southern Europe/Northern Africa.Conversely, negative biases (indicating lower pressure than in JRA-55) are observed in North America, the North Atlantic, and the Central Pacific Ocean.Notably, a pronounced low-pressure bias emerges in the North Atlantic, accompanied by a clear pattern of westerly advection infiltrating the continental region, alongside the robust high-pressure bias observed in Southern Europe/Northern Africa.In figure 3  well as a positive bias with fewer differences within the same region as SE.FV SE exhibits negative and positive biases similar to FV but with a slightly weaker negative bias (figure 3(c)).Both FV and FV SE exhibit significant negative mean biases of Z500 than SE in the North Atlantic Ocean during boreal winter.Notably, the mean bias pattern of Z500 in SE, FV, and FV SE closely resembles the positive phase of the NAO compared to JRA-55.In comparisons among three dynamical core experiments of mean Z500 field, SE reveals a substantial positive difference in Greenland and a negative difference in the North Atlantic Ocean than FV (figure 3(d)).The difference reveals that SE exhibits a negative NAO-like pattern compared to FV.This negative NAO-like pattern is also observed in the comparison between SE and FV SE results (figure 3(e)), albeit with a relatively small difference of magnitude compared to the pattern in figure 3(d).
As a result, the mean difference in atmospheric circulation patterns centered over Greenland can be attributed to the choice of the dynamical core.In comparison to the FV dynamical core, the SE dynamical core predicted an intensified sinking motion in the troposphere over high latitudes, possibly leading to a warmer lower troposphere through adiabatic warming.This NAO-like pattern appears to influence surface temperature by inducing warm advection and moist conditions, particularly in North America and Southern Europe/Northern Africa (figure S1).

Prediction skills
Figures 4(a)-(c) show the spatial distribution of the ACC for winter T2M from SE, FV, and FV SE , respectively.Across the Northern Hemisphere continents, all three dynamical core experiments produced positive ACC distributions, indicating comparable prediction skills (FV = 0.41, FV SE = 0.40, and SE = 0.37 in area-averaged ACC scores of the Northern Hemisphere).In North America (red box), the SE displayed the most significant prediction skill (figure 4(a)), surpassing the performance of the FV (figure 4(b)) and FV SE (figure 4(c)) (SE = 0.52, FV SE = 0.47, and FV = 0.41).The SE also showed a higher ACC score than the FV and FV SE in Southern Europe/Northern Africa (green box).However, SE and FV SE did not show any substantial differences in ACC scores (SE = 0.46, FV SE = 0.45, and FV = 0.38).Moving to East Asia (blue box), FV exhibited a higher ACC score than SE and FV SE , yielding positive ACC distributions but lacking statistical significance in this region for three dynamical core experiments (FV = 0.38, FV SE = 0.37, and SE = 0.27).
Figures 4(d)-(f) show the spatial distribution of the RMSE for winter T2M as a measure of prediction accuracy.In all three dynamical core experiments, across the Northern Hemisphere continents, the midlatitudes were associated with relatively lower RMSE scores, while the high latitudes tended to exhibit larger RMSE scores (FV = 1.47,SE = 1.48, and FV SE = 1.48 in area-averaged RMSE scores of the Northern Hemisphere).This suggests that predicting T2M accurately in high-latitude regions is more challenging.Large RMSE scores were observed not only in North America but also in Russia, which had significant ACC scores (figures 4(a)-(c)).This indicates that, although the prediction performance closely matched the variability of the reanalysis data, the RMSE of the experiment remained substantial.In North America (red box), the SE (figure 4(d)) exhibited the lowest RMSE score than FV (figure 4(e)) and FV SE (figure 4(f)) (SE = 1.56,FV SE = 1.74, and FV = 1.85).For Southern Europe/Northern Africa (green box), SE and FV SE had similar RMSE distributions than FV with no significant differences (SE = 1.09FV SE = 1.10, and FV = 1.14).In East Asia (red box), the FV produced a lower RMSE score compared to the SE and FV SE (FV = 0.94, FV SE = 0.96, and SE = 1.01).
Area-averaged time series of the predicted winter T2M in the specific regions are also analyzed (figure S2.1).The area-averaged time series in all three dynamical core experiments have been subjected to statistical significance tests at the 95% confidence level (figure S2.2).The SE was found to be statistically significant in North America and Southern Europe/Northern Africa.The statistical significance of FV SE in North America becomes evident when the trend is removed.However, FV could not attain statistical significance in three regions.Also, the detrend prediction skills for boreal winter T2M in the areaaveraged time series were examined (figure S2.1).The prediction performance in the three dynamical core experiments did not significantly differ from the trend prediction skills.Overall, the prediction skills of the area-averaged T2M time series in the three dynamical core experiments align with the spatial distribution results.In specific regions, the RMSE scores are closely related to the ACC scores.The dynamical core experiment with a low RMSE score shows relatively high ACC scores during boreal winter.The SE has shown the highest prediction skills of winter T2M compared to FV and FV SE specifically in North America and Southern Europe/Northern Africa.These findings highlight the critical role of dynamical cores in determining predictability.

Conclusions and discussion
This study explored the impact of FV and SE dynamical cores on wintertime seasonal prediction.The three seasonal prediction model experiments were particularly conducted with two different dynamical cores; the SE and FV experiments with the respective dynamical core and generic topography, and the FV SE experiment with the FV dynamical core and SE topography.With these three configurations, the 15-member ensemble predictions, initialized on December 1st every winter from 2001/2002-2017/2018, were conducted and their mean biases and prediction skills were quantified with respect to JRA-55.
The surface temperature predictions showed overall cold biases in all three dynamical core experiments compared to JRA-55 (SE = −4.27• C, FV = −5.17• C, and FV SE = −5.29 • C).The SE prediction shows relatively small cold biases, approximately 1 • C-2 • C warmer than the FV prediction, consistent with Jun et al (2018).The SE prediction also predicted consistently less cold surface temperatures, particularly in North America, Southern Europe/Northern Africa, and East Asia, compared with FV SE prediction in which the SE topography was adopted in the FV prediction (1 • C-2 • C, 1 • C, and <1 • C than FV SE , respectively).This indicates that the difference between the SE and FV prediction is to a large extent caused by the dynamic core itself.Numerous studies in climate modeling, including this study, have consistently reported cold bias characteristics in surface temperature across different seasons and regions (Chen et al 2017, Jun et al 2018, Fan et al 2020, Li et al 2022).This issue remains a subject of ongoing investigation, as also highlighted by Wang et al (2023), who noted that systematic cold biases in surface temperature simulation across various models have been a long-standing concern, with the underlying causes still unclear.Significant biases are also identified in atmospheric circulations.In the mean bias of Z500, FV, SE, and FV SE predictions displays a positive NAO-like pattern compared to the JRA-55.In comparison with the FV and FV SE , the SE showed a negative NAO-like pattern in the midtroposphere, exhibiting a difference in anticyclonic geopotential height over Greenland, an easterly flow across North America, and a northerly flow at high latitudes.These differences are consistent with the warm and humid conditions of the SE particularly in North America and Southern Europe/Northern Africa, compared to the FV and FV SE (see supporting information S1).This result again reveals the importance of a dynamic core.
All three predictions showed reasonably high prediction skills across the Northern Hemisphere continents.In the prediction, FV, FV SE , and SE predictions over the Northern Hemisphere continents showed the ACC score of 0.41, 0.40, and 0.37, respectively.The RMSE scores are 0.41, 0.47, and 0.52, respectively.However, the prediction skills vary from one region to another, depending on the model configurations.In North America, SE prediction showed the highest ACC (i.e.SE = 0.52, FV SE = 0.47, and FV = 0.41) and the lowest RMSE (i.e.SE = 1.56,FV SE = 1.74, and FV = 1.85).It contrasts with prediction skills in East Asia.The FV and FV SE predictions achieved superior skills compared with the SE prediction with the ACC score of FV = 0.38, FV SE = 0.37, and SE = 0.27 and the RMSE score of FV = 0.94, FV SE = 0.96, and SE = 1.01.In Southern Europe/Northern Africa, SE and FV SE showed a comparable seasonal prediction skill, i.e.SE = 0.46, FV SE = 0.45, and FV = 0.38 for ACC and SE = 1.09,FV SE = 1.10, and FV = 1.14 for RMSE.We also analyzed the prediction skill scores in area-averaged T2M time series for four specific regions (figure S2.1).In both ACC and RMSE prediction skill scores, three dynamical core experiments show the same conclusions with spatial distribution results.
The above findings demonstrate that the choice of a dynamical core can significantly impact seasonal predictions in the Northern Hemisphere midlatitudes during boreal winter and has made regional differences in mid-latitude surface temperature predictability evident.Hence, the careful selection of a dynamical core emerges as an essential component for reducing inherent uncertainties in the model system and offering valuable insights that can enhance the prediction performance of the boreal winter climate.
Future work should extend the analysis of uncertainty presented here.The analysis framework employed in this study assesses predictability specifically for boreal winter climate without extending the evaluation to a global scale, including the Southern Hemisphere or all seasons.Prediction skills estimates based on seasonal hindcasts are subject to various sampling uncertainties, for example, due to the hindcast length or the finite size of the ensemble.Note that prediction systems typically utilize at least a 1 month lead time for seasonal predictions (Jung et al 2015, MacLachlan et al 2015).Although we only showed 0 month lead forecasts for DJF, but we also conducted 1 month lead forecasts for JF to check the lead time sensitivity of the results.While there is a reduction in prediction skill scores compared to the previous DJF season, consistent regional differences in prediction performance by dynamical cores are evident suggesting that dynamic core differences are robust features (see figure S3 in supporting information).This study can be extended to explore predictability differences in dynamical cores associated with different initialization approaches.Furthermore, future investigations could delve into techniques for compensating mean bias corrections that arise from differences in dynamical cores.As it stands, this study serves as a pilot study utilizing an atmospheric model with prescribed surface boundary conditions.Expanding this work to operational seasonal prediction models that consider interactions between the atmosphere, ocean, and land processes is a necessary step for a comprehensive understanding.
All data that support the findings of this study are included within the article (and any supplementary files).

Figure 1 .
Figure 1.Spatial distribution of topography for spectral element (SE) dynamical core and finite volume (FV) dynamical core.(a) SE topography, (b) FV topography, and (c) the difference between (a) and (b).
Finite volume (FV) dynamical coreSpectral element (SE) dynamical core Dynamical core, horizontal resolution, and physics settings in CAM5 configuration (when using configure script) -dyn fv -hgrid 1.9 × 2.5 -phys cam4 -ocn docn -dyn se -hgrid ne16np4 -phys cam4 -ocn docn Related case and resolution settings in the CESM1 configuration (when using create_newcase script) -case F -res f19_g16 -case F -res ne16_g16 (b), FV displays a strong negative bias in the North Atlantic Ocean and Pacific-Japan regions as

Figure 2 .
Figure 2. Spatial distribution of hindcast mean bias and difference for 2 m air temperature (T2M, • C) during wintertime (DJF) from 2001/2002-2017/2018. (a) A mean bias of the SE (SE minus JRA-55), (b) mean bias of the FV (FV minus JRA-55), (c) mean bias of the FVSE (FVSE minus JRA-55), (d) mean difference between the SE and the FV, and (e) mean difference between the SE and the FVSE.A mask has been applied such that only the ocean area for a clear distinction of T2M in the land area.

Table 1 .
Configurations of seasonal prediction models used in this study.

Table 2 .
Summary of flag sets for the two dynamical cores used in this study.

Table 3 .
Ensemble strategy, resolution, initial and boundary conditions used in this study.