Separating internal and forced contributions to near term SST predictability in the CESM2-LE

An open question in the study of climate prediction is whether internal variability will continue to contribute to prediction skill in the coming decades, or whether predictable signals will be overwhelmed by rising temperatures driven by anthropogenic forcing. We design a neural network that is interpretable such that its predictions can be decomposed to examine the relative contributions of external forcing and internal variability to future regional sea surface temperature (SST) trend predictions in the near-term climate (2020–2050). We show that there is additional prediction skill to be garnered from internal variability in the Community Earth System Model version 2 Large Ensemble, even in a relatively high forcing future scenario. This predictability is especially apparent in the North Atlantic, North Pacific and Tropical Pacific Oceans as well as in the Southern Ocean. We further investigate how prediction skill covaries across the ocean and find three regions with distinct coherent prediction skill driven by internal variability. SST trend predictability is found to be associated with consistent patterns of decadal variability for the grid points within each region.


Introduction
Skillfully predicting both global and regional climate change on interannual to decadal timescales is an outstanding problem from both a scientific and societal standpoint (Findell et al 2023).Trustworthy forecasts on this timescale can provide actionable information about the future climate for various sectors affected by climate change (Kushnir et al 2019, Solaraju-Murali et al 2021, Dunstone et al 2022).However, making skillful forecasts on decadal timescales is a challenge because it requires both predicting the climate response to anthropogenic forcings like greenhouse gases and aerosols, as well as skillfully forecasting low frequency internal climate variability (Meehl et al 2021).For example, it has been argued that modulations in the rate of global mean surface temperature increase are associated with changes in the internal low frequency variability in the Pacific Ocean so that even in the global mean, skillful temperature predictions require comprehensive understanding about internal variability (Trenberth and Fasullo 2013, Labe and Barnes 2022).On regional scales, the amplitude of internal variability can be much larger relative to the forced response than on global scales, providing an even larger source of uncertainty in the range of variability (Lehner et al 2020, Lehner andDeser 2023).Identifying and understanding predictable internal variability on regional scales is therefore an opportunity to provide stakeholders with improved estimations of the future range of variability.
Internal variability on interannual to decadal timescales can be predictable both in model simulations of the pre-industrial climate (Branstator et al 2012, Gordon et al 2021, Gordon and Barnes 2022), and in hindcast simulations of the historical era (Meehl et al 2016, Smith et al 2019, Delgado-Torres et al 2022).It has been suggested that future, global scale variability may too be predictable for certain initial states (Labe and Barnes 2022).This predictability is generally associated with low frequency modes of variability, namely the Pacific decadal oscillation (PDO;Mantua et al 1997, Newman et al 2016) and Atlantic multidecadal variability (Enfield et al 2001, ).However, prediction skill associated with internal variability is often sparse and limited to ocean heat predictability in mid-latitude ocean basins (Yeager et al 2018).Even with these limitations, it has been demonstrated that skillful sea surface temperature (SST) prediction can constrain predictions of surface climate evolution, for example temperature and precipitation over Europe (Simpson et al 2019, Borchert et al 2021a).Therefore, studies have increasingly suggested studying 'windows of opportunity' or state-dependent predictability to better leverage skill in decadal predictions (Borchert et al 2018, Brune et al 2018, Mariotti et al 2020, Merryfield et al 2020, Gordon and Barnes 2022).This framework focuses on identifying initial states or 'windows' where the climate is more predictable, inherently acknowledging that predictability depends on the initial state of the system, and these 'windows' provide the best opportunity to make skillful decadal predictions.
In the next 30 years, it is expected that forced warming from anthropogenic greenhouse gasses will continue to increase.This raises the question of whether sources of predictability identified in the historical climate will still contribute to skillful decadal predictions, as the large signal of forced warming could overwhelm predictable signals from internal variability.Here, we aim to investigate whether, and to what extent, internal variability contributes to near term decadal prediction skill in a relatively high future forcing scenario.We develop a novel neural network architecture that separately ingests information about the forced response and internal variability to predict the SST trend over the next 10 years.This architecture is then used to diagnose the contribution of internal variability to prediction accuracy, in order to identify regions where internal variability is a significant source of predictability in a near-future climate.We further address the question of where and when we can attribute prediction skill to internal variability in the presence of high anthropogenic forcing.(Fasullo et al 2022).The ensemble members are also designed to sample the phase of the Atlantic Meridional Overturning Circulation (AMOC) by splitting the full ensemble into five groups based on their initialized AMOC.We account for both the biomass forcing and AMOC initialization in our experiment design (see next section).We use SST output bilinearly regridded to 5 • × 5 • resolution, and coarsened to annual means at each grid point.

Community Earth System Model version 2 Large Ensemble, CESM2-LE
The target predictions in this work are classifications of future decadal SST trends, specifically whether a future trend in a particular region will fall in the lower, middle or upper tercile of the 2020-2050 distribution in that region (figure 1(b)).We therefore calculate sliding (i.e. starting each year) 10 year linear least-squares trends over the years 2020-2050 in 10 • × 10 • boxes in the ocean for each ensemble member.Figure 1(a) demonstrates the annual mean time series for a single ensemble member (blue curve) and the forced response (black curve) in a 10 over 1960-2100.SST trends for consecutive 10-year periods (green, orange and red lines) show that even though forced SST increases throughout the 21st century, regional internal variability can still contribute to reduced warming or even cooling in the later part of the 21st century.The distribution of decadal SST trends over 2020-2050 further demonstrates this point (figure 1(b)), with trends in this particular grid box ranging from −1 • C/decade to over 1.5 • C/decade.

Artificial neural networks
Artificial neural networks (ANNs) are used here to both make predictions of future decadal SST trend and also to investigate sources of predictability.We use ANNs because they are an established method of identifying sources of predictability on decadal timescales (Gordon and Barnes 2022) and for predicting future trends in SST (Labe and Barnes 2022).In this application, neural networks can be considered a non-linear data driven model, taking information about the current and past state of the climate (the state of global SSTs) to make a prediction about a future quantity (regional decadal SST trends).Unlike some previous studies that use ANNs with post-hoc evaluation methods to examine their predictions and predictability (e.g. Gordon and Barnes 2022, Labe and Barnes 2022), we separate the prediction problem into an internal variability component and an external forcing component by designing two separate ANNs and only coupling them at the final prediction step of the network.This design allows for a direct investigation of the relative contributions of internal variability and external forcing to the ANN's prediction.The external forcing component is defined as the ensemble mean across all ensemble members at each grid point, and the internal variability for a member is defined at each grid point as the full member minus the ensemble mean.Note any changes in internal variability (e.g.amplitude, period) due to forcing are therefore reflected in the relationships learned by the internal variability network.However, we still consider any sources of predictability identified by this network as attributable to internal variability because the network still must learn to identify where and when this internal variability is predictable.
More specifically, we design two neural networks for the prediction task, named the EF_Network and the IV_Network (figure 1(c)), which make predictions using only information about external forcing (EF) and internal variability (IV), respectively.The inputs to the neural networks are maps of global SST at 5 • × 5 • resolution.For the IV_Network, we input two time-lagged maps of internal variability, the first, averaged over the 10 years prior to the prediction (τ = −1 to −10) and the second, averaged over 10 years and lagged by five years (τ = −5 to −14) (figure 1(c)).These inputs provide the ANN with information about the current state of internal SST variability, and an earlier state, which gives information about the time evolution of internal variability before a prediction.The input to the EF_Network is simply a map of the forced response (i.e.ensemble mean at each grid point) averaged over the 10 years prior to a prediction (τ = −1 to −10) (figure 1(c)).
Each of the neural networks outputs a classification of the future SST trend (i.e.lower, middle or upper tercile) at a particular grid point, and these values are summed pair-wise (with no activation, weight or bias) to make the Combined Network prediction.This is demonstrated schematically in figure 1(c).The upper box shows the EF_Network, with a single map of the externally forced SST input into an ANN which outputs three values, EFa, EFb and EFc.The lower box is the IV_Network, with two time lagged internal variability maps input into an ANN which also outputs three values, IVa, IVb and IVc.The outputs of the individual models are summed pairwise (EFa+IVa, EFb+IVb, EFc+IVc) to make the final predictions of the future SST trend over the next 10 years.We name this full system the Combined Network.The Combined Network's prediction is taken to be the class with the highest value prediction in the final layer after the softmax activation function is applied.The softmax activation function converts the raw ANN outputs to probability-like values.The inclusion of the softmax activation means that higher value predictions correspond to higher ANN confidence in the prediction such that predictions can then be ranked by their value.For ease of comprehension, specific examples of predictions by the Combined Network with contributions from the IV_Network and EF_Network is provided schematically in figure S1.
We subset the 100 ensemble members into 60 members for training data for training the ANNs, 20 members for validation data for selecting the best performing networks, and 20 members for testing which is 'unseen' by the ANNs which is exclusively used for performance evaluation and final analysis.The training, validation and testing data is created by grouping individual ensemble members such that each data set equally samples across the five AMOC initializations, and the two different biomass forcings in the individual ensemble members.We use model years 1960-2100 for training the ANNs, but only validate and test on 2020-2050.We use a wider span of model years for training so the neural networks can 'see' more possible internal variability states.All results presented are from the testing set.We train ten combined networks for each 10 • × 10 • grid box in the ocean and present results from the best network at each location, but results do not qualitatively change if we instead use an average of networks.We define 'best' as the network that achieves the lowest loss on the validation data (not shown).Further, detailed information about the hyperparameters and training process is provided in the supporting information.

Identifying contributions to prediction skill
The Combined Networks skillfully predict SST trends over 2020-2050 (figure 2(a)), with the accuracy outperforming random chance (by design 33%) over most of the globe.Furthermore, the highest neural network accuracy corresponds to regions that are considered to be more predictable on decadal timescales: the North Pacific, North Atlantic and Southern Indian Ocean (Meehl et al 2021).Recent literature has emphasized the importance of identifying socalled 'windows of opportunity' for prediction skill on decadal timescales because this provides an indication of when variability may be at its most predictable (Mariotti et al 2020, Gordon and Barnes 2022).We therefore examine the existence the windows of opportunity in near-future decadal predictions, adopting a similar method used by Mayer and Barnes (2021), Gordon and Barnes (2022) by designating windows of opportunity as the 20% of samples that the Combined Networks assigned the highest confidence at each grid point (see methods).Note that other cutoffs could also be used because neural network accuracy increases with increasing confidence in prediction (see supplement figure S2), however, we choose 20% as this provides the clearest signal of accuracy increase.The 20% most confident samples as designated by the neural networks generally have higher accuracy than all predictions (figure 2(b)) demonstrating that the neural networks have learned inputs that are more likely to lead to a correct prediction, and hence initial states that are more predictable.Skill improvements are especially evident in the North Pacific PDO region, the North Atlantic Ocean and broadly across the Southern Ocean which aligns with previous work that points to windows of opportunity for enhanced prediction skill existing in these regions (Gordon and Barnes 2022).
We examine the contribution of internal variability to the Combined Network's skill by using permutation importance testing (Breiman 2001, McGovern et al 2019) in figure 2(c).Permutation importance measures a deep learning model's dependence on a certain predictor by scrambling that predictor while holding the others fixed and examining how the output is affected by the corrupted input data.Here, we test the sensitivity of the Combined Network's skill to the internal variability input and assume a null hypothesis that internal variability does not increase prediction skill.We scramble the internal variability input pixel-wise (randomly drawing each pixel individually from its distribution in the testing set) to create a corrupt testing set and calculate the accuracy of the neural network on this data.This process is repeated 500 times for each grid box and we plot the mean accuracy of each neural network on the scrambled data (figure 2(c)).The neural networks still perform well in some regions (Indian Ocean, Subtropical North Atlantic) when the internal variability input is scrambled implying that network skill is largely derived from the forced response in these regions.The difference between the total accuracy (figure 2(a)) and the accuracy from the permutation importance testing (figure 2(c)) provides the contribution of internal variability to the neural network's skill (figure 2(e)).We consider internal variability to significantly increase a network's skill (reject the null hypothesis) if the network's accuracy on the true testing set is greater than the 90th percentile of the accuracy on the scrambled data.Internal variability significantly contributes to the Combined Networks's skill in the northern and eastern edge of the North Pacific and extending into the tropical Pacific (i.e. the PDO 'horseshoe').There is also enhanced prediction skill from internal variability in the subpolar North Atlantic.Internal variability in these regions has previously been shown to be predictable in studies of pre-industrial and historical climate (Meehl et al 2021, Gordon and Barnes 2022), but these results further imply that internal variability in these regions can provide predictability in the presence of relatively high anthropogenic forcing.
In conjunction with identifying windows of opportunity for improved prediction skill, studies have underlined the difficulty in attributing prediction skill during windows of opportunity to either internal variability, or time varying changes in anthropogenic forcing in the historical period (Borchert et al 2021b).We therefore use permutation importance to decipher to what extent internal variability contributes to prediction skill during windows of opportunity.First we calculate the skill of the 20% most confident predictions in the scrambled internal variability data (figure 2(d)).We find skill increases in the Southern Indian and Atlantic Oceans even with scrambled internal variability input implying that internal variability did not contribute to the enhanced skill during windows of opportunity in these regions.This increased skill is hence likely derived solely from the external forcing input.Notably, skill enhancements in the North Pacific, Tropical Pacific and some of the North Atlantic Ocean during windows of opportunity can likely be attributed to the networks learning predictable internal variability (figure 2(f)).
Furthermore, in regions where internal variability contributes substantially to prediction skill, enhancements during windows of opportunity are even larger than across all predictions.For example, the general accuracy increase provided by internal variability in the Tropical Pacific is approximately 5-8 percentage points (figure 2(e)) but accuracy increases to up to 15 percentage points during windows of opportunity (figure 2(f)).

Internal sources of predictability over 2020-2050
Having identified regions where internal variability provides enhanced skill for SST trend prediction, we now examine how large-scale phenomena can lead to regional prediction skill in a future climate.We first isolate grid points where internal variability contributes significantly to prediction skill in 2020-2050 (i.e.regions un-stippled in figure 2(e)).We then classify whether the IV_Network at each grid point was correct or incorrect for each test sample prediction.Four example IV_Network accuracy timeseries are demonstrated schematically in figure 3(a).We use K-means clustering to cluster these prediction outcome lists, resulting in clusters of grid points with correlated prediction skill.That is, where skill within a cluster is more likely to be associated with the same input pattern of variability.We choose six clusters as this number appeared a reasonable choice for this data (figure S3) and plot the assigned clusters in figure 3(b).Three distinct spatial regions emerge; a Tropical Pacific cluster, a North Pacific cluster and a North Atlantic cluster (clusters 4, 5 and 6 respectively).These grid points are also well described by their respective centroids (figures S3(b) and (c)).The remaining three clusters (clusters 1, 2 and 3) are not spatially distinct, and investigation into their sources of predictability are beyond the scope of this study.However, the underlying variability that leads to similar prediction skill within these clusters remains an intriguing avenue for future work.We choose to focus the remainder of this study on clusters 4, 5 and 6.
To investigate patterns of internal variability that correspond to common predictability within each cluster, we first compute the IV_Network accuracy within a cluster for each testing sample (i.e. percentage of grid points in the cluster that were correctly predicted) and isolate samples with cluster accuracy greater than 50% (i.e. more than 50% of grid cells within the cluster are correctly predicted for that sample).For example, figures 4(a) and (d) is the composite of input samples for which cluster accuracy in the North Pacific cluster is greater than 50%.We additionally separate the composites based on the predicted tercile within a cluster so that the opposing trend signals are not mixed in the composite.This distinction assumes that due to the spatial proximity of grid points within each cluster, that the correct target prediction will generally be the same.
In the North Pacific cluster (cluster #5), predictable upper tercile SST trends follow slightly positive SSTs in the PDO horseshoe 5-14 years before the prediction (figure 4(a)) which appear to become more negative in the 10 years directly preceding the prediction (figure 4(d)).This SST evolution likely leads to predictable upper tercile trends because the leading mode of decadal variability in the North Pacific, the PDO, acts on approximately a 10-20 year timescale (Newman et al 2016).Increasingly negative SSTs in the horseshoe region over a 15 year period are thus likely followed by warming SSTs in the next decade.Supplement figure S4(a) shows the composite annual mean PDO index for the samples in this cluster, demonstrating the evolution of decreasing PDO index transitioning back to increasing PDO index over the input and prediction output window.We also suggest the opposite mechanism for predictions of lower tercile SST trends in the North Pacific cluster (figures 4(g) and (j)), with strengthening positive SSTs in the horseshoe region over the input period leading to more predictable lower tercile SST trends.See figure S4(d) for composite PDO index over this period.These results provide evidence that some SST trends that are lower than that of the forced response may be predictable in the near future, and predictable trends are associated with the decadal evolution of the PDO.
In the North Atlantic cluster, precursors to predictable positive SST trends appear to be a strengthening SST dipole between the North Atlantic subpolar gyre and subtropical North Atlantic, with subpolar gyre SST anomalies becoming more negative, and the subtropical SST anomalies becoming more positive (figures 4(b) and (e)).This SST pattern is likely driven by a similar mechanism to that identified by Borchert et al (2018), with strengthening positive SST anomalies in the subtropical Atlantic forming a predictable state for positive SST anomalies in the subpolar gyre.Supplement figure S4(b) shows the composite SST evolution for the North Atlantic subpolar gyre for these predictions, supporting this theory.Note we choose to use express this in terms of an SST index rather than the full AMO index (Trenberth and Shea 2006), as the AMO encompasses the full meridional extent of the North Atlantic and hence may not fully capture heat transport between the subtropics and the midlatitude ocean.Predictable lower tercile SST trends in the subpolar gyre are preceded by strengthening positive SST anomalies in the 15 years before a prediction (figures 4(h) and (k)).This positive anomaly is likely followed by a negative anomaly within the 10 year prediction window (figure S4(e)), resulting in net negative SST trends following these initial states.Here we find that there are more samples in  the upper tercile trend composite (102) than the lower tercile trend composite ( 26), implying that a warming trend in the subpolar gyre region may be more predictable than a cooling trend which aligns with previous findings (Borchert et al 2018, Gordon andBarnes 2022).Furthermore, though these mechanisms leading to predictability in the North Atlantic have been studied previously, here we provide evidence that it may continue to be a source of predictability in the presence of relatively high anthropogenic forcing in the near future.
Lastly, the initial state that corresponds to correct predictions of upper tercile trends in the Tropical Pacific cluster (figures 4(c) and (f)) appears to coincide with a strengthening El Nino like pattern in the central Pacific Ocean for the 15 year period preceding the prediction of positive trend, as equatorial Pacific SST anomalies strengthen from prediction lead years 5-14 (figure 4(c)) to lead years 1-10 (figure 4(f)).We hypothesize that the ANN's are forecasting a shift to La Nina in the early part of the 10 year prediction window as large El Nino events are often followed by a rebound to a La Nina within both observations and CESM2 (Planton et al 2018, Capotondi et al 2020).This La Nina hence results in large negative SST anomalies in the tropical Pacific in the early part of the prediction window.The SSTs will then likely follow the approximate timescale of the El Nino Southern Oscillation (ENSO) cycle by growing into neutral conditions, and then likely another El Nino within the later part of the future 10 year period, resulting in a net positive trend over the forecast period.Supplement figure S4(c) demonstrates the composite annual mean Nino3.4 index for the 10 years preceding and following the upper tercile predictions for samples in figures 4(c) and (f), showing a robust Nino3.4decrease early in the prediction window, followed by a positive trend for the next decade.Conversely, a common initial state for predictable lower tercile SST trends for the ENSO cluster (figures 4(i) and (l)) shows a strengthening La Ninalike cooling in the central Pacific over the preceding 15 years.Similar to the positive trend prediction, we hypothesize that the ANNs forecast a substantial El Nino early in the prediction window which dramatically increases tropical SSTs.These positive SST anomalies then decay to neutral conditions and a further La Nina event within the 10 year window (see supplement figure S4(f), like S4(c), shows the composite annual mean Nino3.4 index for the preceding and following 10 years of predictions in the composite).This tropical SST evolution therefore results in a net negative trend prediction for grid points in the Tropical Pacific cluster (El Nino to neutral to La Nina).

Conclusion
This study demonstrates that internal variability is a source of predictability in the years 2020-2050 in the CESM2-LE, even with the relatively high anthropogenic climate forcing in the SSP3-7.0scenario (O'Neill et al 2016).This result has interesting implications for future projections of regional climate change, as it implies that there may be some periods where SST trend predictions can be more skillful than just predicting the forced response.SST patterns like ENSO and PDO are associated with atmospheric teleconnections which affect temperature and precipitation over land, so any improved predictions of these patterns can potentially couple to improved future estimates of land surface processes (Mankin et al 2020).Furthermore, it is becoming clear that identifying windows of opportunity for improved prediction skill will continue to be a crucial method for making skillful near-term climate forecasts (Mariotti et al 2020).Our results further suggest that windows of opportunity for increased prediction skill will likely exist in a future with increased anthropogenic greenhouse gas forcing.Interestingly, in places where internal variability is a significant source of skill, correctly forecasting internal variability becomes even more important during windows of opportunity.The importance of internal variability during windows of opportunity reinforces the need to investigate predictable states of internal variability in the climate system as it likely provides the best opportunity for skillful decadal predictions.
We have investigated variability that contributes to predictability, and found it appeared to be broadly attributable to large scale patterns of internal variability.Here, the clusters of predictability could be considered to be the six centroids that explain the most prediction skill in the ocean, but notably this does not account for interactions between grid point clusters.For example, there is evidence that decadal interactions between the Pacific and Atlantic Oceans may lead to predictability following this lead-lag relationship (Meehl et al 2021, Sun et al 2021, Gordon and Barnes 2022).Furthermore, decadal variability in the Tropical Pacific appears to provide some measure of decadal prediction skill of decadal ENSO variability, though there is some question of to what extent this predictability arises from internal variability, or external forcing in the observational era (Boer et al 2019, Klavans et al 2021, Power et al 2021).There may hence be different, or even coupled, predictable signals in the clusters above that investigated here identified here .We suggest that future work focus on methods for deciphering how SST predictability varies (and co-varies) around the globe, especially in light of our other finding that this may provide better ability to predict future climate.
Our findings may potentially aid in the communication of climate change and its impacts since internal variability modulates the forced climate change signal, particularly on regional scales.For example, there is still potential for continued warming and extreme events due to internal variability even after aggressive climate change mitigation efforts (Diffenbaugh et al 2023).Much of the public perceives climate change based on short-term, regional trends (Shao et al 2016) so continued warming could harm continued mitigation efforts if previous efforts are perceived to have failed.With improved understanding of predictable internal variability, we can better attribute whether future variability stems from either successful mobilization against climate change, or irreducible internal variability.
to David Wallerstein.The authors thank James W Hurrell and Maria Rugenstein (CSU) for their informative comments.

Figure 1 .
Figure 1.(a) CESM2-LE annual mean SST time series area averaged over 240 • E-250 • E, 40 • S-50 • S. The blue line is a single ensemble member (member 70) with gray lines indicating all other members.The black line is the forced response, defined as the ensemble mean.The green, orange and red lines are individual 10 year trends in SST in ensemble member 70, plotted every 10 years.(b) Distribution of SST trends ( • C/decade) between 2020 and 2050 for all ensemble members at 240 • E-250 • E, 40 • S-50 • S. Color coding indicates the tercile cut-offs, green are in the lower third (or tercile) of the 2020-2050 distribution, orange in the middle third, and red in the upper third.(c) Schematic of the neural network architecture demonstrating the internal variability network (IV_Network) and external forcing network (EF_Network), and their summation to the Combined Network output.

Figure 2 .
Figure 2. (a) Neural network accuracy at each grid point for testing samples in the years 2020-2050.(b) Neural network accuracy on 20% of testing samples with highest confidence.(c) Neural network accuracy when internal variability input is scrambled.(d) Accuracy for 20% most confident predictions when internal variability is scrambled.(e) Difference in accuracy between total accuracy and accuracy on scrambled internal variability (i.e.panel a minus panel (c).Differences that are not significant at 90% are stippled.(f) Difference in accuracy between accuracy of confident predictions and accuracy of confident predictions when internal variability is scrambled (i.e.panel (b) minus panel (d).Differences not significant at 90% are stippled.

Figure 3 .
Figure 3. (a) Example list of prediction outcome from internal variability at four grid points.(b) K-means clusters of accuracy from internal variability.Black regions were not included in clustering.

Figure 4 .
Figure 4. Initial states of internal variability that lead to predictable SST trends in the three clusters.Left column is the North Pacific cluster, middle column is the North Atlantic cluster, and the right column is the Tropical Pacific cluster.The top two rows are initial state composites for upper tercile trend predictability, and the bottom two rows are for the lower tercile trend predictability.All plots are SST anomaly from the ensemble in mean in • C. Grid points included in each cluster are illustrated with boxes in each plot.
This study uses output from the Community Earth System Model Version 2 Large Ensemble (CESM2-LE) (Danabasoglu et al 2020, Rodgers et al 2021).The CESM2-LE is a collection of 100 ensemble members of CESM2 run under the specified historical forcing for the Coupled Model Intercomparison Project Phase 6 (CMIP6; Eyring et al 2016) for years 1850-2014, and the SSP3-7.0future radiative forcing for 2015-2100 (O'Neill et al 2016).Note that ensemble members 1-50 use the biomass burning specified for CMIP6 however members 51-100 use a smoothed version which affects end-of-century warming Earth Syst.Dyn.11 491-508 Mankin J S, Lehner F, Coats S and McKinnon K A 2020 The value of initial condition large ensembles to robust adaptation decision-making Earths Future 8 e2012EF001610 Mantua N J, Hare S R, Zhang Y, Wallace J M and Francis R C 1997 A Pacific interdecadal climate oscillation with impacts on salmon production Bull.Am.Meteorol.Soc.78 1069-80 Mariotti A et al 2020 Windows of opportunity for skillful forecasts subseasonal to seasonal and beyond Bull.Am.Meteorol.Soc.101 E608-25 Mayer K J and Barnes E A 2021 Subseasonal forecasts of opportunity identified by an explainable neural network Geophys.Res.Lett.48 e2020GL092092 McGovern A, Lagerquist R, Gagne D J, Eli Jergensen G, Elmore K L, Homeyer C R and Smith T 2019 Making the black box more transparent: understanding the physical implications of machine learning Bull.Am.Meteorol.Soc. 100 2175-99 Meehl G A et al 2021 Initialized Earth system prediction from subseasonal to decadal timescales Nat.Rev. Earth Environ. 2 340-57 Meehl G A, Hu A and Teng H 2016 Initialized decadal prediction for transition to positive phase of the interdecadal Pacific oscillation Nat.Commun.7 11718 Merryfield W J et al 2020 Current and emerging developments in subseasonal to decadal prediction Bull.Am.Meteorol.Soc.101 E869-96 Newman M et al 2016 The Pacific decadal oscillation, revisited J. Clim.29 4399-427 O'Neill B C et al 2016 The scenario model intercomparison project (ScenarioMIP) for CMIP6 Geosci.Model Dev. 9 3461-82 Planton Y, Vialard J, Guilyardi E, Lengaigne M and Izumo T 2018 Western Pacific oceanic heat content: a better predictor of la Niña than of El Niño Geophys.Res.Lett.45 9824-33 Power S et al 2021 Decadal climate variability in the tropical Pacific: characteristics, causes, predictability and prospects Science 374 eaay9165 Rodgers K B et al 2021 Ubiquity of human-induced changes in climate variability Earth Syst.Dyn. 12 1393-411 Shao W, Garand J C, Keim B D and Hamilton L C 2016 Science, scientists and local weather: understanding mass perceptions of global warming Soc.Sci.Q. 97 1023-57 Simpson I R, Yeager S G, McKinnon K A and Deser C 2019 Decadal predictability of late winter precipitation in Western Europe through an ocean-jet stream connection Nat.Geosci.12 613-9 Smith D M et al 2019 Robust skill of decadal climate predictions npj Clim.Atmos.Sci. 2 1-10 Solaraju-Murali B, Gonzalez-Reviriego N, Caron L-P, Ceglar A, Toreti A, Zampieri M, Bretonnière P-A, Samsó Cabré M and Doblas-Reyes F J 2021 Multi-annual prediction of drought and heat stress to support decision making in the wheat sector npj Clim.Atmos.Sci. 4 1-9 Sun C, Liu Y, Xue J, Kucharski F, Li J and Li X 2021 The importance of inter-basin atmosp otprint of Atlantic multidecadal oscillation over Western Pacific Clim.Dyn.57 239-52 Trenberth K E and Fasullo J T 2013 An apparent hiatus in global warming?Earths Future 1 19-32 Trenberth K E and Shea D J 2006 Atlantic hurricanes and natural variability in 2005 Geophys.Res.Lett.33 L12704 Yeager S G et al 2018 Predicting near-term changes in the Earth system: a large ensemble of initialized decadal prediction simulations using the Community Earth System Model Bull.Am.Meteorol.Soc.99 1867-86