High predictability of terrestrial carbon fluxes from an initialized decadal prediction system

Interannual variations in the flux of carbon dioxide (CO2) between the land surface and the atmosphere are the dominant component of interannual variations in the atmospheric CO2 growth rate. Here, we investigate the potential to predict variations in these terrestrial carbon fluxes 1–10 years in advance using a novel set of retrospective decadal forecasts of an Earth system model. We demonstrate that globally-integrated net ecosystem production (NEP) exhibits high potential predictability for 2 years following forecast initialization. This predictability exceeds that from a persistence or uninitialized forecast conducted with the same Earth system model. The potential predictability in NEP derives mainly from high predictability in ecosystem respiration, which itself is driven by vegetation carbon and soil moisture initialization. Our findings unlock the potential to forecast the terrestrial ecosystem in a changing environment.


Introduction
Excess carbon dioxide (CO 2 ) in the atmosphere is derived primarily from fossil fuel sources (Le Quéré et al 2018a), and is the largest contributor to anthropogenic global warming to date (Myhre and Shindell 2013).The governments participating in the 2015 Paris Climate Agreement pledged to prevent dangerous anthropogenic interference with the climate system by collectively reducing greenhouse gas pollution (United Nations Framework Convention on Climate Change 2015).However, verification of these emission reductions from atmospheric CO 2 observations is hampered by our inability to accurately quantify and predict carbon absorption by the land and ocean (Le Quéré et al 2009).
The growth rate in atmospheric CO 2 is highly changeable from year to year and is dominated by variations in the flux of CO 2 between the terrestrial biosphere and the atmosphere (Ciais andSabine 2013, Le Quéré et al 2018a).The net CO 2 flux from atmosphere to land is comprised of a large positive flux due to plant photosynthesis (gross primary production, GPP), a large negative flux due to plant and soil respiration (ecosystem respiration, ER), and smaller negative fluxes due to land use and disturbance (Ciais and Sabine 2013).Both GPP and ER are highly sensitive to changes in the Earth system, such as internal variability imposed by the El Niño-Southern Oscillation (ENSO; Rayner et al 1999, Jones et al 2001) or external variability imposed by volcanic eruptions (Frölicher et al 2011), and increasing atmospheric CO 2 concentration (Zhu et al 2016).
Accurate predictions of near-term terrestrial carbon fluxes are of interest to those forecasting the growth of atmospheric CO 2 for budgeting or emissions management purposes (e.g.Le Quéré et al 2018a).Seasonal forecasts of atmospheric CO 2 concentration and land-atmosphere carbon fluxes show predictive skill at 6-9 month lead times, driven by predictability in the state of ENSO (Zeng et al 2008, Betts et al 2016).Indeed, previous iterations of the Global Carbon Budget (e.g.Le Quéré et al 2018b) generated seasonal atmospheric CO 2 growth rate projections based on the expected state of ENSO.
While the intrinsic predictability of terrestrial ecosystems on seasonal timescales is relatively high, predictability on interannual timescales is less wellknown (Luo et al 2015).Recently, Séférian et al (2018) analyzed interannual predictability in land carbon uptake using a perfect model framework; their results suggest a 2 year prediction horizon with highest predictability in the northern hemisphere extratropics.However, their investigation focused on the potential predictability of ocean versus land carbon fluxes, and did not include a detailed quantification of the drivers of terrestrial carbon predictability.Their perfect model framework also excluded the role of external forcing in generating terrestrial carbon flux predictability.Further studies on this topic are thus warranted.
Here, we investigate the potential to predict nearterm (annual to decadal) variations in land carbon uptake by analyzing retrospective decadal forecasts generated by an Earth system model.We report on the potential predictability in terrestrial biosphere carbon fluxes, which represents the theoretical limit of predictive skill in optimum conditions (e.g.initial conditions for all state variables are known at every global location; (Meehl et al 2014)).Due to the way our prediction system was initialized, we are unable to assess the true skill of our predictions as compared to historical, real world observations.To overcome this, we compare the statistics of our modeled ecosystem state variables to those generated by a simulation of our land model with historically observed forcing.
Our study makes three primary contributions.First, we demonstrate high near-term predictability in net ecosystem production (NEP), the sum of the two large terrestrial carbon fluxes GPP and ER, and the dominant source of variability in the atmospheric CO 2 growth rate.Second, we quantify the role of persistence and external forcing in this predictability.Finally, we demonstrate that initialization of vegetation carbon and soil moisture is key to the successful prediction of land carbon uptake on interannual to decadal timescales.

Methods
We make use of decadal forecasts generated by the Community Earth System Model (CESM) version 1, a state-of-the-art coupled climate model consisting of atmosphere, ocean, land, and sea ice component models (Hurrell et al 2013).The CESM Decadal Prediction Large Ensemble (CESM-DPLE; Yeager et al 2018) consists of a set of initialized, coupled integrations of CESM1 that build on previous CESM decadal prediction efforts (Yeager et al 2012(Yeager et al , 2015)), and the CESM Large Ensemble effort (CESM-LE).CESM-LE simulates the period 1920-2100 40 times under historical and RCP8.5 external forcing (Kay et al 2015).Each CESM-LE ensemble member is exposed to an identical emission scenario but starts from a slightly different atmospheric state, so that the ensemblemean cleanly captures the response of the modeled Earth system to external forcing and averages across various representations of internal variability (Deser et al 2012).
CESM-DPLE advances previous decadal prediction efforts at NCAR with the novel addition of prognostic land and ocean biogeochemistry (Yeager et al 2018).The land component of CESM1 is the Community Land Model version 4 (CLM4) that includes vegetation physiology, phenology, and active carbon and nitrogen dynamics that allow the simulation of global pools and fluxes of carbon (e.g.productivity, decomposition, etc; Lawrence et al 2012).CESM-DPLE comprises 40 decade-long forecasts of the Earth system each year from 1954 to 2017 with ensemble spread generated by round-off level (order 10 −14 ) differences in initial air temperature (figure 1); the start date for each forecast ensemble is 1 November.The CESM-DPLE initialization procedure ensures that each forecast ensemble member exhibits different internal variability, while exposing all forecast ensemble members to identical external forcing (e.g.greenhouse gas concentrations from 1954 to 2017).
Initial conditions for the land and atmosphere components of CESM-DPLE are obtained from a single ensemble member (specifically, ensemble member #34; figure 1) of the CESM-LE.Ocean and sea ice initial conditions for the CESM-DPLE are generated from a forced ocean-sea ice reconstruction of CESM over 1948-2017 (Yeager et al 2018).This full-field initialization procedure generates drift in the coupled CESM over the course of each decadal forecast that requires a drift-correction to be applied to the model forecasts before predictability may be analyzed (Meehl et al 2014).We remove the drift by transforming to anomalies from a drifting climatology, as in Yeager et al (2018) and Lovenduski et al (2019).For a given forecast, X(L, M, S), where L is the forecast length, M is the ensemble member, and S is the start year of the forecast, the drift-corrected forecast anomaly, X′(L, M, S) is defined as where is the average rate of drift over all forecasts.The large number of ensemble members in each forecast ensures a statistically robust calculation of the drifting climatology (Kirtman et al 2013).The drift-corrected forecast anomalies are added to the climatological mean value from CESM-LE member #34 for display in figure 1.
We quantify predictability as the correlation coefficient of a particular variable (e.g.NEP) between two annual-mean anomaly time-series: (1) CESM-LE member #34, and (2) the CESM-DPLE forecast ensemble mean for a given lead year (e.g.see figure S1 is available online at stacks.iop.org/ERL/14/124074/mmedia).This correlation coefficient is a deterministic metric to gauge the role of initialization in the forecast; it provides a measure of the relative association of the average forecast with the simulation from which it was initialized (Goddard et al 2013).Predictability is a distinct concept from prediction skill, which is the ability of the forecast ensemble mean to predict the observed evolution of the system.The CESM-DPLE does not permit an evaluation of prediction skill for terrestrial carbon fluxes, as the land component is initialized from a CESM-LE ensemble member with a chronological sequence of internal variability that differs from the observational record.
To assess the impact of external forcing on predictability, we first calculate the predictability of the uninitialized forecast as the correlation coefficient between anomalies from (1) CESM-LE member #34, and (2) the CESM-LE ensemble mean over ensemble members #1-33.The CESM-LE ensemble mean provides a distribution of terrestrial carbon fluxes produced under the same external forcing as CESM-DPLE from 1954 to 2017.We then compare the predictability of the initialized CESM-DPLE with that of the uninitialized forecast to assess the gain in predictability due to initialization (quantified as the difference in correlation coefficients).
The persistence forecast is generated by calculating the lagged autocorrelation of the annual-mean timeseries from CESM-LE member #34.
Statistical significance of predictability is determined two ways: (1) predictability is statistically different from zero if the anomaly correlation coefficient passes a two-sided Student's t test at the 95% level while accounting for autocorrelation in the sample, and (2) initialized predictability is statistically different from persistence or uninitialized predictability if the z test statistic exceeds the value for a 95% confidence interval.
We use the mean absolute error (MAE) statistic to quantify the average of absolute errors between the prediction anomaly time series ( ¢ y t ) and the CESM-LE member #34 anomaly time series ( ¢ where a MAE of 0 represents a perfect forecast and MAE increases as accuracy decreases. We evaluate the realism of our initial conditions (i.e.CESM-LE member #34) by comparing their statistical properties with a historical reconstruction.This reconstruction is a land-only simulation of the CLM4 that has been forced with the Global Soil Wetness Project (GWSP) version 3 reanalysis over 1850-2014 (Bonan et al 2019).Hereafter, we refer to this simulation as the 'GSWP reconstruction'.

Results and discussion
The predictability in globally-integrated NEP from the CESM-DPLE initialized forecast is high and statistically significant in forecast lead years 1 and 2 (r=0.69,r=0.56)and lower for longer prediction lead times as the impact of initialization is lost (figure 2(a), pink line).Predictability is even higher when averaged over forecast years 1-3 (r=0.75,not shown).In forecast lead year 1, the CESM-DPLE is able to predict nearly half of the variance in globallyintegrated NEP, with a MAE of 0.5 Pg C yr −1 .This high correlation and low error is remarkable, given the highly variable nature of the NEP time series (figure   predictability showcase key regions where reliable forecasts could be made up to 3 years in advance. We contextualize the initialized predictability of NEP by comparing to two other types of forecasts: (1) a simple persistence forecast, which gives an indication of how NEP over from one year to the next, and (2) an uninitialized forecast, which has identical external forcing, but is not initialized from CESM-LE member #34.As our persistence forecast for globallyintegrated NEP is based on the autocorrelation of a highly variable time-series (figure S1), it exhibits generally low predictability for all lead years (figure 2(a)).The persistence forecast for lead year 1 of the globally integrated NEP captures less than 4% of the variance (figure 2(a)) and its MAE is 0.95 Pg C yr −1 (not shown), nearly twice that of the initialized forecast.For the global integral and in most locations across the global land surface, we find that the initialized forecast exhibits higher predictability than the persistence forecast, and this gain is generally maintained with increasing forecast lead time (figures 2(a); 3(d)-(f)).The initialized forecast predictability exceeds that of the persistence forecast for all lead years, with the exception of forecast lead year 4, and is statistically separable from the persistence forecast for the first two forecast lead years (figure 2(a)).This gives us confidence that the initialized forecast system is performing better than a persistence forecast.
We observe a long-term change in the globallyintegrated NEP from 1955 to 2014 in both CESM-LE member #34 and the forecast ensemble mean (figure S1).This trend is likely driven by external factors, such as rising atmospheric CO 2 concentration and increasing temperature.Thus, it is of interest to know how much of our predictability is driven by this external forcing and how much additional predictability is gained by initialization.We account for the role of external forcing in our estimates of NEP predictability by analyzing the ensemble mean of CESM-LE members #1-33.Figure 2(a) demonstrates that the uninitialized forecast of globally-integrated NEP has a rather low correlation with that of member #34 (r=0.38).The initialized predictability surpasses that of the uninitialized forecast for all lead years with the exception of lead year 4, and is statistically separable from the uninitialized forecast in lead year 1 (figure 2(a)).We find a large gain in predictability most of the global land surface in the initialized forecast relative to the uninitialized forecast that decays with prediction lead time (figures 3(g)-(i)).In general, our results suggest that external forcing plays only a minor role in the predictability of NEP.
If not persistence or external forcing, what is driving the high predictability in NEP?As NEP is the sum of GPP and ER, two large and opposing fluxes, we next investigate the predictability in each of these quantities.We note, however, that both GPP and ER are highly sensitive to external forcing; the former is strongly affected by long-term increases in atmospheric CO 2 and changes in temperature and precipitation, while the latter is impacted by long-term increases in surface temperature and changes in soil moisture.To account for this, we show in figure 2(b) the gain in predictability from the initialized forecasts over the uninitialized forecasts of GPP and ER.The gain from initialization is an order of magnitude smaller for GPP and ER than for NEP, likely due to the important role of external forcing for GPP and ER and the opposing influences of these two large fluxes on NEP.The benefit of forecast initialization is nearly twice as large for ER as for GPP for the global integral (figure 2(b)), with large gains in ER predictability on nearly every continent (figure S2).The gain in predictability for ER is maintained for three forecast lead years (figure 2(b)).These findings suggest a large role of ER in determining the high initialized predictability in NEP.
As the predictability of ER and thus of NEP is enhanced by initialization, we next explore the key processes controlling ER in our model.ER is the sum of autotrophic respiration (AR) from plants and heterotrophic respiration (HR) from soil. Figure 2(b) shows that globally-integrated AR and HR both exhibit long-lasting gains in predictability due to initialization, with larger gains for AR than HR.While AR and HR are dependent upon multiple variables in our model, we focus our analysis on the variability in vegetation carbon and surface air temperature for AR, and soil carbon, soil temperature, and soil moisture for HR.We aim to identify the most important variables to initialize to provide reliable, long-lasting prediction of ER/NEP.We quantify the standard deviation in these quantities (e.g.vegetation carbon, σ vegc ) from one November to the next in CESM-LE member #34, as this is the month when the forecasts are initialized.So as to assess the importance of this variability for respiration, we scale this standard deviation by the sensitivity of respiration to the variable being considered s = ´ ¶ ¶vegc scaled variation AR , 3 where airt is air temperature.We find a much larger scaled variation in November vegetation carbon than for air temperature (figure 4(a)); and a much larger scaled variation in November soil moisture than for soil carbon or soil temperature (figure 4(b)).Our results suggest that the initialization of vegetation carbon and soil moisture in our forecasts is key to generating high predictability in ER, and thus NEP.Air temperature, soil carbon, and soil temperature exhibit low variability from one November to the next, and contribute little to the predictability in ER.Further, the relative magnitude of variation these quantities in CESM-LE member #34 is similar to a reconstruction of these quantities over 1955-2014 using Global Soil Wetness Project reanalysis forcing (figure 4), giving us confidence that vegetation carbon contributes more variance than air temperature to AR and that soil moisture contributes more variance than soil carbon or temperature to HR.Thus, results from the variation analysis suggest that the key to successful prediction of ER/NEP in our prediction system is the initialization of vegetation carbon and soil moisture.While promising, our results come with several caveats.Strictly speaking, the net land-atmosphere CO 2 flux includes not only GPP and ER, but also smaller contributions from disturbance and land use.While these latter two processes may provide some predictability in the real world, we are unable to explore their predictability in our prediction system.Certain regions have been shown to exhibit high predictability of disturbance on seasonal timescales (e.g. the high likelihood of fire in Indonesia during El Niño events), suggesting that disturbance may also be predictable on interannual timescales.Future studies should explore this aspect of predictability in more detail.Further, while the results of our study suggest a high potential to predict NEP, our prediction system is not configured to evaluate the true skill of these predictions in the real world.Prediction skill could be assessed in a future prediction study if the NEP forecasts are initialized from, e.g. the GWSP reconstruction.Future studies should also investigate the sensitivity of terrestrial predictability to the month/ season in which the forecast is initialized.Finally, no terrestrial carbon cycle model is perfect, and thus the results of our study need to be interpreted with caution.Other terrestrial models, and perhaps even various versions of the same model, may represent terrestrial processes differently.The Community Land Model, for example, exhibits different sensitivity to meteorological forcings among versions 4 (used in this study) and versions 4.5 and 5 (Bonan et al 2019).
In spite of these caveats, the results of our predictability study suggest high interannual predictability in terrestrial carbon fluxes that is significantly enhanced by forecast initialization of vegetation carbon and soil moisture.A perfect model study using a different Earth system model suggests that land carbon uptake has a 2 year predictability horizon (Séférian et al 2018), which is very comparable to the high predictability of NEP for lead years 1 and 2 that we find here.We find only a small role for external forcing in near-term predictability of NEP, in agreement with many other modeling studies performed on century timescales (e.g.Friedlingstein et al 2006, Arora et al 2013, Jones et al 2013, Hoffman et al 2014, Hewitt et al 2016, Lovenduski and Bonan 2017).Our assessment of the drivers of terrestrial carbon flux predictability indicate that ER is more predictable than GPP, and further indicate an important role for vegetation carbon and soil moisture in predictability.Vegetation carbon is regularly measured via forest inventory analysis (e.g.Goodale et al 2002) and can be estimated from satellite greenness (e.g.(Myneni et al 2001)).Soil moisture 'memory' has previously been indicated in many modeling studies as a source of predictability in the physical climate system at subseasonal to seasonal timescales (National Academies of Sciences, Engineering, and Medicine 2016), and at least one study suggests that this memory can persist for 1-2 years in the ecosystem (Schimel et al 2005).Our findings suggest that consistent observations of vegetation carbon and soil moisture could go a long way toward improving our prediction capabilities of NEP in the real world.Finally, ours is one of the first studies to generate and investigate near-term forecasts of the terrestrial biosphere ecosystem using an Earth system model.The oceanography community is deeply considering the utility of such forecasts for marine ecosystem management (Bonan and Doney 2018); the terrestrial biogeochemistry community could soon follow suit.

Figure 1 .
Figure 1.Annual-mean, globally integrated net ecosystem production (Pg C yr −1 ) for the (black) CESM-LE member #34, and (pink) CESM-DPLE decadal forecasts initiated in 1960, 1980, and 2000 (other forecasts omitted for visual clarity).The thick magenta line represents the ensemble mean forecast; open circles show the ensemble mean in forecast year 1.Positive fluxes denote land uptake.Forecasts have been drift-corrected and adjusted to match the climatological mean value from CESM-LE member #34 for ease of visual comparison.

Figure 2 .
Figure 2. (a) Predictability of globally-integrated net ecosystem production as a function of lead time, as indicated by the correlation of anomalies from the (pink) CESM-DPLE initialized forecast and the (blue) CESM-LE uninitialized forecast (ensemble mean of members 1-33) with CESM-LE member #34.The black dashed line indicates the correlation of the persistence forecast as a function of lead time.Blue dots (black circles) on the initialized forecast indicate predictability that is statistically different from the uninitialized (persistence) forecast at the 95% level using a z test.(b) Gain in predictability of initialized (green) gross primary production, (orange) ecosystem respiration, (red) autotrophic respiration, and (yellow) heterotrophic respiration forecast over the corresponding uninitialized forecast, as indicated by the difference in the anomaly correlation coefficients as a function of lead time.

Figure 3 .
Figure 3. (First row) Predictability of net ecosystem production, as indicated by the correlation coefficient of the CESM-DPLE initialized forecast with member #34.(Second row) The gain in predictability (difference in correlation coefficients) from the initialized forecast relative to the persistence forecast.(Third row) The gain in predictability (difference of correlation coefficients) from the initialized forecast relative to the uninitialized forecast.First column corresponds to forecast lead year 1, second column to forecast lead year 2, and third column to forecast lead year 3.
AR is determined via regression of annualmean, detrended time series from CESM-LE member #34.We then normalize the resulting scaled variations to compare across AR and HR, which have different