Diagnosing atmosphere–land feedbacks in CMIP5 climate models

Human-made transformations to the environment, and in particular the land surface, are having a large impact on the distribution (in both time and space) of rainfall, upon which all life is reliant. Focusing on precipitation, soil moisture and near-surface temperature, we compare data from Phase 5 of the Climate Modelling Intercomparison Project (CMIP5), as well as blended observational–satellite data, to see how the interaction between rainfall and the land surface differs (or agrees) between the models and reality, at daily timescales. As expected, the results suggest a strong positive relationship between precipitation and soil moisture when precipitation leads and is concurrent with soil moisture estimates, for the tropics as a whole. Conversely a negative relationship is shown when soil moisture leads rainfall by a day or more. A weak positive relationship between precipitation and temperature is shown when either leads by one day, whereas a weak negative relationship is shown over the same time period between soil moisture and temperature. Temporally, in terms of lag and lead relationships, the models appear to be in agreement on the overall patterns of correlation between rainfall and soil moisture. However, in terms of spatial patterns, a comparison of these relationships across all available models reveals considerable variability in the ability of the models to reproduce the correlations between precipitation and soil moisture. There is also a difference in the timings of the correlations, with some models showing the highest positive correlations when precipitation leads soil moisture by one day. Finally, the results suggest that there are ‘hotspots’ of high linear gradients between precipitation and soil moisture, corresponding to regions experiencing heavy rainfall. These results point to an inability of the CMIP5 models to simulate a positive feedback between soil moisture and precipitation at daily timescales. Longer timescale comparisons and experiments at higher spatial resolutions, where the impact of the spatial heterogeneity of rainfall on the initiation of convection and supply of moisture is included, would be expected to improve process understanding further.


Introduction
Improving our knowledge of the coupling between the land surface and atmosphere is important to further our understanding of the influence of land use and land cover changes on the climate, and to understand any related feedback processes that may amplify or dampen the effects of future climate change. From global changes in the composition of the atmosphere, through the emission of greenhouse gases and aerosols, to more localized land use and land cover changes due to an expanding population with an increasing ecological footprint, human activity has a considerable impact on the processes controlling rainfall. A number of modelling studies have suggested that a strong positive feedback mechanism between the land surface and the atmosphere drove sudden climatic shifts over the Saharan region during the middle Holocene period (Patricola and Cook 2008, Liu et al 2010, Krinner et al 2012. During the present day climate, observations over monsoon regions have revealed a feedback between moisture and advection that dominates the seasonal heat balance and might act as a positive feedback mechanism, leading to abrupt regional climate changes in response to relatively weak external perturbations (Leverman et al 2009).
Much of the past research on land-atmosphere feedbacks has focused on the relationship between precipitation and soil moisture. Specifically, it has been shown in a model environment that under certain conditions wet (dry) soil enhances (suppresses) the generation of precipitation by influencing the amount of evapotranspiration, which in turn impacts both local-scale convection and the large-scale atmospheric circulation (Wang et al 2007, and references therein). This positive feedback is maintained by the increased precipitation, which further enhances the original wet soil anomaly. Furthermore other modelling studies, such as Koster et al (2004), have shown that the strength of the coupling between precipitation and the land surface demonstrates a strong spatial dependency, with the strongest coupling in transition zones between either arid to semi-arid conditions or semi-humid forest to grassland conditions, such as the central Great Plains of North America, the African Sahel and equatorial Africa and India according to Koster et al (2004), and central Eurasia, southwest China, southeast Asia, the Sahel and equatorial West Africa according to Zhang et al (2008).
However, some findings have also suggested the existence of a negative feedback between precipitation and soil moisture, whereby dry soils cause an increase in boundary layer clouds, resulting in an increase in precipitation received at the surface (Zhang et al 2008, Zhang and. Alternatively, Pan et al (1996) suggested that increases in soil moisture might reduce precipitation even when the atmosphere is sufficiently humid but lacks enough thermal forcing to initiate deep convection. This may be exacerbated by suppression of surface heating by the additional soil moisture (as excess energy is used in evaporation rather than heating).
The experiments undertaken as part of Phase 5 of the Coupled Model Intercomparison Project (CMIP5, see Taylor et al (2009Taylor et al ( , 2012 for full details) present a unique opportunity to study the simulation of the land-atmospheric coupling across a range of models, their ensemble members and varying emission scenarios. In particular in this letter we explore the lag-lead relationship between soil moisture and precipitation as represented by the multi-model outputs at daily timescales for the tropics. The reason for the focus on the tropics is due to the presence of the transition zones identified above, where the strongest couplings are noted to exist. The concentration on daily timescales allows a comparison of model behaviour in terms of the how models simulate the localized effect of soil moisture on rainfall before the large-scale circulation changes can develop. Rather than isolating the precise feedback or coupling mechanisms, as in Koster et al (2004) or Taylor et al (2011), here we seek to explore the basic relationships between key variables in models and observations. At its very simplest level, model simulations of the impact of rainfall on soil moisture are dependent on the treatment of runoff and infiltration in the surface component of the various climate models and the land cover parameterization of the impact of vegetation on evapotranspiration. Conversely model simulations of the impact of soil moisture on rainfall are related to how the processes involving fluxes of latent and sensible heat are represented and their relationship to the initiation of convection.

Observational and model data
Widespread observational data on soil moisture is notoriously difficult to gather. Data on observed precipitation patterns are also fraught with uncertainties, due to poor global coverage of rain gauges over many parts of the tropics (Washington et al 2006). A number of satellite-based rainfall datasets exist at a range of temporal scales, although at daily timescales many of these are either based on indirect measures of rainfall such as cloud top properties (in the case of infrared-based datasets) or comprised of infrequent observations (as is the case of active microwave-based datasets). For the purposes of this comparison, observational rainfall data come from the Global Precipitation Climatology Project (GPCP) 1DD v1.1, which is a merged product containing both satellite rainfall estimates (from microwave and infrared measurements) and rain gauge observations, producing daily data on a 1 • spatial grid (Huffman et al 2001). Likewise, 'observational' soil moisture data are taken from ERA-Interim, the latest global atmospheric reanalysis product from the European Center for Medium-Range Weather Forecasts (ECMWF), which includes both simulated and assimilated data as input; we utilize output provided on a 1.5 • spatial grid (Dee et al 2011).
However, strictly speaking neither of these datasets can be considered as direct observations of the different parameters and are only used in a comparative rather than validation role; ERA-Interim, in particular, should be considered as a model that is constrained by the assimilation of observations to varying degrees, depending upon the variable that is considered (for example atmospheric temperature and humidity profiles are constrained by satellite retrievals, spatially scattered radiosondes profiles and groundbased meteorological measurements). For soil moisture, this is influenced by screen-level observations of both temperature and humidity (Dee et al 2011). In contrast, precipitation is more dependent on the model convection scheme which, according to Dee et al (2011), has been improved to give a more realistic representation relative to previous versions of the reanalysis.
In this study, 10 individual CMIP5 models were used, plus ERA-Interim. Although more models are included in CMIP5, some did not supply the relevant data or were not yet available. Table 1 shows the full list of the models used in this study, along with their corresponding institutes and spatial

Methodology
Methodologically the paper uses a simple diagnostic of the relationship between daily rainfall (in mm/day), soil moisture (in kg m −2 ) and near-surface air temperature (in K), as represented by the significant (at the 95% level) correlation and regression coefficients of both cotemporal and lagged/leading rainfall to soil moisture, rainfall to temperature and temperature to soil moisture. These three variable sets are herein referred to as dP/dM, dP/dT and dM/dT, respectively. For ERA-Interim, we separated out convective and total rainfall and conducted the correlations on each individually, to investigate whether the link between precipitation and soil moisture is stronger with locally-generated convection compared to large-scale dynamic rainfall. Although the CMIP5 experiments require soil moisture to be provided for the upper 0.1 m of soil, it should be noted that the definition of soil moisture varies slightly between models, referred to as 'soil moisture', 'volumetric soil water layer 1', 'moisture in upper portion of soil column' and 'moisture in upper 0.1 m of soil column'.
At the daily timescale, the relationships were based on anomalies calculated by subtracting a 10 day running mean from the daily values. To distil the information for the purposes of this letter, for the lagged/leading relationships the variables were firstly compared at each grid point and at each lag-lead combination (from +10 to −10 days where, for example, at day +5 rainfall would lead soil moisture by 5 days), then these correlations were averaged over the tropics and plotted. For the regression coefficients, the gradients from each model were averaged (at a common spatial resolution) to show spatial maps of the model means, with stippling to show certain levels of agreement between models.

Cotemporal and lag-lead correlations
An example of the daily cotemporal significant (at the 95% level) correlations between dP/dM can be seen in figure 1, for a selected season from two models, namely ERA-Interim and CanESM2. These two models were chosen to show the two ends of the spectrum in how the models reproduce the magnitude and spatial extent of this positive relationship. There is clearly a large difference between ERA-Interim, which shows high positive significant correlations everywhere in the tropics, and CanESM2 where this pattern, although simulated, is much more restricted in spatial extent (figures 1(a) and (b), respectively). For CanESM2, this may be because of two reasons: (i) the relationships may well be weaker in this model because of its own particular setup and parameterizations; and/or (ii) the model may have less variability than the others meaning the correlations are less easy to distinguish. This is beyond the scope of the current letter, but merits further investigation. All of the other models fall somewhere in between these two extremes, and this large spread in terms of the models' ability to reproduce the spatial extent of the relationships is true for all other seasons and for the other two variable sets. In general, all models show weak negative correlations between dP/dT over equatorial (and in particular oceanic) regions, and stronger negative correlations everywhere between dM/dT (not shown).
To provide information on the cause and effect relationships between variables, the daily lag-lead correlations, averaged over the tropics, for the three variable sets are shown in figure 2, along with lag-lead correlations of precipitation against itself. For this latter plot, it is clear that all the models and the observational data (including both convective and total rainfall) agree on a negative correlation from day +10 onwards, becoming (as expected) a perfect positive correlation at day 0 ( figure 2(a)). For the other three variable sets, several points are noteworthy. Firstly, the bulk of the models show a similar shape in their lag-lead relationships; a positive relationship when rainfall, be it convective or total, leads (as expected) i.e. increased rainfall leading to increased soil moisture, and a negative relationship when soil moisture leads i.e. increased soil moisture (after the rainfall occurs) reducing surface heating and leading to reduced rainfall 2 days later (figure 2(b)). As an example of this, figure 3 shows spatial maps of lag-lead correlations from one model (HadGEM2-ES) at days +1 and −2 (i.e. P leads M by 1 day, and M leads P by 2 days, respectively). Positive (negative) correlations exist everywhere when P leads M (M leads P), and this is representative of the majority of the other models.
Similar shapes between the models in the lag-lead relationships of dP/dT and dM/dT are also shown, with nearly all models agreeing on a weak positive relationship at day 0 and +1 for dP/dT and a weak negative relationship at day +1, 0 and −1 for dM/dT (figures 2(c) and (d), respectively). The exception to this is for the ERA-Interim convective dP/dT relationship which, unlike total rainfall, shows a weak negative relationship at day 0 and +1 (figure 2(c)). INMCM produces a slightly different lag-lead relationship for dM/dT to the other models (negative coupling is maximum before day 0 indicating a lag between increased soil moisture and reduced temperatures over the next 2 days). It should be noted that for these plots, one of the models (BCC-CSM1-1) has been omitted due to corrupted temperature data. These results suggest that the models are successfully simulating the effect of increased temperatures on stimulating some convection, while increasing evapotranspiration and thus reducing soil moisture. However they do not provide any modelled evidence for 'local' atmospheric moisture (supplied by soil moisture) initiating convection, at a daily timescale, at the current global climate model spatial scale. In order to remove the effect of averaging over such a large area, the lag-lead correlations were also calculated at a regional level at six zones across the tropics; however, the lag-lead relationships between the three variable sets were almost identical to the tropics-wide relationships (not shown). Nevertheless, the apparent tropicwide relationships disguise a more heterogeneous picture regionally, discussed further in section 3.2.
Secondly, despite this similarity between models, there are nevertheless several important differences, particularly in  the timings of the strongest dP/dM relationships ( figure 2(b)). Here, the majority of models (including 'observational' data) show the strongest correlations at day +1 ( figure 3(a)), i.e. when rainfall leads soil moisture by one day, which is to be expected given the slight delay in soil moisture response to rainfall (e.g. if rain falls at the end of the day, soil moisture for the early part of that day may be low). However, two models (namely CNRM-CM5 and, interestingly, ERA-Interim) do not identify this response, showing the maximum correlation at day 0 i.e. cotemporal correlations. This might relate to the diurnal timing of convection, in which maximum soil moisture is shown on the same day when convection occurs early in the day, but is shown the following day when convection occurs late in the day. Further, as well as showing a difference in the timings of the maximum correlations, CNRM-CM5 also shows much weaker correlations than all the other models. A possible reason for this is discussed below.
Also of note in figures 2(b) and 3(b) is the widespread weak negative relationship in most models between M and P at day −2 (M leads P), as well as a very weak negative relationship at day +5 ( figure 2(b)). This is suggestive  of negative feedbacks operating on timescales longer than 1 day (increased rainfall increases soil moisture over the next 1-2 days, which suppresses temperature over the following days, leading to reduced rainfall after 5 days). Although the correlations are generally not statistically significant, relationships on these longer timescales merit further examination.
Lastly, when 'observational' data are used in the form of GPCP rainfall versus ERA-Interim soil moisture, the relationship is much weaker than most of the other models ( figure 2(b)). Although the same temporal evolution is shown, i.e. positive when rainfall leads and negative when soil moisture leads, the correlation coefficients are much weaker, even when compared to ERA-Interim rainfall and soil moisture. This is likely explained by inconsistencies between P from GPCP and ERA-Interim M (due to uncorrelated errors in both). Although model and ERA-Interim rainfall and soil moisture cannot be considered 'realistic' they are at least self-consistent with each other (e.g. model rainfall directly influences model soil moisture through physical parametrizations).

Cotemporal regressions
Figures 4-6 show the model mean gradients of linear fit between dP/dM, dP/dT and dM/dT respectively, for December-February (DJF) and June-August (JJA) seasons. Relationships between variables are quantified as the gradients (b) using the linear regression models: M = bP + a, P = bT + a and M = bT + a respectively. In these plots, the colours show the model mean gradients whereas stippling shows where the inter-model standard deviations are less than 90% of the model mean. It should be noted that in all of these plots, one of the models (namely CNRM-CM5) was excluded from the analysis. This was because, on closer inspection, this model contained approximately 10 times less soil moisture than the others (due to an effectively smaller soil layer), which was skewing both the model means and standard deviations. This might also explain why CNRM-CM5 shows much weaker lag-lead correlations than the other models as described above, when its lower level of soil moisture is compared to rainfall (a thinner surface soil layer will respond more rapidly to changes in rainfall).
Focusing on the regressions between dP/dM, certain 'hotspots' of high linear gradients are shown which correspond to regions experiencing heavy rainfall. Some, but not all, of these hotspots agree with those found by Koster et al (2004), such as over the Sahel and equatorial Africa during JJA ( figure 4(b)). During DJF the hotspots follow the seasonal cycle with, for example, the strongest gradients occurring where the rainfall is strongest over southern Africa and Brazil ( figure 4(a)). Conversely, with the progression of the ITCZ further north during JJA, the strongest gradients occur where the heaviest rainfall is found over equatorial Africa ( figure 4(b)). The models generally agree well on sign of the gradients (not shown), and it is also in many of these high rainfall regions that the lowest amount of variability between the models is shown. Elsewhere, such as over northeastern Africa, the inter-model spread is high. The exception to this is over northern Brazil during DJF, where high model mean gradients can be seen but also high variability, i.e. a lack of agreement between models ( figure 4(a)). Possible causes of the spread between model gradients are: (i) differences in the positions of the wet ITCZ-influenced regions; and (ii) differences in physical processes operating in the models. This is beyond the scope of the present study but certainly merits further analysis.
For the dP/dT regressions, in general the model means show weak negative gradients throughout the majority of the tropics and particularly ocean regions, consistent with the weak negative correlations discussed above (figure 5). This is at odds with the weak, positive mean correlation at t = 0 for total precipitation but consistent with the convective-only precipitation response (which dominates tropical rainfall totals) in ERA-Interim (figure 2(c)). The lag-lead dP/dT relationship for ERA-Interim convective precipitation in figure 2(c) is suggestive of warmer temperatures in the previous few days leading convective precipitation while convective precipitation events are followed by cooler temperatures the following day; however, these relationships are not statistically significant. Over stratocumulus regions weak positive correlations are shown (possibly indicating warmer SST causing less cloud and precipitation, resulting in increased solar heating and increased temperatures), suggesting a positive feedback. Interestingly these daily relationships are of opposite sign to work by Trenberth and Shea (2005), who found positive relationships between dP/dT throughout tropical ocean regions at the monthly timescale; this demonstrates the importance of timescale on the considered relationships and physical processes operating. Here, the exceptions to this are over northern Africa during DJF where very large positive model mean gradients are shown ( figure 5(a)), and hotspots of positive (though weaker) gradients over southern Africa and central Brazil during JJA ( figure 5(b)). The small regions of strongly negative gradients scattered across the Sahara during DJF (figure 5(a)) represent noise in the timeseries, due to the lack of precipitation in this region creating lots of zeros and skewing the calculation of the regression coefficients.
Very few regions of the tropics show any agreement between models, with the lowest levels of inter-model spread being confined to oceanic regions at higher latitudes, in particular in the winter season ( figure 5). This also suggests that the tropics-wide positive relationship between P and T diagnosed in figure 2(c) are not representing a simple feedback processes. This lack of model agreement is also evident in the dM/dT regressions, where although negative gradients are shown throughout the tropics there is also high variability everywhere except small pockets of model agreement such as over Australia in DJF (figure 6(a)) or China in JJA ( figure 6(b)). The negative dM/dT relationships appear most coherent in subtropical summer, indicating that increased soil moisture results in reduced surface heating and therefore lower temperatures (more energy is used evaporating moisture rather than heating the ground).

Discussion and conclusions
To understand the influence of land use on present day climate, and to understand any possible feedbacks that may amplify or dampen the effects of future climate change, our knowledge of the coupling between the land surface and atmosphere must be improved. Recently, an unprecedented amount of GCM data has become available via experiments undertaken as part of CMIP5, allowing a unique opportunity to study the simulation of the land-atmospheric coupling across a range of the latest generation state-of-the-art climate models. Using these data, in this letter we have characterized some of the fundamental basic relationships between atmospheric variables and the land surface, by looking at the correlation and regression coefficients of both cotemporal and lagged/leading daily rainfall, soil moisture and near-surface air temperature across the tropics.
Results suggest a positive relationship between rainfall and soil moisture at daily timescales, as might be expected, and weaker negative relationships between rainfall-temperature and temperature-soil moisture. The results from the lag-lead correlations suggest the following mechanism: increased rainfall leads to increased soil moisture over the next 1-3 days (figure 2(b)), this increased soil moisture leads to lower temperatures the following day (figure 2(d)), and these lower temperatures lead to a reduction in rainfall the following day (figure 2(c)). The above process results in the weak negative relationship between rainfall and soil moisture, shown up to the following 5 days (figure 2(b)). Important, however, is the large variability in the models' ability to reproduce these relationships, with some models showing widespread strong correlations but others failing to capture either the magnitude or spatial extent of these patterns. Additionally, the tropic-wide relationships disguise a heterogeneous picture regionally, in particular for coupling between P and T. An analysis of the model mean gradients between rainfall and soil moisture shows that while some models identify 'hotspots' of high gradients corresponding to high rainfall regions, there is high inter-model variability such that in many areas the models do not agree on the magnitude or spatial patterns of these gradients. While some of these differences relate to errors in simulating the position of the ITCZ, it is also likely that differences in model parametrizations lead to differences in the physical processes operating, important for representing land surface feedbacks.
Of equal importance are the results on the timings of these relationships. Whilst most of the models generally agree on the temporal evolution of the rainfall-soil moisture relationship (i.e. positive correlations when rainfall leads soil moisture which then become negative when soil moisture leads), there is a difference in timings where most models suggest the strongest correlations occur at day +1 but two models suggesting they occur at day 0.
There may be several reasons for this lack of agreement between some models. Firstly, it may be that even these state-of-the-art climate models do not possess a spatial resolution high enough to reproduce the atmosphere-land surface coupling. In this study, for the lag-lead correlations, the models were analysed at their own spatial resolutions (see table 1) rather than being interpolated to a common grid, so their varying resolutions may account for some of the differences between them. None of the models possess resolutions comparable to numerical weather prediction (NWP) models, which is perhaps required to adequately reproduce the atmosphere-land surface coupling, with even ERA-Interim (which runs with a spatial resolution of approximately 0.7 • , but with the output used here available at 1.5 • ), not reaching the level of a NWP model. The importance of spatial resolution has been discussed by other work, such as in a study by Taylor et al (2011) where satellite observations of surface temperature and cloud cover over West Africa were used to demonstrate the strong influence of spatial scale on the rainfall-soil moisture feedback. They showed that variations in soil moisture on scales of approximately 10-40 km were a strong controlling factor for storm initiation, and that convection was twice as likely to occur over regions with large soil moisture gradients than those without (Taylor et al 2011). They correctly note that these spatial scales are not represented in current climate models, even though this small-scale control will have larger-scale consequences (Taylor et al 2011).
In summary, in this study we have attempted to improve our understanding of the atmosphere-land surface coupling at daily timescales, using a simple diagnostic of the relationships between precipitation, soil moisture and surface temperature from a range of the latest generation climate models. This letter, and the above work in progress, forms part of a wider study to look at atmosphere-land surface interactions, in both present day climate and under future climate change.