Groundwater shapes North American river floods

The importance of soil moisture in triggering river floods is increasingly recognized. However, soil moisture represents only a fraction of the water stored in the unsaturated zone. In contrast, groundwater from the deeper, saturated zone, may contribute a significant proportion of river flow, but its effects on flooding are poorly understood. Here we analyze hydroclimatic records of thousands of North American watersheds spanning 1981–2018 to show that baseflow (i.e. groundwater-sustained river flows) affects the magnitude of annual flooding at time scales from days to decades. Annual floods almost always arise through the co-occurrence of high precipitation (rainfall + snowmelt) and baseflow. Flood magnitudes are often more strongly related to variations in antecedent baseflow than antecedent soil moisture and short-term (⩽3-day) extreme precipitation. In addition, multi-decadal trends in flood magnitude and decadal flood variations tend to better align with groundwater storage and baseflow trends than with changing precipitation extremes and soil moisture. This reveals the importance of groundwater in shaping North American river floods and often decouples the spatial patterns of flood trends from those of shifting precipitation extremes and soil moisture.


Introduction
Previous studies have reported that fluvial flooding is on the rise globally (Hirabayashi et al 2013, Alfieri et al 2015, Arnell andGosling 2016) because warmer atmospheres can hold more moisture, resulting in an increased likelihood of extreme rainfall under climate change (Min et al 2011, Fowler et al 2021. In addition, warming-induced shifts in rainon-snow events are projected to increase flooding regionally (Musselman et al 2018). However, these increases are not clearly observed in flood records of recent decades (Hirsch and Ryberg 2012, Kundzewicz et al 2014, Berghuijs et al 2017, Do et al 2017, Slater et al 2021, Zhang et al 2022. Across North America, flood trends appear spatially fragmented, whereby systematic causes of change remain largely unresolved (Cunderlik and Ouarda 2009, Archfield et al 2016, Slater and Villarini 2016, Hodgkins et al 2019, Dethier et al 2020, Zadeh et al 2020, and the Intergovernmental Panel on Climate Change (IPCC) emphasizes the low confidence in regional patterns of change (IPCC 2021). In addition to long-term trends, regionally, multi-year oscillations can exist with unusually many or few flood occurrences, referred to as flood-rich or flood-poor periods (Blöschl et al 2020, Lun et al 2020. Unraveling the causes of flood trends and variability is pivotal as floods cause billion-dollar damages annually, and these damages are quickly growing (Winsemius et al 2016).
While climate-change-induced shifts in rainfall and snowmelt extremes are important drivers of flooding, natural and artificial changes in landscape conditions (e.g. landcover, damming, soil moisture) can also affect flooding (e.g. Blöschl et al 2007, 2015, Berghuijs et al 2019. A major cause for the disconnection between flood trends and precipitation extremes is the critical role of antecedent soil moisture (e.g. Ivancic and Shaw 2015, Slater and Villarini 2016, Sharma et al 2018, Wasko et al 2021, Zhang et al 2022, which reflects the water stored in the surface layers of the landscape. Many floods tend to occur when heavy precipitation and high antecedent soil moisture concur (Ivancic and Shaw 2015, Berghuijs et al 2016, Sharma et al 2018, Wasko et al 2021, Zhang et al 2022. However, as soil moisture conditions may shift with climate change (Seneviratne et al 2010, Samaniego et al 2018, Zhou et al 2021, increases in precipitation extremes are often not reflected in floods (Sharma et al 2018, Zhang et al 2022.
The importance of antecedent conditions may extend deeper into the subsurface. Globally, groundwater constitutes orders of magnitude more water than soil moisture (Dorigo et al 2021), but groundwater often remains unconsidered in flood analyses as its role in hydrological extremes is often assumed to be minimal. However, groundwater plays a crucial role in river flow across the world. For example, experimental work shows that while streams respond promptly to rainstorms, storm runoff often contains relatively little water from current rainfall (Sklash and Farvolden 1979, Neal and Rosier 1990, Sklash and Farvolden 1979, Kirchner 2003. In addition, most water contributing to river flows tends to be older than approximately three months (Jasechko et al 2016). Such findings indicate that groundwater can play a central role in the river flow, even under storm conditions. Other data also substantiate the vital role of groundwater contributions to river flow, such as high base flow indices (the estimated fraction of river flow sustained by groundwater) across streamflow datasets (e.g. Beck et al 2013), and global recharge data (Berghuijs et al 2022). The potentially key role of groundwater in driving flood hazards is important to understand because groundwater trends and variations may differ from those of soil moisture. The timescales over which groundwater varies tend to be much longer than soil moisture (Blöschl and Sivapalan 1995, Skøien et al 2003, van Loon 2015. Therefore variations of groundwater may have longer-lasting and opposing effects on flooding relative to soil moisture. Groundwater-well observations are abundant in space (e.g. Jasechko and Perrone 2021, Jasechko et al 2021), but well levels may not reflect watershedscale groundwater storage, and time series tend to be sparse. Instead, we explore the role of groundwater via baseflow. Baseflow (Singh 1968)-the mostly groundwater-fed river flow sustained between precipitation events-remains mostly unconsidered in largescale empirical flood hazard assessments. Baseflow provides a measure of drainage speed and has been reported to significantly influence the flood frequency curve (Spellman and Webster 2020). Baseflow can shape flooding because wetter landscapes tend to disproportionally discharge rainfall and snowmelt to rivers (Kendall et al 1999), leading to larger flow rates than those same landscapes in drier conditions with lower baseflows. In addition, streams with higher flow rates already carry more water, further enhancing flood peaks. Trends in baseflows and groundwater storage can occur via climate change and direct human interventions in the water cycle (Ficklin et al 2016, Tan et al 2020 but remain mostly unexplored as a cause of change in flood hazards (see Spellman and Webster 2020).
Here, we show that variations and trends in baseflow can outweigh the effects of soil moisture, extreme rainfall and snowmelt as drivers of flooding, thereby explaining both event-scale variations and long-term trends in the magnitudes of floods across North America. First, we quantify the soil moisture, baseflow, and precipitation (snowmelt and rainfall) conditions under which floods occur, and we quantify how these drivers affect flood magnitudes. Subsequently, we test the extent to which changes in precipitation extremes and antecedent baseflows explain decadal variations and long-term trends in the magnitudes of floods across North America.

Hydrometeorological data
We use daily streamflow, snow storage (snow water equivalent, SWE), temperature, and precipitation data from the Hydrometeorological Sandbox-Ecole de Technologie Supérieure (HYSETS) database (Arsenault et al 2020). HYSETS contains hydrometeorological data for 14 425 watersheds across North America, located in Mexico, the United States, and Canada. These watersheds range in size from ∼1 to 10 6 km 2 . Arsenault et al (2020) visually screened these hydrographs to check if they appeared realistic. The available daily data include filtered discharge time series to remove strongly regulated stations (Arsenault et al 2020). We selected the 4915 watersheds with at least 15 years of data (containing >350 days of streamflow data for each year) over 1981-2018. These watersheds have, on average, 29.9 years of data over this period (10th and 90th percentile = 17.6 and 38.0 years).
We use the HYSETS station-based daily precipitation data, which has been derived from the Environment and Climate Change Canada weather stations for Canada, the Global Historical Climate Network Daily station database for the United States and Mexico, and the station-based serially complete dataset for North America. In addition, Arsenault et al (2020) have derived watershed-averaged daily SWE estimates using ERA5-Land (Hersbach et al 2020). The HYSETS database covers 1950-2018, but snow data are only available from 1981 onwards. All data are publicly available via the HYSETS database (Arsenault et al 2020). Because these watersheds can have rainfall and snowmelt as drivers of flooding, we calculate the sum of these fluxes for each day, whereby snowmelt (mm d −1 ) is the daily difference of daily SWE estimates (mm). For part of the analysis where we consider the relative effects of soil moisture, baseflow, We calculate trends in flooding and meteorological variables by normalizing each station's annual maximum daily discharge values by the station's mean annual maximum daily discharge (computed over the entire period). Annual maximum flows are not always overbank flows with damaging effects because such events tend to happen on a return period longer than one year (e.g. Sampson et al 2015, Wing et al 2017. However, annual maximum flows capture the largest floods, including the most damaging event(s) at a location over the measurement period. In addition, they are a commonly studied metric of flood change in observational datasets (e.g. Blöschl et al 2017, Wasko et al 2021. We present the trends based on robust linear regressions, which we fitted using iteratively reweighted least squares with MATLAB's default bisquare weighting function. The mean date of annual maximum flooding and 30 day precipitation is calculated using circular statistics, which quantifies the average day in which maximum annual flows occur (e.g. Burn 1997, Villarini 2016, Blöschl et al 2017, Berghuijs et al 2019. In addition, these circular statistics can also indicate how variable that date is between years using the consistency of the date.

Baseflow estimates
Baseflow is estimated using a digital pass recursive filter (Lyne and Hollick 1979, Arnold and Allen 1999): where q t is the filtered storm runoff (quick response) on day t, Q is the original streamflow, and β is the filter parameter (set at 0.925; after, e.g. Carrillo et al 2011, Zhang et al 2017, Gnann et al 2019. We pass this filter over the streamflow data two times (forward and backward) to filter off flood peaks from the original streamflow time series (figure 1).

Controls on floods
Soil moisture is known to modulate the extent to which extreme precipitation leads to river floods (e.g. Ivancic and Shaw 2015, Slater and Villarini 2016, Sharma et al 2018, Wasko et al 2021, but the effects of baseflow are stronger and longer lasting than the effects of soil moisture (figure 2). We assessed the association between floods and baseflow/soil moisture based on their instantaneous value on a given number of days before the flood, whereas we assessed the association between floods and precipitation (snowmelt plus rain) based on the accumulated amount of precipitation from a specific day to the day of the flooding. Correlations (Pearson's r) between interannual variations in the magnitude of flood peaks and variations in possible drivers reveal a stronger association with baseflow than with soil moisture (figure 2(a)) (similar results apply for Spearman's ρ). Baseflow thus seems to have a stronger impact on flood magnitudes than both shallow (0-7 cm) and deeper (100-289 cm) soil moisture, and this finding holds across short and longer antecedent periods (figure 2). For short antecedent periods of less than 3-4 days, antecedent baseflow displays a stronger association with flooding than the magnitude of precipitation (i.e. rainfall + snowmelt) does. At longer timescales, the total sum of precipitation fallen in the antecedent period tends to outweigh the initial baseflow. The relatively high correlation of precipitation amounts (around ∼0.5) at antecedent periods of longer than a week probably arises because precipitation also contributes to rising soil moisture and groundwater levels, rather than merely driving the event runoff that leads to a flood peak.
Overall, variations in flood magnitudes at short time scales are often more strongly related to variations in baseflow than the commonly studied factors of precipitation and soil moisture. Thus, baseflow seems to exert a more substantial and longer-lasting influence on flooding than commonly considered near-surface soil moisture conditions. This hierarchy of importance remains identical using the relative importance of predictors metric based on Matlab's fitctree classification function ( figure 2(b)), which estimates the importance of each predictor independently from the effect of the other two predictors. It is important to note that ERA5-Land soil moisture estimates may not always be representative of the catchments' moisture conditions, as ERA5-Land soil moisture estimates often deviate from in-situ measurements (Joaquín Muñoz-Sabater 2021). Therefore, we might underestimate the relative importance of soil moisture.
Higher baseflow rates are associated with higher maximum flow rates (figures 2(a) and (b)). This can occur because baseflow regionally contributes more to streams and because higher baseflow rates reflect wetter conditions that discharge more incoming precipitation and accumulated soil moisture to streams via rapid runoff mechanisms.
The effects of baseflow on flood peaks vary regionally. Across almost all watersheds, higher baseflow rates are associated with higher flow rates (figure 2(a)), which implies an overall greater streamflow rate, at least partly because baseflow contributes more water to these streams. However, the degree to which this occurs varies regionally (figure 2(c)). Flood magnitudes scale positively with baseflow rates, with especially strong associations found in the Great Plains and Colorado Plateau. Although the Colorado Plateau is considered a desert region, baseflow is an important component of its streamflow (Santhi et al 2008, Miller et al 2016. Some regions have a weak negative contribution of total baseflow, such as parts of the Appalachian Plateau, which may merit further investigation. We investigate the relative contribution of baseflow to peak flows by quantifying the fraction of the peak flow originating from baseflow. For 88% of watersheds, peak flow rates correlate negatively with the baseflow index (i.e. baseflow divided by total flow rate at the flooding date) ( figure 2(d)). This shows that as floods increase, while baseflow contributes more flow to streams (figure 2(c)), its relative contribution to the event magnitude tends to shrink (i.e. a higher fraction of flood discharge comes from surface runoff). The decreasing relative contribution of baseflow indicates that the effects of baseflow tend to be indirect; flood magnitudes increase principally because wetter landscapes with higher baseflow rates and groundwater storage will quickly discharge more of the incoming rainfall and snowmelt to rivers. These 'discharging' effects outweigh the influence of absolute increases in antecedent baseflow (figure 2(c)) on the flood magnitude.
Across North America (i.e. all 4915 watersheds), floods generally occur when there has been relatively high antecedent baseflow and substantial recent precipitation, indicating that both factors are important for flooding. We display the frequency of conditions in the lead-up to flooding for specific intervals z (with z = 3 and 5 days) before the annual flood (figures 3(a) and (b)). The values reflect the precipitation on the day of the flood peak and up to z − 1 days before. Baseflow conditions reflect the flow rate z days before the flood peak. The importance of antecedent flow conditions versus rainfall scales with the interval z. For example, at a z = 5 days timescale, both precipitation and baseflow percentiles tend to be substantially higher than average conditions (medians 84th and 69th percentile). The dependence of flooding on the antecedent baseflow strengthens at shorter time scales but does not strengthen for precipitation (i.e. figures 2(a) and (b)) (precipitation and baseflow percentiles: medians 85th and 75th).
Although higher baseflows lead to larger floods, this effect does not entirely decouple flooding from rainfall because increased rainfall over longer periods will grow baseflows and thus flooding. However, the seasonality of floods tends to differ from the seasonality of extreme short-term and longer-term rainfall. This seasonal decoupling occurs because high summer temperatures drive evaporation, depleting terrestrial water storage, thereby reducing runoff (Berghuijs et al 2016, Villarini 2016, Blöschl et al 2017. As a result, maximum annual flows almost always occur during a different time of the year than precipitation maxima. We illustrate this imprint of temperature and water storage seasonality by contrasting the pattern of flood seasonality with that of the 30 day precipitation (figure 4). Distinct  differences exist in the timing of annual flow maxima and annual 30 day precipitation maxima. Similar disconnects have also been found with the timing of shorter rainfall extremes (e.g. Berghuijs et al 2016, Villarini 2016).

Controls on flood trends
Over the period 1981-2018, magnitudes of annual flooding have primarily decreased in the southwestern United States but increased across most of the rest of the continent ( figure 5(a)). The most consistent increases in flood magnitudes occurred in the Prairie Pothole Region, an area with widespread shallow wetlands (often with ∼15% increase in flood magnitudes per decade), and noteworthy trends of up to ∼10% increase per decade also occurred in other watersheds across the continent. In contrast, the southwestern United States experienced large-scale decreases in annual flood magnitudes from 1981 to 2018, with many watershed flood magnitudes shrinking by ∼10% per decade.
Baseflow can shape these changes in flooding over multiple years and decades, as long-term trends in baseflow and the associated groundwater storage may dampen or amplify the river flow generated by incoming precipitation. The flood magnitude trends (figure 5(a)) are strongly correlated (p-value < 0.05) with trends in annual mean baseflow conditions (Spearman ρ = 0.50 across all watersheds) but are only weakly related to trends in annual maximum daily precipitation (Spearman ρ = 0.15) and trends in annual mean soil moisture (Spearman ρ = 0.09) ( figure 5(b)). In most cases, antecedent baseflow trends and flood trends exhibit a similar sign (68%), whereas precipitation and soil moisture trends are less consistent in sign with flood trends (54% and 46% respectively). These results suggest that multi-decadal baseflow trends have largely shaped trends in flood hazards, in many cases curbing or overruling the effects of shifting short-term precipitation extremes and longer-term soil moisture trends.

GRACE groundwater and changing floods
Baseflow reflects the rate at which groundwater drains from a watershed and therefore indicates storage variations but does not measure them directly. In contrast, GRACE-based estimates of groundwater variations provide such storage variations. We use GRACE satellite observations to illustrate multi-year trends of terrestrial water storage change over 2002(Rodell et al 2018. Regional trends in GRACEbased terrestrial water storage exhibit similar trends as flooding overall (figure 6). Several factors may weaken the match between GRACE water storage trends and observed flood change. Not all GRACE estimates are consistent (Scanlon et al 2016); watershed areas can be smaller than the spatial resolution of GRACE, and GRACE also captures trends of deep groundwater depletion and frozen water storage. Yet, despite such confounding factors, trends in flood magnitudes and trends in GRACE-derived water storage change are still related (Spearman's ρ = 0.22, pvalue < 0.01). For example, the wetting trend across the northern Great Plains corresponds to a sequence of wet years and flooding that followed the drought of the early 2000s (Rodell et al 2018). In contrast, drying regions with less terrestrial water storage in the southwestern U.S. also have decreasing flood magnitudes and baseflows. Across regions such as the California central valley, the Ogallala aquifer, and the Colorado river basin, a substantial part of the water storage trends is human-induced by anthropogenic groundwater abstractions (Castle et al 2014, Richey et al 2015. The role of these human abstractions on flood hazards needs to be further investigated. Irrespective of the driving factor and confounding factors, decadal storage changes can have a significant imprint on flood trends at the continental scale ( figure 5).
This finding provides an independent measure that groundwater and baseflow shape flood hazards and suggests there is potential in using past and future satellite-measured variations of the gravity field of Earth (Landerer et al 2020) not just to lengthen the lead-time of flood predictions , but also to help to unravel historical flood trends across continents.

Implications
The IPCC has indicated that the fraction of the global land area affected by floods is likely to grow with global warming due to more frequent heavy rainfall but has reported low confidence in regional patterns of flood change (IPCC 2021). In recent years, the critical role of soil moisture in controlling flood change has been highlighted (e.g. Ivancic and Shaw 2015, Berghuijs et al 2016, Slater and Villarini 2016, Sharma et al 2018, Wasko et al 2021, Zhang et al 2022. Variations in groundwater may have longerlasting and opposing effects on flooding compared with soil moisture, but groundwater effects have remained mostly unquantified. Our analysis highlights the role of baseflow and groundwater in driving river floods. It suggests that the extent to which future flooding may follow increases in heavy rainfall will largely depend on the pre-existing baseflow and groundwater conditions during these extreme rainfall events (which may evolve under, e.g. hotter, drier summers or wetter fall months). The critical role of groundwater in flooding uncovered here is consistent with decades-old knowledge of runoff generation (Sklash and Farvolden 1979). It builds on earlier evidence that baseflow exerts significant influence over the flood frequency curve (Spellman and Webster 2020). Still, baseflows and groundwater storage are often overlooked in large-scale empirical analyses of flood hazards and temporal trends. In addition, many hydrological and land-surface models often capture short-term storage and baseflow variations, but they underestimate longer-term storage trends compared to GRACE satellite data (Scanlon et al 2018). Current knowledge of how climate change may affect baseflow remains limited (Price 2011, Ficklin et al 2016. Accurately integrating short-term variations and long-term storage and baseflow trends is likely crucial for predicting the evolving future of one of Earth's most damaging natural hazards.

Data availability statement
No new data were created or analyzed in this study.

Funding
WR B is supported by the VU Starting Grant 2021. L J S is supported by UKRI (MR/V022008/1) and NERC (NE/S015728/1).