Storm surge changes around the UK under a weakened Atlantic meridional overturning circulation

Climate model projections of future North Atlantic storm track changes under global warming are very uncertain, with models showing a variety of responses. Atmospheric storms force storm surges which are a major contributor to coastal flooding hazard in the UK, and so it is important to know how this process might be influenced by climate change—not only what future is probable, but what is possible? As a contribution to answering that question, we drive a simplified model of the north-west European coastal shelf waters with atmospheric forcing taken from climate simulations with HadGEM3-GC3-MM (1/4 degree ocean, approx. 60 km atmosphere in mid-latitudes) which exhibit a substantial weakening of the Atlantic Meridional Overturning Circulation (AMOC). The first is a ‘hosing’ simulation in which a rapid shut-down of the AMOC is induced by modelling the addition of freshwater to the North Atlantic. The second is the HadGEM3 GC3.05 perturbed parameter ensemble simulation under Representative Concentration Pathway 8.5 (RCP 8.5) which was used to inform the UK Climate Projections 2018 (UKCP18). This model has a high climate sensitivity and exhibits substantial weakening of the AMOC. We find substantial simulated increases at some sites: up to about 25% increase in the expected annual maximum meteorological component of the storm surge. In both the hosing simulation and the ensemble simulation, the greatest projected increases are seen at some west coast sites, consistent with strengthening of the strongest westerly winds. On the south-east coast, projected changes are smaller in the hosing simulation and generally negative in the ensemble simulation. The ensemble simulation shows a decrease in the strongest northerly winds as well as the growth in the westerlies. Overall, these low-likelihood increases over the 21st century associated with storminess are smaller than the likely contribution from mean sea-level rise over the same period, but, importantly, larger than the so-called “high-end” changes associated with storminess that were reported in UKCP18.


Introduction
The expected annual damage in England and Wales alone due to coastal flooding is £0.5 billion and this figure is predicted to at least double over the 21st century in the absence of any adaptation (Hall et al 2006).Regional mean sea level (MSL) rise is expected to be the primary driver of the increased risk (Perks et al 2023, Howard et al 2019, Menéndez and Woodworth 2010).There could also be a secondary contribution from an increase in the level of the storm surges, relative to the mean sea level.
Such an increase could be driven by a change in the North Atlantic storm track.This has previously been assessed by driving a coastal shelf model (more typically used for short-range storm surge forecasting) with projections of wind and atmospheric pressure from climate model simulations of the 21st century with prescribed pathways of radiatively-active gases and aerosols.For example, Lowe et al (2001) used a single climate model simulation, Sterl et al (2009) used a perturbed initial-condition ensemble, Lowe et al (2009) used a perturbed model parameter ensemble, and Palmer et al (2018) used an "ensemble of opportunity": a subset of the Climate Model Intercomparison Project Phase 5 (CMIP5, Taylor et al 2012) simulations.
Several opposing physical processes influence the responses of storm tracks to anthropogenic climate change (Shaw et al 2016) and consequently model projections show a wide range of possible outcomes (Shepherd 2014).Whilst ensemble approaches provide some guidance on the range of likely outcomes, this range is limited by potential uncertainties that the model simulations do not yet sample (Palmer et al 2018), such as systematic changes in the AMOC and atmospheric storm tracks associated with additional fresh water input due to melting from the Greenland ice sheet (Jackson et al 2015), and by structural errors that are common to many of the different models (Shepherd et al 2018).
Thus, from a coastal engineering perspective, the approaches described above may fall short of addressing the decision-makers' "second question" (Sutton 2019): "How bad could it be?", and regional coastal impact studies relying on these approaches may underestimate the possibility of large damages and adaptation needs (Le Cozannet et al 2017).
The Atlantic Meridional Overturning Circulation (AMOC) transports upper waters northwards in the Atlantic and deeper water southwards.This results in a transport of heat northwards in the North Atlantic and affects both regional sea surface height and the pattern of sea surface temperature (SST).A weakening of the AMOC is considered to be very likely in the future as a response to global warming (IPCC 2021).This has direct implications for sea level extremes (through the effect of the AMOC on the mean sea surface height).A weakening AMOC has been shown to impact storm tracks through changing Atlantic SST gradients, hence this is likely also to impact storm surge: an indirect effect on sea level extremes in addition to the direct effect described above.Here we consider only the indirect effect.Woollings et al (2012) showed that there is a strong relationship between the AMOC and storm-track responses in phase 3 of the Coupled Model Intercomparison Project Atmosphere-Ocean General Circulation Models.Jackson et al (2015) found a strengthening and eastwards extension of the North Atlantic winter storm track, similar to that identified by Brayshaw et al (2009) in an experiment with a weakened AMOC.Brayshaw et al (2009) associate this with a strengthening of the SST gradients over the northern Atlantic, from which the storm track derives some of its energy.There is substantial model uncertainty in how much the AMOC is projected to weaken in the future, and this is therefore a source of uncertainty in future projections of storm surges.
This study aims to explore the impact of significant AMOC weakening on storm surge around the UK using numerical model simulations.We first drive a coastal shelf model with projections of wind and atmospheric pressure from a climate model that has undergone a collapse of the AMOC as the result of strong freshwater forcing in the North Atlantic.Whilst the strength of the freshwater hosing is unrealistically high, the chances of AMOC to collapse in the real world remains poorly known (Fox-Kemper et al 2022), so these simulations represent an opportunity to ask the question of how might storm surges around the UK coast change if we saw a substantial weakening of the AMOC in the future.
To put the results in context, we compare with a more conventional approach, using the HadGEM3-GC3.05perturbed parameter ensemble (Sexton et al 2021, Yamazaki et al 2021) forced by RCP 8.5.Although this is subject to some of the limitations of ensemble modelling mentioned above, HadGEM3-GC3.1, a closely related model, has a relatively high climate sensitivity (Andrews et al 2019) of 5.4 K (effective climate sensitivity to a doubling of CO2) which lies outside the very likely range of 2 to 5 K assessed by Forster et al (2021).However, Forster et al (2021) also assessed that it is not possible to rule out values above 5 K and the probability of lowlikelihood high-impact outcomes, such as AMOC collapse, increases with higher global warming levels (IPCC 2021 ).Consistent with that, many ensemble members show a significant weakening of the AMOC through the 21st century (Yamazaki et al 2021) and so, like the hosing simulation, we anticipated that some members of this ensemble would exhibit substantial storm track changes.

Surge modelling
To simulate the response of the northwest European coastal waters to our atmospheric forcing we use the CS3 shelf sea model (Flather 1976, 2000, Flather et al 1998, Flather and Williams 2000).The CS3 shelf sea model is a regional 2D (depth-averaged) barotropic model with coverage over the Northwest European coastal waters.It extends from 48 to 63 degrees North at a resolution of 1/9 degree and 12 degrees West to 13 degrees East at a resolution of 1/6 degree making 135 × 150 grid cells each approximately 25 km square.An image of the central part of the model grid, which gives an indication of the resolution of the UK mainland coast (and the locations of the tide gauge sites which are referred to later) is available in the public domain: figure 3.1 of the UKCP18 Marine Report (Palmer et al 2018).We used a time-step of 45 seconds.Combined with low-cost explicit time-stepping, this makes CS3 a fast-running model, very suited to long and/or ensemble simulations.CS3 has been extensively validated against observations (Furner et al 2016, Horsburgh et al 2008) and until very recently was the operational storm surge model for the UK.Horsburgh et al (2008) show that the model performs well during extreme storm surges in the southern North Sea, providing forecasts of surge to within 10 cm in the Thames estuary when forced by re-analysed meteorology.
In this work we take the step of performing simulations in surge-only mode.This mode is not suited to operational forecasting, when the timing of the surge relative to the tide is of vital importance, so the surge-andtide mode is normally used.That mode has the advantage of more effectively simulating the surge-tide interaction, particularly the modifications to the bottom friction caused by the tide, and the timing of the surge relative to the tide.It would be possible to perform our century-scale integrations in the surge-and-tide mode, and indeed several previous experiments have used that approach (e.g.Sterl et al 2009, Lowe and Gregory 2005, Palmer et al 2018).However, we do not know what the surge-tide phase relationship of a simulated future event will be, and these previous experiments have generally left this to chance.This means that a potentially significant surge event may be missed simply because it happens to coincide with a low tide.One option is to repeat the whole integration multiple times with different phase relationships (for example by staggering the start time of the meteorological forcing relative to the tidal forcing), but this represents a substantial increase in computational and data storage costs.For example even staggering in one hour steps through one M2 tidal cycle3 multiplies the cost by 12, and still does not address the spring-neap cycle.An advantage of the surge-only mode is that it sidesteps that problem.Furthermore, the surge-only mode removes any ambiguity about what metric of surge impact to use (e.g. total water level, non-tidal residual, or skew surge).This benefit comes with the cost of unrealistic bottom friction and hence potentially unrealistic surge magnitudes, and we acknowledge the caveat that our projected changes in surge must be seen in the light of this shortcoming.There are two counter arguments: (1) if the argument of tide/skew-surge independence (Williams et al 2016) was extrapolated, it would seem to suggest that omitting the tide should have minimal impact on the surge, and (2) we anticipate that the percentage change (the change as a percentage of the control value) will be less affected by this shortcoming.Our choice to use a surge-only simulation is discussed further in appendix B.

Climate simulations
First we force our shelf sea model with 10-metre wind and sea-level atmospheric pressure fields which are output from one of the "hosing" experiments described by Jackson et al (2022).The experiment consists of a global coupled climate model (HadGEM3-GC3-1MM, Williams et al 2018) simulation in which a very large additional surface freshwater flux (equivalent to 0.3 Sverdrups) of freshwater is applied to the North Atlantic from 50 degrees North to the Bering Strait.Apart from this freshwater forcing, the simulation is based on the preindustrial control experiments (where external forcings are fixed at 1850s conditions) which were conducted as part of the core CMIP6 experiments, described in Eyring et al (2016).
The additional freshwater prompts the inhibition of deep convection and shuts down or reduces the Atlantic Meridional Overturning Circulation (AMOC, figure 1).Many of the changes induced by additional freshwater forcing that have been seen in previous similar experiments can be considered robust features under an AMOC collapse (Jackson et al 2015).With particular relevance to our purpose, these features include stronger westerly winds over north-west Europe in winter and a strengthened winter storm track.To assess the changes we compare a 30-year time-slice beginning 70 years after the start of the hosing with the same period in the parallel control simulation.In this time period the AMOC has substantially reduced to 23% of its control value (table 1).The model is spun up from climatology over 36 years to initialise the control run, and the experiment is initialised after 42 years of the control run.
Secondly we force our shelf sea model with 10-metre wind and sea-level atmospheric pressure fields which are output from a perturbed parameter ensemble (PPE, Sexton et al 2021) consisting of variants of a configuration (named HadGEM3-GC3.05) of the HadGEM3 (Hewitt et al 2011) climate model at N216 (about 60 km) atmosphere resolution and 1/4 degree ocean resolution.This model has an improved representation of the storm tracks compared to the previous generation of Hadley Centre models (Williams et al 2015, Senior et al 2016).The PPE simulation covers the years 1900-2100, employing CMIP5 historical and RCP8.5 emissions.It exhibits a significant weakening of the AMOC in many of its members (Yamazaki et al 2021).Following Yamazaki et al (2021), we explore the full range of ensemble response, because the simulations are intended to be used as individual examples of plausible outcomes as opposed to samples from a distribution.
We began with the 15 ensemble members selected for use in UKCP18, but we excluded two members (known as 2305 and 2335) because those two exhibit a substantial reduction in the AMOC even in the parallel control simulation, which calls into question the validity of the flux adjustments used in those two members.
Their exclusion does not materially change our conclusions and their inclusion would complicate comparison with the control simulation.A list of the PPE members used is given in table 1.
Details of the selection of parameter combinations and the global performance and future changes exhibited by the PPE are given by Sexton et al (2021) and Yamazaki et al (2021), respectively. McDonald et al. (in prep) present a detailed evaluation of the simulation of extra-tropical storms in the ensemble.To assess the changes, for each ensemble member we compare a 30-year time-slice (years 2070 to 2099 inclusive) from the forced simulation with the same period in the parallel control simulation.Time series of AMOC strength are shown in figure 1. AMOC strength is measured as the maximum in depth at 26 degrees north for the PPE and at 26.5 degrees north for the hosing simulation. .Time series of annual mean AMOC strength, for the 13 members of the PPE, and for the hosing simulation.Vertical lines show the limits of our 30-year time-slices.Y-axis (AMOC strength) and X-axis (year) is common to all panels (tick labels have been omitted from all but one panel for brevity).However, the datum of years in the hosing simulation is arbitrary, so has been adjusted in this plot such that the time-slice of the hosing simulation matches that of the PPE for ease of visual comparison.Palmer et al 2018) has advocated fitting a linear trend to a single long period of time stretching from a time representative of the present day into the future period of interest, as a way of identifying the forced change whilst minimising the "noise" due to the unforced climate variability.However, that approach is not appropriate to the hosing simulation, in which we attempt to identify differences associated with a regime change from the control AMOC (15.8 Sv) state to the weakened AMOC (3.6 Sv, see table 1) state.In order to evaluate robust metrics without this trend fitting approach, we combine 30 years of simulated data from the end of the 21st century (2070-2099 inclusive) into each metric.Thus, to make a metric of extreme surges for any given site, we take the mean of thirty consecutive annual maxima.In other words, we take the maximum value each year for 30 consecutive years and then take a mean of these 30 values.
To assess changes (or "anomalies") relative to the control simulation, we take the difference in the mean of thirty consecutive annual maxima in the forced simulation and the unforced control simulation.For consistency, we apply the same approach to the perturbed parameter ensemble.

Interpretation of the mean annual maximum
Following Howard and Williams (2021)(among others) we define the N-year return level as the level which is expected to be exceeded once in N years on average (where the average is to be taken over a period including many such exceedances).Thus the one-year return level is the level which is exceeded on average once per year over a period of many years, in other words exceeded on average M times in M years, where M is large.We do not make any assumption about the probability distribution of the annual maxima.However, suppose that we were to assume that the annual maxima at a given site follow a generalised extreme value distribution (see e.g.Coles 2001) with constant scale and shape parameters, then a change in the mean annual maximum is identical to a change in the one-year return level. 4Even allowing for variation in the scale parameter, we find (evidence not shown here) that the change in the one-year return level is dominated by changes in the mean annual maximum for our data.
Whilst it would be mathematically correct to extrapolate a small change in the scale parameter to a large, attention-grabbing change in the 10,000 year return level, our 30-year-long samples are not long enough to robustly quantify a change in such rare extremes.Quantification of change in the more frequent extremes is a realistic aspiration.Incidentally, if the annual maxima at a given site are assumed to be Gumbel-distributed (a reasonable assumption for most sites, see Woodworth et al 2021) then the mean annual maximum is approximately the 1.8-year return level, i.e. we would expect ten exceedances of the mean annual maximum in an eighteen year period.

Limitations
Our experimental design focuses on changes in atmospheric storminess and the coastal extreme water levels that arise from storm surges.In order to make a tractable study, we exclude several processes from the scope.For example, a collapse of the AMOC would affect the mean dynamic sea level on the NW European coastal shelf, potentially causing mean sea level increases of the order of half a metre for the UK (Levermann et al 2005).We do not consider that process here, but we note that any storm surge increase would compound such a change.
We have taken a relatively crude approach to analysing the changes in the extreme values, directly comparing 30year means of annual maxima from two time slices.See section 5.
Since we exclude any tidal dependence, our results are most closely aligned with changes in skew surge.However, Haigh et al (2016) found that most extreme sea level events are generated by moderate skew surges, combined with spring astronomical high tides, rather than extreme skew surges.This implies that our findings regarding extreme surges are relevant to even less frequent (i.e. even more extreme) still water levels (tide plus surge).This is no bad thing, since the change in more extreme events is of interest, but it does mean that our findings may not be applicable to more frequently-occurring still water levels.
Our use of the surge model in surge-only mode (discussed in section 2) may lead to unrealistic bottom friction, which in turn may impact the magnitude of surge.
We test for changes in the mean of the distribution of annual maxima, but not the spread (discussed in section 2).
Changes in waves (e.g.Bricheno and Wolf 2018) are outside the scope of this study.
Interaction between components of change (Arns et al 2017), such as the effect of mean sea level change on the propagation of tide, surge, or waves are outside the scope of this study.

Wind and surge changes in the hosing simulation
The change in mean annual maximum surge (evaluated as the difference between the experiment and the parallel control simulation in the average of 30 consecutive annual maxima: the last 30 years of the simulation) in response to the freshwater "hosing" is shown in figure 2. The change in the most extreme westerly and northerly winds (again evaluated as the difference between the experiment and the parallel control simulation in the average of 30 consecutive annual maxima) is also shown (panels (b) and (d) respectively).The strengthening of the winter storm track and mean winter wind speed identified by Jackson et al (2015) in an earlier hosing simulation are seen again here in the increase in strength of the most extreme westerly winds and in the general increase in the surge extremes around almost all of the UK coastline.The bathymetry, fetch and coastline all affect the magnitude of storm surges, and these effects are much stronger in some locations than others.For example, large surges are observed in the German Bight (e.g.Woodworth et al 2007).This area dominates the surge changes and saturates the colour scale in panel (a).To avoid this domination, it is also useful to consider the percentage change (i.e. the change in the mean annual maximum, as a percentage of the mean annual maximum of the control simulation).This is shown in panel (c).Both panels (a) and (c) show that substantial increases in surge occur around much of the UK mainland coastline in the hosing simulation.

Wind and surge changes in the perturbed parameter ensemble simulation
Like the hosing simulation, the perturbed parameter ensemble exhibits some significant changes in storm surge, as shown in figure 3 (absolute change in metres) and figure 4 (change as a percentage of the control).Here the large-scale changes are not driven by an artificial freshwater hosing.Instead, they are driven by a range of CO2 profiles derived from RCP 8.5 emissions accounting for carbon cycle uncertainties along with the standard CMIP5 RCP 8.5 scenario for aerosol emissions and non-CO2 greenhouse gases (Yamazaki et al 2021).Nevertheless, the ensemble members do exhibit a substantial reduction in the AMOC (figure 1) and show some of the same changes (e.g.strengthening of the westerly winds) as the hosing simulation.
It can be seen that, as in the hosing simulation, some members exhibit substantial increases in annual maximum storm surge along large parts of the UK and NW European coastlines, and this can be assumed to be associated with a strengthening of the westerlies as shown in figure 5.The response is different in the different ensemble members and hence depends on the parameter perturbations.Investigation of the links between parameter differences and response is beyond the scope of this work.However, the relationship between AMOC change, northern Atlantic surface temperature change, and surge change is investigated in section 3.3.The change in northerlies is shown in figure 6.
McDonald et al (in prep) evaluated the projected changes in storminess in the future in the PPE.They found more of the most intense North Atlantic storms in winter and an increase in the intensity of the storms over the UK and into Europe in winter.They used a different time-slice (2061-2080) which overlaps with ours (2070-2099) to represent the future period, but despite this difference our findings are broadly consistent with theirs.They also find some relationship between UK/North Sea atmospheric storminess changes and the temperature change in the North Atlantic (c/f our section 3. The PPE does not exhibit the annual maximum surge increases on the UK south-east coast which are seen in the hosing simulation.This is most readily seen in figure 7 which compares the surge changes in hosing and PPE simulations at the sites of UK tide gauges.The sequence in panels (a) and (c) proceeds clockwise around the mainland coast starting from Newlyn in the extreme south-west.The locations of all of the tide gauges listed can be seen in figure 3.1 of Palmer et al (2018).Figure 7 emphasizes the result that, in contrast to the hosing simulation, the PPE generally shows a reduction in the annual maximum surge on the UK south-east coast.This reduction is associated with a reduction in the northerly winds, as shown by figure 6.Consistent with this finding, Haigh et al (2016) note that storm surges on the east coast of the UK (their category 4) depend on northerly winds over the North Sea, in contrast to surges in their other categories.In appendix A, we show using data from a simulation that those storm seasons which generate intense surge events on the east coast of the UK are largely independent of storm seasons which generate intense surge events elsewhere.For reference, figure 7 also shows (in green) the 21st-century change in the RCP 8.5, GFDL-ESM2M-driven surge simulation which was presented by UKCP18 (Palmer et al 2018) as their illustrative high-end surge change result.Our simulations exhibit slightly larger percentage increases than the UKCP18 high-end change.This is to be expected, given that the PPE generally exhibits stronger storm-track changes than the CMIP5 ensemble (see figure 5.10 of the UKCP18 Land Report (Murphy et al 2018) and Figure 6.2.2 of the UKCP18 Marine Report (Palmer et al 2018)).We have interpreted UKCP18ʼs "high-end" rate of change as a change over 100 years, and evaluated the percentage change by dividing by the mean annual maximum skew surge at each location (taken from a 483 year pre-indistrial control simulation described by Howard and Williams 2021) to give a more like-for-like comparison between our results and the UKCP18 "high-end".Even so, it is possible that their alternative method (fitting a trend to all years of the simulation, as opposed to our time-slice approach) contributes to the smaller change which they find.
The magnitude of the simulated surges is affected at some sites by the absence of the tide.This is discussed in appendix B. We anticipate that the percentage change will be less affected by this issue.Typical sizes of increase around the UK shown in figure 7 (of the order of 0.2 metres) are smaller than the projected mean sea level increase over the same period under RCP 8.5, but are not trivial, being of similar size to typical local contributions from glacial isostatic adjustment, for example.Of course, that contribution is much more certain than our possible storminess change.
The storm surge changes are statistically significant.In other words, the forced changes are outside the envelope of unforced changes: maximum unforced changes are typically around half the maximum forced changes.The unforced changes are assessed as follows.For each member, instead of comparing the mean of 30 annual maxima between the forced simulation and the corresponding control simulation, we compare the mean of 30 annual maxima between two periods (separated by one hundred years : 1970-1999 and 2070-2099) of the control simulation.The range of these unforced changes is shown by the orange bars in figure 7 panels (a) and (b).Thus, although the forced changes are statistically significant, there is nevertheless the potential for a substantial unforced contribution.
As part of a wider study which considers projected change in mean sea level and atmospheric conditions (as an indicator of waves in addition to surge), Perks et al (2023) assess projected future changes in coastal flooding according to the PPE using a weather-typing approach.Their emphasis is somewhat different to ours in that they are concerned with the ensemble signal of change (typically represented by the behaviour of the ensemble median), whereas we are more concerned with the largest change.They note an increase in a weather pattern (their pattern number 23) associated with strong westerly winds, which is consistent with our results.

Relationship between AMOC, North Atlantic surface temperatures, and surge
As mentioned previously, Brayshaw et al (2009) associated storm track strengthening with a strengthening of the sea-surface temperature gradients over the northern Atlantic.The PPE ensemble mean winter (October to March inclusive) surface air temperature anomaly (RCP8.5 2070-2099 minus control 2070-2099) over the northern Atlantic is shown in figure 8(a).The crosses show the location of two grid points which we use in the evaluation of a simple metric of baroclinicity.The widespread global warming is offset by the AMOC reduction in the "pool" of reduced warming around the northern cross.As a simple metric of baroclinicity we use the surface air temperature difference between the locations of the two crosses.We refer to this metric as "dT".We evaluate the difference as south minus north, so that dT is positive, and an increase in dT represents an increase in baroclinicity.The anomalies (RCP8.5 2070-2099 minus control 2070-2099) in dT by ensemble member are shown against the corresponding AMOC slowdown (i.e.anomalies (RCP8.5 2070-2099 minus control 2070-2099) in negative-AMOC) in panel (b).Both mean changes are unequivocal (T statistics: 13.5 and 9.3 for AMOC and dT respectively) and we can see that there is a proportionality between the AMOC changes and dT changes.The Pearson product-moment correlation, r, is 0.84 (P value less than 0.1%).
As a simple metric of the forced surge changes, we choose the anomaly (RCP8.5 minus control) in the mean annual maximum simulated surge at Millport (approx 55.8 degrees north, 4.9 degrees west) in the Firth of Clyde, south west Scotland.4) for the same sites.These panels also show the "high-end" change reported by UKCP18 Marine, interpreted as a percentage change over 100 years as described in the main text.("Mainland" is nominal: for example Millport is included in the mainland sites).
Figure 9 shows the evolution of the ensemble mean forced change in our surge metric and in the AMOC slowdown.Panel (b) illustrates the large variability in mean annual maximum simulated surge at Millport, even when averaged over thirteen ensemble members.As shown in appendix C, this large variability is possibly the reason that the inter-member variations in the forced response are not well-correlated (appendix C, figure C1).Both AMOC slowdown and ensemble-mean annual maximum surge at Millport grow approximately linearly between 1990 and the end of the simulation, resulting in a strong proportional relationship between the ensemble mean changes.The ordinary least-squares regression of Y on X (panel c) has a gradient of about 0.9 cm/Sv.As can be expected, this is broadly consistent with the ensemble mean value of 1 cm/Sv which we deduce in appendix C and is consistent with the conjecture that an AMOC collapse would drive more severe surges through increased baroclinicity, for large parts of the UK coastline, exacerbating the likely increase in mean sea level associated with an AMOC collapse.
Summary of section 3.3 We find an increase in the baroclinicity of the north Atlantic, which is correlated with AMOC slowdown across the PPE.This is consistent with the findings of Brayshaw et al (2009).
The ensemble shows an increase in annual maximum storm surge at Millport which is consistent with the increase in baroclinicity and the AMOC reduction.
The time evolution of the ensemble-mean change in annual maximum storm surge at Millport is correlated with the time evolution of the ensemble-mean AMOC slowdown, because both grow approximately linearly over the 21st century (figure 9).Despite this, the inter-member differences in surge change are not correlated with the inter-member differences in AMOC slowdown, perhaps because the latter are small compared to the forced change, whilst the former are subject to large unforced variability (appendix C).

Conclusions
This paper is intended as a contribution to answering the question of how large a contribution storminess changes could make to the change in extreme sea levels around the UK.We use a storm surge model forced by the atmosphere of two models which show significant AMOC reductions.Such reductions are robustly associated with strengthening of the north Atlantic storm track and the winter westerlies over the shelf seas around the UK (Woollings et al 2012, Jackson et al 2015), through an increase in the north-south sea surface temperature contrast in the north Atlantic (Brayshaw et al 2009).
We find that a storm surge model forced by the atmosphere of the perturbed parameter ensemble that was developed for UKCP18 exhibits some substantial increases in extreme surges, including changes which exceed the illustrative "high-end" changes documented in the UKCP18 Marine Report, which used models from the CMIP5 stable.Consistent with this, the perturbed parameter ensemble exhibits a strengthening of westerly wind extremes over the domain of the surge model.However, members of the perturbed parameter ensemble also exhibit a weakening of the northerly wind extremes over the North Sea.It is these northerly winds which generate the worst surges on the south east coast of the UK and, consistent with this, we find that the surges on the south east coast of the UK are largely unchanged or even moderated in the PPE-driven simulations.
In contrast, a storm surge model forced by the atmosphere of a hosing simulation (in which the AMOC is forcefully shut down by a very large freshwater input to the north Atlantic) shows no such weakening of the northerly wind extremes over the North Sea (although the strengthening of the winter westerlies is seen) and consequently the increase in extreme surges extends over the south east coast of the UK in that simulation.

Suggestions for further work
This article takes a "first look" at the effect of AMOC reductions on storm surge via atmospheric changes.We have taken a relatively crude approach to analysing the extreme values, directly comparing 30-year means of Blue points show 7-yr block means of that data (made from non-overlapping blocks).(b) Time evolution of forced PPE ensemblemean annual maximum surge at Millport.Grey points show the ensemble-mean for each year.Blue points show 7-yr block means of that data (made from non-overlapping blocks).(c) PPE ensemble-mean annual maximum surge at Millport (Y axis) against PPE ensemble-mean negative-AMOC (X axis).Grey points show all years.Blue points show 7-yr block means (made from nonoverlapping blocks).The all-years Pearson product-moment correlation coefficient is 0.42 (P value < < 0.1%).The 7-yr block means Pearson product-moment correlation coefficient is 0.87 (P value < < 0.1%).The all-years ordinary least-squares regression of Y on X has a gradient of 0.89 cm/Sv (0.91 cm/Sv for the 7-yr block means).
annual maxima from two time slices.In this we were guided by the maxim "do the simplest thing first": our approach simplifies the analysis and increases the clarity and readability of the results.The use of 30-year time slices is commonplace in climate change modelling studies.Furthermore, almost any reasonable statistic (like the mean of 30 annual maxima) is legitimate for making a comparison between an experiment and a control in order to decide whether a significant change can be seen.However, it is apparent from figure 1 that the AMOC is not stationary over the time slice of the experiment, and thus it is unlikely that the surge annual maxima are stationary.This suggests that a more sophisticated approach to statistical modelling of extreme values is desirable in order to better quantify the dependence of surge extremes on the AMOC in terms of more conventional extreme value parameters.For example, use of the 5 largest independent surges each year with time as a covariate, following UKCP18 (Palmer et al 2018).An alternative appealing option would be to use the AMOC (rather than time) as a covariate in the extreme value fitting.
Another desirable development, beyond the scope of the present study, is the extension of the hosing simulation until the "hosed" climate comes closer to equilibrium.This might solve the problem of nonstationarity in the second time slice.However, we note that, from the point of view of coastal-change preparedness, changes on the time scale of the transient simulation (as identified here) are also important.It is possible that interactions between these sources also contribute.For example, differences in member parameters likely contribute to the different ways in which the members respond to the forcing.Faint grey lines connect pairs of points from the same ensemble member.Y-axes refer to surge at Millport and X-axes refer to negative-AMOC (except in the bar charts).Apart from the bar charts, the same X axis scale is used for all panels (except for an offset in panels (a) and (e)).Similarly, apart from the bar charts, the same Y axis scale is used for all panels (except for an offset in panels (a) and (e)).Inter-member variations in surge are not correlated with those of AMOC (panel b).Forced responses are correlated, but only by virtue of the mean response (panel c).The residual surge response not explained by the AMOC response is no bigger than the unforced variability (panels d and h).A detailed explanation is in this appendix.
To study the impact of parameter change whilst minimising the impact of forcing, beginning with the dataset in panel (a), we subtract a within-period mean-over-members from the 13 data points of that period (informally: keep the size and shape of each coloured cloud, but slide it so its centre is at the origin.The grey bars will change).This gives the data shown in panel (b).Departures from zero in this plot represent differences between a member and the multi-member mean.The correlation (r = 0.15) is not significant: either the effect of the parameters on the surge is not associated with the effect of the parameters on the AMOC, or if it is, the association is lost in the noise.Investigation of the inter-member differences in surge is beyond the scope of this work.
Conversely, to study the impact of a century of RCP8.5 forcing (i.e. the effect of climate change), whilst minimising the impact of parameter change, we follow an exactly analogous procedure.Beginning with the dataset in panel (a), we subtract a within-member mean-over-periods from the 2 data points of that member (informally: keep the size and shape of each grey bar, but slide it so its centre is at the origin.The coloured clouds will change.)This gives the data shown in panel (c).Departures from zero in this plot represent differences between a period and the two-period mean.The mean-over-members for each period are shown by the colourcoded crosshairs.The spread of the x means in panel (c) (the x difference between the crosshairs) is greater than the inter-member x spread in panel (b): the signal of forced AMOC change is greater than the inter-member AMOC variability.The opposite is true for the surge changes, i.e. the spread of the y means in panel (c) (the y difference between the crosshairs) is less than the inter-member y spread in panel (b): the signal of forced surge change is less than the inter-member surge variability.This may be in part due to the inherently noisy nature of block maxima such as the annual maximum, even after averaging over 30 years.Taking either period in isolation (i.e.either the blue cloud or the orange cloud), we can see that within that period, the correlation between members is not significant: either the effect of the parameters on the surge change is not associated with the effect of the parameters on the AMOC change, or if it is, the association is lost in the noise.We return to this in section C.1.However, considering the pool of both periods, there is a strong correlation (r = 0.77).All of this strength comes from the mean changes (shown by crosshairs), which can be thought of as common to all members, with internal variability and/or parameter differences adding variation superimposed on this mean change.By construction, the number of degrees of freedom in this data is only 13, since the 1970 to 1999 data is the negative of the 2070 to 2099 data.Taking the degrees of freedom into account, the P value of the correlation is less than 1%.The ordinary least squares regression of y on x (panel (c)) has a slope of about 1 centimetre per Sverdrup.As pointed out by Gregory (pers.comm., Jonathan Gregory, by email, 2023/05/22), in this situation where the inter-member variations are uninformative, but the change in the x means is unequivocal, • The significance of the association between x and y can also be assessed by a T test on the change in the y means, and • We can also estimate the slope as mean(y)/mean(x).
T for the change in the y means is about 4.4 (P value less than 1%, again taking into account the degrees of freedom), in agreement with the significance indicated by the correlation.mean(y)/mean(x) is about 1 centimetre per Sverdrup, in agreement with the ordinary least squares regression result.
Summarising the message of panel (c): in the ensemble, there is a significant AMOC slowdown.There is also a significant increase in surge at Millport.In that way, the two things are associated.However, the inter-member variations in the slowdown have no value in explaining the inter-member variations in the surge increase.Of course, many other variables will change significantly in response to the forcing.So, many other variables could similarly be shown to be associated with the surge changes.In the case of the AMOC changes we have a plausible physical argument for a connection.The hosing simulation values are shown by grey stars in panel (c).These represent with/without hosing, rather than a 21st-century change.They are shown for ease of reference only, and are not included in any of the statistical analysis of • There is no evidence here to choose between Model 1 and Model 2 (consistent with the poor correlation within the orange cloud in panel (c), or within the blue cloud in panel (c)).
We turn now to the right-hand panels.These show an identical treatment for two periods from the unforced control simulation.Like the RCP8.5 data, these periods are separated by a century.Panels (e) and (g), compared with (a) and (c) respectively, visually confirm the significance of the effect of the forcing: it is clear that the separation of the blue and orange clouds in panel (c), in either x or y, is not consistent with internal variability (and/or model drift) such as is expressed in panel (g).Comparison of panel (g) with (c) suggests that any model drift is insignificant compared to the forced change.Panels (d) and (h) have a common Y-axis, so we can see that the residuals of the two plausible models (1 and 2) in the forced simulation are no bigger than the changes in the unforced simulation.This suggests that these residuals are unforced "noise" and are unlikely to be explained by further investigation of this dataset.It also shows that the following interpretation of the data in panel (c) is plausible: "all members show a forced response of one centimetre per Sverdrup, with variations around this response consistent with unforced variability".The hosing simulation shows a response of about 1.7 centimetres per Sverdrup.This is within the range of the ensemble.
C.1.Why can't any relationship be seen within the blue (or orange) cloud in panel (c)?Let dS be the surge anomaly (the 21st century change in mean annual maximum surge at Millport).Let dA be the AMOC anomaly (the 21st century change in negative-AMOC, in other words the AMOC slowdown).
Panel (c) shows that inter-member variations in dA are substantially less than the ensemble mean dA: std(dA) < < mean(dA).Panel (g) shows that there is unforced noise in dS, comparable5 to the forced mean(dS) seen in panel (c): std(dS) ∼ mean(dS).The hypothesised underlying proportionality between dA and dS is detectable when 1. we reduce the noise by meaning over members, and 2. there is a large mean AMOC change (as seen in panel c) It is not detectable in the large inter-member dS differences in response to the small inter-member differences in dA (the former are dominated by unforced noise).

Figure 1
Figure1.Time series of annual mean AMOC strength, for the 13 members of the PPE, and for the hosing simulation.Vertical lines show the limits of our 30-year time-slices.Y-axis (AMOC strength) and X-axis (year) is common to all panels (tick labels have been omitted from all but one panel for brevity).However, the datum of years in the hosing simulation is arbitrary, so has been adjusted in this plot such that the time-slice of the hosing simulation matches that of the PPE for ease of visual comparison.

Figure 2 .
Figure 2. Surge and wind component response in the hosing simulation (difference in mean of 30 consecutive annual maxima with/ without hosing).The percentage change in surge is also shown.Panel (a): change (difference between forced experiment and unforced control) in 30-year mean of annual maximum surge.Panel (c): as (a) but as a percentage of the control.Panel (b): change (difference between forced experiment and unforced control) in 30-year mean of annual maximum westerly wind component.Panel (d): change (difference between forced experiment and unforced control) in 30-year mean of annual maximum northerly wind component.
3).McDonald et al (in prep) note that some of the same signals of an increase in atmospheric storminess over the UK and into Europe appear in the CMIP5 (Zappa et al 2013) and CMIP6 multi-model ensembles (Harvey et al 2020).

Figure 3 .
Figure3.Surge response in the PPE simulation (difference between RCP 8.5-forced experiment and unforced control simulation in the mean of 30 consecutive annual maxima: 2070-2099 inclusive).The numbers are the last four digits of the realization numbers as used inSexton et al (2021) andYamazaki et al (2021) and are used to identify the ensemble members.

Figure 5 .
Figure 5. Westerly wind response (predominantly positive) in the PPE simulation characterised as the difference (metres per second) between RCP 8.5-forced experiment and unforced control simulation in the mean of 30 consecutive annual maxima (2070-2099 inclusive) of the westerly component of wind.

Figure 4 .
Figure 4. Change in mean of 30 consecutive annual maxima (as figure 3) but here shown as a percentage change.

Figure 7 .
Figure 7. Storm surge changes: UK coastline summary plot.The Y-axis in panels (a) and (b) is the difference (metres) between the forced experiment and unforced control simulation in the mean of 30 consecutive annual maximum surges, as mapped in figure 3. Panel (a): UK mainland coast.Panel (b): various off-mainland sites.Blue vertical lines show the range of the PPE simulation.Red dots show the hosing simulation.Orange bars show the range of the unforced changes (see main text).Panels (c) and (d) show the percentage change (as mapped in figure4) for the same sites.These panels also show the "high-end" change reported by UKCP18 Marine, interpreted as a percentage change over 100 years as described in the main text.("Mainland" is nominal: for example Millport is included in the mainland sites).

Figure 8 .
Figure 8.(a) Ensemble mean winter surface air temperature anomaly over the northern Atlantic.(b) Northern Atlantic baroclinicity increase vs AMOC slowdown for the 13 members.

Figure 9 .
Figure 9. (a) Time evolution of forced PPE ensemble-mean negative-AMOC.Grey points show the ensemble-mean for each year.Blue points show 7-yr block means of that data (made from non-overlapping blocks).(b) Time evolution of forced PPE ensemblemean annual maximum surge at Millport.Grey points show the ensemble-mean for each year.Blue points show 7-yr block means of that data (made from non-overlapping blocks).(c) PPE ensemble-mean annual maximum surge at Millport (Y axis) against PPE ensemble-mean negative-AMOC (X axis).Grey points show all years.Blue points show 7-yr block means (made from nonoverlapping blocks).The all-years Pearson product-moment correlation coefficient is 0.42 (P value < < 0.1%).The 7-yr block means Pearson product-moment correlation coefficient is 0.87 (P value < < 0.1%).The all-years ordinary least-squares regression of Y on X has a gradient of 0.89 cm/Sv (0.91 cm/Sv for the 7-yr block means).

Figure B2 .
Figure B2.Pearson's r for correlation between 483 annual maximum skew surges from a surge-and-tide simulation and 483 annual maximum surges from a surge-only simulation at UK coastal sites.The worst correlation is at Immingham; the best is at Lerwick (both shown in figureB1).

Figure B1 .
Figure B1.Scatter plots of 483 annual maximum skew surges from a surge-and-tide simulation (X-axis) vs. 483 annual maximum surges from a surge-only simulation (Y-axis) at the UK coastal locations where the correlation between these two is best (Lerwick) and worst (Immingham).See the main text for details.

Figure C1 .
Figure C1.Relationship between AMOC and mean annual maximum surge at Millport.Left-hand panels show data from the RCP8.5 simulation.Right-hand panels show data from the control simulation.Orange points show data from the period 1970 to 1999.Blue points show data from the period 2070 to 2099.Faint grey lines connect pairs of points from the same ensemble member.Y-axes refer to surge at Millport and X-axes refer to negative-AMOC (except in the bar charts).Apart from the bar charts, the same X axis scale is used for all panels (except for an offset in panels (a) and (e)).Similarly, apart from the bar charts, the same Y axis scale is used for all panels (except for an offset in panels (a) and (e)).Inter-member variations in surge are not correlated with those of AMOC (panel b).Forced responses are correlated, but only by virtue of the mean response (panel c).The residual surge response not explained by the AMOC response is no bigger than the unforced variability (panels d and h).A detailed explanation is in this appendix.
figure C1.Panel (d) shows the RMS residuals associated with three different models of the surge changes shown in panel (c): • Model 0: surge change is zero: y = 0 + residuals • Model 1: surge change is a constant across members : y = mean(y) + residuals • Model 2: surge change is proportional to negative-AMOC change: y = x * slope + residuals Panel (d) shows that • Model 0 is invalid for the forced simulation, as confirmed by the T test.

Table 1 .
Mean AMOC strength (Sverdrups) over our 30-year time slice in the unforced control and the forced experiment, with change in Sv and change as a percentage of control.
Lowe et al 2009,f surge changesWe are particularly interested in any changes in extreme surges.Some previous work (e.g.Lowe et al 2009,