Arctic warming in response to regional aerosol emissions reductions

This study examines the Arctic surface air temperature response to regional aerosol emissions reductions using three fully coupled chemistry–climate models: National Center for Atmospheric Research-Community Earth System Model version 1, Geophysical Fluid Dynamics Laboratory-Coupled Climate Model version 3 (GFDL-CM3) and Goddard Institute for Space Studies-ModelE version 2. Each of these models was used to perform a series of aerosol perturbation experiments, in which emissions of different aerosol types (sulfate, black carbon (BC), and organic carbon) in different northern mid-latitude source regions, and of biomass burning aerosol over South America and Africa, were substantially reduced or eliminated. We find that the Arctic warms in nearly every experiment, the only exceptions being the U.S. and Europe BC experiments in GFDL-CM3 in which there is a weak and insignificant cooling. The Arctic warming is generally larger than the global mean warming (i.e. Arctic amplification occurs), particularly during non-summer months. The models agree that changes in the poleward atmospheric moisture transport are the most important factor explaining the spread in Arctic warming across experiments: the largest warming tends to coincide with the largest increases in moisture transport into the Arctic. In contrast, there is an inconsistent relationship (correlation) across experiments between the local radiative forcing over the Arctic and the simulated Arctic warming, with this relationship being positive in one model (GFDL-CM3) and negative in the other two. Our results thus highlight the prominent role of poleward energy transport in driving Arctic warming and amplification, and suggest that the relative importance of poleward energy transport and local forcing/feedbacks is likely to be model dependent.


Introduction
The Arctic has experienced rapid warming over recent decades (Gulev et al 2021), with this warming occurring about four times faster than the global mean warming since 1979 (Chylek et al 2022, Rantanen et al 2022. While the dominant role of anthropogenic greenhouse gases (GHGs) in driving the recent Arctic warming is well established (e.g. Najafi et al 2015, Polvani et al 2020, England 2021, Aizawa et al 2022, other anthropogenic forcings-notably, aerosols-may have contributed as well. England (2021) found that anthropogenic aerosol forcing warmed the Arctic at a rate of 0.18 • C/decade during 1975-2019 (compared with a warming of 0.59 • C/decade during this period from GHGs). This aerosol-induced warming is a result of reductions in aerosol (and aerosol precursor) emissions from the United States and Europe (Acosta Navarro et al 2016, Wang et al 2018. In contrast, increases in aerosol emissions from Asia in recent decades have likely contributed to Arctic cooling (Yang et al 2014, Wang et al 2018, thus opposing the observed warming. These increases in Asian emissions are not expected to continue, however, as efforts to improve air quality are likely to result in decreases in anthropogenic aerosol emissions from Asia over the next several decades (Lund et al 2019). These decreases in Asian emissions could contribute significantly to Arctic warming and sea-ice loss (Wang et al 2018, Merikanto et al 2021, thus enhancing the effects of rising GHGs. Motivated by these potential impacts of anthropogenic aerosol forcing on Arctic climate, we here examine the Arctic climate response to several different regional aerosol emissions reductions using three coupled chemistry-climate models: the Community Earth System Model version 1 (CESM1) which was run at the National Center for Atmospheric Research (NCAR-CESM1), the Geophysical Fluid Dynamics Laboratory (GFDL) Coupled Climate Model version 3 (GFDL-CM3), and the Goddard Institute for Space Studies (GISS) ModelE version 2 (GISS-E2-R). Within each of these models, emissions of sulfur dioxide (SO 2 , precursor to sulfate aerosol) and/or carbonaceous aerosol from six world regions (the United States, Europe, China, India, South America, and Africa) were instantaneously reduced or eliminated (see section 2.1 for additional details). The climate response to these emissions reductions was then computed as the difference relative to a long control simulation with fixed year 2000 or 2005 emissions.
This same set of model experiments was recently used by Westervelt et al (2020) to assess the global and regional surface temperature response to regional aerosol emissions reductions. While the Arctic was shown to warm in most experiments, the physical mechanisms underlying this warming response were not explored. Here, we aim to provide a mechanistic understanding of the Arctic warming response to regional aerosol emissions reductions. By analyzing the perturbation atmospheric energy budget, we show that enhanced moisture transport into the Arctic due to aerosol forcing is a leading-order factor explaining the Arctic warming response in all models. Less robust across models are the relationships between local radiative forcing and feedbacks and Arctic warming, which can be of opposite sign in different models.

Model experiments
As noted above, we employ simulations with perturbed regional aerosol emissions from three different models: NCAR-CESM1 (Neale et al 2012), GFDL-CM3 (Donner et al 2011), and GISS-E2-R (Schmidt et al 2014). Each of these models includes active atmosphere, ocean, sea ice, and land surface components, and has fully interactive chemistry of aerosols and trace gases. All models represent the aerosol direct and first indirect (cloud albedo) effects, while NCAR-CESM1 and GFDL-CM3 additionally represent the second indirect (cloud lifetime) effect. The model configurations used here are very similar to those that were used for Coupled Model Intercomparison Project phase 5 (CMIP5). For additional model description and model evaluation, we refer readers to Westervelt et al (2017) and Naik et al (2013).
For each model, we completed a long (up to 400 years) control simulation with fixed year 2000 (2005 for NCAR-CESM1) boundary conditions (including emissions of aerosols and aerosol precursors). From this control simulation, we then launched several additional simulations in which regional aerosol emissions were perturbed in various ways. These perturbation simulations are summarized in table 1. Briefly, we instantaneously reduced or eliminated the emissions of SO 2 , black carbon (BC), and organic carbon (OC) from six world regions (the United States, Europe, China, India, South America, and Africa). In some experiments (e.g. USO2), a single aerosol (or precursor) species was perturbed, whereas in other experiments (e.g. UALL) multiple species were perturbed simultaneously. All 14 experiments listed in table 1 were completed with the GFDL-CM3. For NCAR-CESM1 (GISS-E2-R), 12 (8) of these experiments were completed.
Each aerosol perturbation experiment includes two companion simulations: a fully coupled atmosphere-ocean simulation of 160-240 years in length, and a fixed sea surface temperature (SST) simulation of 40-80 years in length. In the fixed SST simulations (which include a year 2000 (or 2005 for NCAR-CESM1) control), only the atmosphere and land surface components of the models are active, with SSTs and sea-ice concentrations prescribed to follow a repeating climatological seasonal cycle (but with no other time variation). We compute the climate response to a given aerosol perturbation as the difference between the perturbation simulation and the control simulation. More specifically, we difference each time in the perturbation simulation from the corresponding time in the control simulation, and then average over all times, which removes the effects of any drift in the (coupled) control simulation. Finally, it is worth noting that the same set of model experiments described here was previously used by Westervelt et al (2020) to assess the global and regional surface temperature response to regional aerosol emissions reductions (as mentioned above), and by Westervelt et al (2018) to assess the precipitation response to these emissions reductions. Additionally, the surface temperature and precipitation responses in just the USO2 experiment (see table 1) were previously assessed by Conley et al (2018) and Westervelt et al (2017), respectively.

Energy budget analysis
We analyze the perturbation atmospheric energy budget in order to understand the Arctic annual-mean surface air temperature (SAT) response to regional aerosol emissions reductions. The perturbation energy budget can be expressed as: where R is the net downward top-of-atmosphere (TOA) radiation, OHU is the ocean heat uptake (i.e. net downward surface energy flux), AHT is the convergence of the atmospheric heat transport, and ∆ is the response to aerosol forcing (i.e. the difference between the perturbation and control simulations). Note that since we are focused here on annual time scales, the atmospheric energy storage has been ignored (i.e. assumed to be zero). The TOA radiation response ∆R can be decomposed into a radiative forcing ∆F and radiative feedbacks, with the latter assumed proportional to the SAT response: where λ is referred to as the feedback parameter. Substituting (2) into (1) and solving for ∆SAT yields: We take ∆F to be the effective radiative forcing (ERF), which we estimate as the net TOA radiation response in the fixed SST simulations (e.g. Forster et al 2016. The feedback parameter λ is estimated as the slope of the regression line obtained by regressing annual-mean ∆R against ∆SAT, using output from the coupled atmosphere-ocean simulations (Gregory et al 2004). With these estimates of ∆F and λ, and using model output ∆SAT and ∆OHU (from the coupled simulations), we then estimate ∆AHT as a residual in equation (3).
The total AHT response ∆AHT can be decomposed into responses of the dry static energy (DSE) flux convergence ∆AHT dse and moisture flux convergence ∆AHT q : Substituting (4) into (3) yields: We estimate ∆AHT q as a residual in the atmospheric moisture budget: where P and E are the surface precipitation and evaporation, and atmospheric moisture storage has been ignored. Finally, ∆AHT dse is computed as the difference between ∆AHT and ∆AHT q . Equation (5) represents our working form of the perturbation atmospheric energy budget. In section 3.2, we evaluate the different terms on the right-hand side of this equation in order to understand the SAT response in our model experiments. Terms in equation (5) are generally averaged annually and over the Arctic region (60 • N-90 • N), although we also briefly consider in our analysis the SAT response averaged globally, and the ERF averaged globally and over Northern Hemisphere (NH) midlatitudes (30 • N-60 • N). Note that when averaged over the Arctic, ∆AHT dse and ∆AHT q are equivalent to the changes in the poleward fluxes of DSE and moisture across 60 • N.

Quantification of uncertainty
We quantify the uncertainty in the mean SAT response by computing the 95% confidence interval (CI) for this mean response: where N is the sample size, t N−1 is the critical value from the t distribution for N−1 degrees of freedom, and ∆SAT std is the standard deviation of the SAT response. When the mean SAT response is an average over time in a particular model experiment, we account for autocorrelation in the model time series by computing an effective sample size N eff (N eff ⩽ N; e.g. Zwiers and von Storch 1995, Santer et al 2000: where r 1 is the lag-1 autocorrelation. In this case, N eff is used in equation (7) both in the computation of the standard error (∆SAT std / √ N eff ), and in the indexing of the critical t value. Note that when the mean SAT response is an average over time in a particular model experiment, ∆SAT std in equation (7) is the standard deviation of the annual-mean ∆SAT time series for that experiment. In this case, CI ∆SAT is an estimate of the uncertainty in the mean (forced) SAT response in that experiment due to internal climate variability.
In section 3.2, we use least-squares linear regression to quantify the relationship between the SAT response and the various terms in the perturbation atmospheric energy budget (equation (5)) across the model experiments. The slope of the regression line m has a 95% CI given by: where y −ŷ are the regression residuals, an overbar indicates a mean (in this case, over all experiments), and t N−2 is the critical value from the t distribution for N−2 degrees of freedom. In our analysis, y in equation (9) will typically be defined as the model-simulated SAT response (with corresponding least-squares estimateŷ), and x will typically be defined as a given term in the perturbation energy budget. The sample size N in this case is the number of experiments in a given model (see table 1).

Arctic SAT response
We begin by considering the annual-mean SAT response averaged over the Arctic region (figure 1). In nearly all model experiments, the Arctic warms in response to regional aerosol emissions reductions. This warming is statistically significant (i.e. error bars in figure 1 (bracketing the 95% CI) do not overlap zero) in about half of the experiments in NCAR-CESM1 and GFDL-CM3 (7 out of 12 experiments and 6 out of 14 experiments, respectively), and nearly all (7 out of 8) of the experiments in GISS-E2-R. Averaged over all experiments, we find a similar Arctic warming response in the three models: 0.22 ± 0.06 K in NCAR-CESM1, 0.20 ± 0.09 K Figure 1. Arctic annual-mean SAT response in the model experiments. Error bars show the 95% CI (equation (7)), accounting for autocorrelation in the model time series (equation (8)).
in GFDL-CM3, and 0.16 ± 0.04 K in GISS-E2-R (mean response ± 95% CI). However, although the mean responses are similar, the variation in the Arctic warming response across the experiments is not well correlated between different pairs of models, with a correlation coefficient r of 0.27 between NCAR-CESM1 and GFDL-CM3, 0.16 between NCAR-CESM1 and GISS-E2-R, and −0.08 between GFDL-CM3 and GISS-E2-R (all correlations are insignificant). Despite this lack of correlation, we will show in section 3.2 that a common physical mechanism (atmospheric moisture transport) largely explains the spread in Arctic warming response in all three models. We next examine the spatial patterns of annual-mean SAT response over the Arctic, as shown in figures 2-4 for the eight experiments common to all three models, and in figures S1 and S2 for the remaining experiments. In some cases, different experiments in a given model can yield broadly similar spatial patterns of SAT response, for instance the ISO2 and IBC experiments in NCAR-CESM1 (figures 2(e) and (f)). More commonly, however, there are notable differences in the SAT response patterns, both within a given model, and for the same experiment in different models. As an example of the latter, we consider the ISO2 experiment that is common to all models. In NCAR-CESM1 (figure 2(e)), this experiment produces an area of statistically significant warming (red shading) over Western Russia and the Kara Sea, with a smaller area of significant cooling (blue shading) evident over Hudson Bay. In GFDL-CM3 (figure 3(e)), significant warming in ISO2 occurs further to the north and west over Svalbard, Franz Josef Land, the Greenland Sea and portions of the Arctic Ocean, with an absence of any significant cooling. Finally, in GISS-E2-R (figure 4(e)), statistically significant SAT responses in ISO2 are mostly limited to a small area of weak warming occurring over northern Hudson Bay and Baffin Island.
We also briefly consider another experiment that is common to all models: CSO2. The Arctic mean warming in this experiment is quite similar in the three models ( figure 1), yet there are clear differences in the spatial patterns of SAT response. All models simulate a warming dipole across the Arctic Ocean in CSO2, but the strength of this dipole varies significantly, being strongest in NCAR-CESM1 (figure 2(d)) and weakest in GFDL-CM3 (figure 3(d)), with GISS-E2-R (figure 4(d)) somewhere in the middle. These examples thus serve to illustrate that even for the same aerosol emissions perturbation, different models can sometimes produce quite different climate responses over the Arctic.
SAT responses by month of year are shown in figure 5 for the eight experiments common to all three models, and in figure S3 for the remaining experiments. In both figures, Arctic mean SAT responses are depicted as colored lines, and global mean SAT responses as colored crosses. It is clear that Arctic SAT responses exhibit greater seasonality than global responses, with the former tending to maximize in the Annual-mean SAT response in NCAR-CESM1 for the eight experiments that are common to all three models. The response is plotted using a contour interval of 0.1 K with negative contours dashed and the zero contour excluded. Shading indicates that the response is statistically significant (i.e. 95% CI does not overlap zero). The 95% CI of the mean response is computed using equation (7), accounting for autocorrelation (equation (8)).
non-summer months and the latter fairly constant throughout the year. Arctic warming responses tend to be larger than global warming responses, indicating that Arctic amplification (AA) is occurring in these simulations. This is particularly true during the non-summer months, which fits with the expected seasonality of AA (Previdi et al 2021).

Energy budget analysis
We next analyze the perturbation atmospheric energy budget (see section 2.2) in order to understand the Arctic annual-mean SAT response in our model experiments. We focus on two aspects of the Arctic SAT response in each model: the mean SAT response averaged over all experiments, and the spread in the SAT response across the experiments. As noted in section 3.1, the mean Arctic SAT response averaged over all experiments is similar in the three models: 0.22 ± 0.06 K in NCAR-CESM1, 0.20 ± 0.09 K in GFDL-CM3, and 0.16 ± 0.04 K in GISS-E2-R. In order to identify the processes responsible for producing this mean response, we average each term in the perturbation energy budget (equation (5)) over all model experiments (figure 6). A positive (negative) value for a given energy budget term signifies that, on average, that term contributes to a gain (loss) of energy for the Arctic atmosphere, and thus contributes to (opposes) the mean warming response. To  begin, we note that the contribution of local radiative forcing over the Arctic (∆F) to the mean warming response is of opposite sign in different models, being positive in NCAR-CESM1 and GFDL-CM3 and negative in GISS-E2-R. In all models, however, this contribution of ∆F is offset by a similar magnitude but opposite-signed response of the OHU (∆OHU). The contribution of local radiative feedbacks over the Arctic (λ∆SAT) to the mean warming response is also of opposite sign in different models, being positive in GFDL-CM3 and negative in NCAR-CESM1 and GISS-E2-R. This is consistent with a previous multi-model analysis (Block et al 2020) of the CMIP5 abrupt 4 × CO 2 experiment, in which it was shown that the total radiative feedback in the Arctic (i.e. λ) can be positive or negative depending on the model. Lastly, all three models in our analysis agree on the sign of ∆AHT dse and ∆AHT q . The former is negative, indicating decreased DSE transport into the Arctic, which is expected from reductions in the meridional temperature  (5)) plotted against the SAT response. All quantities are Arctic annual means. Each point in a given panel represents a model experiment. Black lines show the least-squares linear fit to the data. The slope of this least-squares fit m (with 95% CI; see equation (9)) and the correlation coefficient r are given at the bottom of each panel. A single (double) asterisk indicates a correlation that is significant at the 95% (99%) level. gradient due to polar-amplified warming (see figures 5 and S3; Langen and Alexeev 2007, Hwang et al 2011, Skific and Francis 2013, Graversen and Burtu 2016, Audette et al 2021. The latter is positive, indicating increased moisture transport into the Arctic. The physical reason for this increased moisture transport will be discussed below. Here, we simply note that ∆AHT q is the largest contributor to the mean Arctic warming response in both GFDL-CM3 and in GISS-E2-R. In NCAR-CESM1, this term is the second largest contributor to the mean warming response behind ∆F. Before proceeding to discuss the spread in Arctic warming across the experiments, we note that the mean energy flux responses shown in figure 6 are generally consistent in sign with the majority of the individual experiments. The only instance where this is not the case is for ∆AHT dse in GISS-E2-R, where equal numbers of the individual experiments exhibit positive and negative responses. For ∆AHT q , 8 out of 12 experiments in NCAR-CESM1, 12 out of 14 experiments in GFDL-CM3, and 6 out of 8 experiments in GISS-E2-R exhibit positive responses, thus consistent with the experiment mean. In two of the individual experiments (SABB in NCAR-CESM1 and IBC in GISS-E2-R), the Arctic warms significantly (see figure 1) yet ∆AHT q is negative, indicating that processes other than enhanced poleward moisture transport must be driving the Arctic warming in these experiments. In general though, the processes that we have identified as driving the mean Arctic warming (including enhanced poleward moisture transport) also contribute to warming in the individual experiments.
We now turn our attention to the spread in the Arctic SAT response across the model experiments. Figure 7 shows the SAT response plotted versus each term in the perturbation energy budget (equation (5)) for the three models. In all models, there is a statistically significant positive correlation between the Arctic SAT response and the moisture transport response (figures 7(e), (j) and (o)). In other words, model experiments that exhibit the greatest amount of Arctic warming tend to be the experiments with the largest increases in moisture transport into the Arctic. Correlations between the Arctic SAT response and the other terms in the energy budget are generally not statistically significant, with the exceptions of a significant positive correlation between ∆SAT and ∆F (figure 7(f)) and a significant negative correlation between ∆SAT and ∆AHT dse (figure 7(i)) in GFDL-CM3.
Given the importance of the moisture transport response for both the mean Arctic warming (figure 6) and the warming spread (figure 7), we wish to better understand the factors controlling this response. Increased poleward moisture transport in the extratropics with warming is a direct consequence of warming-induced increases in lower-tropospheric water vapor (Held and Soden 2006). Because of the Figure 8. Global mean SAT response (∆SAT glb ) plotted against the Arctic moisture transport response (∆AHTq; panels (a), (c), and (e)) and the global mean ERF (∆F glb ; panels (b), (d), and (f)). All quantities are annual means. Each point in a given panel represents a model experiment. Black lines show the least-squares linear fit to the data. The slope of this least-squares fit m (with 95% CI; see equation (9)) and the correlation coefficient r are given in the top left of each panel. A single (double) asterisk indicates a correlation that is significant at the 95% (99%) level. nonlinear dependence of the saturation vapor pressure on temperature-as given by the Clausius-Clapeyron relationship-these water vapor increases are larger at warmer low latitudes than at colder high latitudes (even despite polar-amplified warming), which strengthens the meridional humidity gradient and drives the increased poleward moisture transport (e.g. Burtu 2016, Previdi et al 2021). Thus, the mean increase in moisture transport into the Arctic in our model experiments (figure 6) is an expected response to the simulated mean warming together with the nonlinearity of the Clausius-Clapeyron relationship. One might also expect that ∆AHT q would increase across the model experiments in proportion to the increase in average warming (see also Held and Soden 2006). Indeed, we find a strong and statistically significant positive correlation between ∆AHT q and the globally-averaged SAT response (∆SAT glb ) in all three models (figures 8(a), (c) and (e)). Perhaps not surprisingly, most of the variance in ∆SAT glb across the experiments can be explained by differences in the globally-averaged ERF (figures 8(b), (d) and (f); see also Westervelt et al 2020).

Discussion and conclusions
In this study, we have analyzed the Arctic temperature response to regional aerosol emissions reductions using three fully-coupled chemistry-climate models. We found that the Arctic warms in most model experiments (figure 1), and that this warming is generally amplified relative to the global mean warming, particularly during non-summer months (figures 5 and S3). The perturbation atmospheric energy budget was analyzed in order to better understand both the mean Arctic warming (i.e. the response averaged over all experiments) and the warming spread across the experiments. In both cases, changes in atmospheric moisture transport into the Arctic are important. There is a mean increase in moisture transport into the Arctic in all three models (figure 6), which is the largest contributor to the mean Arctic warming in two of the models (GFDL-CM3 and GISS-E2-R). Increases in moisture transport into the Arctic tend to be largest in the model experiments where the Arctic warms the most (figures 7(e), (j) and (o)). Our results therefore support several previous studies (e.g. Flannery 1984, Cai 2005, Cai and Lu 2007, Langen and Alexeev 2007, Graversen and Burtu 2016, Yoshimori et al 2017, Merlis and Henry 2018, Graversen and Langen 2019) that have suggested that enhanced poleward moisture transport contributes significantly to Arctic warming and AA. This enhanced moisture transport warms the polar region by strengthening the greenhouse effect over the Arctic, both directly and by increasing Arctic cloudiness (Previdi et al 2021).
Increased poleward moisture transport into the Arctic (and in the extratropics more generally) is a robust response of the hydrological cycle to global warming (Held and Soden 2006). Accordingly, the moisture transport response (∆AHT q ) in our model experiments increases in proportion to the global mean SAT response (∆SAT glb ; see figures 8(a), (c) and (e)). This being said, not all of the variance in ∆AHT q across the experiments can be explained by differences in ∆SAT glb . In NCAR-CESM1 ( figure 8(a)), most of the variance in ∆AHT q cannot be explained by ∆SAT glb . In this model, we find that a better predictor of ∆AHT q is the ERF averaged over NH midlatitudes (30 • N-60 • N), with these two variables significantly correlated (r = 0.83) across the experiments. A positive correlation between ∆AHT q and the NH midlatitude ERF also exists in GFDL-CM3 and GISS-E2-R (r = 0.71 and r = 0.46, respectively), but it is weaker and statistically insignificant in GISS-E2-R. In the latter two models, ∆AHT q is more strongly correlated with the global mean ERF (r = 0.79 in GFDL-CM3 and r = 0.73 in GISS-E2-R) and SAT response (figures 8(c) and (e)). (The correlation between ∆AHT q and the global mean ERF in NCAR-CESM1 is relatively weak and insignificant, r = 0.35.) Thus, the most important factors controlling the moisture transport response to regional aerosol emissions reductions differ somewhat between models.
Other aspects of the climate response to these emissions reductions similarly exhibit model dependence. As noted in section 3.1, there is no correlation in the Arctic SAT response across the experiments between different models. This lack of correlation persists even when the Arctic SAT response is normalized by the global mean ERF (not shown). Additionally, in some cases, the same aerosol perturbation can lead to quite different spatial patterns of Arctic SAT response in different models (figures 2-4 and S1, S2). Finally, our analysis of the perturbation atmospheric energy budget has revealed that certain energy budget terms-specifically, the local radiative forcing and feedbacks and the OHU response-make opposite-signed contributions to the mean Arctic SAT response in different models ( figure 6).
Comparing the models across such a diverse set of aerosol perturbation experiments makes it difficult to identify the causes of these model differences. Such differences may arise due to different model sensitivities to emissions perturbations from a particular region(s), or of a particular aerosol type(s), or some combination of the two (and possibly other reasons as well). In this regard, it may be informative to compare the models across a subset of more similar experiments. To test this, we computed the correlation in the Arctic SAT response across just the SO 2 perturbation experiments in different pairs of models. For this subset, we find correlations of r = 0.92 between NCAR-CESM1 and GFDL-CM3, r = 0.45 between NCAR-CESM1 and GISS-E2-R, and r = 0.11 between GFDL-CM3 and GISS-E2-R. The correlation between NCAR-CESM1 and GFDL-CM3 is significant at the 90% level even despite the very small sample size (N = 4). This suggests that the lack of correlation between these models across the full set of experiments (r = 0.27; see section 3.1) likely cannot be explained by different model sensitivities to regional SO 2 emissions. Given the difficulty of establishing statistically robust results using only a subset of the model experiments, we do not pursue this further. This example nevertheless serves to illustrate the potential value of comparing the models across a more similar set of aerosol perturbations in order to better understand the causes of the model differences described above.
In closing, our results underscore the importance of adopting a multi-model framework when assessing the climate response to aerosol forcing, so as to be able to distinguish between robust and model-dependent aspects of the response. Along these lines, the ongoing Regional Aerosol Model Intercomparison Project (RAMIP; Wilcox et al 2022) is developing a new set of Earth System Model simulations (using CMIP6 models) in which the emissions of anthropogenic aerosols and their precursors are perturbed in several world regions (East Asia, South Asia, Africa, and the Middle East). These emissions perturbations are realistic transient perturbations based on different assumed Shared Socioeconomic Pathway future scenarios (Rao et al 2017, Riahi et al 2017, in contrast to the idealized instantaneous emissions perturbations considered in the present study. Analysis of RAMIP results will therefore allow for continued exploration of the Arctic climate response to regional aerosol emissions changes-importantly, within a framework that is relevant for near-term Arctic climate change over the next few decades.

Data availability statements
The data cannot be made publicly available upon publication because the cost of preparing, depositing and hosting the data would be prohibitive within the terms of this research project. The data that support the findings of this study are available upon reasonable request from the authors.