Identifying the regional emergence of climate patterns in the ARISE-SAI-1.5 simulations

Stratospheric aerosol injection is a proposed form of solar climate invention (SCI) that could potentially reduce the amount of future warming from externally-forced climate change. However, more research is needed, as there are significant uncertainties surrounding the possible impacts of SCI, including unforeseen effects on regional climate patterns. In this study, we consider a climate model simulation of the deployment of stratospheric aerosols to maintain the global mean surface temperature at 1.5 ∘C above pre-industrial levels (ARISE-SAI-1.5). Leveraging two different machine learning methods, we evaluate when the effects of SCI would be detectable at regional scales. Specifically, we train a logistic regression model to classify whether an annual mean map of near-surface temperature or total precipitation is from future climate change under the influence of SCI or not. We then design an artificial neural network to predict how many years it has been since the deployment of SCI by inputting the regional maps from the climate intervention scenario. In both detection methods, we use feature attribution methods to spatially understand the forced climate patterns that are important for the machine learning model predictions. The differences in regional temperature signals are detectable in under a decade for most regions in the SCI scenario compared to greenhouse gas warming. However, the influence of SCI on regional precipitation patterns is more difficult to distinguish due to the presence of internal climate variability.


Introduction
All components of the Earth system are experiencing rapid change due to human-driven activities, such as the emission of greenhouse gases (IPCC et al 2021). In fact, the primary global mean surface temperature (GMST) monitoring datasets all agree that the last seven years (2015)(2016)(2017)(2018)(2019)(2020)(2021) are the seven warmest on record (Ades et al 2022). The GMST is now consistently more than 1.1 • C above the 1850-1900 pre-industrial reference period and therefore quickly approaching critical warming levels of 1.5 • C and 2 • C for even more consequential global climate change impacts (IPCC 2018, McKay et al 2022. The effects of human activities (i.e. the forced response) have already been detected outside the range of internal climate variability (Sippel et al 2021), such as through changes to the regional hydrological cycle (e.g. Marvel et al 2019, Madakumbura et al 2021, modulation of the seasonality of tropospheric temperatures (Santer et al 2022), cooling and contraction of the stratosphere (Pisoft et al 2021), increases in some extreme weather events (e.g. Clarke et al 2022), rising global sea levels and deep ocean heat content (e.g. Hsu andVelicogna 2017, Cheng et al 2022), and through the loss of ice mass in the global cryosphere (Slater et al 2021).
Given the continued high levels of global carbon emissions (Liu et al 2022), it is still uncertain whether countries' long-term pledges and commitments for net-zero emissions are enough to prevent overshooting Paris agreement targets within the next few decades (e.g. UNFCCC 2015, Dvorak et al 2022, Matthews andWynes 2022, Meinshausen et al 2022). In addition to exploring technologies for a net-zero energy system (Davis et al 2018), large-scale carbon capture and storage (de Kleijne et al 2022), and other mitigation strategies, the deployment of solar climate intervention (SCI) technology has been discussed as a possible alternative for reducing the most adverse impacts of climate change (Kravitz and MacMartin 2020). However, there are numerous ethical and political concerns, issues of feasibility, uncertainties in the Earth system response, and the potential for unforeseen consequences surrounding the use of SCI methods (Burns et al 2016, Irvine et al 2016, Carlson and Trisos 2018, Mahajan et al 2019, Abatayo et al 2020. To better constrain the costs, risks, and benefits of SCI strategies, the National Academies of Science, Engineering and Medicine (NASEM) outlined a series of recommendations for conducting and supporting more research on this topic, including the impacts of SCI on regional patterns and extremes relative to climate change and natural variability (NASEM 2021).
One often studied form of SCI is through the potential deployment of stratospheric aerosols, otherwise known as stratospheric aerosol injection (Robock et al 2008). By deliberately releasing sulfates, calcium carbonate, or other materials into the atmosphere, a small amount of incoming sunlight would be reflected back into space. Thus, this mechanism would act to cool Earth's climate in a manner that is analogous to the climate effects of an explosive volcanic eruption (Robock 2000). Although coordinated modeling efforts, such as through the Geoengineering Model Intercomparison Project (GeoMIP) (Kravitz et al 2011(Kravitz et al , 2015, have attempted to simulate the range of climate impacts from SCI, these attempts have made some unrealistic simplifications to future scenario choices and rarely considered the role of internal climate variability (MacMartin et al 2022, Visioni and Robock 2022 (Riahi et al 2011, Burgess et al 2020. Barnes et al (2022) recently examined the emergence of SCI impacts on climate extremes in GLENS and found that a simple machine learning method could detect whether a global map of extreme precipitation or extreme temperature came from a world under the influence of SCI or RCP8.5 alone in less than two decades. However, the magnitude of the forced climate responses within GLENS may be unrealistic due to the excessive amount of aerosols needed by the end of the 21st century to offset warming under RCP8.5. Other works investigating the detectability of SCI impacts (e.g. Ricke et al 2010, Kravitz et al 2014, Burger and Cbasch 2015, Lo et al 2016, MacMartin et al 2019 have considered relating the magnitude of the mean change from SCI relative to interannual variability (e.g. signal-to-noise or timing of emergence metrics). As one example, MacMartin et al (2019) investigated the timing of SCI impacts by comparing the magnitude of change relative to natural variability using output scaled from GLENS for considering a lower emission scenario.
In this study, we address the detection of climate signals by considering a new experiment called the Assessing Responses and Impacts of SCI on the Earth system with Stratospheric Aerosol Injection (ARISE-SAI-1.5; , which is simulated with the Shared Socioeconomic Pathway (SSP) 2-4.5 emissions scenario. We then extend the framework of Barnes et al (2022) by first designing a machine learning method to consider when we can detect differences in regional climate impacts between the SCI scenario in ARISE-SAI-1.5 relative to the parallel SSP2-4.5 projection. We then investigate impacts due to SCI alone by designing a machine learning method to predict how long it has been since the initial aerosol injection. Importantly, we focus our analysis on different geographic locations, which range from global land areas to much smaller key climate regions, such as the Amazon basin. One advantage of using this data-driven approach is that we can identify differences in time-evolving and/or consistent spatial patterns of climate impacts (and changes in the underlying climate noise) using explainable machine learning methods, rather than only quantify point-by-point summary statistics like signal-to-noise ratios.

Data
The ARISE-SAI-1.5 experiment is a new SCI simulation conducted using NCAR's CESM2 (Danabasoglu et al 2020) and its high-top atmospheric model component WACCM6 (Gettelman et al 2019). This climate model is further described in text S1. While the specific design details of ARISE-SAI-1.5 are documented within , we briefly summarize its implementation here. Two sets of 10-member ensembles each were performed using CESM2(WACCM6) to compare the effect of SCI. First, a control simulation was conducted using the SSP2-4.5 scenario (O'Neill et al 2016), which is a medium future greenhouse gas emissions pathway that is in better agreement with recent cumulative emission trends . This simulation, which we refer to as 'SSP2-4.5' in our analysis, covers 2015 to 2069.
We compare the SSP2-4.5 simulation with a SCI perturbation experiment, which we refer to as 'SAI-1.5' in the results section. Similar to the control run, the SAI-1.5 simulation uses the SSP2-4.5 future emissions scenario for each ensemble member, but begins climate intervention in the year 2035 by continuously injecting stratospheric aerosols to maintain the GMST anomaly to 1.5 • C above pre-industrial levels. In addition to limiting the GMST from rising, the controller algorithm monitors and maintains the meridional temperature gradient and equator-to-pole temperature, which then adjusts the aerosol injection accordingly for each year in the experiment (MacMartin et al 2014, Kravitz et al 2017). As shown in figure 1 of , most of the sulfur dioxide is injected at 15 • S latitude and 21 km altitude. The sensitivity of using different climate models for SCI strategies is more thoroughly considered in Fasullo and Richter (2023).
For both simulations (SSP2-4.5 and SAI-1.5), we calculate annual means using gridded monthly CESM2(WACCM6) output. We focus our analysis over land areas using two common climate variables: near-surface air temperature (variable name TREFHT; figure 1(a)) and total precipitation (variable name PRECT; figure 1(b)).

Methods
Before evaluating the detectability of climate signals over different regions, we first compare our machine learning results for global maps of temperature and precipitation. We then consider the Northern Hemisphere (0 • N-90 • N and 180 • W-180 • E) and the Southern Hemisphere (90 • S-0 • S and 180 • W-180 • E), along with six smaller geographic regions. These regions are outlined in figure 1 and include the Arctic, Antarctic, Tropics, Southeast Asia, Central Africa, and Amazon. They cover a wide range in climatological mean states and patterns of interannual variability, as shown for the latter portion of the SAI-1.5 simulations in figures 1 and s1. Finally, in addition to evaluating climate signals over the entire 2035 to 2069 time series, we also compare two shorter periods-2035 to 2044 and 2045 to 2069-in order to account for at least 10 years of transition to a quasiequilibrium state as the controller begins to converge after the initial injection of stratospheric aerosols .

Logistic regression
To first evaluate the timing of detectability for impacts on regional climate, we apply a logistic regression model to predict whether an annual mean map of temperature or precipitation is produced from either the SSP2-4.5 or SAI-1.5 simulation ( figure 1(a)). In other words, this is a binary classification problem. Our logistic regression model architecture (sometimes referred to as softmax regression) is comprised of an input layer and an output layer with two class nodes (i.e. SSP2-4.5 or SAI-1.5). A softmax activation function is applied to the output layer, which transforms the values into class probabilities that sum to one. This probability is referred to as the logistic regression model confidence. As one example, for only the global regional analysis, our logistic regression model receives an input vector comprised of 13 824 units, which are flattened maps of 96 latitude by 144 longitude points. Note that the size of this input vector will be different for each geographic region. The output layer then returns the confidence that this map was from the SAI-1.5 or SSP2-4.5 climate model simulation. The class that is ultimately predicted is defined by a confidence value greater than 0.5.
For both SSP2-4.5 and SAI-1.5, we train on seven ensemble members (70% of the dataset), validate on two ensemble members, and test on one ensemble member. By focusing on one ensemble member for testing data, we treat this as analogous to observations in which we also only have one realization of internal climate variability. In other words, we can address when climate impacts are detectable between SSP2-4.5 and SAI-1.5 in a single climate realization, like in the observational record. Note that the sensitivity of the results to architectures, random initialization seeds, and combinations of training data are explored within the supplementary section (text s2 and figures s2 to s3).

Artificial neural network
Next, we use an artificial neural network (ANN) to address a potentially more difficult prediction task. For this problem, we take maps of annual mean temperature and precipitation from the SAI-1.5 simulation and train an ANN to predict how many years it has been since SCI was initiated (i.e. the year 2035) (figure 2(b)). While there is no explicit temporal information given to the ANN (i.e. only inputs of annual mean maps), the ANN still needs to learn patterns of forced climate signals under an SCI world which can evolve through time for correctly predicting the order of the number of years since 2035. By design, this prediction task is similar to recent studies which showed that ANNs can spatially leverage regional climate information to predict the year of a climate map (e.g. Barnes et al 2019, Labe and Barnes 2021, Madakumbura et al 2021, Rader et al 2022. More details on the ANN architecture can be found in text s3 and in figures s4 to s7.

Explainable machine learning
We are interested in not only the detection prediction itself, but also in identifying the relevant climate  the logistic regression model used for classifying whether an annual mean map of temperature or precipitation is from the SAI-1.5 scenario or SSP2-4.5 scenario. The logistic regression consists of a single linear layer and a softmax activation function in the output with two nodes (binary classification). (b) Schematic of the regression artificial neural network (ANN) architecture used to predict how many years it has been since the deployment of SAI-1.5 in ARISE-SAI-1.5. The ANN consists of two hidden layers with 10 nodes each. patterns used by the machine learning models. To reveal these regions, we consider a method of feature attribution for each of the logistic regression and ANN models. Attribution describes the contribution of the input features to the overall output. Despite an increasing number of explainable machine learning methods adopted for various climate science applications (e.g. Toms et al 2020, Molina et al 2021, Sonnewald and Lguensat 2021, Labe and Barnes 2022), we focus on two conceptually simple methods that we refer to as contribution maps. For identifying the significant regions to determine whether a climate map is from the SSP2-4.5 or SAI-1.5 simulation, we consider contribution maps by multiplying the logistic regression model weights by the input values for every location on the map. Positive contributions here can be interpreted as relevant areas that helped to push the logistic regression models ultimately toward the predicted class, whereas negative contributions are regions that tried to push the logistic regression model to the opposite binary class. As a corresponding approach, we evaluate contribution maps for the ANNs by using the input * gradient method (Shrikumar et al 2016. Input * gradient is calculated from the local gradient (i.e. the gradient of the output with respect to the input features) multiplied by the input map itself, which Mamalakis et al (2022) found to perform well against other explainability methods on a benchmark climate dataset for a similar kind of problem. In this example, positive contributions are locations in the input maps that contributed positively to the prediction, while negative contributions are vice versa. For both explainability methods, regions with near zero contributions (white shading) were of low importance for the machine learning predictions.

Regional emergence of climate patterns
The area-averaged time series of temperature is shown in figure s8 for each region in the SAI-1.5 and SSP2-4.5 simulations. Due to the dominant influence of external forcing from increasing anthropogenic greenhouse gases, warming is evident in all nine regions in the SSP2-4.5 scenario. The largest warming is found in the polar regions (figures s8(d) and (e)), although there is also greater ensemble spread. In comparison, after the injection of stratospheric aerosols in 2035 in the SAI-1.5 simulation, the ensemble mean temperature exhibits little to no forced trend in all regions.
Although there are differences in the ensemble mean trends of temperature between SAI-1.5 and SSP2-4.5, the ensemble member spread overlaps in all regions for at least the first 10 years after the start of SCI. This suggests that internal variability alone could inhibit determining whether a region is observing a SAI-1.5 or SSP2-4.5 world (NASEM 2021, Keys et al 2022. To investigate this question, we utilize our first machine learning method. As described earlier, we input a single annual mean map of temperature for each region and output whether it is from a SAI-1.5 or SSP2-4.5 world. The results of the logistic regression predictions are shown in figure 3 for global land areas, and each year from 2035 to 2069 is denoted with a shaded circle. The transparency of each circle is determined by the logistic regression model confidence. The logistic regression model achieves an accuracy of 92.6% on the testing ensemble member predictions of temperature. Even more striking, the logistic regression model achieves perfect accuracy after about the first 5 years of SCI injection. In other words, the logistic regression model is able to distinguish whether a global map of temperature is under the influence of SCI within the first decade, despite the influence of internal climate variability ( figure s8(a)).
Similarly, the annual mean precipitation is displayed in figure s9 for each of the nine regions, and the logistic regression testing predictions are displayed in figure 3. Unlike the ensemble mean trends for temperature, we find notably smaller forced changes in precipitation in both the SAI-1.5 and SSP2-4.5 simulations (figure s9). Although the ensemble mean is usually slightly wetter in SSP2-4.5 (figures s9(a) to (f)), the spread of annual mean precipitation across ensemble members overlaps in all regions through 2069. Despite this, the logistic regression model is once again able to distinguish global maps of precipitation of SAI-1.5 from SSP2-4.5 after the first 5 years of SCI injection (figure 3). The overall accuracy for precipitation is 91.4%, but the model confidence is at times lower for a few individual years (e.g. 2044 for SSP2-4.5), which we attribute to interannual variability ( figure s9(a)). The robustness of these global map results across all years is further shown in figure s10 using different logistic regression models (i.e. combinations of different random initialization seeds and training, testing, and validation ensemble members).
To understand how the logistic regression model is making accurate predictions, we turn to the explainability method using contribution maps (input×weights). Figure 4(a) shows the contribution map composite for temperature predictions in the SAI-1.5 simulation, which are averaged over years 2045 to 2069. As discussed in section 3.3, positive contributions in figure 4(a) can be interpreted as regions that pushed the logistic regression model to make its classification. We find that areas in Greenland, southern South America, eastern Africa, and eastern Australia are all important regions for driving the logistic regression model to determine that a global land map is from a world under the influence of SCI. Next, we compare this contribution map with signal-to-noise ratios in figure 4(b), which are calculated as the SAI-1.5 ensemble mean trend over 2045 to 2069 (forced response) divided by the standard deviation across the individual ensemble member trends (internal variability). We find strikingly similar spatial patterns of higher signal-to-noise between many of the same regions with positive contributions for the logistic regression model. In agreement with Barnes et al (2022), this suggests that the logistic regression model is learning patterns of temperature signals to detect the influence of SCI. Moreover, we note that not all areas of higher positive contributions are associated with higher signal-to-noise, such as for positive contributions across Mexico and the southern United States. Comparing the areas of higher contributions with the differences in the ensemble means of SAI-1.5 and SSP2-4.5 (figure s13) show some common areas (e.g. positive contributions (figure 4(a)) in South America with greater cooling (figure s13(a))), but generally we find that differences in the ensemble mean do not fully explain the results from our method. This again suggests that the logistic regression model is leveraging spatial temperature signals across each map, rather than only learning point-by-point statistics.
The contribution maps composited over the entire 2035 to 2069 period are in figure s11 for temperature (a, c) and precipitation (b, d). The temperature contributions for both SAI-1.5 and SSP2-4.5 predictions are similar to the one displayed in figure 4(a), which reinforces the importance of those regions as reliable indicators for detecting SCI within the ARISE-SAI-1.5 experiment. For maps of precipitation, positive contributions for detecting SAI-1.5 ( figure s11(b)) are particularly prominent for areas in northern Canada, southern Greenland, northern Predictions of temperature are denoted with a red circle for SSP2-4.5 and a blue circle for SAI-1.5. Predictions of precipitation are denoted with a green circle for SSP2-4.5 and a brown circle for SAI-1.5. The color transparency indicates the logistic regression model confidence for each prediction, which are then scaled between 0 (light shading) to 1 (darkest shading). The total accuracy score for the testing ensemble members is indicated on the right label for temperature and precipitation, respectively. South America, and south-central Africa. In addition, parts of eastern Siberia, central Asia, and westcentral North America are locations of higher positive contributions for pushing the logistic regression model to predict maps from SSP2-4.5. We also find these explainability results are consistent, particularly for temperature, across composites of contribution maps from the 20 different logistic regression models (figure s12).
Finally, we repeat this exercise by separately training logistic regression models for the other eight regions using temperature and precipitation. For brevity, we only show the explainability composites using the global contribution maps as displayed above. Similar to the global predictions in figure 3, a single circle is again displayed for each annual mean map of either temperature or precipitation in figure 5. Accurate predictions of detecting whether a temperature map is from SAI-1.5 or SSP2-4.5 are made within the first decade for each hemisphere and across the Tropics. However, for smaller spatial regions (i.e. Southeast Asia, Amazon, and Central Africa) or those areas with higher interannual variability (i.e. Arctic and Antarctic) (figure s1), a greater range in the timing of accurate predictions is evident. There is also lower model confidence in the predictions for the Antarctic, Southeast Asia, and the Amazon until about the last 5-10 years of the ARISE-SAI-1.5 experiment. In summary, we conclude that differences between the climate signals in SAI-1.5 and SSP2-4.5 scenarios are detectable for regional temperature using the logistic regression model, but areas with greater variability and smaller spatial scales can lead to occasional misclassifications, especially prior to about 2060.
There is less overall skill for logistic regression predictions using precipitation. Indeed, the model confidence is especially low (i.e. closer to 0.5) for its precipitation predictions using only input maps of Southeast Asia, Central Africa, and the Amazon. In contrast, we find higher skill for logistic regression predictions in the Northern Hemisphere (e.g. perfect accuracy after 2039) and for the Tropics. Looking more closely at the timeseries of the regional annual mean precipitation in figure s9, we find the detectability of SCI is higher in the logistic regression predictions than might be inferred given the similarities in the SAI-1.5 and SSP2-4.5 ensemble member spreads, like in the Tropics (figure s9(f)). This suggests that for precipitation, which has a much weaker response to external forcing, that there are some regional patterns of climate indicators that the logistic regression model is learning in order to make accurate predictions for either the SAI-1.5 or SSP2-4.5 scenarios.

Time-evolving climate signals in SAI-1.5
So far, we have shown that differences in climate signals between SAI-1.5 and SSP2-4.5 are detectable from single global maps of annual mean temperature and precipitation. This result is also found for some geographical regions. Given these findings, we now ask the question whether a machine learning model can determine when SCI was first initiated, and thus it considers only the time-evolving climate patterns in an SCI world. To address this more difficult prediction task, we use an ANN, which is a method that can consider if there are any possible nonlinearities in the evolution of climate signals. We focus on global and regional data from only the SAI-1.5 ensemble members for this problem. Since the ANN is not explicitly given any temporal information in its input, it must therefore learn the timing of climate indicators for this regression task (i.e. the number of years since 2035).
First, we evaluate the spatial variability of the SAI-1.5 ensemble mean trends of temperature and precipitation in figures s14 and s15, respectively. For completeness, we also include the ensemble mean trends for SSP2-4.5. Although cooling is found in most regions of the SAI-1.5 simulation within the first decade since aerosol injection (figure s14(a)), there is also spatial variability. This includes warming in parts of the extratropical Northern Hemisphere. Rather than suggesting that this is a robust, forced response to SCI, it is more likely that this is simply a reflection of internal variability, which is discussed in more detail in Keys et al (2022) and is also demonstrated by the large spread of ensemble member trends in figure s16. Furthermore, there is large variability in precipitation trends in the first decade since SCI initiation ( figure  s15(a)), but weaker trends in the longer 2045 to 2069 period ( figure s15(d)).
Now we turn to the ANN prediction problem. The results for a single testing ensemble member are shown for temperature in figure 6 (dashed blue line) compared to the 1:1 solid black line (or 'perfect prediction'). The corresponding training ensemble member predictions are displayed in figure s17.  Overall, we find a positive slope for the SAI-1.5 predictions using all regional input maps, except for the Antarctic (figure 6(e)) where the ANN does not learn from even the training data (figure s17(e)). Although there is a wide range in mean absolute error (MAE) scores, the ANN is still able to learn a time-evolving signal for temperature, as reflected by the slope of the prediction lines and higher correlation coefficient (figure s6). We find predictions close to the 1:1 line even for some smaller input regions, including Southeast Asia (figure 6(g)) and the Tropics (figure 6(f)).
To understand where the ANN is looking to make the temperature predictions, we evaluate contribution maps using the input * gradient method for global land maps (figure 7(a)) and for inputs using only the Tropics (figure 7(c)). The respective ANN contribution maps are composited over all years from 2035 to 2069. Areas of positive contributions are evident across much of the Antarctic, South America, northern Africa, and northwestern North America. Notably, these regions differ from the climate patterns leveraged by the logistic regression model predictions (e.g. figures s10(a) and (c)), but this could be a result of comparing different machine learning methods, different prediction tasks, differences in the years composited, and greater variability in prediction error by the ANN. The composited contribution maps for the input fields of temperature in the Tropics are noisier and more difficult to interpret in figure 7(c), but some areas of positive contribution are seen across islands in Indonesia that correspond to locations of higher signal-to-noise ratios ( figure 7(e)).
Finally, we utilize this ANN framework for inputs of precipitation. The predictions for the testing ensemble member are shown in figure s18, and the corresponding training ensemble predictions are included in figure s19. Unlike for temperature, the ANN is unable to learn patterns of reliable precipitation signals to correctly predict the order of the years since the deployment of SCI in all regions. The ANN also suffers from overfitting on the training data. For completeness, we still show the testing ensemble member contribution maps for inputs of global maps in figure 7(b) and for maps of the Tropics in figure 7(d), but the robustness of the higher positive contributions across areas like northern South America and central Africa are unclear given the overall lower prediction skill by the ANN. We also cannot determine whether the lack of prediction skill is from just the limited training data (only 7 ensemble members), which could prevent the ANN from filtering the relevant spatial signals from the background noise of interannual variability.

Discussion and conclusions
A key recommendation from NASEM (2021) was that research was needed to better understand the detection and attribution of climate-related impacts from SCI. While it is likely that satellite remote-sensing observations would be able to quickly detect changes in aerosol optical depth (Li et al 2022), it is still uncertain whether responses in the climate system would be distinguishable from internal variability. This is an important question at smaller regional scales, given the potential societal impacts from even small changes to temperature extremes or the hydrological cycle. In this study, we begin to assess these questions by employing several machine learning methods to evaluate whether the regional effects of temperature and precipitation would be detectable under a plausible SCI scenario compared to a future SSP2-4.5 scenario.
Despite a much weaker external forcing scenario than was considered in Barnes et al (2022), we find similar results for the detection of temperature and precipitation impacts over global lands areas by the logistic regression model. This occurs within approximately the first decade of SCI initiation. We also find utility in training an ANN to identify when SCI was started simply by inputting annual mean maps of temperature. Using the contribution maps as explainability tools for the machine learning methods, we show that the logistic regression and ANN models are leveraging combinations of climate signals across the maps in order to make correct predictions. While these patterns are sometimes associated with areas of higher signal-to-noise or differences in the simulation ensemble means, this is not always the case, especially for the more complex ANN approach. In fact, one advantage of this data-driven approach is that we are not restricted to linear grid point level statistics.
There is a much wider range of skill for predicting the emergence in smaller geographic regions, especially for precipitation. For example, we do not find any skill in detecting whether a map of precipitation is from either SSP2-4.5 or SAI-1.5 over Southeast Asia. This result is not surprising given the challenges in disentangling the influences of anthropogenic aerosols, greenhouse gases, and internal variability on the forced response of regional precipitation (Lin et al 2016, Deser et al 2020, Ha et al 2020. As shown in Keys et al (2022), internal variability can modulate or altogether mask the influences of SCI in the ARISE-SAI-1.5 simulation. Nonetheless, we cannot rule out higher prediction skill with more available training data. Here we are only using 7 ensemble members (n = 10) from each of the SAI-1.5 and SSP2-4.5 simulations to train and validate the logistic regression and ANN models. This may not be enough ensemble members to disentangle the signal from the noise (Milinski et al 2020), and consequently it can limit the amount of information for the machine learning models to learn the combined regional climate change patterns amidst the background noise. Finally, we note that our detection and explainability results are restricted to one climate model (CESM2(WACCM6)) and only one scenario of SCI (SAI-1.5) and future climate change (SSP2-4.5). As future large ensembles are developed for comparing other plausible SCI scenarios (Visioni and Robock 2022), it will be important to extend these data-driven approaches for identifying the range of forced climate signals and how they compare with the results presented here for ARISE-SAI-1.5.