Seasonal predictability of summer north african subtropical high in operational climate prediction models

Seasonal predictability of summer North African Subtropical High (NASH) is investigated in this study by utilizing the hindcast data from four operational climate prediction models, including BCC_CSM1.1(m), NCEP CFSv2, ECMWF System 4, and JMA CPSv2. By reconstructing indices describing the variations in intensity, area, eastern boundary and ridge line of the NASH, it is shown that the intensity and area indices present high prediction skills compared to the relatively low prediction skills of position indices. The multi-model ensemble (MME) mean, calculated as the arithmetic average of the four models, presents relatively higher and stabler skills than individual models. Further investigation indicates that the prediction skill of the NASH is largely reliant on the models’ ability in reproducing the relationship between the NASH indices and the tropical-to-subtropical sea surface temperature (SST) anomalies associated with the El Niño/Southern Oscillation (ENSO). The pattern of atmospheric circulation anomaly over the North Africa in response to ENSO is well captured by the models, which suggests the dominant source of predictability of the NASH.


Introduction
Subtropical high refers to the warm high pressure system in the middle and lower troposphere of the subtropical region. Under the thermal and topographic effects of the Tibetan Plateau, the subtropical high belt breaks over the plateau, forming two centers with the western Pacific subtropical high (WPSH) on its east and the North African subtropical high (NASH) on its west (figure S1). As a crucial factor affecting the climate of East Asia, the formation, impact, and prediction of the WPSH have been widely revealed (e.g. Wu and Zhou 2008, Wu et al 2009, Liu et al 2012, Ren et al 2013, Wang et al 2013, Yang et al 2017, Chen et al 2019a, Guan et al 2019, Zhou et al 2020. However, few studies have been focused on the NASH.
In summer, the NASH mainly locates over the Sahara Desert of North Africa, and can extend eastward to the Iranian Plateau and stays there for a long time. The north-south position variation of the NASH can exert significant impact on precipitation over western China (Jia et al 2002 and arid central Asia (Zhao et al 2018, Chen et al 2019b, Lu and Zhao 2022. More importantly, as a dominant anticyclonic circulation resulted from Hadley cell, the variations of the NASH are closely associated with the regional climate (Zhang and Cook, 2014) and extremes (Hoang et al 2016, Bowerman et al 2017 in North Africa and southern Europe through affecting African easterly jet  and connecting El Niño/Southern Oscillation (ENSO) and African monsoon (Thorncroft et al 2011, Dogar et al 2017, 2018, Dogar and Sato 2019.
The prediction of subtropical high is of a great importance for regional climate prediction and disaster prevention, which have been fully demonstrated by the WPSH (e.g. Mao and Wu 2006, Mao et al 2010, Ren et al 2013, Wang and He 2015, Wang et al 2016, Cheng et al 2019c (Takaya et al 2018), and can routinely release 6-12-month predictions (Weisheimer et al 2009, Ren et al 2017, Dunstone et al 2020. Based on the hindcasts from the four operational climate prediction models, this study mainly focus on examining the performance of seasonal prediction of the summer NASH and investigating the relevant predictability source, which are rarely covered in previous studies.

Data and methodology
In this study, the hindcast data for seasonal prediction are generated by the four fully coupled operational climate prediction models, including BCC_CSM1.1m, NCEP CFSv2, ECMWF System 4, and JMA CPSv2. These hindcasts are also used to conduct the multi-model ensemble (MME) mean through an arithmetic average. Brief information of the four models is given in table 1. The monthly mean atmospheric observations are obtained from the National Centers for Environmental Prediction-Department of Energy (NCEP-DOE) reanalysis 2 (Kanamitsu et al 2002). The monthly mean sea surface temperature (SST) data are from the Hadley Centre Sea Ice and Sea Surface Temperature dataset version 1 (HadISST) (Rayner et al 2003). All of the analyses are performed from 1991 to 2020. Summer refers to the average for June-July-August (JJA). To focus on the seasonal anomalies, the climatological mean and long-term linear trend for 1991-2020 is removed.
Particularly, the phrase 'lead month' denotes to make the prediction of the following JJA based on the initial conditions at the beginning of one month. For example, the 1-month lead prediction means to use the initial conditions from May and predict the average of June, July, and August. The remaining lead times are defined analogously.
Normally, subtropical high is represented in the weather chart as the region surrounded by the contour line of 588 dagpm at 500-hPa level (figure S1). In this study, the prediction capacity of the NASH is evaluated based on a set of objectively reconstructed indices quantitatively describing its intensity, area, eastern boundary and ridge line, following the definitions of WPSH indices (Liu et al 2012). Details are given in table 2. The prediction skills are assessed using three metrics: the temporal correlation coefficient (TCC), which measures the similarity of the temporal evolution between the prediction and observation, the root mean squared error (RMSE), which measures the deviation between the prediction and observation, and the pattern correlation coefficient (PCC), which measures the similarity of the spatial patterns between the prediction and observation. The statistical significance is assessed using the two-tailed Student's t-test.

Results
3.1. Climatology of the summer NASH Firstly, we compare the patterns of the JJA NASH climatology in terms of geopotential height climatology at the 500-hPa isobaric level (H500) among different models (figure 1). As a reference, the observational result is shown in figure 1(a), where the region surrounded by the contour line of 588 dagpm occupies the entire northern Africa and the Arabian Peninsula, and can extend eastward to the Iranian Plateau. The northernmost edge can reach the Mediterranean Sea. The 592 dagpm contour line mainly occupies the northwestern Africa subtropical region. The four operational climate prediction models can overall capture the spatial distribution of the NASH climatology, but biases still exit. The intensity and area of NASH climatology in NCEP CFSv2, ECMWF System 4, and JMA CPSv2 models are generally weaker than that in the observation, but much greater in the BCC_CSM1.1m model. The result of MME mean shows the greatest similarity with the observation, and the shapes of 588 dagpm and 592 dagpm contour lines are highly consistent with the observation.

Prediction skills of the summer NASH indices
TCC and RMSE skills of the summer NASH indices are shown in figure 2. Overall, the four operational climate prediction models show high abilities in predicting the intensity and area of the NASH. The TCC skills decay slowly with lead month increasing, and exceed the threshold of significance test at 99% confidence level for all A(i) and gh(i) denote the area and geopotential height of grid i at 500 hPa isobaric level, respectively. Each index is calculated in the region of

Eastern boundary
The longitude of the easternmost grid at 588 dagpm contour line.

Ridge line
The averaged 0 line of gh y. ¶ ¶ lead months. High TCC skills correspond to relatively small RMSEs, representing high temporal similarities and low deviations between the predicted and observed indices. Note that the prediction skill of BCC_CSM1.1m model for area is obviously lower than that of intensity. The MME mean can stabilize the prediction performance better than individual models for both intensity and area indices. The prediction skills of eastern boundary and ridge line indices are much lower. There are also great discrepancy between models, which is reflected by both TCCs and RMSEs. It suggests that the zonal and meridional position variations in the NASH are difficult to predict correctly. Comparing the results from individual models, we can find that the capacities of the ECMWF System 4 and JMA CPSv2 models are slightly stabler and better than the other two models, where the TCC skills can exceed the threshold of significance test at 99% confidence level for the eastern boundary index. BCC_CSM1.1m model has the lowest skill for the eastern boundary index, while NCEP CFSv2 model has the lowest skill for the ridge line index.
However, the MME mean does not present an outstanding improvement for the prediction performance, but can offer stabler skills than individual models, especially for the eastern boundary and ridge line indices. Since each individual model has its own strengths and weaknesses, the MME mean can help to reduce the impact of any individual model's weaknesses by averaging predictions from multiple models. The MME mean can also help to identify and correct biases in individual models. For example, if one model always predicts a stronger NASH than the other models ( figure 1(c)), the MME can adjust its predictions to account for this bias ( figure 1(b)). The MME mean is typically more robust to errors than individual models.

Attribution of predictability source
Since operational climate prediction models show evident skills in the NASH, our next focus is what source of predictability contributing to the skillful prediction. Prior researches have demonstrated that the seasonal-tointerannual predictability of monsoon circulation systems primarily stems from the oceans via atmospheric teleconnections excited by SST anomaly (Wang et al 2009, Zhou et al 2020, Wang et al 2023. We further examine the correlations between the JJA NASH indices and the SST anomalies in both the observation and model predictions, as shown in figure 3. The significantly positive correlations of SST anomalies associated with the intensity and area of NASH mainly occur over the tropical-to-subtropical Pacific and India ocean, representing the El Niño and Indian Ocean Basin mode SST anomaly patterns. Correlation patterns from the NASH eastern boundary and ridge line are reasonably similar to the NASH intensity and area with weakness for the magnitude of significant correlations.
Operational climate prediction models exist biases in capturing the NASH-SST relationship, which seems to be responsible for the presented prediction skills of NASH indices. For example, from the perspective of reproducing the NASH intensity-SST relationship in the models, there are more or less some inconsistency with the observed results, but they all capture the significant positive correlation of SST in the tropical eastern Pacific, and all the four models have shown high prediction skills in NASH intensity index. The prediction of NASH area index is similar to that of intensity index. In addition to BCC_CSM1.1m model, all models can capture the significant positive correlation of SST in the tropical eastern Pacific despite biases over other regions, leading to the relatively lower prediction skill of BCC_CSM1.1m model for NASH area index.
The correlation ship between SST and the eastern boundary index or ridge line index of NASH is relatively weak, which may explain why the prediction skills of the NASH eastern boundary and ridge line indices are generally lower. However, it can still find significant area in the tropical eastern Pacific. For the prediction of NASH eastern boundary index, only the JMA CPSv2 model can capture the significant positive correlation although it seems to be overstated, it has a higher prediction skill for the eastern boundary index than the other  e)-(t) are for the individual climate prediction models and (u)-(x) for MME mean at 1-month lead, respectively. The black meshed areas denote the significant correlation coefficients above the 95% confidence level based on the two-tailed Student's t-test. three models. The same is also true for the ridge line prediction. The reproduced pattern in ECMWF System 4 model is slightly weaker than that in other models, which leads to a lower prediction skill than the other three modes at 1-month lead. Particularly, the MME mean correctly represent a broad area of SST correlations over the tropical-to-subtropical oceans, leading to stabilize the prediction performance better than individual models for the four NASH indices.
Besides, results from the evolving correlation patterns between the JJA mean NASH intensity and area indices and former February to July SST anomalies (figures S2) imply the decline of El Niño, indicating that ENSO may play a crucial role in the NASH intensity and area. Moreover, during the summer of strong El Niño decaying years, such as 1998, 2002, and 2010, and the summer of strong La Niña decaying years, such as 2008, the predicted indices show high consistency with the observed indices (figures S3). We can infer that the NASH presents to be accurately predicted during the summers following strong ENSO forcing. That is, whether El Niño or La Niña event occurs during the former winter, the prediction of the NASH indices will be largely reliable if the close connection between NASH and ENSO are well captured by the models.
To detect whether the prediction skill of the NASH is related to the ability of model to capture the close relationship between NASH and ENSO, the PCCs between the patterns of observed and predicted NASH-SST correlations against the TCC skills of the NASH indices are examined in figure 4. It illustrates good linear correspondences for the four NASH indices, with fairly high correlation coefficients of 0.85 between the PCCs and the TCC skills of the NASH intensity index, 0.73 between the PCCs and the TCC skills of the NASH area index, 0.61 between the PCCs and the TCC skills of the NASH eastern boundary index, and 0.67 between the PCCs and the TCC skills of the NASH ridge line index, which are far beyond the threshold of significance test at 99% confidence level. The value of the correlation coefficient also reflects the predictability of different NASH indices. The linear trends in both PCC and TCC create the high correlation. The high consistency of the PCC and TCC in the results of all the models reflects they are undoubtedly physically related. That is to say, the operational climate prediction models reproducing a more authentic relationship between the tropical-tosubtropical SST and the NASH tend to better capture variations in the NASH. Similar results can be found by the PCC against RMSE skill (figure S4).
Since ENSO events seems to be an important predictability source of the NASH, the abilities of models in predicting NASH indices may largely depend on the association between ENSO and the prediction performance of the geopotential height in north Africa. Scatter plots of PCC skills of geopotential height over [0°-50°N, 0°-70°E] for models at 1-month lead versus the ENSO intensity represented by the absolute value of former winter (from December to February) mean Niño3.4 index is given in figure 5. Significant positive correlations are presented between the ENSO amplitude and PCCs, and the MME mean has the highest correlation coefficient. It suggests that at the summer following strong ENSO forcing, the prediction performance of models can be evidently enhanced. We can be concluded that ENSO is the most dominant predictability source for models to predict circulations in North Africa. The improved prediction skills for the NASH primarily originates from the accurate capture of ENSO impacts.
Furthermore, the spatial pattern of TCC skills of H500 for MME is also examined to understand the contribution of ENSO to the prediction skill of NASH by comparing it with the correlation pattern between the predicted JJA H500 and the observed DJF Niño 3.4 index (figure S5). We can find that in the North Africa region with high TCC skill also presents largely high correlation with ENSO. Some other regions, such as the Mediterranean Sea in the north of NASH and the Iranian Plateau in the east of NASH, TCC skills is relatively lower, and the correlations between ENSO and these regions are also insignificant. This may indicate that ENSO is the dominant source of high predictability of the NASH, especially for its intensity and area, but provides limited predictability to the NASH eastern boundary and ridge line.

Summary and discussion
This study mainly investigated the seasonal predictability of summer NASH by using hindcast data from four operational climate prediction models, and revealed the possible connection with ENSO. By reconstructing indices describing the variations in intensity, area, eastern boundary and ridge line of the WPSH, it is found that the intensity and area indices show relatively high skills in models, while the position indices show relatively low skills. The MME mean, calculated as the arithmetic average of the four models, can offer relatively higher and  figure 1) (x-axis) and the absolute value of former December-January-February (DJF) mean Niño3.4 index (y-axis) for BCC_CSM1.1(m) (brown), NCEP CFSv2 (green), ECMWF System 4 (blue), JMA CPSv2 (dark yellow) models and the MME mean (red) at 1-month lead. Double (single) asterisk denotes the statistically significant correlation coefficient above 99% (95%) confidence level. stabler skills. The prediction performance of the NASH largely relies on the ability of models in reproducing the relationship between the NASH indices and the tropical-to-subtropical SST anomalies which are connected with ENSO. The pattern of atmospheric circulation anomaly over the North Africa in response to ENSO can be well captured by models, suggesting that ENSO signal is the dominant source of predictability for the summer NASH.
Generally, coupled dynamical models have reliable performance in predicting summer NASH. However, biases still exist and need to be solved, especially for the prediction of position, which has been proved in the case of WPSH to be more critical than the intensity and area in impacting the rain belt location and summer monsoon onset (Chang et al 1999, Mao et al 2010, Ren et al 2013, Yang et al 2017, Guan et al 2019. The position of the NASH may be more sensitive to the internal variability of the climate system than the intensity and area. A variety of factors, including the strength and location of the subtropical jet stream, the monsoon circulation, and the land-sea thermal contrast, may all significantly impact on the position of the NASH. These factors are all subject to internal variability, which can lead to unpredictable changes in the position of the NASH. Additionally, although there are some linear correlation ships between prediction skills and ENSO teleconnection (figure 4) and ENSO intensity (figure 5), the spread cannot be ignored despite the model defaults and errors. It may come from the general characteristic of climate variability over the extra-tropical lands and oceans, indicting forcings such as land surface process, atmospheric internal process, and air-sea-ice interactions from mid-to-high latitudes that are relatively independent of ENSO limitation may play crucial roles in predicting climate variability in North Africa (Hu et al 2017, Kosaka et al 2012, He et al 2016. Further investigation is needed to reveal the detailed mechanisms in future studies.