Variability modes of September Arctic sea ice: drivers and their contributions to sea ice trend and extremes

The variability of September Arctic sea ice at interannual to multidecadal time scales in the midst of anthropogenically forced sea ice decline is not fully understood. Understanding Arctic sea ice variability at different time scales is crucial for better predicting future sea ice conditions and separating the externally forced signal from internal variability. Here, we study modes of variability, extreme events and trend in September Arctic sea ice in 100–150 year datasets by using time-frequency analysis. We extract the non-linear trend for sea ice area and provide an estimate for the sea ice loss driven by anthropogenic warming with a rate of ∼−0.25 million km2 per decade in the 1980s and accelerating to ∼−0.47 million km2 per decade in 2010s. Assuming the same accelerating rate for sea ice loss in the future and excluding the contributions of internal variability and feedbacks, a September ice-free Arctic could occur around 2060. Results also show that changes in sea ice due to internal variability can be almost as large as forced changes. We find dominant modes of sea ice variability with approximated periods of around 3, 6, 18, 27 and 55 years and show their contributions to sea ice variability and extremes. The main atmospheric and oceanic drivers of sea ice modes include the Arctic Oscillation and Arctic dipole anomaly for the 3 year mode, variability of sea surface temperature (SST) in the Gulf Stream region for the 6-year mode, decadal SST variability in the northern North Atlantic Ocean for the 18-year mode, Pacific Decadal Oscillation for the 27 year mode, and Atlantic Multidecadal Oscillation for the 55 year mode. Finally, our analysis suggests that over 70% of the sea ice area loss between the two extreme cases of 1996 (extreme high) and 2007 (extreme low) is caused by internal variability, with half of this variability being related to interdecadal modes.


Introduction
Understanding the causes of Arctic sea ice variability and change is crucial to improve projections of sea ice conditions in the future. Over the past four decades, September sea ice area and thickness in the Arctic have significantly declined due to anthropogenic warming (IPCC 2022). This downward trend in sea ice loss has not been constant and is modulated by internal variability (e.g. Swart et al 2015). For instance, the apparent acceleration after the mid-1990s could be partly attributed to internal climate variability such as the phase change from cold to warm in the Atlantic Multidecadal Oscillation (AMO; Day et al 2012) and/or internal atmospheric variability originating from the tropical Pacific (Ding et al 2017, Baxter et al 2019. Many studies investigated the relative importance of internal variability of the Arctic compared to anthropogenic warming based on satellite observations, which are available since 1979 (e.g. Stroeve et al 2008, Swart et al 2015, Ding et al 2017, Desmarais and Tremblay 2021. However, it remains a challenge to assess the contribution of decadal to multidecadal variability with such limited length of sea ice observations. Moreover, most previous studies did not separate the signals of different modes of internal variability and did not explore the difference in the rate of sea ice change between interannual, decadal and multidecadal 2. Methods

Datasets
We use data from ERA-20C , a global atmospheric reanalysis for the period of 1900-2010 at a horizontal resolution of T159 (approximately 125 km). ERA-20C was produced with the Integrated Forecast System (IFS version Cy38r1) atmospheric model, using SST and ice cover from 1 • horizontal resolution HadISST2.1 (Hadley Centre Sea Ice and Sea Surface Temperature dataset; Titchner and Rayner 2014) as surface boundary conditions, and assimilating observations of surface pressure and marine winds. SST and sea ice data provided in ERA-20C are thus from HadISST2.1. ERA-20C has been used to explore the variability of Arctic climate and sea ice, and it has shown acceptable performance (Belleflamme et al 2015, Wegmann et al 2018, Schweiger et al 2019. It agrees well with ERA-Interim and MERRA regarding the annual cycle of downwelling radiation, air temperature, and wind speed in the Arctic in the overlapping time periods. Belleflamme et al (2015) showed that ERA-20C could capture the circulation anomalies accounting for the 1920-1930 summertime Greenland buoyancy loss and lateral buoyancy exchange warming. Thus, we are confident that ERA-20C data rather realistically simulate earlier periods of 20th century as well. In addition, we also analyze Walsh et al (2019; hereafter Walsh19) and satellite observations (OSI-450;OSI SAF 2017) in section 3 to check the consistency of our analysis between different datasets. These three datasets have a significant correlation of more than 0.9 for the period of 1980-2010 for their September sea ice. Walsh19 dataset is a reconstruction of Arctic sea ice coverage from 1850 to 2017 based on a variety of historical observations of sea ice that have been combined into a database.

Extracting modes of variability and non-linear de-trending
We use EEMD-a method suitable for non-linear and non-stationary data-to extract the modes of variability and detrend the data (Wu and Huang 2009). A full description of the method is given in the appendix and the source code is provided in the data and source codes section.
Similar to harmonic analysis, EEMD decomposes timeseries into their fundamental components but uses an adaptive (data-driven) method instead of using a priori determined basis function as in Fourier transform. Moreover, this method accounts for non-linearity to estimate the trend in data and, unlike other regression fitting methods, it does not use a predetermined function to find the trend. It is also a powerful tool to find dominant modes of variability and to capture changes in their amplitudes and contributions to occurrence of extreme events. It is a common and accepted technique in signal processing. In climate-related research, several studies benefited from the EEMD method to study variability and trends in the atmosphere (Franzke 2009, Shi et al 2012, Ji et al 2014, Huang et al 2017, ocean (Kenigson and Han 2014) and sea ice .
EEMD decomposes a timeseries x(t) into a finite number of intrinsic mode functions (IMFs) which are the fundamental components of x(t). These modes include oscillatory (periodic) components C j (t) (j = 1, 2, …, n − 1) and a trend mode R n (t), which is a monotonic (with no more than one minima or maxima) and non-periodic function of time (Ji et al 2014): The optimal number of dominant IMFs (n) is dependent on the length of data as well as data characteristics. For a dataset with N time steps, the number of IMFs is normally between log 2 (N) − 1 and log 2 (N) depending on the dataset (Huang et al 2020). These IMFs are organized from the highest to the lowest frequency as individual timeseries and have the same unit as the original signal. For ERA-20C (1900ERA-20C ( -2010, we extract 6 IMFs for a given variable akin to the century-long study by Ji et al (2014). For the reconstructed sea ice data of Walsh19 (167 years) and satellite observations of OSI-450 (36 years), we extract 7 and 5 IMFs, respectively. To de-trend the timeseries, we remove the last IMF from the original signal. Note that for 2D variables, we apply EEMD to the timeseries at every grid point.

Sea ice area
Sea ice area is calculated by summing the product of sea ice concentration (larger than 15%) and grid cell area over the Northern Hemisphere. Figure 1(a) shows the September sea ice area from HadISST2.1 (hereafter ERA-20C), Walsh19 and OSI-450. In ERA-20C, the constant and large sea ice area in the 1940s is not consistent with the reported warm Arctic (Johannessen et al 2004, IPCC 2022 and sea ice area by Walsh19 for the same period. Since the real Arctic sea ice area for the 40 s remains unclear, we first replaced . Note that the large values in the 40s were removed in the ERA-20C data, as explained in section 2.3. The difference between the trend mode of datasets is partly because they cover different time spans. In Walsh19 with a longer timeseries, more variability modes were extracted from the original signal in order to obtain the trend mode (see section 2.2). The trend mode extracted for the sea ice area of Walsh19 was proposed to be attributed to anthropogenic warming. the 1940s sea ice values with the averaged values of the late 1930s and early 1950s. The high and low sea ice years are defined as those years with de-trended September sea ice area exceeding ±1 standard deviation (SD; table 1; the light red and blue shaded areas in figure 2(b)). Table 1. Years with high (above +1 SD) and low (below −1 SD) September sea ice area for ERA-20C after non-linearly detrending. Years which are exceeding ±1.5 SD are indicated in bold. The SD value is 0.4.

Trend mode in September sea ice area and its relation to anthropogenic climate warming
The last IMF (trend mode) shows different trend curves for ERA-20C compared to Walsh19 and OSI-450 (figures 1(a) and (b)). This is obviously related to difference between the datasets (e.g. having different mean states) but also because the trend mode-as the remaining non-periodic part of the signal after extracting all the variability modes-might contain variability modes with periods equal or longer than the length of the original signal that cannot be extracted. In Walsh19 timeseries which is longer than ERA-20C, a centennial mode could also be extracted as an individual IMF (see figure A1(a)), meaning that more modes were extracted from the original signal to get the trend mode. If we add IMF6 to the trend, the resulting curve will have about the same slope as in ERA-20C (figure A1(b)). This highlights the importance of using long timeseries as well as non-linear de-trending of climatological data. When linearly de-trending and/or using short timeseries, the trend still contains information about multidecadal and centennial signals (see Ding et al 2019).
It is widely agreed that anthropogenic climate warming is the main driver for sea ice loss in recent decades (IPCC 2022). The negative slope in the trend mode of sea ice area is in agreement with this assertion ( figure 1(b)). In ERA-20C, the trend mode should be related to the anthropogenic signal plus any centennial-scale variability. In the trend curve extracted for Walsh19 data, however, the trend mode excludes the centennial mode as mentioned before, and shows steady values until about 1920 and a continuous reduction afterwards, which follows well the global average temperature change (figure 1.2 in Allen et al 2018) and anthropogenic CO 2 emissions (Notz and Stroeve 2016). Taking this trend curve, the sea ice loss can be estimated to ∼2.2 million km 2 between 1850 to 2017, of which 1.1 million km 2 happens between 1980-2010. The total sea ice loss in the EEMD trend mode of ERA-20C and OSI-450 sea ice area between 1980 and 2010 is 1.8 million km 2 and 2.2 million km 2 , respectively, and the difference between datasets was explained earlier in this section. Notz and Stroeve (2016) suggested a linear relationship between cumulative carbon emissions and Arctic sea ice (i.e. 3 ± 0.3 million km 2 of sea ice melt for the emission of 1000 gigatonne of CO 2 ). For the 746.5 gigatonne of emitted CO 2 between 1980 and 2010 (Le Quéré et al 2018b), the estimated amount of sea ice melt corresponds to 2.2 ± 0.2 million km 2 . This value is close to our estimate for OSI-450 but higher than the ones based on ERA-20C and Walsh19. Part of the difference could be because Notz and Stroeve (2016) used sea ice extent in their method, while our analysis is for sea ice area. On the other hand, sea ice melt could be overestimated in Notz and Stroeve (2016) as was pointed out in Ding et al (2019). Using the method from Ding et al (2019), which subtracts the contribution of internal variability, yields 1.35 million km 2 of sea ice loss due to anthropogenic forcing between 1980 and 2010, closer to our estimate of 1.1 million km 2 based on Walsh19.
The rate of sea ice change per decade (first derivative with respect to time multiplied by 10) in the trend curve for Walsh19 is about −0.25 million km 2 per decade in 1980 which accelerates to −0.47 million km 2 per decade in 2010 and −0.53 million km 2 per decade in 2017 ( figure A2). This acceleration is presumably a sign of self-accelerating loss of sea ice due to the ice-albedo feedback (Notz and Marotzke 2012) and other contributors to Arctic amplification. Using Taylor expansion for the trend curve, the rate of sea ice loss (in million km 2 per decade) can be approximated as: −(0.53 + 0.0092 × ∆t + 0.16 × 10 −4 × ∆t 2 ) relative to 2017 where ∆t refers to the number of years. Following this formula into the future would suggest a September ice-free Arctic around 2060. This is considering only anthropogenic warming without the contribution of internal variability, excluding any feedbacks for thin ice cover and assuming sea ice loss would continue to accelerate in the future until September ice-free Arctic is reached. Although our approximation for the timing of a September ice-free Arctic is questionable, it is close to the estimate of the CMIP6 multi-model ensemble mean for the high emission scenario (Notz and SIMIP Community 2020).

Modes of variability in September sea ice area
The IMFs 1-5 (figure 2(a)) are the oscillatory modes and have periods of around 3, 6, 18, 27 and 55 years, respectively (period = N (number of local maxima−1) ). These periods have values close to the peaks (above 90% confidence curve) at 3, 5, 18, 25 and 55 years in the spectral analysis of the original timeseries (figure A3). (b) the de-trended September sea ice area (sum of all the 5 modes): red and blue shaded colors refer to years when anomalies are exceeding +1 and −1 standard deviation, respectively (high and low sea ice years), pink and purple shaded colors correspond to positive and negative anomalies, respectively.
Following the statistical significant test suggested by Wu and Huang (2005) and Coughlin and Tung (2005), the IMFs are statistically significant above the 95% level (see appendix A3 for details). The sensitivity of IMFs to parameterizations used in the method is also tested to show how robust these modes are. All the modes except IMF5 are found to be insensitive to the parameter choices of the EEMD method. The uncertainty in probably because the quality of these datasets improve throughout the century as the number of observations increases.
The total sea ice area anomaly (the de-trended signal) is the sum of IMF 1-5, which is equal to the original timeseries minus the last IMF (figure 2(b)). The well-known years of 1996 and 2007, which are high and low sea ice years, respectively, are captured well (Johannessen et al 2004, Kauker et al 2009. The low sea ice years in 1957-1961, 1968 and 1971, and the high ones in 1967, in 1980s, 1994 and 2001

Sea ice area
Previous studies suggested that the rate of sea ice loss accelerated in the late 1990s (Day et al 2012, Ding et al 2017. Recalling that the total sea ice area includes the trend mode plus the variability modes, we investigate the contribution of internal variability to sea ice area loss for ERA-20C. In section 3.1, we mentioned that the rate of sea ice loss in the trend mode has been increasing. This obviously results in more sea ice loss after 1995 but not as significant. For sea ice variability, however, we see a change in mid 1990s (figure 2(b)). The integrated rate of sea ice change is +0.015 million km 2 per year between 1980-1995, and −0.06 million km 2 per year for 1995-2010, resulting in about +0.25 million km 2 and −0.9 million km 2 change in sea ice area, respectively. This shows the acceleration of Arctic sea ice loss after 1995. Thus, internal variability can slow down or accelerate the background anthropogenic-driven sea ice loss in agreement with the finding of Swart et al (2015) and Ding et al (2017).
For the two extreme cases of 1996 (extreme high) and 2007 (extreme low), the total sea ice difference amounts to about −3.4 million km 2 . Out of this total loss, −2.5 million km 2 (73%) comes from the difference between 1996 and 2007 anomalies and can be attributed to internal variability (figure 2(b)). Further, by summing up the interdecadal modes 3, 4, and 5, we find that half of this internal variability-induced loss is related to these modes (figure 2(a)). This indicates the importance of variability modes in rapid sea ice decline in the Arctic (Auclair and Tremblay 2018).

Sea ice concentration
To quantify the spatial distribution of sea ice change in the trend compared to internal variability, we regressed the trend mode and the de-trended signal of sea ice concentration against time, respectively. The difference between the two periods of 1980-1995 and 1995-2010 (figure 3) shows that sea ice loss in the trend mode is only slightly larger after 1995, but it substantially increases in the internal variability part (more regions with negative anomalies). The large area covered with positive sea ice anomaly in figure 3(c) indicates that internal variability hindered the rate of sea ice loss before 1995 in those regions. Comparing figures 3(b) and (d) implies that the contribution of internal variability is larger than the trend mode in Kara and East Siberian Seas. Our results highlight the non-uniform changes and regional variations in Arctic sea ice trend and variability (see England and Polvani 2019).

Contribution of each mode to the September sea ice variability and high/low sea ice events
To understand the contribution of each sea ice mode to the total de-trended sea ice anomaly throughout the Arctic, we calculate the variance ratio of the sea ice modes relative to the total variance of de-trended sea ice concentration (σ i ⁄σ total ; figure 4). The largest variance ratio, as expected, is for mode 1 (period = 3 years) and is between 20%-60%, although other modes show contributions to the variability in different sub-basins of the Arctic. Mode 2 (period = 6 years) contributes 10%-20% in most regions while mode 3 (period = 18 years) and mode 4 (period = 27 years) mainly play a role over the eastern and western Arctic, respectively, and mode 5 (period = 55 years) is dominant over the north of Chukchi and East Siberian Seas.
Similarly to figures 4 and 5 shows the contribution of each mode to high/low sea ice events (larger than ±1 SD). We first calculate composites of high minus low sea ice events for the total as well as every mode of sea ice concentration, and then calculate the ratio of each mode compared to the total composite. Mode 1 is again the main contributor to the high/low sea ice events but synchronization of other modes together without mode 1 can also result in such events (e.g. low sea ice year in 2008). Half of the difference between the two extreme cases of 1996 and 2007 (−2.2 million km 2 ) is related to the interdecadal modes (modes 3, 4 and 5). The interdecadal modes provide a background state for interannual variability, and they make the anomaly stronger (weaker) if they (de-)synchronize. Without the interdecadal modes, sea ice area would have been high in 2008 and 2009. The contribution of each mode to the sea ice change in different sub-basins is: mode 2 and 3 in the eastern Arctic and Barents Sea, mode 4 in the Barents Sea and the Baffin Bay, and mode 5 in Chukchi Sea and Central Arctic. After applying EEMD to the timeseries at every grid point of September sea ice concentration to find their dominant modes of variability, the contribution of each mode of sea ice variability is calculated as their variance ratio relative to the total variance of de-trended September sea ice concentration (σ i ⁄σ total ): (a) total variance σ total of the de-trended sea ice concentration (1900-2010); (b)-(f) variance ratio (σ i ⁄σ total ) of the sea ice modes (mode 1−5) relative to the total variance in percentage. Data for sea ice concentration was obtained from the ERA-20C dataset.

Drivers of September sea ice variability
Below, we discuss composites of different variables based on high minus low sea ice years for each mode of sea ice variability. Our analysis of the composite maps (not shown)of sea ice concentration for most of the modes show that their anomalies start to appear mostly in July consistent with Blanchard-Wrigglesworth et al (2011). For atmospheric variables, we focus on June-July-August (JJA) anomalies and show JJA averages of SLP, surface air temperature at 2 m (SAT) and geopotential height at 500 hPa (GPH500) because the atmosphere leads September sea ice from JJA (Blanchard-Wrigglesworth et al 2011). For SST, we use yearly-mean values as oceans have a longer memory in response to internal variability. Below, we also discuss the relation between different modes of sea ice variability and main modes of internal climate variability in the Northern Hemisphere: AMO, PDO, AO, ENSO and QBO. In the appendix, calculations of these modes and their timeseries based on ERA-20C are shown. Note that we will only discuss variables which show significant composite values for the corresponding sea ice mode.

Mode 1 (period = 3 years)
The sea ice variability in mode 1 is associated with changes in atmospheric circulation. The high minus low sea ice composite map of SLP in summer (JJA; figure 6(a)) shows that most of the Arctic is covered by a low-pressure anomaly while there is a high-pressure anomaly along the coast from the Laptev Sea to northern Scandinavia. Such SLP pattern results in a meridional wind anomaly opposing the Trans-polar Drift Stream. Following the SLP isobars, sea ice convergence is expected in coastal areas of the East Siberian and Laptev Seas. This is also accompanied by a reduced sea ice export through the Fram Strait (Koenigk et al 2006). The SLP pattern resembles a combination of the positive AO (EOF1 of SLP) and the negative AD (EOF 2 of SLP) patterns (figure A4). AO and AD are the main factors controlling the large-scale atmospheric circulation over the Arctic in summer (Cai et al 2018). We find that mode 1 has correlations of +0.3 and −0.3 (95% significance) with the AO and AD indices, respectively. Most of the high sea ice events for this mode occur when AO is in the positive and AD in the negative phase, and the opposite is true for the low sea ice years. In fact, regressing the AO minus AD index shows the largest sea ice anomaly (not shown). The composite for SAT shows cooling over the Arctic, particularly the East Siberian Arctic (figure 6(b)) as would be expected from the atmospheric circulation anomaly pattern. At 700 hPa and the upper levels, the cold anomalies are even larger and covering almost the entire Arctic Ocean (figure 6(c)). This could be due to weakened solar irradiance and/or less subsidence of air from upper atmospheric levels consistent with the SLP pattern. Previous studies found that Arctic temperature changes at the surface due to the air descending from the upper atmosphere (Graversen et al 2008, Screen et al 2012, Ding et al 2017 and adiabatic warming makes the surface less cold compared to upper levels. We find westerly QBO for the high sea ice minus low sea ice composite of zonal winds at 50 hPa (figure 6(d)), consistent with previous studies (Garfinkel et al 2012 and references therein). However, we do not find a significant correlation between mode 1 and the QBO index despite their similar periodicity. An interesting feature of mode 1 is that the amplitude varies on a decadal time scale (figure 2(a)). This will need further investigation in the future.

Mode 2 (period = 6 years)
For mode 2, we see cold SST anomalies over the Barents Sea ( figure 7). The composite SST maps also show significant anomalies over the western North Atlantic. Previous studies showed the important role of North Atlantic SST for the Barents-Kara Sea (Nakanowatari et al 2014), and in particular the Gulf Stream SST being an important contributor to the ocean and/or atmospheric heat transport to the Barents Sea (Yamagami et al 2022). In our SST composite map, we see a SST pattern in the Atlantic Ocean that can be associated to the Gulf Stream variability (see Seo et al 2017). In fact, if we plot the latitude-time cross section of SST anomaly averaged over the longitudes 60 • W-30 W • ( figure 7(b)), we see dipole-like anomalies between 40 • and 60 • N with a periodicity of about 6 years that is very likely related to the Gulf Stream variability. Gangopadhyay et al (2016) also found variability of 4-10 years for the path of Gulf Stream (d) zonal wind velocity at 50 hPa (ms −1 ). The dotted areas in each map indicate regions where the differences between high and low sea ice years are statistically significant at the 90% level. All data were obtain from ERA-20C.
(with 7-10 years on the western side and 4-5 years on the eastern side of the path), and Muilwijk et al (2018) found a variability of 5-10 years in their simulations for the Atlantic Ocean heat transport towards the Arctic, which is close to the variability in mode 2. In addition to affecting heat transport to the Arctic, the variability of Gulf Stream SST can drive a teleconnection between the Gulf Stream region and the Arctic, which is demonstrated in our GPH500 composite map ( figure 7(c)). This connection can ultimately impact Arctic sea ice . We, therefore, suggest that mode 2 is partly linked to the variability in Gulf Stream SST.
In the Pacific Ocean, the composite map of SST shows a pattern resembling El Niño. We see, however, no significant correlation between the Nino index and Arctic sea ice, and rather weak teleconnection between the tropical Pacific SST and the Arctic in our composite map of GPH500 when using the entire timeseries from 1900-2010. This is because the strength of teleconnection between ENSO and mode 2 varies through time. In figure 7(d), the scatter-plot of high sea ice years of mode 2 projected into the PDO and ENSO phase space shows that high sea ice events are primarily associated with compounding effect of El Niño and positive PDO, or in general when ENSO and the PDO are having the same sign. For instance, by correlating GPH500 with the ENSO index (not shown), we find negative correlation values over the Arctic for the period between 1980-2010 when the PDO is mainly positive, and positive values for 1936-1966 when the PDO is mainly negative. We therefore propose that there is a relation between mode 2 and ENSO which is modulated by the phase of PDO. This is consistent with the previously suggested idea of a combined impact of tropical and extratropical Pacific SSTs on the Arctic (Svendsen et al 2021). Furthermore, the Gulf Stream displacement we discussed above could itself be generated through ENSO and PDO (Taylor et al 1998, Schemm et al 2016).

Mode 3 (period = 18 years)
The SST and SAT composites show cooling anomalies over the Arctic, mainly over the Barents Sea (figure 8). Over the North Atlantic, our SST composite pattern is similar to the decadal variability mode of Atlantic SST with a dominant time scale of 13-18 years in Årthun et al (2021; their figure 2). In an earlier study, Årthun and Eldevik (2016) found a variability of around 14 years and related decadal-scale sea ice variability in observed and modeled SST of the Norwegian Sea during the last century. The source of these SST anomalies is not fully understood but Årthun et al (2021) suggest large-scale atmosphere-ocean interaction as the driving force. We therefore suggest that mode 3 of sea ice can be linked to the decadal mode of atmosphere-ocean variability in the North Atlantic.

Mode 4 (period = 27 years)
The JJA SLP composite map ( figure 9(a)) has a negative anomaly over most of the Arctic causing an anomalous wind circulation that favors more sea ice in the Arctic and reduces the export of sea ice through Fram Strait. It also has a strong negative anomaly over Greenland and Baffin Bay, and in fact it is the only mode with such an anomaly. This, next to a cold SAT anomaly over the same region ( figure 9(b)), is consistent with the dominant role of this mode in controlling the sea ice in the Baffin Bay mentioned in section 3.4. Our SLP composite is similar to the central Arctic pattern in Wu et al (2012). They suggest that high September sea ice is associated with a negative SLP over the central Arctic with an interdecadal variability that changed phase in late 1990s from an anomalous cyclone to an anticyclone. This is consistent with our results where we find that mode 4 has a downward trend from late 1990s (IMF 4 in figure 2(a)).   The SST composite (figure 9(c)) resembles a PDO-like pattern in the Pacific Ocean agreeable with mode 4 having a period in a similar range as our calculated PDO index (25-28 year). We find that the PDO index leads mode 4 by about 6 years with a correlation of 0.65, and high (low) sea ice events follow after positive (negative) PDO. The SST composite also shows significant cold anomalies in the North Atlantic and sub-polar regions which can decrease ocean heat transport into the Arctic and leads consequently to increased Arctic sea ice area (Schlichtholz 2011, Årthun et al 2012, Koenigk and Brodeau 2014, Docquier et al 2019. From the composite of GPH500, there is an inter-basin connection between the Atlantic and Pacific Oceans which establishes an anomalous background circulation driving the SLP and SAT composites (figure 9(d)). A lead-lag-analysis shows that this anomalous circulation is present from two years prior to the sea ice events (not shown). This is in agreement with the suggested hypothesis for the extreme sea ice event in September 2007 that the shift in the summer wind pattern of the Arctic had already started in 2005 (e.g. Wang et al 2009, Overland and Wang 2010.

Mode 5 (period = 55 years)
The JJA SLP composite map ( figure 10(a)) has a negative anomaly over most of the Arctic similar to mode 4, but does not extend over Greenland and Baffin Bay. The JJA SAT composite shows cold temperatures not only over the Arctic but also over the Atlantic Ocean and most of the Northern Hemisphere continents ( figure 10(b)). The SLP and SAT composites are similar to the pattern of correlation between SLP/SAT and AMO index (not shown). Moreover, the SST composite resembles an AMO-like pattern in the Atlantic Ocean and we also find a correlation of −0.6 between AMO and mode 5 ( figure 10(c)). This implies that cold/warm phases of SSTs during AMO influences the Arctic through changes in oceanic heat transport (Docquier et al 2021) as well as establishing an anomalous atmospheric circulation (Castruccio et al 2019).

Summary and discussion
It is crucial to understand the causes of Arctic sea ice change and different modes of sea ice variability for better predictions of sea ice conditions in the future and to be able to better separate the externally forced signal in observations and future projections from internal variability. In this study we explored Arctic sea ice modes of variability, their driving sources and relations to large-scale modes of internal climate variability. We explored the 111 year long (1900-2010) reanalysis data from ERA-20C in order to study interannual to multidecadal modes of sea ice variability. To separately study each individual mode of variability, we de-trended and decomposed the analyzed data by applying the non-linear EEMD method. This method provides main modes of variability and the trend of timeseries and is thus well suited to provide estimates of the relative importance of internal variability of the Arctic compared to anthropogenic warming. To further support our analysis, we also performed EEMD on reconstructed sea ice data of Walsh19 and satellite data of OSI-450.
Based on our analysis, we proposed that the EEMD-extracted trend for sea ice area of Walsh19 could be linked to anthropogenic warming. We estimated a sea ice loss of 2.2 million km 2 between 1850 and 2017, and 1.1 million km 2 between 1980-2010. The rate of sea ice change was found to accelerate from about −0.25 million km 2 per decade in 1980 to −0.47 million km 2 per decade in 2010, and to −0.53 million km 2 per decade in 2017, probably a consequence of self-accelerating loss of sea ice due to the ice-albedo feedback (Notz and Marotzke 2012). Following this sea ice loss curve to the future, which is related only to the anthropogenic warming, we estimated a September ice-free Arctic around 2060. For ERA-20C, the sea ice loss in the trend mode was amounting for 1.8 million km 2 between 1980 and 2010. The difference between the two datasets is because in Walsh19 with a longer timeseries, more variability modes were extracted from the original signal in order to obtain the trend mode. This indicates the importance of de-trending method and using of long timeseries for better understanding climatic trends. Furthermore, the results showed that changes in sea ice due to internal variability at decadal to multidecadal time scales are as large as the anthropogenic-related rate and could thus substantially contribute to the observed sea ice trends at these time-scales. For instance, our analysis demonstrated that more than 70% of sea ice area loss between the two extreme cases of 1996 (extreme high) and 2007 (extreme low) was due to internal variability, of which half was related to interdecadal modes of variability.
It is important to understand how the interannual to multidecadal modes of sea ice variability and their underlying processes contribute to intense sea ice changes. This has so far received less attention. We discussed five main modes of variability for September sea ice area with approximated periods of around 3, 6, 18, 27 and 55 years, and showed regional sea ice variations for each of these modes. For every mode, we briefly explored the main sources of variability and their connections to climate modes of variability. We showed that mode 1 was driven by changes in the atmospheric circulation over the Arctic. The high (low) sea ice events in mode 1 were found to be linked to a combination of positive (negative) AO and negative (positive) AD. Interestingly, there was also a periodic behavior with an interdecadal cycle in the variability amplitude of mode 1. Mode 2, on the other hand, was found to be driven mainly by ocean SST variability. We suggested changes in the ocean/atmosphere heat transport associated with Gulf Stream variability/ displacements to be one of the main driver of this mode. We also showed that mode 2 was the only mode that could be linked to ENSO. Based on our analysis, the correlation between ENSO and Arctic sea ice was depending on the phase of the PDO, and high (low) sea ice events were primarily associated with compounding effect of ENSO and PDO being in the same (opposite) phase. We proposed that mode 3 could be related to the decadal mode of SST variability in the Atlantic Ocean. For mode 4, the PDO was suggested as a related source of variability by establishing an anomalous atmospheric circulation over the Arctic and influencing sea ice with a lag of 6 years. Regarding mode 5, results indicated a strong connection with the AMO.
The dominant modes of sea ice variability could be used to develop a probabilistic model for predicting the occurrence of extremes and rapid sea ice changes. Here, we will only provide a brief discussion of this topic, rather than covering it extensively. Our results suggest that high and low sea ice events occur when several variability modes are in the same phase (as shown in figure 2). While these modes are not precisely sinusoidal functions, they can be approximated as such, and their common period can be used to predict intervals when two or more modes are likely to be in the same phase. For modes 1-4, it is likely that the intervals during which two or more modes are in the same phase will be around 6, 18, 27, and 54 years, with some uncertainties. Therefore, high/low sea ice events are more likely to appear at these intervals. These findings are consistent with those in figure 2 and table 2. Extreme sea ice events (exceeding 1.5 × SD) mostly occur on multidecadal time scales (27 and 54 years), as do rapid sea ice change events where interdecadal modes are the primary contributors (section 3.3). Additionally, it is also possible to configure a predictive function based on lead-lag relations between sea ice modes and their driving sources. For example, mode 4 can be predicted years in advance using the PDO-related SSTs.
ERA-20C has been used to explore the variability of Arctic climate and sea ice and has an acceptable performance (Belleflamme et al 2015, Wegmann et al 2018, Schweiger et al 2019. Though, it has uncertainties due to limitations of available observational data used for its production. The accuracy of ERA-20C is dependent on the availability of observational data, and it improves over time as the number of observations increases throughout the century. According to Poli et al (2013Poli et al ( , 2015, ERA-20C can fairly reproduce the daily evolution of tropospheric meteorology at mid-and high latitudes, and reasonably represents known extreme events. However, it is recommended to evaluate uncertainties in different regions and during different periods when using this dataset. During the period 1979-2010, ERA-20C agrees well with other reanalysis products such as ERA-Interim, ERA-40, and NCEP-NCAR in the representation of synoptic variability (Dell'Aquila et al 2016). Although ERA-20C is ∼1 • C colder than ERAInterim over Greenland during the period 1980-2010, it could capture the circulation anomalies and warm period over Greenland during 1920-1930(Belleflamme et al 2015, Fettweis et al 2017. ERA-20C produces a QBO but is not in phase with observations (Anstey et al 2022). As for ENSO-related variability in ERA-20C, the main climatic features have been shown to be well-represented in the tropical Pacific and Atlantic Oceans (Pinheiro et al 2021). Over the Arctic, ERA-20C agrees well with ERA-Interim and MERRA regarding the annual cycle of downwelling radiation, air temperature, and wind speed during the overlapping time periods and shows considerable skill in capturing daily to monthly variability of temperature and wind for the pre-satellite period (Schweiger et al 2019). Analyzing ERA-20C data from the upper troposphere and above, and from the Southern Hemisphere should be done with caution due to the lack of observational constraints particularly when going back in time.
To assess uncertainties associated with the ERA-20C dataset, particularly during the first half of the 20th century, we conducted additional composite analyses using only the second half of the data . However, we found no significant differences compared to our previous results (not shown). To further test the consistency of our obtained sea ice modes of variability, we compared sea ice area in the HadISST2.1 (boundary forcing for ERA-20C) with that in the HadISST1.1 (Rayner et al 2003). The detrended signals from both datasets exhibit a correlation of 0.8 and agree on roughly 50% of the high and low sea ice events.
Our analysis of the ERA-20C dataset and the Walsh19 sea ice reconstruction provides valuable insight into the variability and trend of sea ice during the 20th century. However, some of the results we obtained require further evaluation by climate models. We will incorporate these findings into future research to better understand the complex mechanisms driving sea ice variability in the Arctic.

Data availability statements
The information about ERA-20C data and how to download them are given in here: www.ecmwf.int/en/ forecasts/datasets/reanalysis-datasets/era-20c.
The reconstructed sea ice data of Walsh et al (2019) A4). To define the PDO and AMO indices (figure A5), we calculate the EOFs of SST in the Northern Hemisphere (north of 10 • N). EOF1 resembles the PDO (period = 25-28 years) and the EOF2 resembles the AMO (period = 55-60 years). This is a slightly different approach than other studies but results are similar (with correlations of more than 0.9) when we use definitions from the NCAR team (https://climatedataguide.ucar.edu/climate-data/).

A2. Time-frequency analysis and non-linear de-trending
In signal processing, filtering of non-stationary and non-linear data in either the time-domain or frequency-domain comes with some limitations. Thus, to obtain different modes that compose a non-stationary and non-linear signal, a joint time-frequency transform where a signal is simultaneously analyzed in both time and frequency domains is more appropriate. This is particularly important when dealing with data that have non-linear trend next to oscillatory behavior such as the variability and long-term decline in the Arctic sea ice. Understanding the time-varying frequency content of the signal is a very helpful step in understanding the occurrence of extreme events. Empirical mode decomposition (EMD; Huang et al 1998) is one of the time-frequency analysis methods that is widely used in signal analysis. It uses an adaptive (data-driven) method in contrast to Fourier or wavelet transforms where a priori determined basis function is defined. Thus, it is a multi-purpose method to confine any event in time and investigate frequency of variability in data (Salisbury and Wimbush 2002). To add robustness and prevent possible signal distortion and mode mixing phenomenon in EMD, the EEMD was developed (Wu and Huang 2009), in which an ensemble of white noise is added to the original signal. The amplitude of the noise which is added is in the range of 0.1-0.4 time the standard deviation of the original signal. EEMD decomposes timeseries into a finite number of IMFs through the sifting process. This is done by: identifying the local maxima and minima in the signal; connecting (interpolating) these maxima and minima points with cubic spline lines to form an upper and a lower envelope; taking the mean of the two envelopes and subtract them from the original data (Wu et al 2007). This process will be iterated until the mean of the two envelopes equals zero, and the resulting function is oscillatory. Once this is achieved, the first IMF (IMF1), which has the highest frequency, is obtained. By subtracting the resulting IMF from the original signal and repeating the process, the next IMFs are computed until no more IMF can be extracted. The last IMF is the trend mode which is a monotonic function and is the difference between the primary signal and the summation of the oscillatory IMFs (Wu and Huang 2009). The instantaneous frequency of data can be obtained after applying Hilbert transform to IMFs and gives a realistic representation of the intrinsic oscillation of data as suggested by Huang et al (1998).

A3. Significance of IMFs
We used the statistical significance test for IMFs following the method suggested by Wu and Huang (2005) and Coughlin and Tung (2005) to see if they are true signal and not a random noise. The variance of IMFs in the original data are tested against the variance of IMFs in a random data (red noise). By using the bootstrapping method, we created 1000 random timeseries with the same autocorrelation as the de-trended sea ice area timeseries. We then carried out EEMD on the 1000 replicate timeseries to estimate variance distribution for IMFs and calculate the confidence intervals. The IMF components which have their variance outside the confidence interval are significant (contain signal information). In addition, we also used spectral analysis for every IMF component to check whether they are significant at the corresponding period of variability. Figure A1. (a) Six IMFs obtained by using EEMD representing dominant modes of variability of the September sea ice area from Walsh19 data. More variability modes were extracted in Walsh19 with a longer timeseries compared to ERA-20C and OSI-450. (b) Similar to figure 1(b) with the last IMF for ERA-20C (black) and satellite observation OSI-450 (red) but IMF6 (in panel (a)) is added to the last IMF of Walsh19 (blue). Figure A2. Rate of September sea ice area change per year for the last IMF (trend mode) of three different datasets used in this study: ERA-20C, Walsh19, and satellite observation OSI-450. The rate was calculated by taking the first derivative of the curves in figure 1(b) with respect to time. As explained in the text, trend mode extracted for sea ice area of Walsh19 and thus, its rate of sea ice area change (blue curve in this plot) is proposed to be attributed to anthropogenic warming. Figure A3. Obtained spectrum for the total variability (de-trended signal) of September sea ice area shown in figure 2(b). Peaks (above 90% confidence curve) can be seen at 3, 5, 18, 25, 35 and 55 years and are highlighted with grey stripes. Figure A4. Empirical orthogonal functions (EOFs; top panels) and their corresponding principal components (PCs; bottom panels) for sea-level pressure (SLP in Pa) in June-July-August (JJA) north of 20 • N with the variance explained by each EOF indicated in the upper-right corner of each panel (%). EOF1 and EOF2 patterns resemble Arctic Oscillation (AO) and Arctic dipole (AD), respectively, and their PCs were used in this study to define AO and AD indices. The SLP data used in this figure was obtained from the ERA-20C dataset. Figure A5. Empirical orthogonal functions (EOFs; top panels) and their corresponding principal components (PCs; bottom panels) for annually-averaged sea surface temperature (SST in oC) to the north of 10 • N with the variance explained by each EOF indicated in the upper-right corner of each panel (%). EOF1 and EOF2 patterns resemble Pacific Decadal Oscillation (PDO) and Atlantic Multidecadal Oscillation (AMO), respectively, and their PCs were used in this study to define PDO and AMO indices. The SST data used in this figure was obtained from the ERA-20C dataset.