Solar Cycle 25 Prediction Using N-BEATS

Solar activities lead to Sun variation with an 11 yr periodicity. The periodic variation affects space weather and heliophysics research. So it is important to accurately predict solar cycle variations. In this paper, we predicted the ongoing Solar Cycle 25 using neural basis expansion analysis for the interpretable time series deep learning method. 13 months of smoothed monthly total sunspot numbers taken by sunspot Index and Long-term Solar Observations are selected to train and evaluate our model. We used root mean square error (RMSE) and mean absolute time lag (MATL) to evaluate our model performance. RMSE and MATL measure the difference between our predicted values and the actual values along the Y- and X-axis, respectively. The RMSE value is 26.62 ± 1.56 and the MATL value is 1.34 ± 0.35, demonstrating that our model is able to better predict sunspot number variation. Finally, we predicted the variation of the sunspot numbers for Solar Cycle 25 using the model. The sunspot number of Solar Cycle 25 will peak around 2024 February with an amplitude of 133.9 ± 7.2. This means that Solar Cycle 25 will be slightly more intense than Solar Cycle 24.


Introduction
Solar activities show a periodic variation with an 11 yr cycle.The period of intense activity often causes auroras, geomagnetic storms, and ionospheric disturbances on Earth.The sunspot number (SSN) has been used as the main indicator of solar activities.Accurate solar cycle prediction provides an effective estimate for heliophysics research and long-term space weather planning.Predictions have been ongoing for decades, and their difficulty, importance, and potential applications make this research field still active.At present, with the end of Solar Cycle 24, the prediction for the ongoing Solar Cycle 25 naturally becomes critical.
The prediction methods of the solar cycle usually include three categories: precursor, model-based, and extrapolation methods (Petrovay 2020).Precursor methods rely on some measured values of solar activities or their magnetic field at a specified time to predict cycle peaks.Therefore, it is crucial to find a good precursor.For example, Pishkalo (2008) used a parametric calibration method to predict the maximum SSN and concluded that the values will reach 112.3 ± 33.4 in 2023 May and June.Hawkes & Berger (2018) also noticed the potential of the global dipole and proposed the helicity flux and found that Solar Cycle 25 is similar to Solar Cycle 24.Modelbased methods use numerous physical underlying theories and are mainly categorized into the surface flux transport models and the consistent generator ones.These have only been proposed in recent years, but new ideas and studies on the methods are proliferating.For example, Upton & Hathaway (2018) proposed the advective flux transport model and predicted that Solar Cycle 25 is similar to or slightly weaker than the previous cycle.Pesnell & Schatten (2018) used a solar dynamo model to predict the maximum SSN to be between 110 and 160, which occurs between 2023 September and 2026 July.Using extrapolation methods, they considered that the physical processes of SSNs are statistically homogeneous and can be treated by time series analysis.They only used a time series of SSNs (or other indicators of solar activity) and relied on previous values to extrapolate to the future trends, so extrapolation methods are also known as time series analysis or regression.The methods include spectral, linear regression, and nonlinear regression methods.Nonlinear regression methods usually include phase space reconstruction and deep learning methods.Rigozo et al. (2011) outlined a spectral component analysis method and predicted the maximum SSN to be 132.1 in 2023 April.However, Sello (2019) used a nonlinear dynamical approach and found the maximum SSN to be 107 ± 10 between 2022 July and 2023 July.Pala & Atici (2019) employed long short-term memory (LSTM) neural networks to predict the maximum SSN to be 167.3around 2022 July.Benson et al. (2020) employed a combination of WaveNet and LSTM neural networks to predict the maximum SSN to be 106 ± 19.75 around 2025 May.Okoh et al. (2018) used the hybrid regression neural network to predict the maximum SSN to be between 103.9 and 124.3, which occurs between 2024 July and 2025 July.
However, the increased frequency of the updating of predictions and the uncertainty of the predicted results show the inadequacy of the existing prediction methods (Petrovay 2020).The methods using precursors, particularly using polar measurements or geomagnetic components, do not have a specific mathematical model, and the predictions only rely on empirical correlations.The polar measurement data set is very short, so the prediction peaks are only an approximate range and are usually delayed by 5 or 6 yr (Schatten et al. 1996;Makarov et al. 1989).Lu et al. (2022) also pointed out that whatever precursor is used would produce significant errors in the predictions for very few individual cycles.Similarly, the solar dynamo model is very sensitive to changes in its Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.parameters, even small random variations of the parameters can also lead to unpredictable changes in the predicted results.The chaotic nature of the nonlinear dynamo also severely limits the possibility of prediction (Bushby & Tobias 2007).There are also problems with the extrapolation methods.The setting of the attractor and embedding dimensions for the attractor phase space reconstruction method is usually a challenge because different values may cause large variations.Deep learning methods have powerful learning capabilities and can be trained with massive data to fit any complex nonlinear relationship.However, the structure of the neural network has a great impact on the prediction results.Also, for longer time series, many deep learning methods such as LSTM exhibit less satisfactory predictive capability (Zhou et al. 2020).
Nevertheless, deep learning methods for time series prediction have proliferated in recent years, and many of them have demonstrated higher accuracy and reliability than classical time series methods (e.g., Bai et al. 2018;Siami-Namini & Namin 2018;Pala & Atici 2019;Salinas et al. 2020;Zhou et al. 2020;Yoo & Kang 2021).With the emphasis on longer time series prediction, the research on modeling long-term time dependence is also advancing.Here, we used the neural basis expansion analysis for interpretable time series (N-BEATS; Oreshkin et al. 2019) deep learning method to exhibit better modeling capability for solar cycle prediction.The model will be used to predict the SSN variation of Solar Cycle 25.Moreover, we made comparative experiments on different parameter settings and different improvement measures of the model to determine the best structure of the model for predicting solar cycles.

Data
We selected the SSNs provided by the World Data Center for sunspot Index and Long-term Solar Observations (WDC-SILSO) in the Royal Observatory of Belgium and the SSNs can be downloaded from the website at https://www.sidc.be/silso/datafiles.The data consist of 13 months of smoothed monthly total SSNs from 1749 July to 2022 March (3273 months in total).Figure 1 shows the selected SSNs.
If the SSNs are regarded as a univariate time series, the prediction task is a univariate multistep prediction.Predicting the SSNs is that a historical time series [x 1 , K...,x t ] is input into the model to generate output for the next n time steps [x t+1 , K...,x t+n ].Here, we examine the autocorrelation between the SSNs and the lagged terms to determine the prediction horizon n.The optimal lag is about order 125 with an autocorrelation coefficient of 0.66, while order 120 presents a pretty high autocorrelation with an autocorrelation coefficient of 0.64.It can be determined that the prediction horizon set to 120 is reasonable.Moreover, solar cycles exhibit an average length of about 11 yr, so the choice of the next decade as the prediction horizon is sufficient to reflect the intensity of Solar Cycle 25.
According to the extrapolation mode mentioned earlier, we used the sliding window method to divide the SSNs into input and output horizon pairs.The sliding window is designed as a fixed-length window to denote the total length of the input and output horizons.This sliding is repeated for all of the SSNs by sliding the window one time step at a time to sample all input and output horizon pairs.Furthermore, these samples need to be divided into training and testing sets for the model, but many classical division methods are not suitable for time series prediction.For example, cross-validation is a common method for dividing data sets and validating performance in statistics.However, the chronological nature of the time series requires that the time index of the testing set should be higher than that of the training set.Here, we divided the training and testing sets of the model using a time series division method called TimeSeriesSplit, which is widely used in the machine-learning library scikit-learn (Pedregosa et al. 2011).Unlike the standard cross-validation, TimeSeriesSplit requires that only the data before the testing set are selected as the training set.The sampling and division mentioned above are described in detail in Section 4.

Method
N-BEATS model is a typical univariate multistep prediction structure with an input window and an output window.Figure 2 shows our model structure and an example of the input and output horizon pairs.The input window length is limited to m times the output one.To our model, the input and output window lengths are set to 240 and 120, i.e., m is 2.
N-BEATS has a three-layer nested structure.The outermost layer is a sequential structure with multiple stacks connected end to end as shown in the light red box in Figure 2. The SSNs are input to the model and analyzed stack by stack.Each stack analyzes the input and presents a partial prediction, then the predictions are eventually integrated as output by forward residual.A normalization can reduce the training time and improves prediction accuracy.So we employed the reversible instance normalization (RevIN; Kim et al. 2021) method to process our input and output.The mechanism removes and restores the statistical information from the SSNs so that the prediction has improved significantly.
Each stack is internally stacked by multiple blocks.In Figure 2, the blue box shows the structure of each stack.Each block has two outputs called backcast and forecast that are pointed out by the purple and green arrows, respectively.Multiple blocks are interconnected in each stack by means of bidirectional residual and layer normalization (LN; Ba et al. 2016).In a stack, the backcasts of all blocks are considered to be the results of sequential processing for the input.Each block generates a backcast, then subtracts the backcast from its input using the backward residual and passes the remainder to the next block.The input of the successor block is therefore simplified.Meanwhile, each block outputs a partial prediction (i.e., forecast).These partial predictions are aggregated by forward residual first at the stack layer and then at the outermost layer to generate the model output.Furthermore, we proposed to add the LN mechanism behind the residual, which significantly improved the predictive capability and generalization of the model.Moreover, we design various LN strategies and discuss them in detail in Section 5.
In the innermost layer, the interpretable structure is designed inside the block to generate the backcast and forecast shown in the orange box of Figure 2. Deep learning methods can be regarded as universal function fitting tools so that both backcast and forecast are essentially fitting time series.Backcast needs to fit part of what the block extracts from the input time series, while forecast needs to fit the partial prediction of the output time series.If the time series is considered as the unknown function curves, then the mapping function of the backcast and forecast can be generated from the block to fit the curves.As mentioned above, the mapping functions are defined in the decomposed form inside the block, θ b and θ f are expansion coefficients of the decomposition terms of the function families g b and g f .This means that as long as the function families and their expansion coefficients are determined, the model can generate the unique mapping functions toward backcast and forecast, respectively.Therefore, each block is designed as two parts.The first part is the four layers fully connected neural network followed by two linear layers to generate the expansion coefficients θ b and θ f : where l is the block number, x l denotes the input of the block, and FC is the nonlinear projection layer in the fully connected network.b and f are used to indicate that they belong to backcast and forecast, respectively.The second part is the interpretable structure to generate the backcast and forecast.For convenience, x ˆand yˆare used to represent backcast and forecast, respectively.There are two types of structures for the function families g b and g f , generic DL model, and trendseasonality model.The generic DL model is an ordinary blackbox neural network, where g b and g f are designed as linear layers of the neural network.In this case, the expansion   model needs to fit the slowly changing trend component of the SSNs, so g b and g f are designed as p-order polynomials: ( )  /H is defined as an Hdimensional vector.Note that H stands for horizon here, which means that H represents the input horizon in x ˆand the output horizon in yˆ.The s indicates the stack number, θ b and θ f are the coefficients of the p-order polynomials.As long as p is relatively low, the p-order polynomial will fit the trend of the SSNs.To accommodate the periodicity of the seasonal component of the SSNs, the g b and g f of the seasonality model are designed as Fourier series with q decomposition terms: , , , while θ b and θ f are used as the expansion coefficients of the Fourier series.More complex seasonal variation requires more decomposition terms.In Equations (2) and (3), both p and q represent the numbers of decomposition terms, which also correspond to the dimensions of θ.Whether the trend or seasonality model, as long as the expansion coefficients are determined, the corresponding backcast and forecast can be mapped.In other words, the model just needs to analyze massive data to obtain the appropriate expansion coefficients so that the model can give the prediction of the successor SSNs for any input.

Experiment and Results
We set the parameters of the model according to the steps listed in Section 3. The model is designed as a three-layer stack structure, with the trend, seasonality, and generic architecture stack from top to bottom, and the numbers of blocks inside the stacks are set to 8, 3, and 6, respectively.Among these stacks, we used the trend and seasonality stacks to fit the trend and seasonal variations for the SSNs and left the remainder to the generic architecture stack.The blocks in the above three stacks have the θ dimensions of 2, 8, and 3, respectively.The fully connected neural network inside blocks has 1024 hidden units in each layer.The output window length is set to 120 according to the autocorrelation analysis described in Section 2. We further set the input window length to 240 (the effects of different input window lengths and output window lengths are discussed in detail in Section 5), so that the sliding window length mentioned in Section 2 is set to 360.We selected the 3240 data from the entire SSNs and sampled the selected data using the sliding window so that the data can be divided into 2880 input and output horizon pairs.Afterward, TimeSeriesSplit divides the samples into 7 slices.In order for the model to learn sufficient time dependence within the SSNs, we set the minimum length of the training set and the length of the testing set to 1320 and 120, respectively.The divided slices are listed in the left half of Table 1.
All experiments are conducted on the device with NVIDIA RTX3060Ti GPU and Intel i5 11400 CPU.The model is built under the PyTorch deep learning framework.We chose the MAE loss function and Adam optimizer (Kingma & Ba 2014) to train and optimize the model.The batch size is set to 10 and the iteration number is set to 50 epochs.The initial learning rate is 0.0005 and the learning rate decayed to 0.75 times the last value at each epoch.Root mean square error (RMSE) and mean  absolute time lag (MATL) are given to evaluate the performance of the model.MATL denotes the global temporal deviation between the predicted and actual values, which is measured by the mean of the absolute value of the time lag when the autocorrelation achieves the maximum value.We used MATL as part of the evaluation metrics, thereby taking into the difference between the predicted and actual values along the Y-and X-axes.We used the training set on each slice to train a model and verified its predicted performance on the testing set. Figure 3 shows the first, middle, and last predicted results of each slice, where the blue and red lines indicate the actual and predicted values, respectively.The figure exhibits good predictions in almost all slices except for a few subplots, such as slices 6-60 and slices 6-119 are less than satisfactory.As mentioned in Section 1, predictions of extrapolation methods rely heavily on historical time series.Therefore, we consider that this is caused by the unusually long trailing of Solar Cycle 23 with rising and falling phases of 3.5 and 8.6 yr, respectively.This unusually long trailing can be found in the tail of slices 6-0, which exhibits an early entry into the rising phase of the next solar cycle in the prediction curve.This long trailing also causes a general overestimation of subsequent predicted values.Table 1 summarizes the performance of our model on each slice.The RMSE value of all slices is 26.62 ± 1.56, and the MATL value is 1.34 ± 0.35.Moreover, Table 1 denotes that the evaluations of RMSE and MATL are not consistent.For example, the MATL value in slice 1 is relatively high while the RMSE value is quite low, but the opposite is true in slice 6.In previous studies, RMSE is taken as the only criterion to judge the performance of the model, while we suggest that numerical and temporal deviations should be considered simultaneously.Furthermore, Table 2 and Figure 4 are about the predicted and evaluated results of Solar Cycle 13 to Solar Cycle 24.The RMSE and MATL values of all the solar cycles are 25.62 and 0.42, respectively.Our model achieves rather good predictions for both the amplitudes and timing of these solar cycles, except for Solar Cycle 19 and Solar Cycle 24, which exhibit relatively high numerical deviations with RMSE values of 56.67 and 44.51.
Finally, we predicted the ongoing Solar Cycle 25 using the model built in the last slice.The inputs of the model are 240 SSNs before the end of Solar Cycle 24 in the data set, and the prediction horizon is the next decade.Figure 5 shows the prediction of Solar Cycle 25, where the blue and red lines are the actual and predicted values, respectively.The prediction exhibits the cycle amplitude of 133.9 ± 7.2, which occurs around 2024 February.Our prediction is consistent with the early variation of Solar Cycle 25.The prediction also conforms  to the theory of asymmetry of the solar cycle profile, which held that the cycle profile has an obvious right-skewed distribution (Waldmeier 1936;Vaquero et al. 2006).

Discussion and Conclusion
Figure 4 shows that the overestimates or underestimates of cycle intensity occur in individual cycles.It is worth discussing whether these misestimated cycles will influence the subsequent cycle predictions, and in particular, the impact of the overestimated Solar Cycle 24 on the Solar Cycle 25 prediction.With our method, a well-trained model is equivalent to an approximate function mapped from the input to the output.The input is the observed values with the previous length of 240 and the output is the predicted values with a length of 120.We compared the deviation between the real values and the predicted values to evaluate the performance.For predictions of two adjacent cycles, there is at least one cycle time step between their inputs, so the inputs of the two adjacent cycles are quite different.Therefore, the prediction of a solar cycle is a relatively independent process.Furthermore, the input is always the observed values, so there is no error accumulation or propagation between cycles.For example, 3-119 and Solar Cycle 19 are separated by only about half a cycle, their inputs are relatively similar, and their prediction horizons partially overlap.So the underestimation of 3-119 shows the underestimation of Solar Cycle 19 intensity.However, for the Solar Cycle 24 prediction, we input the previous 240 observed values and get an overestimated prediction.In the prediction of Solar Cycle 25, the previous 240 observations are input in the same way.The inputs of the two cycles are quite different due to the time steps of one cycle apart.Therefore, the overestimation of Solar cycle 24 did not affect the Solar Cycle 25 prediction.
In order to compare the effects of different input window lengths, we selected window lengths of 120, 240, 360, and 480.RMSE and MATL are also used to evaluate the performance of different models.The results are listed in the left half of Table 3.The models with 120 and 240 window lengths hold relatively low RMSE values, 25.61 and 26.62, respectively.However, the MATL value of the model with 240 size length is 1.34, meaning that the 240 length value holds fewer temporal deviations on each slice.The lower standard deviations also support that the model with this setting has a more stable performance.Moreover, the three model structures, i.e., no-LN added, stack-LN added, and block-LN added, are also designed by different LN strategies.The structures mean no LN layer added, LN layers added between stacks, and LN layers added between blocks, respectively.Their results are listed in the right half of Table 3.For the no-LN, stack-LN, and block-LN added structures, the RMSE values are 30.38,28.52,and 26.62,and the MATL values are 4.20,2.41,and 1.34.This proves that the model performance improves after  adding LN layers, the performance is especially optimal when LN layers are added between blocks.
Different input lengths sometimes lead to different predictions for a specific cycle in solar cycle prediction.For example, Dikpati & Gilman (2006) used the flux transport dynamo model to predict Solar Cycle 24.They applied the polar field strength of the previous three cycles as the input and estimated a stronger Solar Cycle 24 than the previous cycle.However, Choudhuri et al. (2007) used a surface flux transport dynamo model, combined with the polar field strength of the previous cycle, and found that Solar Cycle 24 would be a weaker one.We also compared the effects of different input lengths on the predicted results.Figure 6 shows the prediction results from cycles 21-24 using the different lengths of 120, 240, 360, and 480.The blue line is the actual values, and the orange, green, purple, and red lines represent the predicted results with 120, 240, 360, and 480, respectively.Although the prediction results of the models with different input lengths are slightly different in the same solar cycle, the variation trend is basically the same.There is no significant difference in predictions due to different input lengths, further proving that our model is stable.
Furthermore, we discussed the setting of different output lengths (i.e., prediction horizons).We compared the prediction performance of the model under the output lengths of 120, 130, and 140, and explored the change in the model prediction with the change in the output length.Table 4 lists the evaluations of the three models in each solar cycle.To the output lengths of 120,130,and 140,the RMSE values are 25.62,28.09,and 31.30, and the MATL values are 0.42, 1.17, and 3.00, respectively.The RMSE and MATL values rise as the output length extends.This is in line with the expectation that the larger the output length, the greater the difficulty of prediction.
Figure 7 shows what these models predict for the solar cycles from 21-24.The blue lines represent the actual values, while the red, green, and purple lines represent the predicted values when the output lengths are 120, 130, and 140, respectively.It is obvious that although the performance of the model will decline with the extension of the output length, it can still show the variation in solar cycles.
Table 5 lists the predictions of Solar Cycle 25 in our model and the previous methods.Some studies predicted that Solar Cycle 25 holds a larger amplitude and is more intense than Solar Cycle 24 (e.g., Rigozo et al. 2011;Sarp et al. 2018;Pala & Atici 2019;Dang et al. 2022;Lu et al. 2022;Prasad et al. 2022).In contrast, other studies suggested that Solar Cycle 25 has a smaller amplitude and would be weaker (e.g., Pishkalo 2008;Okoh et al. 2018;Pesnell & Schatten 2018;Sello 2019;Benson et al. 2020;Lee 2020).However, our result demonstrates that Solar Cycle 25 will have a slightly larger amplitude than the previous cycle by about 15%.
In this paper, we used the N-BEATS method to predict Solar Cycle 25.The predictions for known solar cycles achieve RMSE and MATL values of 26.62 ± 1.56 and 1.34 ± 0.35, respectively, demonstrating that our model has better performance.It is able to achieve fairly accurate prediction results in both numerical and temporal terms.For Solar Cycle 25, the SSNs will reach 133.9 ± 7.2 around February 2024, which would be slightly more intense than Solar Cycle 24.

Figure 2 .
Figure 2. Structure of the N-BEATS model.

Figure 3 .
Figure 3. Predictions for all slices, showing the first, middle, and last results of each slice.

Figure 6 .
Figure 6.Predictions of the solar cycles from 21-24 under different input lengths.

Figure 7 .
Figure 7. Predictions of the solar cycles from 21-24 under different prediction horizons.

Table 1
Divided Slices and Evaluations coefficients directly generate backcast and forecast by linear mapping.Therefore, the generic DL model exhibits poor interpretability.Compared with the generic DL model, the trend-seasonality model can be divided into the trend and seasonality models, which can correspond to the trend and seasonal components in time series decomposition.The trend

Table 2
Evaluations of the Solar Cycles from 13-24

Table 3
Model Evaluations under Different Structures

Table 4
Evaluations of the Solar Cycle Predictions under Different Prediction Horizons

Table 5
Comparison of Predictions of Solar Cycle 25