Multicomponent Activity Cycles using Hilbert-Huang Analysis

The temporal analysis of stellar activity evolution is usually dominated by a complex trade-off between model complexity and interpretability, often by neglecting the non-stationary nature of the process. Recent studies appear to indicate that the presence of multiple coexisting cycles in a single star is more common than previously thought. The correct identification of physically meaningful cyclic components in spectroscopic time series is therefore a crucial task, which cannot overlook local behaviors. Here we propose a decomposition technique which adaptively recovers amplitude- and frequency-varying components. We present our results for the solar activity as measured both by the sunspot number and the $K$-line emission index, and we consistently recover the Schwabe and Gleissberg cycles as well as the Gnevyshev-Ohl pattern probably related to the Hale cycle. We also recover the known 8-year cycle for 61 Cygni A, in addition to evidence of a three-cycles long pattern reminiscent of the Gnevyshev-Ohl rule. This is particularly interesting as we cannot discard the possibility of a relationship between the measured field polarity reversals and this Hale-like periodicity.


INTRODUCTION
The number of spots that emerge on the solar surface is known to vary in a regular 11-year cycle (Schwabe 1844).Polarimetric observations of the Sun have revealed the relationship between these structures and the solar magnetic field (Hale 1908), with the interesting observation that the magnetic polarity of sunspots switches sign every consecutive cycle.This polarity reversal means that a complete magnetic cycle spans 22years, twice the length of the Schwabe (amplitude) cycle.
A stellar monitoring survey was conducted at Mount Wilson Observatory for over 30 years and led the way towards detecting similar starspot cycles in the chromospheric activity of solar-type stars, as measured by the Ca II H+K flux (Wilson 1978).For the many main-sequence stars in which such cyclic behavior was observed, Brandenburg et al. (1998) identified two relationships between cycle length, rotation period and mean activity level, known as the active and inactive branches.However, many stars since then have been found to present a coexistence of long and short activity cycles, some of these falling on both branches simulta-neously, and some between them (e.g., Boro Saikia et al. 2018).
For a few stars, the large-scale magnetic field has been reconstructed using Zeeman Doppler imaging (ZDI) spanning several years, allowing the detection of field topology variations.On ξ Boo A, ϵ Eri and HN Peg, such variations were found to be rapid and localized in time (Morgenthaler et al. 2012;Jeffers et al. 2014;Boro Saikia et al. 2015).Although τ Boo was originally thought to reverse its polarity yearly, significantly longer than its 120-day S-index cycle, it was found later that this discrepancy had arisen from poor time sampling.Jeffers et al. (2018) discovered τ Boo's 240-day magnetic cycle, the fastest ever observed, which coincided with two complete consecutive chromospheric cycles.61 Cyg A was the first cool star beyond the Sun where a magnetic cycle was detected in phase with its chromospheric cycle (Boro Saikia et al. 2016); similar results were recently also confirmed in HD 75332 (Brown et al. 2021) and 18 Sco.
In order to better trace the magnetic evolution of solar-type stars, the Mount Wilson dataset has been recently revisited on several occasions, either by extending the data using multiple other surveys while sticking to traditional periodogram methods (e.g., Boro Saikia et al. 2018), or by introducing new methods to take into account the non-sinusoidal and locally harmonic nature of the cycles (e.g., Olspert et al. 2018).Indeed, as is well known for the Sun, stellar activity cycles are not perfectly harmonic, or even strictly periodic; variations in both cycle length and amplitude are common and can make it challenging to interpret the results of methods in the Fourier domain.
As shown by Oláh et al. (2016), time-frequency analysis can be used to reveal multiple and changing cycles in the Mount Wilson survey dataset.In this letter, we propose to extend this idea by using fully adaptive timefrequency methods with the specific purpose of disentangling multiple physically meaningful components from the data.
The Hilbert-Huang Transform (HHT), based on the Empirical Mode Decomposition (EMD; Huang et al. 1998), is a data-driven technique that addresses this exact issue.The goal of Hilbert-Huang analysis is to decompose a signal into a finite sum of amplitudemodulated frequency-modulated (AM-FM) oscillations plus some monotonic trend.Such oscillations, known as Intrinsic Mode Functions (IMFs), present well-defined instantaneous values of amplitude and frequency.These can be derived from their analytic signal using the Hilbert transform.This method of obtaining a time-frequency representation heavily contrasts with other traditional methods from the literature.Linear methods based on inner products with a pre-determined family of signals, such as the Short-Term Fourier Transform (STFT) or the Wavelet Transform, have their resolution limited by the Heisenberg Uncertainty Principle.On the other hand, the HHT is able to precisely track single values of instantaneous frequencies for each component, while also being agnostic to the specific signal shape.
Many improvements to the original EMD algorithm have been proposed to address different issues arising from the original formulation.For a more in-depth review of the HHT and EMD, see Huang & Shen (2005) and Huang & Attoh-Okine (2005).In this letter, we use a combination of the Complete Ensemble EMD with Adaptive Noise (CEEMDAN; Colominas et al. 2014) and the post-processing steps proposed by Wu & Huang (2009).The instantaneous frequency is then calculated on the resulting modes using the Normalized Hilbert Transform (NHT; Huang et al. 2009).
Hilbert-Huang analysis has been able to extract and characterize periodicities in the solar cycles from many activity indicators, such as the sunspot number (Barn-hart & Eichinger 2011; Gao 2017), the 10.7 cm radio flux, and helioseismic frequency shifts (Kolotkov et al. 2015).
The remaining parts of this letter are organized as follows: in Section 2 we define our activity time series datasets both for the Sun and for 61 Cyg A; next, Section 3 details our data-driven multicomponent analysis method as well as the preprocessing steps needed; Section 4 presents the results of our experiments, and we conclude our discussion in Section 5.

OBSERVATIONS
For the Sun, the first observational dataset used in our analysis as a proxy for magnetic activity is the daily total sunspot number series obtained from the World Data Center for Solar Index and Long-term Solar Observations (WDC-SILSO) at the Royal Observatory of Belgium (http://www.sidc.be/silso/datafiles).The dataset spans January 1818 to the present.The second dataset is the monthly-averaged Ca II K emission index composite; derived from the Kodaikanal, Sacramento Peak, and SOLIS/ISS observations (Bertello et al. 2016(Bertello et al. , 2017;;Egeland et al. 2017).
The S-index time series data for K dwarf 61 Cyg A is composed of the long-term surveys from the NSO Mount Wilson project (MW; Duncan et al. 1991;Baliunas et al. 1995) and the solar and stellar activity program from the Lowell observatory (SSS; Hall et al. 2007;Lockwood et al. 2007).We also used data published by Boro Saikia et al. ( 2016) using the NARVAL high-resolution spectropolarimeter and the data presented in the radial velocity catalog from the California Planet Search (CPS; Howard et al. 2010;Rosenthal et al. 2021).We additionally determined the S-index from the ESPaDOnS and NARVAL spectropolarimeter data from 2015 onwards following the methodology described in Marsden et al. (2014).These measurements span more than 50 years of observations, extending from mid-1967 through early 2018.

HILBERT-HUANG ANALYSIS
The goal of our method is to find a finite set of oscillatory components that add up to the original data, while also describing each individual component by its instantaneous values of amplitude and frequency.To this end, our pipeline consists of a preprocessing step to correct sampling issues, followed by the actual decomposition of the signal, and a post-processing step to ensure the components meet certain conditions needed for the Hilbert analysis to, in the final step, extract their time-frequency representations.

Preprocessing
Although the EMD is a fully data-driven algorithm (meaning that it makes no assumptions a priori on specific signal models), it still performs poorly if the input signal is unevenly sampled (see e.g., Barnhart et al. 2011).Therefore, we follow a similar preprocessing approach to the one used by Kolláth & Oláh (2009), in which the data is first downsampled to uniform bins, and later upsampled with a cubic smoothing spline.
We bin each time series at 90-day intervals using a triangular window to compute the averages and to estimate the variances.The exact value of the binsize has no strong impact on the resulting signal, as long as it's within a reasonable distance from the periods of interest; for stars with wildly different rotation rates, this value should vary accordingly.
The binned data is then interpolated at 10-day intervals by a regularized cubic spline with weights inversely proportional to the bin variances.The resulting smoothed series preserves the local time-frequency content of the original data to a high extent, as demonstrated by Kolláth & Oláh (2009).Since there are no significant gaps in the solar observations, the amount of added information to fill in missing data is negligible.Even for the 61 Cyg A data, which has a few minor gaps, the added information is limited.

Decomposition
The EMD consists of an iterative algorithm in which each mode is successively extracted through a sifting process.This sifting iteratively updates the current mode by subtracting an estimate of the local mean, corresponding to the average of the upper and lower envelopes, which are taken as the spline interpolations of the local maxima and minima.Convergence is assessed through a stopping criterion that checks if the resulting IMFs have the following properties: 1. the number of local extrema and the number of zero crossings are equal (or differ by one); 2. the local mean is (close to) zero (almost) everywhere.
In other words, all local maxima must be positive, all local minima must be negative, and their envelopes must be symmetrical.After enough IMFs have been subtracted from the original signal x(t) for the remaining data to be completely monotonic, we can write the resulting decomposition as the finite sum: where each C k (t) is one IMF and r(t) is the monotonic trend.
The local nature of EMD may result in mode-mixing, i.e., similar oscillations being separated in different modes or vice-versa.The improved algorithm used in this work is the Complete Ensemble EMD with Adaptive Noise (CEEMDAN; Colominas et al. 2014).Its main idea is to apply the EMD to an ensemble of degraded copies of the original signal, each arising from a different realization of additive white Gaussian noise.The EMD then behaves more closely to a filter bank and the resulting modes are more regular; the decomposition of spurious modes is also avoided.
Specifically, let E k (•) be an operator that returns the k-th IMF via EMD, then define M(•) as the local mean operator: (2) Also, let w (i) be the i-th white noise realization (i = 1, . . ., I).Then, in each CEEMDAN iteration, we calculate the ensemble average of the local mean: with q 0 = x and where β k are coefficients controlling the relative energy of the introduced noise.The k-th pseudo-IMF D k is then found by the difference between consecutive averages:

Post-processing
In the above analysis, we referred to the output of the CEEMDAN algorithm as "pseudo-IMFs".This is due to the fact that a sum of two different IMFs does not in general preserve the necessary properties that define an IMF; in particular, the ensemble averages computed during CEEMDAN cause the resulting D k to lose such features.
Since these properties are important guarantees for the assumptions that each component is a single AM-FM oscillation with well-defined amplitudes and frequencies, a post-processing step is needed to transform the pseudo-IMFs into proper IMFs C k .
In this letter, we formalize the generic steps proposed by Wu & Huang (2009) in the following post-processing algorithm: with r 0 = 0. Since consecutive modes present oscillations on similar timescales, this algorithm extracts a proper IMF from each consecutive pair of pseudo-IMFs, resulting in the same number K of components.The final monotonic trend is given by r(t) = r K (t) + q K (t), the sum of the final step's trends.

The marginal Hilbert spectrum
The main purpose of our analysis is not only to separate each individual component of the time-series signal, but also to identify their underlying periodicities.Since the extracted IMFs are not constrained to harmonic tones, but are, by construction, single-mode signals, the instantaneous frequency is a better measure for their time-varying periodicities.
We start by assuming each IMF can be written as an AM-FM oscillation with an arbitrary, slow-varying amplitude and phase functions a k (t) and ϕ k (t): The demodulation of the AM and FM subcomponents can be done with stable results by using the Local Mean Decomposition (LMD; Smith 2005).This method is similar in nature to the EMD, but it decomposes a signal into a sum of Product Functions (PFs): where A j and F j correspond to the AM and FM factors of each PF and m is the local mean residue.Due to the nature of the stopping criteria for the EMD, equation 7 is only true as an approximation, while equation 8 holds whenever the LMD converges; however, the LMD can diverge when processing arbitrarily complex signals, which is why it is only applied here to the already simplified IMFs.Other examples of this joint EMD-LMD technique can be found in Yue et al. (2020).
A single PF can thus very closely approximate each IMF, and we can write: where µ k (t) is very close to zero everywhere, and can locally account for some edge effects of the EMD.We therefore associate a k = A k and cos(2πϕ k ) = F k and derive the time-varying instantaneous frequency f k = ϕ ′ k by Hilbert-transforming the FM component (Huang et al. 2009).The Hilbert spectra of each mode can hence be defined as the distribution of the instantaneous frequencies weighted by the instantaneous squared amplitudes.

Solar Cycles
To validate our methods, we tested the extent of our ability to recover multiple, distinct cycles on solar data using two different observational proxies, as detailed in Section 2: the daily total sunspot number series from 1818 to 2021, and the composite K-line emission index from 1907 to 2017 (Bertello et al. 2016;Egeland et al. 2017), hereafter referred to as the K-index.
Figure 1 presents the result of our decomposition of solar activity data.The K-index is shown to correlate strongly with the sunspot number counts, and for both series our analysis identifies three main modes: the first one at ≈11 yr, another one at twice that period, and a longer ≈90-year cycle.Although the K-index series is too short to resolve what appears to be the Gleissberg cycle, its trend component can be seen to correlate very well with the latter half of the final sunspot IMF, which indicates that this method is able to identify long-term trends and potential analogues of the solar Gleissberg cycle within the relatively short spectroscopic time series currently available for other stars.
It is reassuring to notice that the amplitude of the primary cycle seems to be modulated in a shape very similar to the Gleissberg component, which is to be expected if these components truly represent the underlying physical processes.Also, the secondary cycle appears to capture the empirical Gnevyshev-Ohl rule, which dictates that odd solar cycles tend to be stronger than even ones (Gnevyshev & Ohl 1948;Hathaway 2015).The underlying cause of this observed behavior is thought to be related to the ≈22-year Hale magnetic cycle, but the connection is not yet entirely understood.

61 Cygni A
We also applied our decomposition technique to the much shorter time series of 61 Cyg A S-index measurements.The composition of observations from different instruments is shown in the top left panel of Figure 2 together with the smooth interpolation resulting from our preprocessing steps.The remaining panels of Figure 2 present the three best defined cyclic components found by the application of CEEMDAN and their corresponding Hilbert spectra.Interestingly, besides the well-known ≈8-years long chromospheric activity cycle (e.g., Baliunas et al. 1995, ; cycle 1), a component with a period about three times as long is also well separated (cycle 2).This component can be visually identified in the original data as a variation in activity maxima repeating itself in a pattern every three consecutive cycles.Such behavior is reminiscent of the solar Gnevyshvev-Ohl rule, except with a period ≈3 times rather than twice that of the activity cycle.
This star had its large-scale magnetic field detected via ZDI in six different epochs ranging from 2007 to 2015 (Boro Saikia et al. 2016) and six more epochs from 2015 to 2018 (Boro Saikia et al. 2018).Its magnetic field polarity was found to flip twice between the first and last observation, indicating the presence of a Halelike magnetic polarity cycle in this star.Those polarity measurements are shown in Figure 2 as vertical lines, colored according approximately to the average field polarity and intensity at these epochs.An inversion can be seen to occur between the timeframes of two chromospheric activity minima, in agreement with the hypothesis of a ≈ 15-years long magnetic cycle.However, it is also important to note that a very short baseline was used to draw this inference, and the few datapoints we have cannot exclude the possibility that the Hale-like polarity cycle is actually linked to the observed threecycle Gnevyshev-Ohl pattern, with a somewhat longer inversion period of ≈ 11 years.The predictions of both these hypotheses are shown in Figure 2 as the colored red and blue backgrounds in the two cyclic components.We introduced a fully data-driven approach to separate different coexisting activity cycle components using the Hilbert-Huang transform that is immune to the nonstationary variations in cycle lengths and amplitudes known to exist.
This methodology was shown to be capable of not only recovering the mean periods of known solar cycles at different timescales, but also the time-domain characteristics of these components, such as amplitude variations, in agreement with expectations.
For the K-dwarf 61 Cygni A, our method confirms its Schwabe-like 8-year cycle and also presents evidence of a potential three-cycle long Gnevyshev-Ohl-like behavior, shown to be possibly tied to a polarity reversal.A confirmation of this result would be of some importance in the sense that (i) it strengthens the relationship between the solar Gnevyshev-Ohl rule and the Hale cycle, and (ii) it shows that we are able to detect the activity signature of magnetic polarity changes in stellar spectroscopic calcium data.

APPENDIX
A. ON THE DEPENDENCE OF THE RESULTS ON THE CHOICE OF SMOOTHING While describing our proposed pipeline and its main advantages, we highlighted the fact that it was a fully adaptive and data-driven method and that it doesn't rely on previous knowledge surrounding the dataset.We also claimed that, even though there apparently is need of a choice surrounding the smoothing procedures during the preprocessing steps, there was no strong effect on the results as long as the bin sizes were chosen within some reasonable interval.This short appendix intends to justify such a claim.
The main parameter that controls how much the input data is smoothed out is the size of the bins used for averaging.Figure 3 shows an example when the bins are almost twice as spaced out as what was used in the paper (see Figure

Figure 1 .
Figure 1.Result of CEEMDAN for the Sun.The sunspot number is shown in blue and the K-index in orange.The top-left panel shows the preprocessed input data.The right-side panels show the Hilbert spectra corresponding to their respective time-domain components.

Figure 2 .
Figure2.Result of our analysis for 61 Cyg A. The colored vertical lines indicate the epochs for which its large scale magnetic geometry was reconstructed using ZDI(Boro Saikia et al. 2016, 2018).The background colors indicate predicted polarities a hypothesis of a complete magnetic reversal taking three times cycle 1 (shown in the Cycle 2 panel) or two times cycle 1 (as in the Sun, shown in the Cycle 1 panel).