Time series synchronization in cross-recurrence networks: uncovering a homomorphic law across diverse complex systems

Exploring the synchronicity between time series, especially the similar patterns during extreme events, has been a focal point of research in academia. This is due to the fact that such special dependence occurring between pairs of time series often plays a crucial role in triggering emergent behaviors in the underlying systems and is closely related to systemic risks. In this paper, we investigate the relationship between the synchronicity of time series and the corresponding topological properties of the cross-recurrence network (CRN). We discover a positive linear relationship between the probability of pairwise time series event synchronicity and the corresponding CRN’s clustering coefficient. We first provide theoretical proof, then demonstrate this relationship through simulation experiments by coupled map lattices. Finally, we empirically analyze three instances from financial systems, Earth’s ecological systems, and human interactive behavioral systems to validate that this regularity is a homomorphic law in different complex systems. The discovered regularity holds significant potential for applications in monitoring financial system risks, extreme weather events, and more.


Introduction
Time series data mining has long been a widely researched topic [1][2][3][4].One of its subfields focuses on evaluating the synchronization and dependence between time series, assessing the degree to which a given time series is similar or correlated with another.Considering individual time series as elements in a system, the dependence between elements often leads to emergent behaviors in the system [5][6][7][8].Particularly, the dependence or synchronous patterns between the extreme values in time series deserve much more attention as it is often associated with revealing systemic risks [9][10][11].Traditional methods for analyzing the relationship between two time series typically employ linear measures such as Euclidean distance and Pearson correlation coefficient.However, time series in real complex systems often exhibit nonlinear similarity and dependence, and the complex dynamical mechanisms underlying time series are difficult to capture solely based on values of a single observation.Therefore, there is a compelling demand for investigating more comprehensive nonlinear measures for time series in order to unveil the synchronicity between pairwise time series, particularly during occurrences of extreme events.Traditional linear measures exhibit notable constraints in this regard, whereas techniques such as recurrence plots (RPs) and complex networks serve as potent nonlinear approaches for elucidating the intricate dynamical mechanisms inherent in time series.The former enables the extraction of underlying dynamical properties in short or non-stationary time series, while the latter facilitates the mapping of time series onto networks and the representation of original sequence attributes using network statistics.
Eckmann et al first introduced the concept of RPs, highlighting the recurrence of states as a fundamental attribute of every dynamic system [12].Based on this attribute, RP was developed as a two-dimensional visualization tool.Over the past two decades, RP has evolved into a nonlinear method for describing complex dynamics [13].RP represents a graphical representation of a binary symmetric matrix, encoding the times when two states are very close (i.e.neighbors in phase space).The core idea is to reconstruct the attractor from the time series using the delay embedding method [14].If the metric distance between two points on the reconstructed attractor is less than or equal to a threshold, it is considered a state reoccurred.All recurrence of trajectory points can be represented by a two-dimensional graph, with each point indicating whether the corresponding trajectory point has a recurrence.RP provides abundant information about the underlying dynamical properties of the system.By analyzing the recurrence matrix, information about the system's dynamics can be extracted and quantified using techniques like recurrence quantification analysis (RQA) [15].Additionally, RP-based techniques can analyze short and non-stationary data, making them highly applicable to studying real-world data [16].Although there is not a precise definition in academia for the length of so-called 'short' time series, recurrence-based methods have found extensive application in fields facing data acquisition challenges like geology, paleoclimatology, physiology, among others.Research in these areas, employing simulation and empirical evidence, has demonstrated the effectiveness and robustness of recurrence-based methods when analyzing short time series with lengths greater than 100 but not exceeding 500 [12,15,[17][18][19].Therefore, this paper will also employ empirical studies using sample sequences within this range of lengths.
In recent years, significant progress has been made in developing methods for complex system analysis based on RP.However, since RP is typically used to analyze the recurrence patterns of a single time series, further extensions are required to analyze the similar patterns between pairwise time series.Marwan et al proposed the concept of cross-RPs (CRP) [13], which simultaneously embeds two time series into phase space and compares their dynamic behaviors [20].The cross-RP (CRP) displays all times when the state of a dynamic system occurs simultaneously in the second dynamic system, providing a two-dimensional cross-recurrence matrix.In other words, CRP shows all the times when the phase space trajectory of the first system is roughly the same as the trajectory in the phase space of the second system.Similarly, quantification tools based on CRP, such as cross-RQA (CRQA), can measure how and to what extent two time series exhibit similar patterns.This analysis framework was initially developed and widely used in natural sciences, such as heart rate variability, seismology, and chemical fluctuations, among other fields [13,20].In psychology, it has found extensive applications in the field of motor control [21][22][23].Richardson et al provided a broader context for the method in the domains of dynamical systems and psychology [24].It has also been applied to capture dynamic patterns between individuals, for example, uncovering interaction behaviors during goal-oriented tasks [25] or conversations [26].In these applications, the time series of interest can be factorial data describing body sway or eye movement states, as well as numerical data such as heart rate [21,26,27].
Due to the ability of complex networks to capture both local and global properties, they have become instrumental in understanding the complex relationships and information flow among different components in extended systems [16].Complex networks have gained considerable popularity in analyzing complex, particularly spatially extended systems [28,29].Additionally, because the CRP plot provides an adjacency matrix, it can serve as a basis for constructing a complex network.Therefore, by utilizing the cross-RP as an intermediary, it becomes possible to map pairwise nonlinear time series onto a network, enabling the use of network analysis tools instead of traditional methods for time series analysis.Donner et al demonstrated the fundamental relationship between the topological properties of the recurrence network and the statistical properties of attractor densities in the underlying dynamical system [30].This complements the existing RQA by incorporating network descriptions as new quantitative features of the dynamic complexity of time series.Recurrence networks and related statistical measurements have also become important tools for analyzing time series data [31].However, to the best of our knowledge, statistical metrics of cross-recurrence networks (CRNs) have received little attention and have not been widely utilized to assist in the analysis of the original time series.As we are interested in capturing certain types of cross-correlation or dependence between pairwise time series, which are often not accurately captured by traditional linear analysis tools such as Pearson correlation coefficients, we construct CRNs based on the original sequences.We establish a connection between the statistical features of the CRN and the original time series, aiming to reveal the coupling relationships of the original sequences using statistical metrics derived from the established network.
Inspired by Wallot et al [27], we observe the statistical metrics of CRNs established from pairwise time series and the properties of the time series themselves.Through our theoretical derivations, we discover a significant positive linear correlation between the clustering coefficient of the CRN and the synchronicity of the pairwise time series.To validate this regularity, we consider multicomponent dynamical systems and employ the coupled map lattices (CML) technique to simulate multidimensional time series with coupling relationships.By varying the coupling strength between the sequences, we find that there exists a strong linear relationship between the synchronicity of pairwise sequences and the clustering coefficient of their corresponding CRNs.The simulation results confirm our findings.Furthermore, we conduct empirical analyses in financial systems, Earth systems, and human interactive systems.We find strong universality of this regularity in time series from different complex systems, indicating it as a homomorphic law.Due to the relatively stable topological structure of recurrence networks and their applicability to short time series, the regularity uncovered in this study reveals the relationship between statistical metrics of complex networks derived from time series mapping and event synchronicity.Thus, it holds significant practical implications for real-time monitoring of future financial risks, natural crises, or human behavior using CRNs, particularly when faced with limited data.
The remaining sections are organized as follows: section 2 provides an introduction to the fundamental methods of RPs and cross-RPs (CRP), along with their corresponding statistical metrics.Additionally, it discusses the network metrics derived from RP and CRP.In section 3, a theoretical derivation is presented to establish the mathematical relationship between the clustering coefficient of the CRN and the probability of synchronicity.Section 4 validates this relationship by employing the CML technique to simulate coupled multidimensional time series.Section 5 presents empirical evidence using real-world financial time series, rainfall time series, and eye-tracking sequences during human interactions.These empirical analyses aim to demonstrate the homomorphic nature of the discovered regularity across diverse complex systems.Finally, section 6 provides a summary and conclusion of the paper, highlighting the key findings and contributions.

Method
A RP is a graphical representation of the recursive states of a dynamical system in its m-dimensional phase space.For all phase space vectors − → x i , a pairwise measurement based on distance is performed: where Θ (•) is the Heaviside function, ε represents the threshold of closeness, and d is the measure of closeness.Different measures can be used to quantify closeness, such as spatial distance, string metrics, or local rank orders [13,32].In most cases, spatial distance is considered using the Euclidean distance, where In the binary recurrence matrix R, if the distance ∥ − → x i − − → x j ∥ is less than ε, the corresponding R i,j is set to 1.The phase space trajectories can be reconstructed from the time series {u i } N i =1 using time-delay embedding [33]: where m represents the embedding dimension and τ represents the time delay.And {u i } N i =1 represents the observed values of the time series variable of interest, derived from our collected data samples.Optimal values for m and τ can be determined by calculating the average mutual information (AMI) function and the false nearest neighbors (FNN) function, ensuring the coverage of all free parameters and avoiding autocorrelation effects [34].Specifically, FNN can be computed for different embedding dimensions, and the optimal embedding dimension is chosen as the first local minimum or the dimension that corresponds to a smooth curve.Similarly, for time delay, AMI is calculated for different time lags, and the optimal time lag is selected as the first local minimum value of AMI.Let us illustrate the meaning of equation (2) with a simple example.Suppose the series { − → x i } N i =1 represents the actual phase space vector series of a variable.Considering {u i } N i =1 as observations from one dimension, unable to fully reflect the variable's true dynamical characteristics, we employ delaying and embedding for {u i } N i =1 for phase space reconstruction.Assuming the optimal embedding dimension m = 3 and optimal delay τ = 1, we aim to obtain the first-order lagged sequence of , as observations for the second dimension, and subsequently, the first-order lagged sequence of , as observations for the third dimension.After this embedding process, { − → x i } N i =1 becomes a three-dimensional sequence, where each vector − → x i is obtained from observations in three dimensions: (u i , u i +1 , u i +2 ) for i = 3, . . ., N. Due to the lagged operation, the number of points with observations in all three dimensions reduces from N to N − (m − 1).Thus, d represents the distance between points i and j in this three-dimensional space.If the reconstruction of phase space is not performed, i.e. m = 1, d signifies the absolute difference between u i and u j in the sequence.We use a simple schematic diagram in figure 1 to illustrate the above process.However, our study primarily focuses on identifying cross-recurrence patterns among pairwise time series variables in multivariate systems, necessitating equal lengths.Due to variations in the optimal embedding dimensions and delays for each time series, ensuring equal lengths for any pairwise sequences after phase space reconstruction is challenging.Therefore, for simplicity, we do not perform embedding of the original series {u i } N i =1 by defaulting to m = 1.Consequently, the equation ( 2) can be written as − → x i = u i in our case, and the process shown in figure 1 is not involved.This practice aligns with common approaches in studying multivariate systems where multivariate analysis itself offers a better description of a system's evolutionary features compared to univariate analysis [35].
Clearly, in RP, there is an evident diagonal line representing the recurrence of each point with itself.If spatial distance is used as the criterion for recurrence, RP is symmetric.Small-scale features in RP can be observed through diagonal and vertical lines, and the morphology of these special lines reflects the dynamics of the system.Following a heuristic approach, Zbilut and Webber [15] introduced a quantitative description of RP based on these line structures, known as RQA.RQA defines measures such as diagonal line length, recurrence rate (RR), determinism (DET), average length of diagonal structures, and entropy to characterize the diagonal segments in the RP.Table 1 presents these measures and their definitions.
The RP is extended to the CRP, which compares the dynamic behavior of two time series simultaneously embedded in phase space [13].Specifically, for each point in the first trajectory − → x i (i = 1, . . ., N, − → x i ∈ R m ), and each point in the second trajectory ⃗ y (j = 1, . . ., N, − → y j ∈ R m ), the distance measure d ( − → x i , − → y j ) is calculated, resulting in an N × N matrix indicating their closeness: CRP is the two-dimensional plot generated from the binary cross-recurrence matrix CR.In CRP, long diagonal line structures reveal synchronization between the two time series in phase space.Corresponding RQA measures are redefined, where RR, DET, and mean diagonal line length (MEAN_DL) are functions of the distance from the main diagonal [13].This indicates that CRP can explore the dynamical similarity of the two sequences in the presence of time delays.For example, in CRP, the diagonal-wise RR focuses only on the recurrence along the diagonal and is defined as: where k represents the time delay between the second trajectory ⃗ y and the first trajectory − → x i , and k can be positive or negative.In practical research, the optimal time delay, k, which maximizes RR k , indicates the best delay for exhibiting the highest similarity between the two sequences.The graphical properties of RP and CRP reflect the dynamic patterns of the sequences.When transformed into complex networks, the adjacency matrices derived from the recurrence matrix R and the cross-recurrence matrix CR can further reveal the self-similarity and mutual similarity patterns of the sequences.Various metrics commonly used to measure the topological structure of complex networks are listed in table 2.
These metrics provide insights into the properties of recurrence networks, capturing characteristics such as clustering, connectivity, average degree, average path length, and centrality measures.They help quantify the structure and importance of vertices within the recurrence network.Donner et al [30] established a connection between the properties of recurrence networks and the phase space topology of dynamic systems

Measurement Definition Explanation
The number of recurrence points (RP_N) The number of recurrent points in the RP plot Recurrence rate (RR) The proportion of recurrent points in the RP plot It represents the ratio of recurrent points to all points, indicating the probability of specific recurrent patterns.

Determinism (DET)
The percentage of recurrent points forming diagonal lines in the RP plot, given a minimum diagonal line length threshold It reflects the extent to which deterministic behavior dominates the system, as random behavior leads to shorter diagonal lines while deterministic behavior results in longer diagonal lines.Thus, the ratio of recurrent points forming diagonal structures to all recurrent points serves as an indicator of the determinism of a system.

Mean diagonal line length (MEAN_DL)
The  represented by RPs.However, their work primarily focused on demonstrating the fundamental relationship between the statistical properties of the underlying dynamical system and the topological properties of the corresponding recurrence network.In contrast, this paper emphasizes the relationship between the statistical metrics of CRNs and the probability of pairwise time series experiencing synchronized 'events' .The objective is to utilize the network's topological structure to reveal the likelihood of synchronization risks or states occurring between two time series.This approach offers an alternative perspective for understanding and mitigating synchronization risks among multiple entities.

Theoretical derivation
In our study, we observe the common statistical metrics of CRN, as listed in table 2, in relation to the proportion of synchronized states between pairwise time series.We find a deterministic linear relationship between the clustering coefficient and the occurrence of synchronized events (i.e.relatively extreme states).We will present further details and characteristics of the CRN in this section, and then derive the theoretical basis for the pattern we have discovered.To simplify the process of determining the threshold for recurrence when constructing the recurrence network, we transformed the numerical time series into categorical time series based on the relative magnitudes of their observed values.Specifically, assuming equal probability of the occurrence of each state, by increasing percentile levels at fixed increments from 0% to 100%, we can obtain n + 1 percentiles from the pseudo-sample distribution of the numerical time series of length N.
Consequently, this partition yields n ranges, where the sample values falling within these ranges are classified into n states.These n states can be ordered according to the relative sizes of their corresponding ranges.Hence, the final state space can be designated as (a 1 , a 2 , . . ., a n ) (1 ⩽ n ⩽ N), facilitating the identification of so-called extreme states due to its inherent orderability.The choice of n should neither be too small nor too large; if it is too small, the probability of occurrences of a 1 or a n will be significantly higher than that of extreme events.Conversely, if it is too large, each state will occur with lower frequency, resulting in very few edges in the CRN and making it challenging to extract useful information.Considering the application of recurrence-based methods on shorter time series, an empirical range for n could be between [10,20].Consider a two-dimensional time series {X t , Y t ; t ⩾ 0}, where the components have the same state space (a 1 , a 2 , . . ., a n ) (1 ⩽ n ⩽ N).With a set of observation samples of length N, (x 1 , y 1 ) , . . ., (x N , y N ), we can construct a CRN for these two time series variables.Specifically, we designate the N moments as network vertices, denoted as V = (1, 2, . . ., N).The cross-recurrence matrix R = { A ij } N×N is constructed based on whether the respective states at two moments, one from each time series, are the same or not, defined as follows: Clearly, this matrix is not symmetric; generally, A ij ̸ = A ji .As defined above, the determination of states is done from the perspective of X t , where the state of each moment in X t is compared sequentially to that of every moment in Y t .If the matrix is constructed from the perspective of Y t , the comparison is done similarly, and the resulting cross-recurrence matrix is denoted as R * = { B ji } N×N , where: Hence, we have A ij = B ji , establishing the relationship between R and R * as transpose matrices, i.e., R T = R * .This property is demonstrated graphically in figure 2.
When establishing the CRNs, since our primary interest lies in determining whether there is a connection between nodes rather than focusing on the directionality or weight of these connections, we obtain symmetric adjacency matrices, R ′ and R * ′ , based on R and R * .And then the CRNs are constructed by these symmetric adjacency matrices.Consequently, the resulted network is undirected and unweighted.Moreover, this network includes self-loops, denoted as Within the constructed CRN, we observed a particular property: the clustering coefficient of this network is positively correlated with the probability of synchronous states between the two time series, i.e. the synchronization probability.This relationship is expressed as: where C represents the global clustering coefficient of the CRN, and (X = Y) signifies the simultaneous occurrence of identical states in both time series.The proof of this relationship is as follows: Let v be any vertex in the CRN, and C ν be the local clustering coefficient of that vertex.Since the adjacency matrix corresponding to the CRN is symmetric, when analyzing the actual meaning of the network's edges e ij (i, j = 1, 2, . . ., N), it is reasonable to consider either A ij = 1 or A ji = 1.Therefore, the definition formula for the local clustering coefficient can be written as: This equation demonstrates that the local clustering coefficient of CRN is equal to the conditional probability of synchronization between two time series.Particularly, when the states of the two-dimensional time series at different moments are mutually independent, we have: Under this condition, the local clustering coefficient equals the unconditional synchronization probability of the two time series.However, assuming the independence of the states of the two-dimensional time series at different moments is a strong assumption, indicating that this two-dimensional time series is independently and identically distributed (i.i.d.) sequences.In reality, when the auto-correlation and cross-correlation functions of the two time series decay exponentially, i.e. there's only short-term correlation within and between the time series, we can derive an approximate relationship: C ν ≈ P (X = Y).
It can be observed that the local clustering coefficient of any vertex in the network is determined by a same term.Consequently, the global clustering coefficient C can be obtained as the average of all local clustering coefficients: C = P (X = Y) holds if the strong independence assumption is met, and C ≈ P (X = Y) when the correlations are weak or decay rapidly over time.Under the strong independence assumption, if we are more concerned about the synchronization probability of extreme events, that is, the probability of the co-occurrence of the extreme states in the two series, the above equation can be further decomposed as: where P 0 = n−1 ∑ k=1 P (x i = y i = a k ), and P 1 = P (x i = y i = a n ).P 1 represents the probability of extreme event synchronization, while P 0 denotes the probability of the synchronization of other states.Clearly, while maintaining P 0 constant, the variation in P 1 is directly proportional to C. This derivation demonstrates two key points: firstly, the clustering coefficient of a CRN established between two time series is equal to the synchronization probability when the states of the two-dimensional time series at different moments are independent; secondly, the clustering coefficient is linearly correlated to the synchronization probability of extreme event, whether the strong independence assumption is satisfied or not.From a physical perspective, the synchronization of two time series is determined by the coupling strength between their respective attractors.In general, if the coupling strength is increased, the synchronization between the time series is enhanced, leading to a higher clustering coefficient in the CRN of the two time series.Although Chen et al [36] also emphasized that the relationship between coupling strength and synchronization can be complex and nonlinear, and it might depend on the specific characteristics of the coupled systems.The coupling strength remains a crucial variable that plays a significant role in altering the synchronization between time series.In the next section, we will employ simulation experiments to further demonstrate the findings mentioned above.

Simulation experiments
From a physical perspective, the synchronization of two time series can be understood as the effect of coupling between their respective attractors, manifested as the dependence between the states of the time series.In this paper, we consider CMLs, which are simple spatiotemporal chaotic models widely used for modeling complex spatiotemporal dynamics.Generally, their form is given by: where x [κ] t can be understood as the value of the κ th sequence at time t (κ = 1,2, . . ., M, M denotes the dimension of the multivariate time series), and h t represents the interaction of the elements from other time series at time t on the elements of the κ th sequence.The first term on the right side represents the internal chaotic dynamics determined by the nonlinear mapping function f(x), while the second term represents the mutual coupling effect generated by the coupling parameter ζ (0 < ζ < 1).Generally, CMLs have three types of coupling: local, direct-neighbor coupling, where h t ); and intermediate-range coupling, where h ).In this experiment, following the approach of Lacasa et al [37] and Eroglu et al [35], we consider the M-dimensional time series as M points on a ring model.With this approach, the dynamic evolution of each point x [κ] is determined by the internal chaotic evolution and the average coupling effect between neighboring points, resembling a form of direct-neighbor coupling: This coupling mechanism ensures that the simulated sequences only exhibit short-range correlations, as the single simulated value only couple with its neighbors.As ζ varies, the system is trapped in different attractors, leading to different degrees of synchronization and dynamical phases.Although ζ does not directly reflect the synchronization between time series, it acts as a coupling strength parameter that alters the synchronization between time series.Thus, we can observe the relationship between the synchronization probability of each pair of time series and the clustering coefficient of the corresponding CRN under different values of ζ in the system.The classic logistic mapping function f (x) = 4x (1 − x) is adopted as the internal chaotic dynamics mechanism to represent the underlying dynamics of the system and the system is assumed to be in a five-dimensional phase space (M = 5).The coupling parameter ζ is set within the range [0, 0.4], with an increment of △ ζ = 0.005 to investigate the effect of different coupling strengths.Due to the sensitivity of chaotic systems to initial conditions, small variations in the initial values can lead to significantly different system structures.To capture the prominent avalanche effect, we iterate the system 15 000 times and use the last 10 000 simulation values for analysis.Although ample previous research confirms the robustness of recurrence-based methods in the presence of noise [19,31,38], to demonstrate the stability of the identified homomorphic regularity across different signal-to-noise ratios (SNRs), we will add varying levels of noise to the above simulated series.During each experiment, we introduce white noise (following a Gaussian distribution with a standard deviation of 1 and mean of 0) into the five simulated sequences within the CML system, setting SNRs at 2, 5, 10, and 20, representing 50%, 20%, 10%, and 5% noise addition, respectively.Then we factorize each time series into ten states based on the relative magnitudes of their values.We establish CRNs for the updated series and compute the clustering coefficients and synchronization rates.With multiple sample results under the coupling parameter ζ, we perform linear fitting on the sample series of clustering coefficients and synchronization probabilities, yielding parameter estimates (slope and intercept) for first-order linear regression.Through 100 iterations of the aforementioned setup, we derive pseudo-sample distributions of slopes and intercepts for different random initial values.Based on these pseudo-distributions, we establish 95% confidence intervals for the slope and intercept estimates.The criterion for assessing the strong linear correlation between CRN clustering coefficients and synchronization probabilities relies on whether the slope 1 and intercept 0 fall within the respective confidence intervals.It is worth noting that under certain ζ values, the simulation sequences may evolve into a few fixed values, leading to a reduced number of distinct states, possibly fewer than ten.In such cases, when computing the synchronization probability and constructing the CRN for any two time series, we adopt the number of states with the least partition from both time series as the reference.This ensures an equal number of distinct states in each series of a pair.
Figure 3 illustrates the relationship between CRN clustering coefficients and synchronization probabilities at various SNRs, including a noise-free scenario for comparison.Remarkably, even at a SNR of 2, representing a 50% noise addition, the robust linear relationship between CRN clustering coefficients and synchronization probabilities persists.The fitted slope tends closer to 1 when noise is minimal or absent.Meanwhile, figure 4 presents the pseudo-sample distributions of parameter estimates from the fitted linear models at different SNRs.As noise increases, the pseudo-sample distribution of slopes gradually skews right; however, a significant portion of slope estimates remains within proximity to 1. Similarly, the distribution pattern of the intercept did not exhibit systematic changes with alterations in the noise.Table 3   95% confidence intervals for slope and intercept estimates across different scenarios, showcasing that the slopes 1 and intercepts 0 fall within the corresponding intervals irrespective of the noise levels.This observation suggests the insensitivity of the relationship between CRN clustering coefficients and synchronization probabilities to some extent, regardless of the noise present.Therefore, we have valid reasons to believe that the observed regularity remains effective even in empirical series with varying degrees of noise.

Empirical analysis
In the empirical study, we will focus on verifying the homogeneity of the linear relationship revealed by equation (11) in different systems, as event synchronization is often a crucial factor in triggering systemic risks.We will also examine the performance of the equivalence between the clustering coefficient in CRN and the unconditional synchronization probability in actual data.We select price time series from the financial system, rainfall data from the Earth's ecosystem, and eye-tracking sequences from human interactive behavior as empirical objects.For each system, we establish CRNs for all pairwise time series and calculate their global clustering coefficients.Subsequently, we obtain the occurrence ratio of synchronous extreme events for each pair of sequences, aiming to verify whether there exists a significant positive linear correlation between the global clustering coefficient and the ratio of event synchronization.

Financial system 5.1.1. Data
The stock data from the two major trading markets in China, namely, the Shanghai Stock Exchange and the Shenzhen Stock Exchange is considered empirical objects.We select constituent stocks from the Shanghai Composite Index (SH000001) and the Shenzhen Component Index (SZ399001) based on their market capitalization weights arranged in descending order.Our focus is on choosing stocks with complete trading data to ensure data integrity and reliability.Ultimately, we identify 56 stocks from the Shanghai Composite Index and 34 stocks from the Shenzhen Component Index that have complete trading data from 4 January 2015, to 30 October 2020.The data is divided into two parts: the training set (from 2015 to 2019) and the test set (2020).The data length of each time series is denoted as N, resulting in an N × N recurrence matrix when establishing the CRNs or CRPs.It is worth noting that a large N is not suitable for constructing CRNs or CRPs due to the computational complexity.Considering that financial returns often exhibit yearly cyclical changes, we will use the daily return data from 2015 to 2019 to build CRNs for each year separately.And the length of the sample data of each year falls within the range of 200-240.Subsequently, we will calculate the ratio of daily coexceedances of pair stocks within a year to verify the robustness of the relationship between the topological structure of CRNs and the ratio of coexceedances of daily returns across different years.
Exceedances here refer to positive returns exceeding the 95th percentile and negative returns falling below the 5th percentile.And coexceedances specifically refer to exceedances in the pair stocks that occur at the same moment or at the fixed lead-lag moments.Table A1 and A2 present the specific stock codes for the selected stocks in the SH000001 and SZ399001, respectively.

Parameter setting
As described in section 3, establishing CRNs based on numerical data involves the selection of the radius parameter.However, determining a suitable fixed radius is challenging, and there is currently no reliable literature supporting a specific choice.In this context, our primary interest lies in the frequency of occurrences of joint extreme returns between pairs of stocks Extreme values represent absolute values of significantly positive and negative returns.We then discretize returns into ten equidistant states, ensuring a uniform distribution of returns across these states.By converting the numerical variable into a factor variable, recurrence is only considered to have occurred when the states are entirely consistent (radius = 0), thereby avoiding the need for optimal parameter selection.Consequently, CRNs are established based on factor variables.Since we consider the data of different constituent stocks as descriptions of different dimensions within the financial system, we will no longer perform the reconstruction of the phase space in the empirical analysis.Therefore, the embedding dimension is set to 1, and the time-lag value does not affect the results when the embedding dimension equals 1.

Optimal delay between time series
In cross-RQA (CRQA), we can construct CRPs for all possible lags between pairs of sequences and extract corresponding indicators, such as diagonal-wise RR, to explore the maximum ratio of state recurrence between the two sequences at different time lags.This helps us investigate the delayed similarity patterns between pairs of stock return sequences within the same financial system.If there exists a time delay that maximizes the RR between two return sequences, we can reasonably assume that there are statistically significant differences in their reaction speeds to the same financial events in the market.This information can also guide us in establishing rules for identifying coexceedances, which means detecting exceedances in two sequences either with a time delay or simultaneously.
Based on the data from each year in the training set, with a maximum delay of 10 (k max = 10), we construct all possible CRPs for any pair of factorized return series and observe the optimal delay that maximizes the diagonal-wise RR.The results show that the optimal delay for any pair of sequences is 0, indicating that the highest RR occurs when there is no lead-lag relationship in the pairwise daily return series.Taking the daily returns of sh601398 and sh601288 in the Shanghai market and sz000027 and sz000400 in the Shenzhen market in 2015 as examples, figure 5 shows the variation of diagonal-wise RR RR k within the range of {k = ±i} 10 i =1 .The RR k curves for any other pair of sequences exhibit a similar pattern to that in figure 5, where RR k reaches its highest relative value at k = 0.In other words, when there is no delay between pairwise stock return series, there are more occurrences of the same state on the same day.Based on this observation, when determining coexceedances, if exceedances occur in different assets on the same day, we consider it as a case of coexceedance.

Relationship between CRP and CRN indicators and the ratio of coexceedances
In the absence of delay in the pairwise series, we construct the CRP and CRN separately and obtain the commonly analyzed indicators in tables 1 and 2. Taking the results of 2015 as an example, we show the scatter plots of any pair of stocks in the Shanghai market corresponding to the CRP indicators versus the ratio of coexceedances in figure 6, and the CRN indicators versus the ratio of coexceedances in figure 7. The corresponding plots figures A1 and A2 for the Shenzhen market are shown in the appendix.
In figure 6, there is no clear linear or nonlinear relationship observed between the CRP indicators and the proportion of coexceedances.However, in figure 7, it can be observed that there is a positive linear relationship between the average shortest path in the complex network established based on pairwise returns  and the proportion of coexceedances between the returns of the two stocks.Additionally, there is also a positive linear relationship between the network's clustering coefficient and the ratio of coexceedances.
We validate these two relationships using the 5 year training set data separately.As shown in figure 8, the linear relationship between the average shortest path and the ratio of coexceedances in individual stocks' CRN networks in the Shanghai Composite Index is not consistently significant, such as in 2016 and 2017.However, for each year's data, there is a clear positive linear relationship between the clustering coefficient of stocks' CRN and the ratio of coexceedances. Figure 9 illustrates the corresponding relationships in the Shenzhen Composite Index.It can also be observed that the linear relationship between the average shortest path and the coexceedances is weaker than the linear relationship between the clustering coefficient and the coexceedances.To further investigate these linear relationships, we conduct Pearson correlation tests and use the Wilcoxon rank-sum test to verify if the clustering coefficient and the ratio of coexceedances follow the same distribution.The results are presented in table 4. As indicated in the table, for constituent stocks in the Shanghai market, the clustering coefficient of CRN from 2015 to 2019 is significantly and positively correlated with the corresponding ratio of coexceedances.This strong correlation is evident as all the linear  correlation coefficients for each year are greater than 0.7.For constituent stocks in the Shenzhen market, there is also a significant positive linear relationship between the CRN clustering coefficient and the ratio of coexceedances from 2015 to 2019, but the strength of the relationship is weaker than that in the Shanghai market.The correlation coefficients between the average shortest path of CRN and the ratio of coexceedances in 2017 and 2019 are lower than 0.5.
Based on these findings, we conclude that only the clustering coefficient of the CRN for pairwise daily returns shows a robust and strong positive linear relationship with the ratio of coexceedances.The Wilcoxon rank-sum test results suggest that both in the Shanghai and Shenzhen markets, and for all years, the clustering coefficient and the average shortest path of individual stocks' CRN do not belong to the same distribution as the corresponding proportion of coexceedances.
As mentioned earlier, mapping time series onto complex networks allows us to discover not only correlations but also richer information and specific patterns.Besides extreme values in financial systems, the relationships between extreme values in time series of other complex systems have been of interest to researchers.To emphasize the universality of this pattern, in the following section, we will conduct separate analyses on rainfall data from different regions in the climate system and eye-tracking data from human interaction behavior.
Financial returns exhibit several stylized facts, one of which is long memory, characterized by long-range autocorrelation in the absolute values of returns.This characteristic may interfere with the equivalence of C and P (X = Y).Taking the example of data from the Shenzhen market, we segment the return series into 10  states and calculate the synchronization probability of pairwise state sequences.Then, using the clustering coefficient series obtained above, we plot the relationship between the clustering coefficient of CRN and the synchronization probability of states of returns for each year from 2015 to 2019 in figure 10.
From figure 10, it is evident that the fitted curve does not align perfectly with the straight line of slope 1 and intercept 0. However, the deviation is small and it also suggests a strong linear correlation between the CRN clustering coefficient and the synchronization probability.This indicates that the long memory of returns has minimal impact on the conclusion of C ≈ P (X = Y).This could be attributed to the fact that the length of the data is just one year, comprising just over two hundred values, where the long memory of returns is not notably pronounced in such relatively short time series.

Earth's ecosystems
In this study, we obtain the China's ground climate daily data set (V3.0) from the National Tibetan Plateau Data Center (https://data.tpdc.ac.cn/home).The original data provides daily observations from various weather stations.We first interpolate the original data into grid data, covering 500 × 500 grids over China, with each grid size of 0.123 1924 (longitude degrees) × 0.099 4549 (latitude degrees).By regionally averaging, we calculate the daily average precipitation for 34 provinces in China for the period from 2015 to 2019.The length of the sample data of each year is about 365.The unit of precipitation is 0.1 mm, and the total length of each province's precipitation sequence is 1826 d.For analysis, we consider extreme rainfall events as those exceeding the 95th percentile of daily precipitation in each year.The occurrence of extreme rainfall in two provinces on the same day is defined as synchronous rainfall.We calculate the probability of synchronous extreme rainfall events between pairs of provinces.Then, for each pair of provinces, we construct a CRN based on their respective rainfall sequences, following similar procedures and parameter choices as described in section 5.1.2.The clustering coefficient of each recurrence network is calculated.
Figure 11 presents the relationship between the clustering coefficient of the CRNs and the probability of synchronous extreme rainfall events between pairs of provinces from 2015 to 2019.It tells that this relationship is positively linear and exhibits a certain level of robustness.The results from the correlation tests in table 5 also confirm the statistical significance of this linear relationship.Similarly, even though these two variables show a linear correlation, the results from the rank sum test indicate that they do not come from the same distribution.
We are also interested in investigating whether the equivalence between C and P (X = Y) approximately holds in rainfall data samples.Similarly, we partition rainfall sequences into 10 states based on their respective percentiles and compute the synchronization probability between pairwise rainfall sequences.Figure 12 depicts the relationship between the clustering coefficient of CRN and the synchronization rate for each year from 2015 to 2019.It can be seen that the fitted curve of C and P (X = Y) is closely aligned with the straight line of slope 1 and intercept 0, indicating a more obvious equivalence between C and P (X = Y) in  this case.The differences between figures 10 and 12 also indicate that rainfall sequences exhibit stronger independence compared to return sequences.

Human interaction behavior system
Ho et al [39] conducted a study to investigate the intercorrelation between gaze and speech during social interaction games.Specifically, they organized 20 pairs of participants in small groups, where each pair played two social guessing games, and their eye movements were tracked during the interaction.The length of each pair of sample sequences is around 100.Through cross-correlation analysis of the gaze and speech signals between the two participants, they found that speakers often end their turn by directly gazing at the listener, signaling the listener to respond, resulting in a lagged synchronization of the listener's speech state with the speaker's direct gaze behavior.In other words, during social interaction, the direct gaze state of participant A exhibits a certain level of synchrony with the speech state of participant B, even though the latter's speech state consistently lags behind the former's gaze state.We process the data from one of these social games and explore the relationship between the synchronization rate of one participant's gaze state with the other participant's speech state and the clustering coefficient of the corresponding CRN established from their respective behavioral sequences.In this experimental data, both the gaze time series and speech time series are binary sequences, where 1 represents the presence of gazing or speaking, and 0 indicates looking away or the end of speech.Treating both gaze and speech as events, the event synchronization rate between pairs of participants during the interaction activities shows a positive linear relationship with the clustering coefficient of their respective  CRNs, as depicted in figure 13.And the p-value of the corresponding Pearson's correlation test is 0.03 (<0.05), which indicates that the linear relationship is significant at 5% confidence level.
Similarly, figure 14 illustrates the relationship between the CRN clustering coefficient and the state synchronization probability between pairs of participants in this scenario.The fitted curve of C and P (X = Y) shows a slope approaching 1, and while the intercept is not equal to 0, the confidence interval of the fitted curve still contains the line with a slope of 1 and an intercept of 0. Therefore, this also indicates an approximate equivalence between C and P (X = Y).
In conclusion, there exists a significant positive linear relationship between the synchronization rate of events occurring in pairs of objects and the clustering coefficient of the CRNs established based on their respective time series.This pattern is consistently observed across various complex systems, including but not limited to financial systems, Earth's ecological systems, and human interactive behaviors, demonstrating strong universality.Moreover, data from these three different systems all support the approximate equivalence between the clustering coefficient of CRN and the unconditional synchronization probability, as long as the weak independence assumption is satisfied.

Discussion and conclusion
This paper focuses on the similarity patterns or synchronicity of extreme events occurring in pairs of time series.By leveraging cross-recurrence analysis to capture the complex dynamics of time series and the relative stability of network topology, we map the two time series into a complex network based on cross-recurrence, aiming to establish a relationship between the network's statistical metrics and the probability of synchronized events occurring in the two time series.Our analysis reveals a positive linear correlation between the clustering coefficient of the CRN and the event synchronization rate of the two time series.We conduct simulation experiments by CMLs and observe that the synchronization probability and the clustering coefficient of CRNs tend to become approximately equivalent.We also conduct empirical analyses in financial systems, Earth's ecological systems, and human interactive systems, and find that this pattern is universally applicable across different complex systems.However, when dealing with limited data, there are inherent limitations in observing synchronized events between two time series.In contrast, the CRP is particularly suitable for non-stationary and short time series, with relatively stable statistical features of the CRN.This context provides a practical scenario for the discovered regularity.For instance, in financial systems, systemic risk is often represented by index volatility.By identifying stocks that exhibit similar patterns to the index, i.e. with higher event synchronization rates, we may monitor or even predict systemic risk using this kind of constituent stock.When only short historical data is available and observing extremes is rare, it becomes challenging to compute event synchronization rates to identify stocks with high synchronization with the index.However, due to the positive linear relationship between the clustering coefficient of the CRN and event synchronicity, we can achieve this goal by constructing CRNs for short time series.
Similarly, this method can be applied to predict critical events such as extreme rainfall and earthquakes in specific regions.Therefore, the discovered homomorphic regularity in this study bears significant practical implications in revealing systemic risk.In the future, we plan to implement and expand the aforementioned applications based on this regularity.

Figure 1 .
Figure 1.Schematic diagram of delay and embed.
average length of diagonal lines in the RP plot Diagonal structures in the RP plot indicate segments of trajectories that are close to each other at different time points.The length of diagonal line represents the duration of their closeness.Entropy of diagonal line length distribution (ENT_DL) The entropy of the distribution of diagonal line lengths It measures the Shannon entropy of the histogram of diagonal line lengths, reflecting the complexity of deterministic structures in the system.Laminarity (LAM) The percentage of recurrent points forming vertical lines in the RP plot, given a minimum vertical line length threshold It measures the extent of deterministic behavior in the system by evaluating the ratio of recurrent points forming vertical structures to all recurrent points.Trapping time (TT) The average length of vertical lines in the RP plot Vertical line structures indicate segments of trajectories that stay close to specific points in another trajectory.The trapping time represents the duration of their closeness.

Figure 3 .
Figure 3.The relationship between synchronization probabilities and cluster coefficients of cross-recurrence network in one of the simulation experiments (the initial values of five simulated series in CML system are 0.083, 0.4088, 0.5153, 0.3969, 0.2227 in this experiment) in the presence of noise.

Figure 4 .
Figure 4. Distribution of the results obtained from the univariate regression model examining the relationship between the synchronization probability of two sequences and the clustering coefficient of the cross-recurrence network in the presence of noise.The results are based on 100 simulation experiments.

Figure 5 .
Figure 5. Diagonal-recurrence plots of the pairwise daily return series.(a) shows the variation of the diagonal-recurrence values for the daily returns of sh601398 and sh601288 in 2015 with different time delays.(b) presents the variation of the diagonal-recurrence values for the daily return sequences of sz000027 and sz000400 in 2015 with different time delays.

Figure 6 .
Figure 6.Relationship between the CRP indicators and the ratio of coexceedances of the constituent stocks in the Shanghai stock market in 2015.

Figure 7 .
Figure 7. Relationship between the CRN indicators and the ratio of coexceedances of the constituent stocks in the Shanghai stock market in 2015.

Figure 8 .
Figure 8. Relationships between the CRN clustering coefficient, average shortest path, and the ratio of coexceedances in the Shanghai market for different years' data.

Figure 9 .
Figure 9. Relationships between the CRN clustering coefficient, average shortest path, and the ratio of coexceedances in the Shenzhen market for different years' data.

Figure 10 .
Figure 10.The relationship between synchronization probabilities and clustering coefficients of cross-recurrence networks in the Shenzhen market based on different years' data.

Figure 11 .
Figure 11.Relationship between the clustering coefficient of CRNs for paired provincial rainfall data and the probability of synchronous extreme rainfall events in different years.

Figure 12 .
Figure 12.The relationship between synchronization probabilities and clustering coefficients of cross-recurrence networks based on different years' rainfall data.

Figure 13 .
Figure 13.Relationship between the clustering coefficient of the CRNs based on gaze and speech sequences during dyadic interaction and the state synchronization rate.

Figure 14 .
Figure 14.The relationship between synchronization probabilities and clustering coefficients of cross-recurrence networks based on the data of eye-movement experiment.

Figure A1 .
Figure A1.Relationship between the CRP indicators and the ratio of coexceedances of the constituent stocks in the Shenzhen stock market in 2015.

Figure A2 .
Figure A2.Relationship between the CRN indicators and the ratio of coexceedances of the constituent stocks in the Shenzhen stock market in 2015.

Table 1 .
Main qualification analysis metrics of recurrence plots.

Table 2 .
Main metrics of recurrence networks.

Table 3 .
The limits of 95% confidence intervals for slope and intercept estimates across different scenarios.

Table 4 .
Results of correlation tests between the CRN indicators and the proportion of coexceedances.Note: the values corresponding to the Pearson's correlation test are correlation coefficients, and the values corresponding to the Wilcoxon rank sum test are test statistics.* in parentheses indicate the significance level of the corresponding P-value.* , * * , and * * * represent statistical significance levels of 5%, 1%, and 0.1%, respectively.

Table 5 .
Results of correlation tests between the precipitation CRNs clustering coefficients and the proportion of synchronized rainfall events.Note: the values corresponding to the Pearson's correlation test are correlation coefficients, and the values corresponding to the Wilcoxon rank sum test are test statistics.* in parentheses indicate the significance level of the corresponding P-value.

Table A1 .
List of selected stocks in the Shanghai stock market.

Table A2 .
List of selected stocks in the Shenzhen stock market.