PaperThe following article is Open access

A simplex path integral and a simplex renormalization group for high-order interactions*

, , and

Published 30 July 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Aohua Cheng et al 2024 Rep. Prog. Phys. 87 087601DOI 10.1088/1361-6633/ad5c99

0034-4885/87/8/087601

Abstract

Modern theories of phase transitions and scale invariance are rooted in path integral formulation and renormalization groups (RGs). Despite the applicability of these approaches in simple systems with only pairwise interactions, they are less effective in complex systems with undecomposable high-order interactions (i.e. interactions among arbitrary sets of units). To precisely characterize the universality of high-order interacting systems, we propose a simplex path integral and a simplex RG (SRG) as the generalizations of classic approaches to arbitrary high-order and heterogeneous interactions. We first formalize the trajectories of units governed by high-order interactions to define path integrals on corresponding simplices based on a high-order propagator. Then, we develop a method to integrate out short-range high-order interactions in the momentum space, accompanied by a coarse graining procedure functioning on the simplex structure generated by high-order interactions. The proposed SRG, equipped with a divide-and-conquer framework, can deal with the absence of ergodicity arising from the sparse distribution of high-order interactions and can renormalize a system with intertwined high-order interactions at the p-order according to its properties at the q-order (). The associated scaling relation and its corollaries provide support to differentiate among scale-invariant, weakly scale-invariant, and scale-dependent systems across different orders. We validate our theory in multi-order scale-invariance verification, topological invariance discovery, organizational structure identification, and information bottleneck analysis. These experiments demonstrate the capability of our theory to identify intrinsic statistical and topological properties of high-order interacting systems during system reduction.

Export citation and abstractBibTeXRIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

1.1. Unknowns about high-order interactions

In recent decades, the studies on phase transition phenomena, especially non-equilibrium ones, in different interacting systems have made substantial progress [1, 2]. This progress should be credited to the development of path integrals [35] and renormalization group (RG) [6, 7] theories, which significantly deepen our understanding of system dynamics and provide a precise formulation of scaling and criticality.

However, a theoretical vacancy can be found in the existing path integral and RG approaches if we subdivide interactions into pairwise and high-order categories. As the name suggests, a pairwise interaction only involves a pair of units and does not require the participation of any other unit. These kinds of interactions can be represented by edges in networks [8, 9] and are implicitly used in the derivations of classic path integrals and RG theories (e.g. see [10]). A high-order interaction, in contrast, is the mutual coupling among more than two units [1113]. While some high-order interactions can be decomposed into a group of pairwise interactions, most high-order interactions are undecomposable and in-equivalent to the direct sum of the pairwise interactions [9, 14]. For instance, the triplet collaboration among three agents is not equivalent to the trivial sum of three pairs of individual collaborations. These undecomposable high-order interactions have intricate effects on system dynamics [13, 14] and cannot be trivially represented by ordinary networks [12]. Although notable efforts have been devoted to studying the optimal characterization of high-order interactions (e.g. using simplicial complexes [1518] or hypergraphs [1921]), there are fewer works focusing on the development of specialized path integrals and RG theories for complex systems with high-order interactions [17, 22]. The difficulty in proposing these specialized frameworks arises from the intrinsic dependence on pairwise interactions when researchers constrain the discrete replicas of interacting systems as conventional networks. Once this constraint is relaxed, the intertwined effects of newly included high-order interactions would make the analysis of emerging dynamics highly non-trivial and even lead to novel results that cannot be directly derived by classic theories (e.g. see instances in diffusion and random walks [16, 20, 23], social dynamics [2426], and neural dynamics [27, 28]).

1.2. Related works and remaining challenges

To suggest a possible direction for developing path integrals and RG theories for high-order interactions, we summarize the accomplishments and limitations of previous works.

To date, most of the progress of the computational implementations of path integrals and RGs has been achieved in real systems with pairwise interactions [10, 2933]. Although box-covering methods [3437] may be the most straightforward methods for coarse graining while preserving the organizational properties of a system represented by a network, they depend on the assumption of internal fractal properties and convey no information about the system dynamics. In contrast to box-covering, the spectral method [38] pays more attention to inherent dynamics properties (e.g. random walks) during coarse graining but may suffer from the obstacles of finding macro-units implied by small-world effects [10, 33, 39]. With the aim to resolve the coexisting and correlated scales, a geometric RG has been developed to identify the potential geometric scaling rather than path-distance scaling [33]. Nevertheless, this approach essentially relies on the a priori knowledge of the hidden metric space of networks (e.g. the hyperbolic space [40]). With the pursuit of realizing metric-free coarse graining, whose iterability is not limited by small-world effects, a Laplacian renormalization group (LRG) [10] has been derived from the diffusion on networks, a continuum counterpart of the block-spin transformation [41], to deal with the network topology, dynamics, and geometry encoded by the graph Laplacian operator [42]. This approach has a natural connection with path integrals as well [43]. Although these properties make the LRG a promising foundation for in-depth explorations, this framework suffers from several non-negligible limitations. First, there is no apparent correspondence between the LRG and an appropriate approach for analyzing undecomposable high-order interactions because the Laplacian operator cannot characterize polyadic relations directly [9, 1114]. Second, the intrinsic dependence of the LRG on the ergodic properties of the system dynamics reflected by network connectivity limits the applicability of this approach to analyzing high-order interactions, which are more sparsely distributed in the system than pairwise interactions and do not necessarily ensure ergodicity.

Fewer pioneering works exist that are devoted to developing RG theories for high-order interactions, among which a notable framework is the real space RG proposed by [17, 22]. This approach is initially defined on the Laplacian operators of the network skeletons of the Apollonian [44] and the pseudo-fractal [45] simplicial complexes to calculate spectral dimensions [22]. Then, it is generalized to the normalized up-Laplacian operator [16, 23] of simplicial complexes to deal with high-order spectra. Despite the effectiveness of this approach, its dependence on the non-trivial manual derivations using the Gaussian model [46] and the specialized application scope of spectral dimensions [17, 22] make it less applicable to real scenarios, where a programmatic implementation is demanded for renormalization of complex systems generated by empirical data automatically.

In summary, while representative works such as the LRG [10] have suggested a promising way to renormalize complex systems, the appropriate path integrals and RG theories for high-order interactions remain clouded by numerous theoretical unknowns and technical difficulties.

1.3. Our frameworks and contributions

To fill these gaps, we generalize classic path integrals and RG theories from dyadic to polyadic relations. To realize such generalizations, we are required to derive the appropriate definitions of the canonical density operators (i.e. the functional over fields) [10, 49, 50] and the coarse graining processes in real and moment spaces [6, 7] for high-order interactions.

Our work suggests a way to define these concepts in simplices, in which simplices can join any subset of units to characterize undecomposable high-order interactions among them (see figure 1(a) for instances and section I in the supplementary materials for definitions). We propose two kinds of high-order propagators as the generalized path integral formulation of the abstract diffusion processes driven by multi-order interactions, which further creates a system mode description tantamount to the moment space. The coarse graining in moment space is implemented by progressively integrating out short-range system modes corresponding to certain orders of interactions. Parallel to this process, we propose a real-space coarse graining procedure to reduce the simplex structures associated with short-range high-order interactions, where we suggest a way to overcome the dependence on ergodicity. The correspondence between the moment and real spaces enables our framework to implement system coarse graining according to both organizational structure and internal dynamics. Taken together, these definitions generate our simplex path integral and simplex renormalization group (SRG) theories, which are applicable to arbitrary orders of interactions.

Figure 1. Refer to the following caption and surrounding text.

Figure 1. System evolution governed by high-order interactions. (a) The simplicial complexes of different orders are presented. An n-order simplex manifests itself as a clique with n + 1 units in the network and represents the n-order interactions among these n + 1 units (i.e. each n-order interaction is formed by the simultaneous participation of n + 1 units). As an instance, the 1-order simplex is a pairwise interaction. (b) An instance of the interacting system with high-order interactions is presented, where there are 200 units whose pairwise interactions follow a Barabási–Albert network (the number of edges between a new unit and existing units is c = 6, see the definitions in section I in the supplementary materials) [47, 48]. The units engaged with the first four orders of interactions are extracted using simplicial complexes. (c) The associated network sketches of the system shown in (b) are presented at each order, where two units share an edge in the network sketch on the n-order only if they are engaged with an n-order interaction in the system. (d)–(e) The corresponding multi-order Laplacian (d) and high-order path Laplacian (e) operators of the system shown in (b) are illustrated at every order.

Standard image High-resolution image

The SRG can be used to uncover the potential scale invariance of high-order interactions, whose non-trivial fixed point may suggest the existence of critical phenomena. In contrast to conventional RGs, the SRG is designed to renormalize p-order interactions (each p-order interaction requires the simultaneous participation of p + 1 units) in the system according to the structure and dynamics associated with q-order interactions (). Therefore, we can compare the renormalization flow of p-order interactions guided by p-order interactions with that guided by arbitrary higher-order interactions. This feature allows us to investigate the impacts of higher-order interactions on lower-order interactions and study the differences between lower-order and higher-order interactions in terms of scaling properties. In a special case with (i.e. pairwise interactions), the two kinds of high-order propagators developed in our work both reduce to the classic counterpart used in the LRG [10], and our proposed SRG can reproduce the outputs of the LRG in the ergodic case.

To comprehensively demonstrate the capability of our approaches to identify the latent characteristics of diverse real complex systems, we validate our framework from the perspectives of multi-order scale-invariance detection, topological invariance discovery, latent organizational structure identification, and information bottleneck optimization, respectively. An efficient code implementation of this framework is provided in [51]. One can also see a detailed explanation about the programmatic implementation and usage of the SRG in section XIII in the supplementary materials.

2. Undecomposable high-order interactions

The Laplacian operator is a natural choice for characterizing interactions and their spreading among units [5254]. In a special case with pairwise interactions, the Laplacian has a classic expression, , where δ denotes the Kronecker delta function and A is a weighted adjacency matrix that describes the non-negative coupling strengths among all units in unit set V (here, the non-negativity is required by the Laplacian). Under the autonomous conditions, the evolution of the system dynamics given an initial state x0 is defined by , where denotes a timescale [55]. However, in more general cases with undecomposable high-order interactions, the classic Laplacian operator is no longer applicable. To fill the gap, previous studies have explored the generalization of the Laplacian operator on uniform hypergraphs [56, 57] and simplicial complexes [9, 23, 58, 59], where diverse variants of the Laplacian (e.g. the combinatorial Laplacian [60], the Hodge Laplacian [23, 61], and the multi-order Laplacian [9]) have been proposed to characterize high-order interactions.

In our work, we first derive a multi-order Laplacian following the spirit of [9]. Then, we develop a new operator, referred to as the high-order path Laplacian, as an alternative description of high-order interactions from a different perspective. As shown in figure 1, these two operators are defined on clique complexes, a kind of commonly used simplicial complexes in the persistent homology analysis [60, 62] (see section I in the supplementary materials for clique complexes). The derivations of these two operators are elaborated in section II in the supplementary materials, where we summarize the similarities and differences between them. Although these two operators have different physics meanings and derivation processes, they can be reformulated into similar mathematical forms, as presented in sections 2.1 and 2.2. Thus, the theoretical differences between these two operators will not impede their unified numerical calculations and programmatic implementations.

2.1. High-order interactions through faces

Let us consider the n-order interactions that manifest as the simultaneous participation of n + 1 related units. The undecomposability of these interactions lies in the fact that the couplings among n + 1 units are required to occur simultaneously and none of them are dispensable. For instance, the triplet collaboration requires the participation of all three agents together. As discussed in [9], these interactions can be represented by the n-faces of the n-simplex, where n + 1 units are placed. For the sake of clarity, we define as an arbitrary set of n + 1 units in which unit i is included, and we denote as the set of all permutations (i.e. rearrangements) on . The multi-order Laplacian associated with n-simplices is defined as

In equation (1), the term counts the number of n-simplices that contains unit i

Meanwhile, the term counts the number of n-simplices that contains units i and j

where is an arbitrary set of n + 1 units in which units i and j are included. The general definition of used in equations (2) and (3) is given as a product of a series of matrices for any permutation

where is the classic adjacency matrix of units. It can be seen that serves as an indicator function, which equals 1 if the units in form an n-simplex and equals 0 otherwise.

See section II in the supplementary materials for the full derivations of the multi-order Laplacian. In equation (2), vector is similar to the degree vector, while matrix can be understood as the adjacency matrix. The coefficient n in equation (2) arises from the fact that (i.e. each simplex is repeatedly counted n times). In a special case where n = 1, equation (2) is equivalent to the classic Laplacian for pairwise interactions. In the more general cases where n > 1, high-order interactions can be characterized by equation (2). See figure 1(d) for illustrations of the multi-order Laplacian. Although the definitions of equations (2)–(4) are different from the approach proposed by [9], they are mathematically consistent with [9].

2.2. High-order interactions along paths

Then, we turn to another type of n-order interactions, which manifest as the sequential actions of n + 1 related units after they all participate in (i.e. sequential participation). The undecomposability of these interactions arises from the fact that the mutual coupling effects emerge only after n + 1 units have been engaged with the interplay progressively. For example, a pipeline-like collaboration may require all agents to communicate with each other and behave in order. Each previous action is globally known to all agents and affects all subsequent actions. This property can be characterized by the paths (i.e. a sequence of 1-simplices) that successively pass through the n + 1 units placed on the n-simplex. Consequently, to model the n-order interactions arising from sequential participation, we need to analyze both the faces of the simplices (i.e. to ensure that all related units are indispensable) and the paths defined on these faces (i.e. to describe the sequential participation).

To describe the interactions propagating along the paths in n-simplices, we develop an operator named as the high-order path Laplacian

In equation (5), the coefficient denotes the normalization of the propagation speed of interactions along paths. The term counts the number of paths that traverse all n + 1 units on the n-simplex without repetition and terminate at unit i

where counts the possibilities for the remaining part of the units in to form unique sequences after we fix unit vi as the endpoint of these sequences. The term counts the number of paths that traverse all n + 1 units on the n-simplex without repetition, initiate at unit j, and terminate at unit i

where the coefficient is derived from the fact that the remaining part of the units in can form unique sequences if we fix units vi and vj as the endpoints.

The construction process of the high-order path Laplacian is elaborated in section II in the supplementary materials. See figure 1(e) for several instances of the high-order path Laplacian. In a special case with n = 1, equations (6) and (7) reduce to the classic degree vector and adjacency matrix, respectively (i.e. we have and ). Therefore, equation (5) denotes the classic Laplacian when interactions are pairwise. In most general cases with n > 1, the n-order interactions formed by sequential participation are not equivalent to those caused by simultaneous participation (see section II in the supplementary materials for details).

In general, one can choose the multi-order Laplacian and the high-order path Laplacian according to application demands. As we have suggested above, while conveys information about the high-order interactions that units are simultaneously engaged with, the definition of is more applicable to the cases where a high-order interaction is formed by an irreducible sequence of the participation of related units. In section II in the supplementary materials, we explain how these two operators differ from each other in governing the dynamics of system evolution (e.g. with different decay rates of different orders). Specifically, the decay rate of the system evolution of the simultaneous participation (i.e. ) is same as that of the sequential participation (i.e. ) when interactions are pairwise (i.e. n = 1). When units exhibit two-order interactions, the sequential participation implies slower decay rates of the system evolution compared with the simultaneous participation. When interactions occur at higher orders (i.e. n > 2), the sequential participation leads to a larger decay rate. These differences highlight the dissimilar physics meanings and characteristics of and .

Based on the unified mathematical forms presented in sections 2.1 and 2.2, we have developed efficient code implementations of these two kinds of operators, which can be seen in [51]. Meanwhile, in section III in the supplementary materials, we explain the similarities and differences between our considered Laplacian operators and other Laplacian operators used in topology theories (e.g. the combinatorial Laplacian [60] and the Hodge Laplacian [23, 61] in persistent homology analysis [6267]), which helps us understand the theoretical significance of designing an RG on and . For convenience, we uniformly denote them as in our subsequent derivations. One can specify the actual definition of in the application.

3. Simplex path integral

Given the high-order interactions described by , we turn to explore the path integral formulation of the trajectories of units governed by high-order interactions. As first suggested by [42] and then discussed in [10, 43, 6870], the Laplacian can be understood as a Hamiltonian if we consider the system dynamics governed by it. In high-order interactions, we can consider the eigendecomposition, , using the bracket notation and define a time evolution operator

As suggested by equation (9), this operator characterizes the probability of each unit to evolve from one state to another state within a given timescale τ, i.e. . The characterization is realized by considering the distributions of units, , generated by all the possible ways of evolution, , during a time period of τ. See figure 2(a) for illustrations. As indicated by equation (10), the probability amplitude is intrinsically determined by the exponential decay rates of diffusion modes in a moment space. Short-range interactions associated with fast decay (i.e. large eigenvalues) in the moment space have small impacts on units.

Figure 2. Refer to the following caption and surrounding text.

Figure 2. Illustrations of the simplex path integral. (a) An interacting system of 50 units is generated, where the pairwise interactions of units follow a Barabási–Albert network (c = 4). This system is used to show the derivation of the time evolution operators of the first three orders, where we use the multi-order Laplacian (MOL) for calculation and set τ = 1. For each order of interactions, we illustrate the first four terms of the series expansion of the time evolution operator in equation (9). (b) For the system defined in (a), we show the density operators calculated based on the MOL or the high-order path Laplacian (HOPL) of each order. (c) We show that values in the density operator can be related to the evolution paths of system states. The blue path leads from one state to another, while the red path starts from and ends in the same state.

Standard image High-resolution image

At equilibrium, the Gibbs state of the system associated with n-order interactions is

where denotes the trace and τ serves as the inverse of a finite temperature. See figure 2(b) for instances. The Boltzmann constant is assumed to be unitary for simplicity. The partition function, , is proportional to the average return probability of random walks within a timescale τ [42]. See figures 2(b) and (c) for illustrations.

Equation (11) defines a density operator for the system evolution characterized by high-order interactions

which naturally relates to the path integral formulation. Let us consider a minimum time step ε such that . At the limit (or ), the numerator of the density operator (i.e. the time evolution operator) implies a power form of the corresponding high-order propagator . On the one hand, this power form expression directly leads to a path integral in conventional quantum field theory

where we mark . On the other hand, this power form expression can also be reformulated using the imaginary time and equations (8) and (9)

where denotes the reduced Planck constant [3, 4]. Equation (14) is closely related to the imaginary time evolution governed by the operator in the infinite-dimensional path space. Specifically, the probability amplitude for the system to evolve from to during a period of τ conveyed by the propagator in equation (13) can be measured by summing over all possible evolution paths connecting between these two states (i.e. the discrete counterpart of path integrals) in equation (14). This is why the n-order network topology conveyed by intrinsically shapes path integrals. See figure 2(c) for illustrations.

Moreover, the denominator of the density operator (i.e. the partition function) can be interpreted by considering the expectation

where Γ is an arbitrary closed curve that starts from and ends at x0. Equation (16) suggests why the partition function is proportional to the average return probability of random walks within a timescale τ from the perspective of the path integral formulation. See figure 2(c) for illustrations.

The path integrals in equations (13)–(16) are fully characterized by , which describes latent simplex structures of the system. They are integrals over all the possible evolution paths of a system governed by high-order interactions. Consequently, they are referred to as simplex path integrals in our work. Although our framework is presented using the terminologies of quantum field theory [3, 4] for convenience, it is generally applicable to diverse classical systems (e.g. the brain in neuroscience) in the world. In fact, we can naturally find the Maxwell–Boltzmann statistics, a common description of classical systems, in equations (11) and (12).

4. SRG

The connection between equations (8)–(16) and the moment space naturally inspires us to consider the generalization of RG theories. The degrees of freedom corresponding to short-range interactions (e.g. fast diffusion within the clusters of strongly correlated units) can be safely coarse grained without significant information loss, which is consistent with our interpretation of equation (10). Here, we develop a framework to generalize the RG to high-order interactions, where we also propose possible ways to overcome the limitations of the LRG [10].

4.1. Renormalization procedure

To offer a clear vision, we first introduce our approach, referred to as the SRG, in an ergodic case, as assumed by [10]. The SRG renormalizes a system of the p-order according to the properties of q-order interactions (). Specifically, the renormalization is realized as the following:

  • (1)  
    In each kth iteration, we have two iterative Laplacian operators, and , and their associated high-order network sketches, and , with units (there is an edge connecting between two units in or only if they share a p- or q-order interaction).
  • (2)  
    In the real space, we first search for targets to coarse grain. Specifically, we search for every cluster of units sharing short-range high-order interactions within the timescale , such that we can coarse grain these units into a macro-unit. Hence, we generate a reference network, , as follows. We first initialize as a null network (i.e. with no edge). We follow equation (11) to calculate the density operator associated with the timescale as . In operator , its main diagonal elements, related to self-interactions of units, are frequently greater than most of its off-diagonal elements (e.g. see figure 2(b)). If the th element is greater than the th or the th element, then the q-order interactions between units vi and vj are sufficiently quick such that their accumulations within the timescale are faster than those of self-interactions. In this case, we add an edge between units vi and vj in network . By progressively implementing the edge-adding procedure for every pair of units, we can form clusters in . Each cluster contains a set of strongly correlated units with short-range high-order interactions. Basically, these clusters can be understood as Kadanoff blocks (i.e. the quasi-independent blocks, whose mean size serves as the correlation length of the system) in the real space renormalization procedure [7173].
  • (3)  
    After we find the targets for coarse graining in the real space, we implement a renormalization procedure. Specifically, in , each set of units that belong to the same cluster in are aggregated into a macro-unit to generate a new network . Consequently, macro-units remain in after coarse graining. In , two macro-units, vi and vj , are connected if at least one unit aggregated into vi is connected with one unit aggregated into vj in . If a unit or an interaction in does not participate in any coarse graining according to , it is adopted into without modification. Parallel to this process, we also follow the same rules to aggregate the units in that belong to the same cluster of into a macro-unit in . The generated contains macro-units as well. Therefore, we define .
  • (4)  
    In the moment space, we first search for modes to reduce. Specifically, we find that the eigenvalues of that are smaller than , where is the timescale for q-order simplicial complexes (see section 4.2 for the selection of ). These eigenvalues correspond to long-range high-order interactions because the contributions of their eigenvectors have slow decay rates in equation (10). We denote the number of these eigenvalues as . Other eigenvalues correspond to short-range high-order interactions and can be coarse grained. Moreover, according to equation (11), the totality of long-range high-order interactions (i.e. corresponding to smaller eigenvalues) is generally greater than that of short-range ones (i.e. associated with larger eigenvalues). Therefore, integrating out short-range high-order interactions does not damage the effective information contained in the system significantly. The only problem is that the eigenvalue spectrum cutoff defined above may overestimate the number of short-range high-order interactions and make too small. This is because the eigenvalue spectrum of does not reflect the possibility that the short-range q-order interactions between units vi and vj can be slower than their self-interactions. For instance, although the interactions between units vi and vj are classified as short-range, the th element of can be smaller than both the th and the th elements. In this case, the timescale may be too short for the correlations between vi and vj to accumulate to a non-negligible level. Compared with self-interactions, the interactions between units vi and vj are less important in determining the behaviors of vi and vj since these interactions are not strong enough. To avoid the possible overestimation, we use , the number of Kadanoff blocks in , as a bound. In the case with overestimation, the number of long-range interactions is underestimated (i.e. ) and we correct it by defining . The value of defines the number of modes to keep during renormalization.
  • (5)  
    After we find the modes to reduce in the moment space, we exclude these modes with fast decay rates from the system such that modes remain. Specifically, the current q-order Laplacian, , is reduced to the rescaled contributions of the eigenvectors associated with long-range high-order interactions
    where is the set of the smallest eigenvalues of (i.e. the first eigenvalues after we sort the eigenvalue spectrum in a decreasing order), and is the indicator function defined on an arbitrary set A. Meanwhile, the current p-order Laplacian, , is reduced to the contributions of the smallest eigenvalues (i.e. long-range p-order interactions)
    where is the set of the smallest eigenvalues of . Note that the eigenvalues in are not necessarily smaller than .
  • (6)  
    The correspondence between moment space and real space for q-order interactions is ensured by a Laplacian . For vi and vj , two macro-units in , we define
    if vi and vj are connected in . We set if vi and vj are disconnected in . Here, is a -dimensional ket, where unitary components correspond to all the units in that are aggregated into vi and zero components correspond to all other units in that are not covered by νi (i.e. the value of is the sum of all if vi and vj are connected, where x goes through all the units in that are aggregated into vi and y goes through all the units in that are aggregated into vj ). After defining all off-diagonal elements, the diagonal elements of are defined by following
    Mathematically, this procedure actually defines a similarity transformation T between and
    where 0 is a -dimensional all-zero square matrix. It is clear that the similarity transformation is given as , where each ket ηj is selected such that the columns of T define a group of orthonormal bases. Similarly, we can derive the Laplacian for p-order interactions based on . The derived and are used in the th iteration.

The renormalization of p-order interactions in both the real space (i.e. see steps (2) and (3)) and the moment space (i.e. see steps (4) and (5)) is realized under the guidance of q-order interactions. After renormalizing by repeating steps (1)–(6), the SRG progressively drives the system toward an intrinsic scale of p-order interactions that exceeds the microscopic scales. More specifically, the SRG creates a q-order-guided iterative scale transformation that, irrespective of starting from which local scale, always leads to the latent critical point associated with p-order interactions (if there is any). It provides opportunities to verify whether the concerned thermodynamic functions are scale-invariant or not. See figure 3(a) for a summary of key steps in our approach. Note that the use of a reference network in step (2) is not necessary for theoretical derivations, yet it is favorable for computer programming.

Figure 3. Refer to the following caption and surrounding text.

Figure 3. Illustrations of the SRG. (a) The conceptual illustration of the SRG is presented. For legibility, the simplicial complexes of different orders are not elaborated and are abstractly represented by dashed lines with different colors. Given a system where multi-order interactions coexist, the SRG can be applied to renormalize the system at the p-order based on the structure and dynamics associated with q-order interactions (). (b) Conceptual illustrations of our divide-and-conquer procedure for dealing with the dependence of renormalization flows on the ergodicity. (c) A system with 500 units is defined, where pairwise interactions follow a Barabási–Albert network (c = 2). We use this system to illustrate the selection of the timescale based on the specific heat (q = 2).

Standard image High-resolution image

In a non-ergodic case, the presented procedure cannot be directly adopted. This is because our proposed renormalization approach is rooted in the relations among the time evolution operators defined by equations (8)–(10), the exponential decay of diffusion modes in the moment space, and the evolution paths in the path space, which implicitly depend on the assumption of the ergodicity of system states. In the non-ergodic case where a set of states can never evolve to or be transformed from another set of states (i.e. the network connectivity is absent), path integrals between these state sets are ill-posed and the diffusion over the network can be subdivided into several irrelevant sub-processes (i.e. the diffusion processes defined on two disconnected clusters are independent from each other). Dealing with these properties is crucial for our work since the SRG is oriented toward analyzing high-order interactions, which are usually sparsely distributed and may lack ergodicity.

Here, we develop a divide-and-conquer approach. Our idea arises from the fact that we can treat the diffusion from one cluster (i.e. a connected component) to another cluster on any order as being decelerated to a condition with zero velocity (i.e. being infinitely slow). This infinitely long-term diffusion process contains no short-range high-order interactions and, therefore, should not be integrated out during renormalization. Consequently, the SRG does not need to deal with the diffusion across any pair of clusters and can focus on the diffusion within each cluster severally. In the k-iteration, we consider the following two cases:

  • (A)  
    If is connected (i.e. the ergodicity holds), we directly apply steps (1)–(6) on to guide the renormalization of p-order interactions and derive the inputs of the -iteration.
  • (B)  
    If is disconnected (i.e. the ergodicity does not hold) and has r clusters denoted by , we respectively apply steps (1)–(6) on each cluster of to guide the renormalization of p-order interactions formed among the units that belong to this cluster. Specifically, we treat each cluster Ci as a network and input it into steps (1)-(6) with . Note that the above procedure does not affect if the selected cluster Ci is trivial (i.e. contains only one unit). After dealing with all r clusters, the obtained results are used for the -iteration.

In figure 3(b), we conceptually illustrate our approach, which enables us to relax the constraint on system ergodicity while implementing the SRG.

In section III in the supplementary materials, we explain how the SRG is related to the persistent homology theory [62, 7476] and suggest its potential applicability to analyzing the lifetime of topological properties. In section IV in the supplementary materials, we discuss the SRG in the form of a conventional RG, suggesting the benefits of including high-order interactions into RG analysis, and indicate the relations between the SRG and other frameworks [31, 7779]. Specifically, we find that the ideas underlying the SRG can be clearly interpreted by relating the SRG with a PCA-like RG theory [31].

4.2. Critical timescale

After developing the renormalization procedure, we attempt to present a detailed analysis of the timescale and its associated scaling relation.

In the ergodic case, the setting of in the SRG (as we have mentioned in step (1)) is finished in the first iteration and required to make the specific heat, an indicator of the transitions of diffusion processes, maximize (see similar ideas in [10]). In section IV in the supplementary materials, we explain why it is valid to define renormalization flows based on the specific heat. Specifically, we consider the q-order spectral entropy in the first iteration of the SRG (see [42] for the definition of spectral entropy)

where each is an eigenvalue of , and is the density operator derived from following equation (11). Note that we have . The q-order specific heat is defined according to the first derivative of the q-order spectral entropy

See section V in the supplementary materials for detailed derivations. The value of is selected to ensure , which indicates an infinite deceleration condition of diffusion processes. See figure 3(c) for illustrations. Apart from the infinite deceleration condition, determination of the timescale value used in step (2) of the SRG also requires us to consider the high-order properties of diffusion. For convenience, we denote as the value of under the infinite deceleration condition (i.e. there is ). The optimal timescale for the SRG is set as

which is the average subdivided timescale for each 1-simplex in an n-simplex (i.e. the average number of adjacent 1-simplices between arbitrary units i and j is ). Note that holds for pairwise interactions. The value of ε is used to correct the errors of timescale calculation when system units share ultra-dense interactions, which arise from the numerical bias of diffusion rate estimation in ultra-dense systems. The definition of ε can be seen in section VI in the supplementary materials.

The above procedure can be directly generalized to the non-ergodic case, where we treat each cluster Ci as a network and derive a for it if it has not been assigned with a timescale yet. The timescale of a cluster does not change after it is assigned. See figure 3(c) for illustrations.

4.3. Renormalization flow

After determining the timescale, the SRG proposed in section 4.1 can be applied to renormalize interacting systems in a non-parametric manner. See section XIII in the supplementary materials for the programmatic implementation of the SRG.

In figure 4(a), we explain how the SRG iterates. For an input system, we define p and q to realize the renormalization guided by different orders of interactions. Given each pair of , we first search p- and q-simplices to represent the system of the p- and q-order (i.e. generate the high-order network sketches, and , as shown in figure 1). Meanwhile, we derive the critical timescale, , on following section 4.2. After the critical timescale is calculated, we iteratively realize steps (1)–(6) introduced in section 4.1 to renormalize the system once, twice, and so on. For instance, in the first iteration of the SRG, we follow steps (2)–(3) to find the Kadanoff blocks of the q-order (i.e. clusters in ) and coarse grain the units in each Kadanoff block into a macro-unit on the p and q-order to derive and . Then, we follow steps (4)–(6) to calculate the associated high-order Laplacian operators and . The obtained , , , and serve as inputs for the SRG to run the second iteration to generate , , , and . In this manner, the iteration continues and a renormalization flow is gradually generated.

Figure 4. Refer to the following caption and surrounding text.

Figure 4. Illustrations of the iteration process and coarse-graining procedure in the SRG. (a) A summary of the key processing pipelines of the SRG proposed in sections 4.1 and 4.2. Specifically, given an input system and a selected combination of , we follow step (1) to characterize the system of the p- and q-orders to generate its representations in the real (i.e. and ) and the moment (i.e. and ) spaces. Then, we follow section 4.2 to calculate the timescale . If the ergodicity does not hold in , we repeat the timescale calculation on each cluster of . After obtaining , we follow steps (2)–(3) to find the Kadanoff blocks and implement coarse graining in the real space to generate and . We also follow steps (4)–(5) to find the modes corresponding to short-range high-order interactions and reduce the system in the moment space into the contributions of long-range high-order interactions. Finally, we calculate and according to the correspondence between moment space and real space. The generated , , , and are used for the next iteration of the SRG. (b)–(d) Use a system whose pairwise interactions follow the Watts–Strogatz network to show the real space renormalization flows on 1-order when the SRG is guided by , respectively. In these instances, the SRG runs two iterations and we assign each macro-unit in a unique color. If a unit is aggregated into a certain macro-unit, it adopts the color from this macro-unit. By recursively repeating this procedure, we can assign units in each iteration with appropriate colors to indicate which macro-unit in contains them. Meanwhile, to highlight the effects of different values of q on renormalization flows, we assign each unit in an index to trace the units grouped into the Kadanoff blocks in step (2) of the SRG framework. For instance (see (c)), the units 1, 2, 3, 28, 29, and 30 in are aggregated into a macro-unit when q = 2, whose index is assigned as 1 in . Meanwhile, the units 10, 11, 12, 13, and 14 in are coarse grained into a macro-unit in , which is indexed as 10. Moreover, the units 17, 18, 20, and 15 in are grouped into the macro-unit 17 in . Therefore, the system size is reduced from 30 to 18 after this renormalization time.

Standard image High-resolution image

In figures 4(b)–(d), we mainly illustrate how the real space coarse graining in the SRG occurs at different orders. We define p = 1 and to generate renormalization flows and visualize the changes caused by coarse graining the units within each Kadanoff block. Two important messages are conveyed by figure 4. First, whenever a Kadanoff block of m units is found in , the size of is reduced by m − 1 (this property also holds for ). If there is no Kadanoff block in (i.e. no unit to coarse grain), the SRG arrives at its fixed point and the system remains unchanged during subsequent renormalization. Second, as shown in the comparison across different values of q, the system reduction during renormalization is always smaller given a larger value of q, which mainly arises from the sparser distribution of high-order interactions.

To demonstrate the meaning of distinguishing between p and q during renormalization, we compare between the cases with p ≠ q (i.e. renormalize the system of an order according to its properties in another order) and p = q (i.e. equivalent to the situation where we do not distinguish between p and q) in section VII in the supplementary materials. In Sfigure 2 in the supplementary materials, we observe significant differences in both real space (i.e. ) and moment space (i.e. ) between p = q and p ≠ q, demonstrating the effectiveness of supporting p ≠ q during renormalization. Therefore, by distinguishing between p and q, the SRG is enabled to analyze more phenomena than classic frameworks with . Meanwhile, these results qualitatively suggest the effects of higher-order interactions on lower-order interactions during renormalization (i.e. in the case with p ≠ q), whose quantification is explored in our subsequent analysis.

In figure 5, we present more instances of the renormalization flow generated by the SRG. We apply the SRG to synthetic interacting systems whose pairwise interactions follow the random tree (RT), the square lattice (SL), the triangular lattice (TL), the Watts–Strogatz networks (WS), and the Erdos–Renyi networks (ER). Among these random networks, the SL and TL can be defined with or without the periodic boundary conditions (PBCs). See section I in the supplementary materials for the definitions of these random networks. As shown in figure 5, the key structural properties of the RT, SL, and TL are principally preserved during renormalization (i.e. we can see qualitative similarities in across different iterations) while the SRG implemented on the WS and ER networks gradually reveals specific latent structures that are dissimilar to the original ones. Among these instances, the lattice systems with PBCs are observed to make the renormalization ineffective. This is a phenomenon caused by the identical properties of all units under the PBCs (i.e. all units are exactly the same as each other at any q-order). Because the renormalization procedure in the SRG begins by distinguishing between short- and long-range interactions, the SRG can effectively reduce a system only when units and interactions are not identical. Otherwise, it is impossible to tell which interactions are long-range and which are short-range. In figures 5(c) and (e), the SRG does not coarse grain systems because it discovers no short-range interaction. This phenomenon does not cause conflicts in our work because a lattice system with identical units and interactions can be treated as an extreme case of self-similarity to a certain extent (i.e. each sub-system is identical to the whole system) and can be invariant for coarse graining. To avoid this trivial phenomenon in our analysis, all the lattice systems implemented in our subsequent experiments (e.g. figures 6 and 7) are defined without the PBCs.

Figure 5. Refer to the following caption and surrounding text.

Figure 5. Large-scale instances of the renormalization flows of the SRG. The first three iterations of renormalization in the systems whose pairwise interactions follow the random tree (a), the square lattice (b), the square lattice with a periodic boundary condition (PBC) (c), the triangular lattice (d), the triangular lattice with a PBC (e), the Watts–Strogatz networks (f)–(h), and the Erdos–Renyi networks (i)–(j) are presented, where we set , , and for (a)–(c), (f), (i), (d), (e), (g), (j), and (h), respectively. See section I in the supplementary materials for the definitions of these random networks. The absence of the PBC implies that the definitions of different lattices (i.e. square and triangular) only hold within boundaries.

Standard image High-resolution image
Figure 6. Refer to the following caption and surrounding text.

Figure 6. The scaling relation of the SRG. (a)–(e) We verify the scaling relation on the systems whose pairwise interactions follow a Barabási–Albert network (BA, c = 3), a Watts–Strogatz network (WS, each unit initially has eight neighbors and the edges are rewired according to a probability of 0.3), and an Erdos–Renyi network (ER, two units share an edge with a probability of 0.08), the triangular lattice (TL, with the interactions of the first two orders), and the random tree (RT, with only pairwise interactions), respectively. Scatters denote the mean values of averaged across all replicas on each order and the error bars denote standard deviations. See the definitions of random networks in section I in the supplementary materials. (f) The standard deviations compared across different systems. (g)–(i) The high-order effects measured on the replicas in (a)–(c), where each color area denotes the distribution of high-order effects in the k-iteration of the SRG ().

Standard image High-resolution image
Figure 7. Refer to the following caption and surrounding text.

Figure 7. The topological invariance explored by the SRG. (a)–(d) The m-order Betti number () is calculated on in each kth iteration of the SRG (), where the multi-order Laplacian (MOL) is used. (e)–(h) The same analysis is repeated using an SRG defined with the high-order path Laplacian (HOPL). Each line in (a)–(h) corresponds to an instance of the evolution of the m-order Betti number during renormalization. Note that we define the y-axis as to meet the demands of a log-scale axis.

Standard image High-resolution image

4.4. Scaling relation and high-order effects

Given the SRG, a crucial task is to analyze the potential scaling relation and quantify the effects of high-order interactions on it.

In the ergodic case, we denote and as the specific heat and density operator associated with . We can insert these variables into equation (25) to derive the following relation

where and the constant µ is obtained when we solve equation (25) as a differential equation. If the Laplacian eigenvalue spectrum (i.e. the density of the eigenvalues of ) follows a general power-law form, , we can obtain a scaling relation

whose derivations are presented in section VIII in the supplementary materials. The parameter ν is a specific constant. For a system with scale-invariance properties of the q-order, the power-law form of the Laplacian eigenvalue spectrum holds and should be invariant under the transformation of the SRG. Therefore, we expect to see an approximately constant in a certain timescale interval (i.e. the infinite deceleration condition), accompanied by a fixed during renormalization. For a system without scale invariance, the scaling relation does not hold true.

To generalize our analysis to the non-ergodic case, we respectively derive and for each cluster Ci of by following equations (27) and (28). These definitions enable us to obtain

and

where denotes weighted averaging across all clusters of (i.e. weighted according to cluster size). In the non-ergodic case, variables and refer to the global specific heat and the timescale measured on under the infinite deceleration conditions. Equations (27) and (28) actually serve as special cases of equations (29) and (30), where only one cluster exists. See section IX in the supplementary materials for derivations of these equations. Moreover, although the scaling relations in equations (27)–(30) are derived in a case with k = 1, we can also explore these scaling relations in for k > 1 (i.e. we directly replace with during the calculation of equations (27)–(30)). In summary, we can verify if a system obeys scaling relations and further explore whether the SRG preserves scaling relations.

Equations (27)–(30) are derived for q-order interactions during renormalization and may divide from the original scaling relation of p-order interactions. These p-order interactions, initially corresponding to and , are renormalized according to and in the SRG. In the case with p < q, we can measure the effects of q-order interactions on the scaling relation of p-order interactions as

which is applicable to both ergodic and non-ergodic cases. Meanwhile, similar to the scaling relation analysis, the measurement of high-order effects can be implemented in any iteration of the SRG.

Overall, the proposed SRG offers an opportunity to verify the potential existence of scale invariance at different orders and classify interacting systems according to their high-order scaling properties.

5. Application of the SRG

To this point, we have elaborated on the SRG framework and its properties. Next, we validate its applicability to analyzing systems with high-order interactions.

5.1. Verification of the multi-order scale invariance

We aim to verify the multi-order scale invariance (i.e. being invariant under the SRG transformation at different orders). Specifically, being scale-invariant at an order requires the system to be located at a certain fixed point of the renormalization flow (i.e. the system behaves similarly across all scales), leading to scale-independent characteristics of the system described by power-law scaling. Specifically, if we describe system states via a set of observables, their measurements are expected to maintain invariance across all the iterations of the SRG. Due to the internal complexity of the system's different orders, we may see a rich variety of fixed points in empirical data. In an opposite where the system lacks a fixed point, its behavior significantly changes across scales.

In section X in the supplementary materials, we implement our verification according to the behaviors of the Laplacian eigenvalue spectrum (the behavior of degree distributions is also presented as auxiliary information). Based on the existence of invariant power-law forms of Laplacian eigenvalue spectra under the SRG transformation, we can classify systems into scale-invariant, weakly scale-invariant, and scale-dependent types of different orders.

Apart from the above approach, there is a more direct way to realize the verification. Because a scale-invariant system of the q-order should invariably follow the scaling relation, being bound to satisfy equation (29) is one of the necessary conditions for being scale-invariant. Consequently, we can first measure the deviations from the scaling relation of the concerned systems before renormalization. Given significantly large deviations, the system cannot be scale-invariant, regardless of how it behaves during renormalization. In figure 6, the measurement is conducted on multiple types of systems in each kth iteration () of the SRG (see section I in the supplementary materials for random network definitions, and see section XI in the supplementary materials for experimental details). As shown in figures 6(a)–(e), although the mean observations of averaged across all replicas collapse onto the scaling relation at every order and in each kth iteration, the departures from the scaling relation measured by the standard deviations of and in some systems can be non-negligible. Specifically, as suggested in figure 6(f), the systems whose pairwise interactions follow known scale-invariant structures, such as the TL or the RT, do ensure that all their replicas follow the scaling relation with vanishing deviations in each kth iteration. The replicas of the systems whose pairwise interactions follow the Barabási–Albert network, a structure with weak scale-invariant properties (i.e. the Barabási–Albert network can be renormalized within a specific set of scales, but is not strictly scale-invariant [10]), have relatively small deviations at the first three orders as well. The replicas of other kinds of systems whose pairwise interactions follow the WS or the ER networks [80] exhibit larger deviations from the scaling relation at most orders, suggesting the absence of multi-order scale invariance.

In summary, the method reported in figure 6 can offer convenient detection of scale-dependent systems in practice because scale-invariant systems are expected to satisfy equation (29) before and during renormalization. To seek a more precise verification, one should apply the approach in section X in the supplementary materials.

Finally, we measure the high-order effects on the scaling relation by applying equation (31). As shown in figures 6(g)–(i), these high-order effects are non-negligible and generally increase with the difference between p and q. These results explain how the SRG differs from conventional RGs that focus on a single order.

5.2. Exploration of the topological invariance

In section III in the supplementary materials, we have explained how the SRG is theoretically related to the persistent homology theory and other concepts, where we suggest the potential applicability of the SRG to analyzing persistent homology. Below, we validate this applicability by using the SRG to explore the topological invariance.

In brief, the persistent homology theory can be used to study the lifetimes of different topological properties in a filtration process, during which we progressively add or remove simplices from the system [62, 7476] (i.e. this is similar to the network evolution process in complex network studies [81]). Although the SRG is not exactly the same as the classic filtration approach because it modifies the system in a more complicated manner, it does offer an opportunity to study the evolution of topological properties during renormalization (i.e. we can measure different topological properties in for each ). Similar to the analysis of scale invariance, a topological property can be treated as invariant if it is persistent across all the iterations of the SRG (i.e. referred to as topological invariance). In our analysis, we mainly focus on the Betti numbers of different orders, a family of important topological properties used in biology [8284] and physics [8587] studies. In general, an m-order Betti number, , counts the number of m-dimensional holes in the system, which is a key topological characteristic of high-order structures and defines the dimension of mth homology [74]. In the terminology of graph theory, a 0-order Betti number counts the number of connected components and a 1-order Betti number is the number of cycles. Computationally, the m-order Betti number can be calculated using the toolbox offered by [88].

In figure 7, we explore the potential existence of topological invariance in different random network models, including the Barabási–Albert network (c = 4), the WS network (each unit initially has 10 neighbors and edges are rewired according to a probability of 0.3), the ER network (two units share an edge with a probability of 0.1), the TL, and the RT. See section I in the supplementary materials for their definitions. We generate 100 replicas for each random network model, where every replica consists of 200 units. In the experiment, we design an SRG with a certain combination of and let it function on each random network to generate a sequence of for . Given each , we measure the m-order Betti number, , on it (). If an m-order Betti number is generally constant across all iterations, its corresponding topological properties can be treated as invariant during renormalization.

As shown in figure 7, the Betti numbers of scale-dependent systems (e.g. the WS and the ER networks) exhibit non-negligible changes on most orders when the SRG transforms systems across scales. Compared with these systems, weakly scale-invariant systems (e.g. the Barabási–Albert network) have less variable Betti numbers during renormalization. When systems become scale-invariant (e.g. the TL and the RT), the Betti numbers at all orders are generally invariant, except that some fluctuations may occur in certain replicas due to random effects. This finding is interesting because, to the best of our knowledge, there is no theoretical guarantee that the multi-order scale invariance is correlated with the topological invariance yet. According to our observations, we have the following speculation: specific topological invariance properties may exist near the critical point that makes the SRG, a framework dealing with high-order topological structures, have fewer effects on the information contained in the system. As a result, the behavior of the system seems to be approximately invariant under the scale transformation of the SRG, leading to the inference that the system is born at a fixed point of the renormalization flow. Exploring this speculation may deepen our understanding of the relation between the persistent homology [62, 7476] and the criticality studied by the RG theory [6, 7], which remains a valuable direction for future works.

5.3. Identification of organizational structure

Next, we use the SRG to explore the roles of different orders of interactions in preserving or defining the organizational structures (e.g. latent communities) of complex systems, where we also validate the applicability of the SRG in organizational structure identification.

When organizational structures are given, we explore the conditions under which these structures can be maintained by the SRG. In section XII in the supplementary materials, we implement the verification using protein–protein interactions (e.g. orthologous and genetic interactions) in Caenorhabditis elegans [89]. We first extract the initial community structures of protein–protein interactions by applying the Louvain community detection algorithm [90] (see section I in the supplementary materials for an introduction). Then, we use the SRG to renormalize protein–protein interactions on a high order (i.e. with ) or a low order (i.e. with ). We define latent community structures according to unit aggregation during renormalization (see section XII in the supplementary materials for details). Consistent with [38], our results in section XII in the supplementary materials highlight that renormalization can never be treated as a trivial counterpart of clustering. Meanwhile, the SRG guided by high-order interactions (e.g. q = 2) can generally preserve the properties of initial community structures while the SRG guided by pairwise interactions does not. Based on these results, we speculate that high-order interactions are essential in forming and characterizing organizational structures. Communities cannot be treated as trivial collections of pairwise interactions, and it may be inappropriate to partition a system into sub-systems based only on pairwise relations. Given this observation, we naturally wonder if high-order interactions alone are sufficient to define ideal community structures (i.e. do not consider pairwise relations while determining system partition).

To tackle this question, we explore whether an SRG defined by different combinations of can be applied to identify latent communities when system partition is unknown. In figures 8(a)–(c), we define communities according to the renormalization flows of the SRG and compare them with the Louvain communities (i.e. the communities detected by the Louvain algorithm [90], see section I in the supplementary materials). Specifically, we classify the units into the same SRG community if they are aggregated into the same macro-unit during renormalization (e.g. we generate SRG communities after the first iteration of renormalization flows in figures 8(a)–(c)).

Figure 8. Refer to the following caption and surrounding text.

Figure 8. The organizational structure discovered by renormalization flows. (a)–(c) The Louvain communities underlying the Genetics (Ge), WI2007 (WI), and Inflamome (In) data sets [89] are extracted using the Louvain community detection algorithm [90]. The units of different communities are distinguished by colors. For comparison, the SRG is applied to renormalize these systems with and . Units are determined as belonging to the same SRG community if they are aggregated into the same macro-unit. Meanwhile, the corrected SRG communities are presented. (d)–(f) The correctness of communities and disconnectivity between communities are shown. (g)–(i) The adjusted mutual information, completeness, and homogeneity are measured in SRG communities. 'Ge (1,1)' denotes the SRG communities with in the Genetics data set. 'Ge Co' denotes the corrected SRG communities in the Genetics data set. Other notions can be understood in a similar way. See section I in the supplementary materials for the definitions of all the measures used in (d)–(i).

Standard image High-resolution image

To quantitatively compare the SRG communities with the Louvain communities, we analyze the correctness of communities (i.e. or referred to as the performance) [91] and the disconnectivity between communities (i.e. or referred to as the coverage) [91]. Note that both these concepts reflect the properties of communities on 1-order because they are defined on pairwise interactions [91] (see section I in the supplementary materials for precise definitions). As shown in figures 8(d)–(f), the behavior of the SRG is different from the Louvain community detection algorithm, which is designed to maximize the modularity of communities [90, 92, 93]. When the SRG is guided by pairwise interactions, the generated SRG communities tend to be disconnected from each other (i.e. lack pairwise interactions across communities) and have low accuracy (i.e. units in the same community may lack pairwise interactions because many short-range pairwise interactions are reduced). When the SRG is guided by high-order interactions, the generated SRG communities tend to be precise (i.e. units in the same community are always involved in high-order interactions) and non-isolated (i.e. across-community pairwise interactions exist since the SRG mainly reduces high-order interactions). We are inspired to generate the SRG communities by combining the properties of these two cases. We suggest considering pairwise interactions to correct the SRG communities generated by the SRG guided by high-order interactions. Specifically, for two SRG communities generated with , if at least half of the units in one community have pairwise interactions with the units in another community, then all units in these two SRG communities are merged into the same community. The corrected SRG communities are presented in figures 8(d)–(f), exhibiting better capability in balancing between the correctness and the disconnectivity.

From another perspective, we quantify the consistency between the SRG communities and the Louvain communities in defining system partition in figures 8(g)–(i). As measured by the adjusted mutual information, completeness, and homogeneity (see section I in the supplementary materials for the definitions of these metrics, whose larger values suggest higher consistency extents), the partitions defined by the corrected SRG communities are generally consistent with, but not exactly equivalent to, those formed by the Louvain communities in all data sets. On the other hand, the SRG communities defined based only on high-order (i.e. with ) or low-order (i.e. with ) interactions do not necessarily maintain consistency with the Louvain communities in these data sets. Because the Louvain communities are defined by maximizing the modularity [90, 92, 93], we suggest that the optimization of modular structures cannot be realized based only on pairwise or high-order interactions. Instead, modular structures depend on multiple orders of interactions.

Taken together, high-order interactions are more effective than pairwise interactions in preserving the principal characteristics of organizational structures. However, high-order interactions alone are not sufficient to define ideal system partitions with optimal modular structures and clear separability among communities. For real systems, their organizational structures are defined by the intricate coexistence of various orders of interactions, exhibiting diverse behaviors across different orders. Although the SRG does not behave in a manner equivalent to the optimized algorithms for community detection (e.g. the Louvain algorithm [90]), it offers a flexible way to analyze the roles of the interactions of every order in characterizing organizational structures since we can freely select different combinations of in analysis. Moreover, after considering the interactions of multiple orders, the corrected SRG communities have the potential to offer a system partition scheme that is competitive with optimized algorithms (see figures 8(d)–(i)), which suggests the applicability of the SRG to identifying latent community structures of real systems.

5.4. Optimization of information bottleneck

Finally, we analyze the SRG in terms of informational properties. In previous studies, the analysis using information theory has suggested that RG frameworks essentially maximize the mutual information between relevant features and the environment [94, 95] or reduce the mutual information among irrelevant features [96]. In this work, we suggest exploring the SRG from the perspective of an information bottleneck [97] because an RG, similar to dimensionality reduction or representation learning approaches in machine learning [98100], essentially deals with information encoding during data compression [101].

To realize this analysis, we follow one of our earlier works [102] to represent the associated high-order network sketch in the kth iteration of the renormalization flow, , by a Gaussian variable

where J is an all-one matrix and measures the number of units in . The derived Gaussian variable offers an optimal representation of with network-topology-dependent smoothness and maximum entropy properties, and enables us to compare across different iterations (see [102] for explanations). If necessary, one can further add a scaled unitary matrix, (a > 0), to the covariance matrix of to ensure its semi-positive properties in the non-ergodic case (i.e. where is disconnected).

We suggest that one considers the following information bottleneck

where if k > 1 and i = k if k = 1. The parameter ψ defines the regularization strength. We denote as the mutual information (see section I in the supplementary materials for its definition), which can be estimated by the mutual information estimator with local non-uniformity corrections (we set the correction strength as 10−3 such that corrections only occur when necessary) [103]. This approach has been included in the non-parametric entropy estimation toolbox [104]. Note that equation (33) is different from the information bottleneck considered in [101] due to the discrepancy between our concerned question and [101].

Intuitively, the objective function in equation (33) evaluates whether the renormalized system in the kth iteration, , preserves the information of the original system, , by maximizing and reduces the trivial dependence on its previous state, , by minimizing . For a valid RG, being able to preserve some information of the original system is an essential capability. Because a trivial solution of maximizing is to make for any k > 1 (i.e. there is no renormalization at all), we need to include the complexity term in equation (33) to avoid this trivial result.

Meanwhile, the possibility of maximizing depends on the properties of the system as well. If the system is scale-invariant, the preserved information does not rapidly vanish when we reduce the complexity since the information of holds across scales (note that still reduces slightly because the decreasing dimensionality of affects the numerical behavior of the estimator). If the system is scale-dependent, the preserved information crucially relies on the complexity and vanishes unless the complexity is high. Therefore, we can divide by to measure the complexity required by a single bit of preserved information in the kth iteration. A high value of complexity per information bit suggests that the system is scale-dependent because we cannot find a low-complexity coarse-grained system to represent it efficiently.

In figure 9(a), we show the information curves of the renormalization flows generated in four kinds of interacting systems. As suggested by our results, the renormalization flows of weakly scale-invariant (e.g. Barabási–Albert) and scale-invariant (e.g. RT) systems lead to sufficiently small complexity values and numerous preserved information bits, while the renormalization flows of scale-dependent systems (e.g. ER and WS) do not always achieve this (although the WS network realizes complexity reductions during renormalization, it is still not competitive with the Barabási–Albert network and RTs). The information bits of scale-dependent systems are preserved at the cost of maintaining high complexity, while scale-invariant and weakly scale-invariant systems progressively become more complexity-saving in preserving information during renormalization. These results can be quantitatively validated based on the complexity per information bit measured in figure 9(b). These findings suggest the possibility of verifying scale invariance from an informational perspective. Moreover, we observe that the SRGs guided by pairwise interactions (i.e. q = 1) frequently achieve more significant complexity reductions while preserving information. This phenomenon may arise from the fact that high-order interactions are distributed in systems in a sparser manner.

Figure 9. Refer to the following caption and surrounding text.

Figure 9. The information curve of the SRG. (a) The SRG is applied to four synthetic interacting systems, whose pairwise interactions follow the Erdos–Renyi network (ER, each pair of units share an edge with a probability of 0.02), the Watts–Strogatz network (WS, each unit initially has five neighbors and edges are rewired according to a probability of 0.05), the Barabási–Albert network (BA, c = 4), and the random tree (RT), respectively. Each kind of system consists of 500 units and has 5 replicas. The SRG is defined with the multi-order Laplacian and for the RT, while is set for all other systems. The generated information flows are used to derive information curves. (b) The complexity per preserved information bit is shown after averaging across all replicas.

Standard image High-resolution image

6. Discussion

Various real interacting systems share universal characteristics that govern their dynamics and phase transition behavior [105]. These characteristics remain elusive because fundamental physics tools, such as path integrals and RG theories, are not well-established when a priori knowledge about system mechanisms is absent [10]. This gap may be accountable for diverse controversies concerning critical phenomena in biological or social systems (e.g. the controversies about brain criticality [106]) and has induced a booming field that is devoted to developing statistical physics approaches for the empirical data generated by unknown mechanisms [10]. Notable progress has been accomplished in the computational implementations of path integral and RG theories in real systems with pairwise interactions [10, 2933], supporting remarkable applications in the analysis of neural dynamics [107] and swarm behavior [108]. Nevertheless, no apparent equivalence between these pioneering works and an appropriate approach for analyzing undecomposable high-order interactions exists because classic network or geometric representations are invalid in characterizing polyadic relations [9, 1114].

In this work, we contribute to the field by developing the natural generalizations of path integrals and RGs on undecomposable high-order interactions. Our main contributions are summarized below:

  • (1)  
    Our theoretical derivations lead us to an intriguing perspective that the system evolution governed by high-order interactions can be fully formalized by path integrals on simplicial complexes (e.g. the return probability of an arbitrary system state is proportional to the path integral of all closed curves that start from and end with this state in high-order networks), which suggests the possibility of studying high-order interactions by applying the tools of quantum field theory [3, 4].
  • (2)  
    We suggest that the contributions of microscopic fluctuations to macroscopic states in the moment space are precisely characterized by a function of the multi-order Laplacian [9] or our proposed high-order path Laplacian. Consequently, an RG, the SRG, can be directly developed based on the diffusion on simplicial complexes, which does not require any assumption of a latent metric for mapping units into target spaces (e.g. the hyperbolic space [30, 33]). This property ensures the general applicability of our theory to real complex systems, where a priori knowledge is rare and any assumption can be unreliable. In contrast to classic RGs, the SRG can renormalize p-order interactions in a system based on the structure and dynamics associated with q-order interactions (), enabling us to analyze the effects of q-order interactions on the p-order interactions and study the scaling properties across different orders.
  • (3)  
    We propose a divide-and-conquer approach to deal with non-ergodic cases, where some systems' states are never reachable by the evolution started from other systems' states (i.e. without global connectivity). The developed framework improves the validity of our theory for high-order interactions, which usually feature sparse distributions that break system ergodicity at a high order.
  • (4)  
    We seek a comprehensive analysis of the scaling relations in both ergodic and non-ergodic cases. Our theoretical derivations and computational experiments suggest that the scaling relation can help differentiate among scale-invariant, weakly scale-invariant, and scale-dependent systems of an arbitrary order (consistent with verification using the behavior of the Laplacian eigenvalue spectrum). Meanwhile, the effects of high-order interactions on the scaling relation can be measured as well.
  • (5)  
    We propose a way to use the SRG to verify the existence of topological invariance, suggesting the possibility to relate the SRG with the persistent homology analysis [62, 7476]. As empirically demonstrated in our results, the SRG can serve as a tool similar to the filtration process [62, 7476] during the study of the lifetimes of different topological properties. In our experiments, we observe a correlated relation between the multi-order scale invariance and the topological invariance (i.e. multi-order scale-invariant systems are more likely to satisfy the topological invariance), which suggests the potential role of topology in shaping the statistical physics properties of complex systems.
  • (6)  
    Apart from validating the essential difference between the SRG and clustering [38], our experiments also suggest the applicability of the SRG to the identification of organizational structures (e.g. latent communities) of real complex systems and the analysis of the crucial roles of the interactions of different orders in characterizing these structures.
  • (7)  
    We suggest that the SRG can be analyzed from the perspective of an information bottleneck, which quantifies how the renormalized system preserves the information of its original properties while reducing complexity. We discover that the complexity required for maintaining one information bit can distinguish between scale-invariant and scale-dependent systems in an informational aspect.

In summary, by extending classic path integrals and RGs to simplicial complexes, our research reveals a novel route to studying the universality classes of complex systems with intertwined high-order interactions. The proposed simplex path integral and SRG can serve as precise tools for characterizing system dynamics, discovering intrinsic scales, and verifying potential scale invariance at different orders. The revealed information via our theory may elucidate the intricate effects of the interplay among multi-order interactions on system dynamics properties (e.g. phase transitions or scale invariance). To ensure the capacity of our theory to analyze the real data sets that are governed by unknown mechanisms or lack clear network structures, we suggest that one considers a-priori-knowledge-free framework in future studies. This framework begins by applying specific non-negative non-parametric metrics (e.g. distance correlation [109] or co-mutual information [110]) to evaluate the coherence between units and define the adjacency matrix of pairwise interactions. Then, the high-order representation and the SRG can be progressively calculated by following our theory.

Acknowledgments

This work is a part of the Topophy program. Y T develops the theory, designs computational experiments, writes the manuscript, and leads the Topophy program. A H C contributes to theoretical derivations, programmatic implementation, and computational experiment realization. Y H X contributes to theoretical derivations and proofreading. P S supervises the research project, proofreads and revises the manuscript, contributes to conceptualization, and offers technical support. This project is supported by the Artificial and General Intelligence Research Program of Guo Qiang Research Institute at Tsinghua University (2020GQG1017) and the Huawei Innovation Research Program (TC20221109044). The authors are grateful to Dr Hedong Hou, who studies at the Institut de Mathématiques d'Orsay, for his discussions. The authors thank Mr Kangyu Weng, who studies at Tsinghua University, for his help with proofreading. The authors also thank the anonymous reviewers for their inspiring suggestions, especially those about the topological invariance analysis, the SRG in a form of conventional renormalization group theories, and the nature of high-order interactions.

Data availability statement

The open source Python library of the proposed simplex renormalization group and its tutorials are provided in [51].

undefined