This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Paper The following article is Open access

Network processes on clique-networks with high average degree: the limited effect of higher-order structure

and

Published 22 November 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Citation Clara Stegehuis and Thomas Peron 2021 J. Phys. Complex. 2 045011 DOI 10.1088/2632-072X/ac35b7

2632-072X/2/4/045011

Abstract

In this paper, we investigate the effect of local structures on network processes. We investigate a random graph model that incorporates local clique structures, and thus deviates from the locally tree-like behavior of most standard random graph models. For the process of bond percolation, we derive analytical approximations for large percolation probabilities and the critical percolation value. Interestingly, these derivations show that when the average degree of a vertex is large, the influence of the deviations from the locally tree-like structure is small. In our simulations, this insensitivity to local clique structures often already kicks in for networks with average degrees as low as 6. Furthermore, we show that the different behavior of bond percolation on clustered networks compared to tree-like networks that was found in previous works can be almost completely attributed to differences in degree sequences rather than differences in clustering structures. We finally show that these results also extend to completely different types of dynamics, by deriving similar conclusions and simulations for the Kuramoto model on the same types of clustered and non-clustered networks.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

One of the problems that has motivated research in network science to a large extent is the assessment of how structural characteristics of real-world networks determine the performance of dynamical processes that take place on them [26]. Most analytical approaches to this problem use networks constructed via configuration models [5] as the substrate for the dynamics. In such models, one specifies the fraction of vertices with k neighbors pk . A sequence of vertex degrees {k1, ..., kN } is then drawn independently following pk , and the network is assembled by choosing from this sequence pairs of 'half-edges' (or stubs) uniformly at random, which are joined to form complete edges. While this method is able to generate networks with any prescribed degree distribution along with offering great analytic tractability, it has the shortcoming that the generated networks are locally tree-like. That is, the density of cycles vanishes asymptotically as the network size increases. This contrasts markedly with the rich topological structure of real-world networks, which often exhibit short cycles, degree correlations and clustering (i.e., the tendency of groups of three vertices to form triangles). Clustering is common to a variety of systems, but it is specially important in social networks, where the average probability that two neighbors of a vertex are also neighbors themselves (also referred to as clustering coefficient) often reaches values of tens of percent [26]. Other classes of systems known to be highly clustered in this sense comprise biological and information networks [26]. Hence, the inclusion of triangles and other types of subgraphs in random network models appears to be a crucial step to model dynamical process on networks accurately.

A practical method to create analytically tractable random networks with a more realistic clustering structure is to extend the standard configuration in order to explicitly include the generation of motifs that yield clustering. The first model of this kind was proposed independently by Newman [27] and Miller [24]. This model sets two degree sequences drawn from a joint degree distribution: the first sequence prescribes how many edges each vertex is incident to, exactly as in the standard configuration model; and the second degree sequence defines the number of triangles to which each vertex is attached. As the model then matches these stubs accordingly into edges and triangles, it generates networks with non-vanishing clustering even in the limit of large sizes [24, 27]. This strategy can be adapted to produce networks not only with triangles, but also with distributions of cliques of larger size [9], different types of subgraphs [19], or edge-multiplicities [38].

A number of previous authors have investigated the impact of added clustering on several types of network dynamics by employing such extensions of the standard configuration model. For instance, using a model that created networks with arbitrary distributions of cliques [9], Gleeson et al [10] showed that clustered networks exhibit higher bond percolation thresholds in comparison to locally tree-like structures with same degree distributions and correlation properties. Very recently, Mann et al [21] studied the percolation properties of the model by Karrer and Newman [19] under different combinations of cycles and cliques as building blocks for the networks. The authors confirmed that the increased clustering created by cliques leads to higher percolation thresholds [21]. On the other hand, the dynamics of networks containing only cycles were shown to approach the result obtained for the configuration model when the length of these cycles increases, as the model then becomes more locally tree-like. A different method to add clique structures to standard configuration models is to use household models, where every vertex of the configuration model is exploded into a clique of a specified size [1]. In this model, clustering was found to increase the percolation threshold [2, 6]. However, when including other clustered subgraphs than cliques, the percolation threshold may either increase or decrease compared to a locally tree-like model [18, 34].

Network processes on configuration models with higher-order clustering find their widest application in mathematical epidemiology, because of the natural importance of modeling of outbreaks in real-world scenarios and the close analogy between disease spreading and percolation processes. Indeed, many results uncovered in the context of percolation have counterparts in disease spreading. For instance, the presence of triangles has been found to increase the epidemic threshold while decreasing the outbreak size [24]. Likewise, networks composed of cycles have been shown to yield epidemic dynamics similar to those of tree-like networks as the length of these cycles increases [32]. Examples of other dynamics investigated with higher-order configuration models include cascade propagation [13, 14], the Ising model [16], and synchronization of coupled oscillators [30].

In this paper we reveal an effect that seems to have remained unnoticed in many previous works; namely, we show that the influence of higher-order subgraphs on network dynamics is negligible when the average degree is large. This phenomenon has also been observed in several real-world networks [23, 25]. In this paper, we show this insignificance of clustering analytically for a popular random graph model that includes clustering in the standard configuration model [27] and show that, in this model, the percolation dynamics of clustered networks for large outbreaks as well as the critical percolation value converge to the one expected for locally tree-like networks. We focus on the most clustered subgraphs possible: cliques of different sizes. While our analytical results are for the large average degree limit, in our simulations, this convergence kicks in for average degrees as small as 6 for several degree distributions. We also show that these conclusions hold for the synchronization transition of phase oscillators modeled by the Kuramoto model [33], indicating that the insensitivity to local network structures may hold for a wide range of network processes. While the effect of added clustering is non-trivial and can either decrease or increase the percolation threshold in any finite network, as was shown in a large body of previous research [2, 6, 10, 18, 21, 34], we show that these increases or decreases are small, and vanish when the average vertex degree becomes large.

Organization of the paper. We first describe the random graph model with subgraphs in section 2. We then focus on the setting where the network is formed by k-cliques of one given size. In section 3, we show that in such networks, size of the largest component under percolation becomes independent of the clique structures for large percolation probabilities. We then turn to small percolation probabilities in section 3.1, where we show that the critical percolation threshold also can be approximated by a k-independent value when the average degree of the network is large. We then investigate a setting where different clique sizes are present, in section 4. We show that even in this setting, where it has been reported that the possible introduction of degree correlations can affect the size of the largest component under percolation, when the average degree grows, the giant component only depends on the degree distribution of the network, not on the specific clique sizes. Finally, in section 5, we use analytical approximations as well as simulations to show that for a very different network process, the Kuramoto model, this insensitivity for local clustered network structures also appears for networks of large average degrees.

2. Random graph model with clique subgraphs

As a random graph model, we employ the random graph model with clustering developed in [19, 27]. This random graph model is a general framework that extends the configuration model to create networks with specified densities of arbitrary specified subgraphs. Including clustered subgraphs in the set of specified subgraphs enables to overcome the locally tree-like property of the standard configuration model.

In this manuscript we focus on the most clustered sets of subgraphs, cliques. That is, every vertex has a joint clique degree vector (s(1), ..., s(m)), where m + 1 denotes the largest clique size. Here ${s}_{i}^{(1)}$ denotes the edge-degree of vertex i, and ${s}_{i}^{(j)}$ denotes the clique-degree of size j + 1 of vertex i. The clique-degree of a vertex describes the prescribed involvement of a vertex in non-overlapping maximal cliques of a specified size. Thus, a vertex of clique degree ${s}_{i}^{(2)}=3$ is prescribed to participate in 3 cliques of size 3. Note that this clique-degree encompasses the number of cliques of a given size that are prescribed by the clique-degree vector. However, a vertex of k-clique-degree l may in fact be incident of more cliques of size k + 1. For example, a vertex with clique-degree vector (2, 1, 1) (see figure 1) takes part in one 3-clique and one 4-clique, but this 4-clique also contains three 3-cliques incident to the same vertex, which are not counted in the clique-degree vector. Thus, this particular vertex is in fact part of four 3-cliques. We denote the probability that a vertex has clique-degrees s(1), ..., s(m) by ${q}_{{s}^{(1)},\dots ,{s}^{(m)}}$. The degree or the total number of connections of vertex i is then described by ${\sum }_{j=1}^{m}\enspace j{s}_{i}^{(j)}$, because every clique of size j + 1 adds j connections to the vertex. We denote the degree distribution of a vertex by pk , so that

Equation (1)

where $\mathbb{1}$ is the indicator function. After sampling a joint clique-degree for every vertex, the network is then formed by selecting j uniformly chosen clique-edges of size j, and pairing the corresponding vertices into a clique. This process is continued for all j until all clique-edges have been paired into a clique. This is an extension of the standard configuration model, where the network is formed by pairing two uniformly chosen half-edges until all half-edges have been paired. Such pairing schemes can form multiple edges or self-loops, creating overlapping cliques, but the probability of such multiple edges or self-loops becomes small in the large-graph limit [19, 24]. The structure of a network constructed via the model described above is illustrated in figure 1.

Figure 1.

Figure 1. Illustration of a small network constructed according the model defined in section 2 and with largest clique size m + 1 = 6. Vertex a has clique-degree vector $({s}_{a}^{(1)},{s}_{a}^{(2)},\dots ,{s}_{a}^{(5)})=(2,1,1,0,0)$, i.e. it takes part in two single-edges, one 3-clique (triangle) and one 4-clique, meaning that it is attached to four 3-cliques in total. Vertex b, in turn, has clique-degree vector $({s}_{b}^{(1)},{s}_{b}^{(2)},\dots ,{s}_{b}^{(5)})=(0,0,1,1,0)$ (one 4-clique and one 5-clique).

Standard image High-resolution image

3. Bond percolation with general cliques

We now investigate the behavior of this network model under bond percolation, where every edge is removed independently with probability 1 − π. We first focus on the case where every vertex is part of only k-cliques. Let qi denote the probability that a randomly chosen vertex is part of i k-cliques. Define the generating functions

Equation (2)

where ⟨s⟩ denotes the average number of k-cliques a vertex is part of. Let u denote the probability that a randomly chosen clique-edge is not connected to the giant component. We are interested in the fraction of vertices in the largest component after percolation, S, which can be obtained by [27],

Equation (3)

where h(k, j, π) is the probability that a given vertex of a k-clique is still connected to j other vertices of the clique after percolation with probability π. These implicit equations are in general difficult to solve [19, 22], so that it is difficult to make general observations on the solution of these equations. Therefore, we here focus on an approximation of S, first for large component sizes (π large), and then for small ones (approximating the critical value where S becomes larger than zero). In these approximations, we will assume that the number of connections of a vertex is large.

When the degree of a typical vertex is large, the probability that a randomly chosen clique-edge is not connected to the giant component becomes small, as it is likely to lead to a large-degree vertex. As u denotes the probability that a randomly chosen clique-edge does not lead to the giant component, we expand (3) with a first-order Taylor expansion around u = 0. This yields

Equation (4)

For a clique vertex to be disconnected from the rest of the clique, k − 1 edges need to be not present, so that h(k, 0, π) = (1 − π)k−1. For a clique vertex to be connected to only one other clique vertex, there are k − 1 choices for this other vertex, and then the connections from these two vertices to the other k − 2 vertices need to be not present, giving h(k, 1, π) = (k − 1)π(1 − π)2(k−2). Filling in these yields

Equation (5)

This results in

Equation (6)

Using a first order Taylor expansion, (3) then yields for S,

Equation (7)

where ⟨s⟩ again denotes the average number of cliques a vertex is part of.

Now g((1 − π)k−1) = gD (1 − π), where gD (x) = ∑k pk xk is the generating function of the vertex degrees from (1). This means that for a given degree distribution D, the leading order term of the approximation of the largest component size does not depend on the clique size in which the vertex degrees are split. Furthermore, we show in appendix B that the numerator of the second term also only depends on the degree distribution, not on the clique structure. Furthermore, ${g}_{p}^{\prime }({(1-\pi )}^{k-1})$ decreases when the network degrees increase. Thus, large giant components become asymptotically independent of the clique structures in the networks.

Example: regular degrees. We now apply our approximations to several frequently used degree distributions. In regular networks, every vertex is part of sk-cliques. Then, g(x) = xs and gp (x) = xs−1, so that (7) becomes

Equation (8)

Now s(k − 1) is the degree of a vertex. Equation (8) therefore shows that fixing the degree of a vertex, and changing k (by decreasing or increasing s) does not influence the leading term for the giant component size S. Furthermore, the larger s, so the larger the average degree of a vertex, the more dominant the first term becomes. Thus, the larger the degree of a vertex, the smaller the influence of the clique structure of the network on percolation processes.

In particular, fixing the degree of a vertex at s(k − 1) = d and investigating the difference between choosing cliques of size k = i or k = j yields

Equation (9)

Thus, by making d larger, it is always possible to get ${S}_{{K}_{i}}-{S}_{{K}_{j}}$ arbitrarily small. This indeed shows that when the average degree of a network is large, the influence of the clique structure of the model becomes irrelevant.

Figure 2 shows the behavior of the approximation of (8) for three networks, one consisting of only edges (the standard configuration model), one only of triangle-edges, and the other only of K4-edges. We see that the approximation of (8) works well when S is large for all networks. Furthermore, the size of the largest component under percolation differs more between K3 and K4 than between K2 and K3 under small average degree in figure 2(a), while these differences have washed away in figure 2(b) under higher average degree. In the simulations, double edges, cliques, self-loops that can possibly be created by the configuration model-like construction of the network have been removed. However, these events are sufficiently rare to not affect the limiting degree distribution when the network size tends to infinity [17].

Figure 2.

Figure 2. Size of the largest component after percolation on networks with only clique-edges of given size. The solid line presents the analytical value of S obtained from solving (3), the dashed line is its approximation from (8), circles are obtained by simulations on N = 10 000 vertices, and the cross indicates the approximation of the critical percolation value from (16).

Standard image High-resolution image

Example: Poisson degrees. Under a Poisson degree distribution where every vertex is part of on average sk-cliques, the generating functions of (3) become gp (x) = g(x) = es(x−1). Then, (7) becomes

Equation (10)

Figure 3 shows the behavior of the approximation of (8) for four networks, one consisting of only edges, one of only triangle-edges, one of only of K4-edges and one of only K5-edges. We see that for these Poisson degree distributions, the difference between the large component sizes are well approximated by (10), but that these final sizes still differ quite a bit even for large average degrees. This is caused by the fact that for Poisson clique-degrees, the degree distributions of the different clique sizes are not the same.

Figure 3.

Figure 3. Size of the largest component after percolation on Poisson networks with only clique-edges of one specified size. The solid line presents the analytical value of S obtained from solving (3), the dashed line is its approximation from (10), circles are obtained by simulations on N = 10 000 vertices, and the cross indicates the approximation of the critical percolation value from (17).

Standard image High-resolution image

Indeed, if we focus on the 2-clique case, a vertex can have degree 0, 1, 2, ... when its degree is sampled from a Poisson degree distribution. However, a vertex that is part of triangles, can only have degrees 0, 2, 4, 6, ..., when the number of triangles is sampled from a Poisson distribution. In general, a vertex that is only part of k-cliques can only have degrees 0, k − 1, 2(k − 1), .... Even when the average values of the Poisson distributions are tuned as λ/(k − 1) to make sure that, on average, all vertices have the same average number of connections, the degree distributions are not the same. This makes the leading order term in (10) different for different clique sizes. In particular, the probability of having zero connections increases, which makes the final component size smaller when the clique size increases.

To overcome this problem, we now generate networks with different clique sizes with the same degree distribution. We do this by generating the K4 network by sampling a Poisson random variable for each vertex, which we multiply by 2. This is the K4 degree for each vertex. For the K3 network, we sample a Poisson random variable with the same mean for each vertex, which we multiply by 3. This is the K3 degree for each vertex. For the edge-network we again sample a Poisson random variable with the same mean for each vertex, which we multiply by 6. This is the edge-degree for each vertex. Now, in all three networks, vertices can only have degrees 0, 6, 12, ..., and the degree distribution across the three networks is the same. Figure 4 shows the results on percolation on these types of networks. We see that in this case, the percolation curves of these Poisson networks of different clique sizes completely overlap, even while the average degree in this setting is only 6. Thus, the difference between networks of different clique structures under Poisson degree distributions reported in figure 3, but also in [10, 19, 27], does in fact not seem to be caused by the clique structure of the network, but by the fact that the degree distributions of the networks are different, changing the leading order term in (7).

Figure 4.

Figure 4. Size of the largest component after percolation on networks with a Poisson degree distribution with λ = 1, adjusted so that every network has the same degree distribution and average degree 6, where every vertex is only part of clique-edges of specified size. The solid line presents the analytical value of S obtained from solving (3), the dashed line is its approximation from (7), circles are obtained by simulations on N = 10 000 vertices, and the cross indicates the approximation of the critical percolation value from (14).

Standard image High-resolution image

For these Poisson networks, figure 4 shows that the percolation threshold and percolation behavior almost perfectly overlaps for cliques of size 2, 3 and 4, in contrast with the results on the regular graph of figure 2(a), where a higher average degree was necessary for convergence. This faster convergence in the Poisson setting can be ascribed to the fact that for the Poisson distribution, the excess degree distribution when arriving at a vertex through an edge is the same as the degree distribution. In general distributions this is not the case. Arriving at a vertex through a clique of size k already uses up k − 1 edges. Thus, for general distributions, the larger k, the lower the typical number of new cliques that will be reached from this vertex. This effect is particularly large when k is large and the average degree is small. In Poisson distributions on the other hand this effect vanishes due to the fact that the excess distribution is the same as the original degree distribution, so that the overlap already appears for lower average degrees.

Example: power-law degrees. For networks with power-law degrees, we can follow the same approach as for the Poisson networks. We generate power-law random variables, multiply them by 2 for the K4 network, by 3 for the triangle-networks, and by 6 for the edge-network to ensure that all networks have the same degree distribution. Using that g(x) = Liτ (z)/ζ(z) is the generating function of a power-law random variable with exponent τ, we can again find the approximation of the largest component size under percolation for large π from (7). Figure 5 shows that also for power-law random networks, these component sizes of the different networks are similar.

Figure 5.

Figure 5. Size of the largest component after percolation on networks with a power-law degree distribution with exponent τ = 3.5 and average degree 7.1 where every vertex is only part of clique-edges of a specified size. The solid line presents the analytical value of S obtained from solving (3), the dashed line is its approximation from (7), circles are obtained by simulations on N = 10 000 vertices, and the cross indicates the approximation of the critical percolation value from (14).

Standard image High-resolution image

3.1. Approximation of πc for general clique-degree distributions

We now turn to investigating the similarity of small component sizes. In particular, we approximate the critical percolation value πc. The critical value πc is obtained when the average number of neighbors of a vertex reached by following a randomly chosen edge after percolation equals one. Thus,

Equation (11)

where $\frac{\langle {s}^{2}\rangle -\langle s\rangle }{\langle s\rangle }$ equals the average number of k-cliques connected to a vertex reached from an arbitrary k-clique.

For large average degrees, the critical percolation value is achieved at small π. Therefore, we only keep terms of order π2 or less. The only terms in the summation above with terms of order π2 or less are h(k, 1, π) and h(k, 2, π), as reaching 3 or more other vertices in a clique requires at least 3 edges to be present, giving a contribution of at least π3. By filling in h(k, 1, π) = (k − 1)π(1 − π)2(k−2) and h(k, 2, π) = (k − 1)(k − 2)(3π2(1 − π)3(k−3)+1 + π3(1 − π)3(k−3)), we approximate

Equation (12)

Keeping only the terms of order ${\pi }_{\mathrm{c}}^{2}$ or less gives

Equation (13)

This is a quadratic equation that has its positive solution at

Equation (14)

When the average degree, and therefore also $\frac{\langle {s}^{2}\rangle -\langle s\rangle }{\langle s\rangle }(k-1)$, becomes large, we use a first order Taylor expansion of $\sqrt{1+1/x}$ for large x. Then, πc can be approximated by

Equation (15)

The term in the denominator describes the average number of vertices reached by coming from a randomly chosen clique-edge, without percolation. When we compare two networks with different clique structures but with the same degree distribution, $\frac{\langle {s}^{2}\rangle }{\langle s\rangle }(k-1)$ is the same for the different networks. Furthermore, this quantity is increasing in the average degree. Thus, in the large average degree-regime πc converges to a value that is independent of the clique structure of the network.

Regular networks. In networks where every vertex is connected to s k-cliques, we can reduce (14) in the following way. Using that the degree of a vertex d = s(k − 1), (14) becomes

Equation (16)

Thus, when d increases, πc approaches the same value for all cliques sizes k. Furthermore, the larger k, the larger the difference between πc when increasing k by one. Figure 2 shows the approximated value of πc from equation (16) versus the analytical values of the giant component sizes. We see that already for an average degree of 6 this approximation is quite good, and that for larger average degree of 12, indeed the values of πc for the network of triangles and K4 cliques almost overlap.

Poisson networks. In Poisson networks where the average vertex is part of s k-cliques with fixed d = s(k − 1), $\frac{\langle {s}^{2}\rangle -\langle s\rangle }{\langle s\rangle }=s$. Then (14) becomes

Equation (17)

Thus, when d gets large, again πc approaches the same value for all cliques sizes k. Figure 3 shows that this is a good approximation of the critical percolation value πc, and that for an average degree of 12, these values become very close under different clique sizes.

4. Mixed clique sizes

We now investigate networks where cliques of different sizes are present. By introducing different clique sizes, it is possible to create degree–degree correlations that have often been said to influence the largest component size after percolation [3, 4, 12]. Thus, we now investigate to what extent the introduction of mixed clique sizes influences the size of the largest component.

Under bond percolation of networks where every vertex is part of s1 cliques of size k1, and s2 cliques of size k2 with probability ${q}_{{s}_{1},{s}_{2}}$, the generating-function methodology gives the following results. Let $g(x,y)={\sum }_{{s}_{1},{s}_{2} > 0}{\enspace p}_{{s}_{1},{s}_{2}}{x}^{{s}_{1}}{y}^{{s}_{2}}$ be the generating function of the clique degrees. Furthermore, let

Equation (18)

Equation (19)

with

being the generating functions of the number of cliques that are reached by following a randomly chosen clique-edge. Let u denote the probability that a randomly chosen k1-clique-edge is not connected to the giant component. Similarly, let v denote the probability that following a randomly chosen k2-clique edge does not lead to the largest component.

We show in appendix A that u and v can be approximated by

Equation (20)

Furthermore, the giant component size S is then approximated as

Equation (21)

where

and where gD (x) is the generating function of the total vertex degrees. Thus, the leading order term of the giant component size does not depend on the distribution of the clique degrees k1 and k2, but only on the total vertex degree. Furthermore, the numerators of the second order terms also only depend on the degree distribution, and not on the clique sizes, similarly to the one clique-size case.

It is not difficult to extend this analysis to include more than two different clique sizes, where (21) contains terms for all size biased generating functions of the clique sizes ${g}_{{k}_{i}}$, instead of only gp and gq in (21). Therefore, even in the presence of multiple clique sizes that can generate degree–degree correlations, large component sizes are clique-structure independent for large average degrees.

Example: assortative mixing. In several sources of previous work, degree–degree correlations were found to be important for the behavior of percolation processes (see, e.g., [3, 4, 12]). Furthermore, the clustering assortativity, describing the tendency of high-degree vertices to be more clustered than high-degree vertices or vice versa, has also been ascribed strong importance on the behavior of a network under percolation [21]. However, (21) shows that large component sizes only depend on the degree distribution, so that it is independent of any clique correlations in the large degree limit. Figure 6 shows that indeed the influence of mixed clique sizes on the giant component is small, especially in the large average degree regime.

Figure 6.

Figure 6. Size of the largest component after percolation on networks with clique sizes of 2 and 4, mixed assortatively or disassortatively with average degree 7.5 or 15. Here pi,j denotes the probability of having i cliques of size 2 and j of size 4, and for assortative p6,0 = 0.5, p3,2 = 0.25, p0,3 = 0.25, while for disassortative p3,1 = 0.5, p3,2 = 0.5. In the high-degree regime, all degrees are doubled. Dashed lines are the approximations from (21), and the cross denotes the approximation of πc from (22).

Standard image High-resolution image

4.1. Approximation of πc for mixed clique networks

In appendix C we show that πc can be approximated by

Equation (22)

where

Equation (23)

For large average degrees, this value can be approximated by

Equation (24)

In assortative networks, where cliques of a given size are typically also connected to many cliques of the same size, $\left(\frac{\langle {s}_{1}^{2}\rangle }{\langle {s}_{1}\rangle }-1\right)({k}_{1}-1)$ and $\left(\frac{\langle {s}_{2}^{2}\rangle }{\langle {s}_{2}\rangle }-1\right)({k}_{2}-1)$ are large, so that we expect πc to be small. In disassortative networks, where the different clique sizes are more mixed, $\left(\frac{\langle {s}_{1}^{2}\rangle }{\langle {s}_{1}\rangle }-1\right)({k}_{1}-1)$ and $\left(\frac{\langle {s}_{2}^{2}\rangle }{\langle {s}_{2}\rangle }-1\right)({k}_{2}-1)$ are smaller. Thus, the degree–degree correlations that are created by the different clique sizes play a role in the critical percolation value πc, whereas the giant component size of (21) is asymptotically independent of such degree correlations. However, figure 6 shows that these correlations still vanish in the large-degree regime.

5. Phase oscillators coupled on clique-networks

In this section we illustrate the limited effect of higher-order structure on more complex dynamic network processes than bond percolation. In particular, we focus on the dynamics of coupled oscillators. For this purpose, we employ the paradigmatic Kuramoto model [33] that can describe synchronization phenomena on complex networks. In the Kuramoto model, the oscillator of vertex i is characterized by a phase variable θi , and the dynamics on a heterogeneous network is dictated by the following equations [33]:

Equation (25)

where ωi is the natural frequency of oscillation of oscillator i, and Aij is the network adjacency matrix. If there is an edge connecting i and j, Aij = 1 (0 otherwise), and the interaction between the vertices is weighted by the coupling K. If K is lower than a certain Kc, the oscillators rotate incoherently, each one at its own rhythm set by the natural frequency ωi . For K > Kc, the incoherent state loses stability: a cluster of oscillators is formed around an average phase value, and these units begin to rotate locked in the same frequency [7, 35]. This transition from asynchrony to a partially synchronized state is measured by the Kuramoto order parameter R given by [7, 35]

Equation (26)

where R quantifies the level of synchrony achieved by the oscillators, and ψ is their average phase. While one can monitor the synchronization transition of a heterogeneous network with equation (26), it is not possible to decouple equation (25) in terms of a global order parameter. Instead, in order to perform a self-consistent analysis and characterize the onset of synchronization analytically, we need to employ heterogeneous degree mean-field approximations [33]. This is equivalent to replacing the terms of the adjacency matrix Aij by their ensemble averages in the configuration model, which in the single-edge version is ⟨Aij ⟩ = di dj /Nd⟩. In the model that generates networks with a single clique type the expression is analogous, namely, $\langle {A}_{ij}^{(c)}\rangle =(k-1){s}_{i}{s}_{j}/N\langle s\rangle $, where si is the number of cliques attached to vertex i. Replacing Aij by $\langle {A}_{ij}^{(c)}\rangle $ in equation (25) we obtain

Equation (27)

which motivates the definition of the following order parameter

Equation (28)

which in turn allows us to rewrite equation (27) as

Equation (29)

In the limit of N, we assume that the assignment of cliques and natural frequencies is well described by distributions qs and g(ω); we further assume that the collections of vertices with clique number s and frequency ω form a phase density ρ(θ, t|s, ω). Thus, we rewrite equation (28) in the continuum limit as

Equation (30)

By choosing $g(\omega )={(\sqrt{2\pi })}^{-1}{\text{e}}^{-{\omega }^{2}/2}$, we can set ϕ = 0 without loss of generality. Substituting the stationary synchronous solution of equation (29)—i.e. ρ(θ|s, ω) = δ{θ − arcsin[ω/Kr(k − 1)s]}, for |ω| ⩽ Kr(k − 1)s—into equation (30), we arrive at the following implicit equation

Equation (31)

where I0(⋅) and I1(⋅) are the modified Bessel functions of the first kind. Thus, with equation (31) we can find the dependence of the order parameter r on the coupling strength K, and thereby assess the impact of different clique sizes on the onset of synchronization. Letting r → 0+, we also obtain the expression for the critical coupling

Equation (32)

Thus, again, substituting d = (k − 1)s shows that the critical coupling is independent of the clustering structure. However, an immediate problem we face is the fact that the mean field approximations behind equation (31) are accurate only for sufficiently dense networks, typically when the average degree is at least of order of a few dozen [11, 33]. This limits the analytical verification of the effect of cliques on the dynamics of networks as sparse as the ones considered in the previous sections. Nonetheless, in the appropriate regime in which the mean field approach is valid, equation (31) suggests that the conclusions drawn for bond percolation may be similar for synchronization processes: notice that in equation (31) (k − 1)s is the actual degree of a vertex; substituting d = (k − 1)s in the implicit equation for r and rewriting it in terms of the new variable, we find that the emergence of a synchronous component depends only on the final degree sequence of the network and not on the sizes of the cliques. Therefore, clustered and unclustered networks are expected to exhibit similar dynamics also in the synchronization of coupled oscillators.

In order to confirm the above result, let us first investigate how the critical point Kc changes according to the clique structure. In figure 7 we compare the predictions of Kc by equation (32) with the corresponding quantities obtained via numerical integration of the system (25) for several average degrees ⟨d⟩. We numerically detect the transition point between incoherence and partial synchronization by identifying Kc as the position of the divergent peak of the susceptibility $\chi =N({\langle {r}^{2}\rangle }_{t}-{\langle r\rangle }_{t})/{\langle r\rangle }_{t}$ [28], where ⟨⋅⟩t is a long temporal average. As can be seen in figure 7, the agreement between simulation and theoretical values is satisfactorily good for low ⟨d⟩, but it is progressively improved as the networks get denser. Furthermore, figure 7 indicates that the transition to synchrony tends to occur sooner as the clique size, and hence the clustering, increases. The numerical value of Kc for different clique sizes becomes statistically equivalent at high ⟨d⟩. Yet, the solutions obtained from equation (32) in figure 7 suggest that clustering always ameliorates the network synchrony, an effect that, as seen in figure 7, asymptotically vanishes as ⟨d⟩ increases. This is in apparent contradiction with our analysis of equation (31), in that networks with the same degree distribution, regardless of their clustering structure, ought to have identical dependence r = r(K) and critical couplings Kc. However, similarly to the experiments of figure 3, these networks of different clique sizes do not have the same degree distributions.

Figure 7.

Figure 7. Critical coupling Kc for the onset of synchronization as a function of the average degree ⟨d⟩ for the clique-networks with Poisson degree distributions. Solid lines are the solutions of equation (32). Dots correspond to the simulation results obtained by numerically integrating equation (25) using the Heun's method with time step dt = 0.05. For each coupling K, the quantities are average over t ∈ [500, 1000]. In all numerical experiments we have N = 104 oscillators, whose frequencies are distributed according to $g(\omega )={(\sqrt{2\pi })}^{-1}{\text{e}}^{-{\omega }^{2}/2}$. Different symbols and colors refer to networks constructed with different clique sizes: K2 denotes networks containing only single-edges (configuration model), and K5 refers to networks built from sequences of cliques with five vertices.

Standard image High-resolution image

To verify whether the differences in the onset of synchronization shown in figure 7 are due to discrepancies in the degree sequences or are, on the other hand, a true effect of the clustered structure, we repeat the methodology of the experiments depicted in figure 4. That is, we simulate the oscillators on networks with different clique sizes but adjusted to have the exact same degree sequence. The result is seen figure 8. Again, as in the percolation experiment in figure 4, the synchronization values match almost perfectly despite the difference in the clustering levels: in the examples in figure 8, the single-edge networks (K2) have transitivity coefficient [27] equal to C ≈ 0, while networks with K4 cliques exhibit C ≈ 0.18—a significant structural difference that is not reflected in the dynamics. It is also noteworthy that this phenomenon occurs at a low average degree (⟨d⟩ = 6), i.e., the regime depicted in figure 7 with the most prominent discrepancies between clustered and unclustered networks. As discussed for the percolation case in section 3, actually those discrepancies are due the fact that the degree distributions are not identical for different clique sizes; as a consequence, the mismatches in the degree sequence end up generating different solutions for equation (31). The results in figures 4 and 8 therefore show that networks with similar degree distributions and correlations may exhibit equivalent dynamical behavior regardless of their subgraph structure and clustering levels.

Figure 8.

Figure 8. Synchronization diagram showing the evolution of the order parameter r as a function of the coupling K. The networks have Poisson degree distribution adjusted so that every network has the same degree sequence and average degree ⟨d⟩ = 6, as in figure 4. Dots correspond to the simulation results obtained by numerically integrating equation (25) using the Heun's method with time step dt = 0.05. For each coupling K, the order parameters are average over t ∈ [500, 1000]. In all numerical experiments we have N = 104 oscillators, and the frequencies are distributed according to $g(\omega )={(\sqrt{2\pi })}^{-1}{\text{e}}^{-{\omega }^{2}/2}$. Different symbols and colors refer to networks constructed with different clique sizes: K2 denotes networks containing only single-edges (configuration model), and K5 refers to networks built from sequences of cliques with five vertices.

Standard image High-resolution image

We also note that the critical coupling Kc can be estimated via 'quenched' mean-field approximations and, for the parameters considered here, expressed in terms of the largest eigenvalue λ1 of the adjacency matrix as ${K}_{\mathrm{c}}={\lambda }_{1}^{-1}\sqrt{8/\pi }$ [28, 31]. We can complement the latter expression with the recent results of reference [29] in which the largest eigenvalue for Poisson random networks constructed with K3 cliques has been estimated as λ1 = 2⟨s⟩ + 1 + 1/2⟨s⟩. In the limit of high average degrees, ⟨s⟩ → , the third term of λ1 vanishes, and the corresponding result for tree-like Poisson random networks is recovered (λ1 = ⟨d⟩ + 1). Therefore, also in the quenched mean-field formulation, the value of Kc of clustered networks is expected to asymptotically approach the calculations for tree-like networks, in agreement with the results in figure 7.

6. Conclusion

In this paper, we have investigated the influence of the presence of clustered structures in a random graph model in the form of cliques on two network processes: bond percolation and synchronization. Percolation on such clustered networks has been investigated frequently, but as the equations for the giant component size under percolation are given by several implicit equations that are difficult to analyze mathematically, the factors dominating the behavior of percolation processes on such networks are largely unknown. By approximating the size of the giant component under large percolation probabilities as well as the critical percolation value where a giant component starts to form, we have found that the degree distribution is the dominant factor in these approximations, especially when the average degree of the network is large. In particular, our approximations are independent of the amount of clustering in the network. This means that introducing clustering by locally inserting cliques or other types of subgraphs in the frequently used locally tree-like random graph models, barely influences the size of the largest component.

We also showed that differences in percolation behavior due to the introduction of cliques in the configuration model that were found in several previous works [10, 19, 21, 27, 36] can be ascribed to the fact that the degree distribution changed in those experiments as well. When keeping the degree distribution fixed while introducing more clustering, this difference becomes small. We also proved that small differences in percolation behavior that remain vanish when the average degree becomes large.

While our approximations show that the dominant factor for the giant component under large percolation probabilities as well as the critical percolation value is the degree distribution, and not the clustering in the network, our simulations show that actually the entire percolation curve seems to become independent of the clustering in the network, once the average degree becomes large. Showing this analytically would be an interesting point for further research.

Furthermore, while we have primarily focused on the process of bond percolation, we also showed that for a different network process of oscillator synchronization, the same independence of higher-order structures is present when the average degree is large. We therefore believe that other processes such as opinion dynamics or contact processes could be independent of the clustering structure of this model as well. Importantly, while we have shown that the size of the giant component and the stationary synchronous states remain unaffected, we note that the transient properties of disease spreading processes may indeed be influenced by the clustering level [15, 37]. Investigating which types of dynamics are independent of the clique structures is therefore an interesting avenue for further research.

In this manuscript, we investigated the percolation properties of a configuration-model type random graph model with enhanced clustering in the form of cliques that do not overlap in their edges. Real-world networks in general however also contain cliques that do have edge-overlaps, or clustered subgraphs that are not necessarily cliques. We believe that our results can be extended to such more realistic settings by using the extension of the random graph model we considered to include arbitrary subgraphs as well [19]. We believe that our Taylor-expansion-based approach still works in this setting with more general subgraphs. In particular, we conjecture that these results also hold for such models with overlapping cliques, where the overlap forms a higher-order locally tree-like structure. Indeed, cliques that overlap at one or more specific edges can then be defined as one subgraph that is included in the model, so that it explicitly includes overlapping cliques. Identifying the exact model conditions under which clustering does not affect the percolation properties would therefore be an interesting line of further research. Furthermore, there is some evidence that clustering properties also barely affect the dynamical processes of real-world networks [11, 23]. It would be interesting to see if such model classifications can also predict for which real-world networks clustering is an important property when investigating network dynamics, and for which ones it is not.

The random graph model we considered connects cliques in a locally-tree like fashion. In other types of models, percolation on locally tree-like networks has been investigated by contracting cliques (or other subgraphs) in a specific way to one or more vertices, to create a reduced tree [6, 17]. Analyzing percolation on this reduced tree then gives the percolation behavior of the clustered network. However, this reduced tree in general has a different degree distribution, and sometimes even a different number of vertices than the original network. As the percolation behavior of locally tree-like networks strongly depends on the degree distribution of the network, this means that this method will still not show the similarity of the process on clustered and non-clustered networks. Furthermore, in the model we investigate, contracting cliques to single vertices is more difficult, as cliques may touch at a single vertex. Still, we believe that the similarity of the two processes on clustered and non-clustered networks has to do with the higher-order locally tree-like structure that is present.

Another interesting line of research following from these results is in higher-order dynamics. We showed that single-edge dynamics on networks where a clique structure is imposed behave similarly as in networks without the clique structure. However, when studying the network model, for example, as a simplicial complex instead, it is possible to impose simplicial dynamics on top of it, where the dynamics involve all clique vertices in the interactions. It would be interesting to see under which conditions on the dynamic process on such a simplicial extension of this model depends on the clique structure, and under which conditions it does not.

Finally, while this work shows that inserting clustering in a locally tree-like model barely affects the behavior of an epidemic process under bond percolation, we believe that in different models, where clustering is introduced by the presence of geometry, or imposed by maximum-entropy based constraints [8, 20], bond percolation can behave very differently under two models of the same degree distribution. Showing general conditions on the network structure under which clustering does or does not affect the size of a giant component compared to tree-like network models would therefore also be an interesting avenue for further research.

Another interesting question is to identify the conditions on the clique sizes such that clustering does not affect the percolation processes. In this work, we investigated fixed maximal clique sizes, and showed that their effect is small when the average degree becomes large. Thus, an interesting question would be: how large can the clique sizes become compared to the average degree so that these results still hold? Identifying this threshold scaling would be an interesting avenue for further research.

Acknowledgments

CS acknowledges VENI grant 202.001, and TP acknowledges FAPESP (Grant No. 2016/23827-6). This research was carried out using the computational resources of the Center for Mathematical Sciences Applied to Industry (CeMEAI) funded by FAPESP (Grant No. 2013/07375-0).

Data availability statement

No new data were created or analysed in this study.

Appendix A.: Computations for the mixed clique sizes

After bond percolation with probability π [27],

Equation (A.1)

while $S=1-g({\sum }_{j=1}^{{k}_{1}-1}h({k}_{1},j,\pi ){u}^{j},{\sum }_{j=1}^{{k}_{2}-1}h({k}_{2},j,\pi ){v}^{j})$. Again, we expand (A.1) with a first-order Taylor expansion around u, v = 0. This yields

Equation (A.2)

Taylor expanding gp (x, y) and gq (x, y) as well and using that uv is small, we obtain

Equation (A.3)

This is a linear system of equations with as solution

Equation (A.4)

where

Equation (A.5)

This can be approximated by

Equation (A.6)

This gives for the final component size

Equation (A.7)

This can be written as

Equation (A.8)

where gD (x) is the generating function of the total vertex degrees.

Appendix B.: Equality of numerator second term

We now show that the numerator of the second approximating term in (7) only depends on the degree distribution of the random graph, but not on its clique structures.

Equation (B.1)

where D* is the size-biased degree distribution. Thus,

Equation (B.2)

which is independent of the clique size k, and only depends on the degree distribution.

Appendix C.: Approximating πc for mixed clique sizes

The average number of ki vertices reached from a kj -clique vertex for i, j ∈ {1, 2}, equals

Equation (C.1)

where ${\delta }_{{k}_{i},{k}_{j}}$ is the Kronecker delta. Thus, the matrix M is a branching matrix that describes the average number of vertices of type ki attached to a randomly chosen clique-edge of type kj . The average number of vertices at generation j of the offspring distribution can be expressed in terms of Mj . Therefore, if the largest eigenvalue of M becomes larger than one, a giant component forms [19].

Again, we approximate the solution by a second-order polynomial in π assuming again that for large average degrees, the critical percolation value is small. Therefore, similarly to the analysis in section 3.1, we only keep the terms h(ki , 1, π) and h(ki , 2, π). Then, the condition on the largest eigenvalue of M becomes [19]

Equation (C.2)

where

Equation (C.3)

Keeping only second order terms in π yields

Equation (C.4)

This equation has its positive solution as

Equation (C.5)

Please wait… references are loading.
10.1088/2632-072X/ac35b7