The simplified self-consistent probabilities method for percolation and its application to interdependent networks

Interdependent networks in areas ranging from infrastructure to economics are ubiquitous in our society, and the study of their cascading behaviors using percolation theory has attracted much attention in recent years. To analyze the percolation phenomena of these systems, different mathematical frameworks have been proposed, including generating functions and eigenvalues, and others. These different frameworks approach phase transition behaviors from different angles and have been very successful in shaping the different quantities of interest, including critical threshold, size of the giant component, order of phase transition, and the dynamics of cascading. These methods also vary in their mathematical complexity in dealing with interdependent networks that have additional complexity in terms of the correlation among different layers of networks or links. In this work, we review a particular approach of simple, self-consistent probability equations, and we illustrate that this approach can greatly simplify the mathematical analysis for systems ranging from single-layer network to various different interdependent networks. We give an overview of the detailed framework to study the nature of the critical phase transition, the value of the critical threshold, and the size of the giant component for these different systems.


Introduction
Systems consisting of multiple inter connected networks with different types of links have received enormous attention in recent years [1][2][3][4][5][6][7][8][9] due to their ubiquitous applications in complex systems. Such networks appear in the literature as interdependent networks or multiplex networks. Studies have shown that interdependent networks show different percolation/phase transition behaviors than single networks. In particular, an interdependent network is more vulnerable to random attacks [10]. As many real-world infrastructure networks can be classified into interdependent networks [11][12][13][14], the understanding of their robustness carries great practical significance.
In a network consisting of links and nodes, one of the most important quantities used to analyze its robustness is the size of the giant component, which is defined as the largest set of nodes that are connected to each other. When a network is under attack (i.e., a fraction p 1 − of nodes (or links) are removed), the size of the largest cluster shrinks. Usually its size is a finite fraction of the total number of nodes in the network, unless more than a certain fraction, p 1 to very complicated formulations that are not always easy to solve. To study different network constructions, some of the other studies used different methods to achieve a relatively simpler analytical framework. In particular, the works in [7,8,[19][20][21][22] used self-consistent equations of the converging probabilities to have an alternative approach to analyzing the critical behaviors of certain types of interdependent networks. Some of these methods can be extended to other scenarios.
In this paper, we illustrate the use of one particular technique based on self-consistent probabilities [8,[19][20][21], and we demonstrate that this technique could be applied to a wide variety of different interdependent networks with minimum simplicity by surveying the literature in this field. This method focuses on the recursive representation of two central quantities, defined as the probabilities of finding a link/node in the giant component. It is able to give a set of straightforward self-consistent equations describing the percolation behaviors without going through the cascading process [1], and it also deals with many correlated systems with simpler mathematical formulations.
First we will illustrate the framework through the example of single-layer network. Next, we will extend it to multilayer networks without degree-degree correlations. Following that, we extend the analysis to more complicated scenarios of partially correlated networks and degree-degree correlated networks. More complications are added to the case when multiple dependency links per node are introduced along with correlations, as well as a single network with different types of links, which is also known as a multiplex network.

Single-layer network
The classic site-percolation problem in a random network [23,24,28,29] gives rich phase transition phenomena for various network structures. In the simplest case, we consider a random network without any correlations, and its degree distribution, P(k), fully captures its structural property. We start by introducing a key quantity, x, in to the system. This x will be similarly defined throughout this work and will play a central role in the mathematical analysis. If we randomly choose a link from the network and travel along one direction of the link, there is a probability, x, that it would reach the giant component of the network, and a probability, figure 1 for an illustration.) Suppose we randomly choose a link and find an arbitrary node, u, by following this link in an arbitrary direction. The probability that the node, u, has degree k′ is For this node u to be part of the giant component, at least one of its other k 1 − out going links (other than the link we first picked) must lead to the giant component. By calculating this probability, we can write out the self-consistent equation for x: − − is the probability that at least one of the other k 1 − links of node u leads to the giant component, and 〈 〉 is the probability that the node u has degree k. Therefore, for a randomly chosen node u, the probability that it is in the giant component is equal to the probability that at least one of its k links leads to the giant component. Thus we have: is the probability that none of the k links of node u leads to the giant component, and P(k) is the probability that node u has degree k. It is worth noting that μ ∞ is also the normalized size of the giant component (i.e., the fraction of nodes in the giant component). The above equations exactly equal the results obtained by Newman et al in [28].
In the network percolation problem, when a fraction of p 1 − nodes are randomly removed from the network [28,29] (i.e., there is a fraction of p node remaining), we could apply the previous equations with slight modifications. Assuming that the links of the removed nodes are still present on the network, the probability that a randomly selected link leads to the giant component is the same as before and is given by But since only a fraction, p, of the nodes remain in the network, by calculating the probability that the randomly chosen link does not lead to the giant component, the self-consistent equation of x in equation (2) becomes: − − is the probability that at least one of the k 1 − outgoing links of node u leads to the giant component, and 〈 〉 is the probability that u has degree k, which is the same as before. The additional variable, p, in front is due to the fact that only a fraction of p nodes remain in the network after removing p 1 − nodes. Similarly, the probability that a randomly selected node is in the giant component is: It is known that in a single network, we usually only have second-order phase transitions, such that there is no giant component when p is smaller than a critical probability, p c . Above the threshold, the giant component appears and its size increases continuously from 0 with increasing p. This means that when p p c II → , we have x 0 → and 0 μ → . When x 0 → , by taking the Taylor expansion of equation (4), we obtain: c II = − which is exactly the analytical result discovered in [24]. When p = 1 (i.e., the network without attack), this is also in agreement with the results in [27]. For the Erdos-Renyi network, equation (7) yields p k 1 c II = 〈 〉. For a scalefree network with 3 γ < , k 2 〈 〉diverges, so we would obtain p c = 0, which is also in agreement with the known result [25,26].
The above system is based on node percolation, in which nodes are randomly removed until the giant component disintegrates. An alternative scenario is link (bond) percolation, in which links are randomly removed from the network. In this case, we still have the same definition for x, and its equation remains the same as equation (4), because a randomly selected link has the probability, p, to remain in the network after removing a fraction of p 1 − links. The only difference is in equation (5), in which we need to remove p on the right side: This is due to the fact that all of the nodes remain in the network in link percolation, unlike the cases of node percolation, in which only a fraction of p remains. Hence we would obtain the same p c II value for both node and link percolation, but we would obtain different μ ∞ values.
Before we proceed to interdependent networks, it is worth mentioning that other than the second-order phase transition mentioned above, there could also be a first-order phase transition, and the critical threshold can be labelled as p c I . For such phase transitions,when p p c I < the size of the largest cluster is 0, and it abruptly jumps to a nonzero value at p p c I = .
We will see more of these examples later.

Two-layer interdependent network
In the original and seminal work of Buldyrev et al [1], generating functions ware used to study the phase transitions in a two-layer interdependent network. The system consists of two networks, A and B, with degree distributions P A (k) and P B (k), respectively. Networks A and B both have N nodes, and each node in network A is linked with exactly one node in network B by a dependency link, and vice versa. The dependency link is different from the connectivity links within each network, in that once a node on one end of the dependency link is removed, the other node on the other network is also removed. This corresponds to the case where the failure of a power plant in the grid network will make the connected computer system shut down due to lack of electricity. Also, any node outside the giant cluster of its own network would fail since it is disconnected from the majority of the other nodes. In the defined mutually connected giant component (MCGC), every node in the giant component is connected via the connectivity links in its own network, and its dependent node is in the giant component of the other network as well. Thus the MCGC is a steady state of the remaining network, such that no further cascading of failures would happen.
Here we present a simple method for studying the phase transition behaviors using the formulation extended from the previous section. Following the definition in equation (2), we define x as the probability that a randomly chosen link in network A leads to the giant component. Analogously, we define y as the probability that a randomly chosen link in network B leads to the giant component.
If a randomly chosen link in A leads to a node with degree k, the node is in the MCGC only if at least one of its other k 1 − links leads to the giant component and its dependent node in network B is also in the MCGC. Otherwise, this link will be not be in the MCGC, and it will eventually be deleted, according to [1]. (See figure 2 for a detailed illustration.) Therefore, by calculating the probability that a randomly chosen link in A leads to the MCGC, we would obtain: − is the probability that at least one of node uʼs other k 1 − connectivity links in network A leads to the MCGC, and P k y is the probability that at least one of the k′ connectivity links of the dependent node u′ in network B leads to the MCGC. Similarly, we would have the probability that a randomly chosen link in B leads to the MCGC: Definition of x and y in an interdependent network. Networks A and B, both with N nodes, are connected by dependency links with one-to-one matching. Here node u is connected with u′ via an interdependent link. A link (red solid line) in network A is chosen and a node, u (in green), is found following the link. Of the two outgoing connectivity links (black dashed lines) of u, one leads to the giant component. Node u′ in the lower network is connected with u via the dependency link (solid brown line), and it is also connected with the giant component via a connectivity link in network B. Since both nodes u and u′ are connected to the giant component, we can be sure that the initial red link leads to the MCGC. In network A, we define the probability of finding a connectivity link leading to the MCGC as x. Similarly, in network B, we define the probability of finding a connectivity link leading to the MCGC as y.
Consequently, the probability that a randomly chosen node (either in network A or B) is in the MCGC is: which again is the normalized size of the MCGC. Note that we do not distinguish this value in different networks, because there is a one-to-one matching between nodes in A and B, so that μ ∞ is identical for both networks. When we randomly remove p 1 − fraction of nodes from network A, there is only p fraction of nodes left in A. Hence, out of the original probability, x, that a randomly selected link leads to the MCGC, only a fraction of p nodes actually remain. It is easy to write down the new expression for x as: Analogously, the equation for y is At last, we arrive at the equation of μ ∞ , which is the probability that a randomly selected node in network A or B is in the MCGC: which is also the normalized size of the MCGC.
If we define f f ( ) A B as the probability that a randomly selected link in the original network A(B) is not in the MCGC, with substitutions x p f (1 ) (12), (13), and (14) lead to the same equations from generating functions in the seminal work [1].
In principle, equations (12) and (13) can be transformed into If we cannot get the explicit formula above, the numerical computation can always be used succesfully. Usually, the phase transition for the above system is of first order at the critical point, p c I (example given below). Therefore, at p p c I = , the two functions x F p y ( , ) meet tangentially with each other: For the first-order phase transition, at the the critical point p c I , the giant component is not 0, implying that we cannot use the Taylor expansion to simplify equations (12) and (13), but rather we have to solve the polynomial equations directly. It could be very difficult to obtain the explicit formula for p c I except in the most simple distributions, but numerical methods are possible.

Example with random regular network
For a simple example, we assume that both networks A and B are random regular networks with This means that every node in both networks has degree 3, and the nodes are randomly connected. The above equations (12), (13), and (14) then become: Hence, the requirement for p c I of equation (17) can be written explicitly as  (18) and (19). This abrupt change in the size of the giant cluster corresponds to the first-order transition, in which μ ∞ changes from 0 to 0.6329 abruptly at p p c I = . This is illustrated through simulation results in figure 4.

N-layer interdependent network
The case of N-layer interdependent networks [7,8,10,[30][31][32][33] is an extension of the two-layer scenario. Assuming that there are N networks of equal numbers of nodes, each node in a layer is mutually dependent on only one node in each of the two connected dependency layers with one-to-one matching. Extending from equation (9), we obtain the probability that a randomly chosen link in network i leads to the MCGC as: 〈 〉 is the probability that a randomly chosen link in network i leading to node u has degree k i , is the probability that at least one of the other k 1 i − outgoing connectivity links of node u in network i leads to the MCGC, and P k x is the probability that at least one of the k j connectivity links of node w in network j (node w and node u are connected by a dependency link) leads to the MCGC.
Similarly, extending equation (11), we obtain the probability that a randomly selected node is in the MCGC as: In the percolation problem, if p 1 − fraction of nodes are randomly removed from layer i, we can simply multiply equations (24) and (25) by p, which is the fraction of nodes remaining in the layer after the attack: Note that equations (26) and (28) are identical to the results of equations (29) and (30) in [10] with a substitution, x z 1

Example with random regular networks
For a simple illustration, we let the networks have the same degree distribution: i j = = (i.e., every network is a random regular network with degree 3).
Since the equations are symmetric, and there is a one-to-one matching between every node on each network, we have the relation By bringing x to the right-hand side, we could transform the equation (31) into Solving the simultaneous equations (32) and (34) numerically, we are able to find out the value of p I c . Note that for any integer value of N 1 > , we have a solution of p c and x c in the range of [0, 1], so we always have a firstorder phase transition, but no second-order phase transition.

Percolation on multilayer interdependent networks with degree-degree correlations
Usually, in real-world networks, the connection through dependency links may not be random [7,8,16,17,22,[34][35][36]. In a general form, we can assume a joint probability, P k k ( , ) A B , for the dependency links to connect a node, u, with degree k A in network A and a node, u′, with degree k B in network B. In this case, we still assume that each node is connected with one and only one node in the other network through a dependency link.
Instead of using the independent probabilities, P A (k) and P B (k), the joint probability P k k ( , ) A B is used. Thus equations (12), (13), and (14) become: In fact, equations (35), (36), and (37) are the more general representations of equations (12), (13) and (14). In the case of [35], there is perfect correlation between the degrees of the two networks (i.e., P k k P k . The above equations transform into: A special case is when we have random regular networks for both A and B. The results ware discussed in the previous section, since P k k P k P k ( , ) ( ) ( ) 1 The more general case of correlated systems of a multiplex network with different types of links was studied in [8]. With a similar argument, in the case of correlated N-layer interdependent networks, we could write down the equations of x i and μ ∞ from equations (26) and (28): ( ) provided a more general mathematical tool to solve for the critical points by using the Jacobian of the equations: Solving equations (42) and (44) gives the critical point value of p I c and x c for each layer of network in the system.

Percolation on two-layer partially interdependent networks
In certain interdependent networks, not every node has a dependency link. It is more realistic to assume that only a fraction of nodes from each network have dependency links [15,17]. In such systems, both firstand second-order phase transitions may occur depending on the details of the networks' structural properties. Let us assume two networks, A and B with degree distributions P A (k) and P B (k), and let us also assume that only a fraction of q nodes from each network is connected to nodes in the other network with dependency networks. For simplicity, we let each node be connected with at most one other node through a dependency link.
In order for a randomly selected link in A to lead to the MCGC, it must satisfy two conditions. First, the node, u, that it directly attaches to must have at least one of its outgoing connectivity links leading to the MCGC, and the probability is the same as in the case of full dependency links given by Second, for the case of network B, there are two scenarios: there is a probability, q 1 − , that node u is not connected to any node in B, and then u is in the MCGC. There is a probability, q, that u is connected to a node u′ in B, and then at least one of the connectivity links of u′ must also lead to the MCGC; the probability for this is . Therefore, the original equation (12) becomes: The parameter p on the right-hand side takes into account that after removal of p 1 − nodes in network A in the beginning of the attack, only a fraction of p nodes remain.
It is worth noting that the calculation of y is not symmetric with x, because there is no one-to-one matching between a node in A and a node in B. For a node u′ in B, the difference is that in the case when it has a dependency node, u, in A (probability q), at least one of the connectivity links of u must lead to the MCGC (probability , and u must not have been removed (probability p). Thus, equation (13) becomes: There is no additional parameter p in front of the right-hand side since we are not removing nodes from network B in the beginning. Again, due to the lack of symmetry in this case, the sizes of the MCGC in A and B are expressed differently:

Example with random regular networks
For a simple illustration, we use random regular networks for both A and B, and P P (3) (3) 1 A B = = . equations (45) and (46) simplify into  Solving equations (49), (50), and (57) for x, y, and p c I , we are able to get the threshold value p c I numerically. For any given value of q, which determines the fraction of dependency between networks A and B, we would have either a first-order phase transition with critical threshold p c I , or a second-order phase transition with critical threshold p c II . Figure 5 shows the plot of p c I and p c II versus the change in q. The two curves of p c I and p c II intersect at q = 0.4. Hence the percolation behavior is separated into two regions: When , it is a second-order transition, but when q 0.4 1 < ⩽ , it is a first-order transition.

Example with ER networks
For a more general example, we suppose the degree distributions of networks A and B are both Poisson, with average degrees k A 〈 〉and k B 〈 〉. In this case we can use the generating function formulation [28] to simplify the expressions.
For networks A and B, the corresponding generating functions are G Note that the second p on the right side is there because both the node itself and its dependent node have the probability, p, of remaining after the initial attack because they are in the same single network. Correspondingly, we can write down the size of the giant component as: This corresponds to the special case in [37], with every node in a dependency pair. For a more general case, a network could have dependency groups [38], which are certain groups of nodes that have dependency relations (as shown in figure 7), such that the removal of any one node would result in the removal of all the other nodes in the group. Given the probability distribution of the group size, g(s), where s is the number of nodes in a group, we would obtain the following equation of x:   s g s s p P k x · ( ) ( ) 1 (1 ) .
From here, we could study the critical phase transition behaviors by solving the equations using the techniques in the previous sections.

Percolation on two layer with many correlated links
In certain networks, including brain networks, it is observed that the dependency links are not always one-toone matching, but could be one-to-many, and extensive correlations and dependencies between nodes could exist, as proposed by [39]. Reference [7] discovered that brain networks are wired in such a way that stability is maximized.
To examine the critical phase transition behaviors of such networks, additional parameters need to be defined to specify the structure. Assuming two networks, A and B, we denote k in A (k in B ) as the degree of connectivity links of a node in A(B), and k out A (k out B ) ( 1 ⩾ ) as its dependency degree, which leads to nodes of the other network. The joint probability that a node u in A has k in A connectivity links and k out A dependency links is denoted by p k k ( , ) , is that given a node u in A with degree k in A , the probability that any of its dependent nodes w in B is of degree k in B . Similar definitions carry over to network B for p k k ( , ) Here we let x A (x B ) be the probability that on following an arbitrary connectivity link in network A(B), we reach a node leading to the giant component. For a dependency link between node u with connectivity degree k in A and node w in B, y k A in is defined as the probability that this link from u leads to the giant component, and y k B in is defined similarly for a dependency link from B to A. Thus we have the following self-consistent equations: ( ) is the probability that at least one of the k out A dependency neighbors (in network B) of node u (which has degree k in A ) is in the giant component. Finally, we can write down the probabilities that a randomly chosen node is in the giant component: In general, since the above system is too complicated to have analytical solutions, numerical methods are usually preferred. Here we illustrate a simple yet efficient numerical method called, binary search, to detect the