Generating function technique in complex networks

Complex network is a useful tool in describing the interactions between different agents of a complex system, and has attracted considerable interest recently. There are various approaches to studying the structure and dynamic of complex networks for the diverse and complicated connectivity structure. One more applicable theoretical approach is the generating function technique, which is useful for the studies of network structure and dynamic, especially for tree-like networks. In this paper, we will give a summary of the basic ideas of this approach, and explore the structure of networks. A cascading failures model we have proposed before will also be presented as an application of this approach.

corresponding generating function is defined as For many well-defined sequences, G(x) can be written in a closed form, and the coefficient a n in the sequence can be recovered by Taylor expand, Considering a discrete random variable A, taking non-negative integer values. If a n in eq.(1) is the probability that A takes the value n(= 0, 1, 2, ...), G(x) is also called as probability generating function. In this way, the expected value of A can be expressed as G ′ (1). In addition, a probability generating function G(x) must satisfy G(1) = 1. Taking Poisson distribution a n = λ n e −λ /n! as an example, the probability generating function is G(x) = e λ(x −1) . Obviously, we have G(1) = e λ(x−1) | x=1 = 1 and G ′ (1) = λe λ(x−1) | x=1 = λ.

generating function in complex networks
Next, we turn our attention to complex networks. In a network, the focus is usually concentrated on the states of nodes, which generally depend on the states of the links connected to them. For example, if all the links of a node are not connected to the giant component of the network, the node must not belong to the giant component. For convenience, let x be the probability that a link is in state X, where X is relevant to the problem we considered. Further, we hypothesize that the states of links are independent of each other. Then, the probability that all the links of a node with degree k are in state X is x k . Averaging this probability over the degree distribution of the network, we can obtain the fraction of the nodes with all their links in state X, in other words, the probability that all the links of a randomly chosen node are in state X. According to the definition of the generating function (1), this fraction (or probability) can be written as Let x = 1, thus G 0 (1) = ∑ ∞ k=0 p k = 1. This can be understood as that when the state of link is sure to X, this fraction (or probability) must be 1. Furthermore, the average degree of the network can be expressed as In this way, the average degree could be understood as the rate of change of the node state with link. More generally, for a node with degree k, the probability that there are only i(≤ k) links are in state X can be written as Using the degree distribution of the network, the probability that a randomly chosen node are in this state (or the fraction of nodes in this state) can be written as This function is more general than eq.(3) to represent the state of a node. However, it must be pointed that eq.(5) can reduce to x k by letting i = k, but eq.(6) can not reduce to eq.(3).
Functions (3) and (6) describe the state of a randomly chosen node in a network. However, the nodes considered in practical network problems are not always chosen randomly, but rather ones reached by following randomly chosen links. So, we need to further explore the state of such a node. Supposing the degree of the node reached by following a randomly chosen link is k, then, the probability that all the other links of this node are in state X is x k−1 . This probability can be used to describe the state of this node. Note that the link used for reaching this node must be ruled out. To obtain the average probability as eq.(3), we need to know the distribution of the node degree reached by following a randomly chosen link, the so-called excess degree distribution q k . Obviously, q k depends on the degree distribution p k , the larger p k is, the larger q k is. In addition, following a link, it is easier to reach a node with larger k. In consideration of the above two points, we have q k ∝ p k k.
The normalized form of q k is Therefore, the average probability that all the other links of a node reached by following a Note that the smallest degree of the node reached by following a link is 1. Obviously, eq.(9) satisfies G 1 (1) = 1 and . Furthermore, similar with eq.(6), the average probability that only i(≤ k − 1) links in the other links of a node reached by following a randomly chosen link are in state X, can be written as Now that we have represented the common node states in a network shown graphically in fig.1. It must be pointed that all these functions are for the case that the states of links are independent of each other. For networks with correlated links, these functions do not apply. However, an approximate expression can be obtained by revising the functions shown in fig.1 for some special cases [8,9,10], this point is not considered here.

hierarchical structure of networks
Hierarchical structure is one of the important characters of real networks [11]. In this section, we will give a brief discussion on the hierarchical structure of networks using generating function. As shown in fig.2, we choose a node randomly as the center of the network, and then the direct neighbors of this node are the first layer of the network, and the neighbors' neighbors are the second layer, and so on. Since the central node is chosen randomly, its state can be described by the generating function G 0 (x). Here, state X means the connection of links. Following the links of the central node, we will get the first layer of the network. As mentioned last section, the generating function G 1 (x) can represent the state of the node reached by following a link. So, replacing x in G 0 (x) with G 1 (x), we will obtain the state of the first layer G (1) Similarly, the state of the second layer can be represented as ). Generally, the state of the m-th layer can be written as Note that the state of links must be independent of each other, i.e., loop is forbidden. For the state function G (m) (x) of the m-th layer, the integer power exponent of x must be the number of links that connect to the nodes in the (m + 1)-th layer, since x represents the states of the excess links of the m-th layer. Therefore, the derivative of eq.(12) with respect to x at x = 1 must equal the total excess degrees of the nodes in the m-th layer, that is, the number of nodes in the (m + 1)-th layer (see fig.2), From eq.(15), we can also find that This indicates that the growth rate of the node number of the m-th layer with m is a constant G ′ 1 (1). For an infinite network, there must be an infinite number of layers, so the node number can not decrease with m, i.e., This is the existence condition of the giant component in a tree-like network [6,12]. For a network with finite size, the average number of layers l must be a finite number. Next, we will show how to obtain l. Using eq.(16), the node number of the m-th layer can be written as Summing over all the layers, we will obtain the network size This yields For a real network, N ≫ ⟨k⟩ and ⟨k 2 ⟩ ≫ ⟨k⟩, thus l can be approximately represented as This result can be seen as an approximation of the average distance of a real network [13,14], which indicates that l ∼ ln N .

k-shell structure
As we pointed above, the hierarchical structure shown in fig.2 has an arbitrary center. In complex network, there is another hierarchical structure with a determined center, i.e., k-shell structure [15]. For a given network, the k-shell structure is determined.
To illustrate k-shell structure of networks, we need to take a moment here to give the definition of k-core of network. k-core is a subnetwork of a network, in which nodes are connected to one another by at least k paths. k-core can be obtained by removing repeatedly all the nodes of degree less than k. The result, if it exists, is the k-core of the network (see fig.3). It is easy to know that the size of (k + 1)-core S k+1 is smaller than that of k-core S k . The nodes that belong to k-core and not belong to (k + 1)-core are called as k-shell of the network. k-core and k-shell structure of networks. In k-core, nodes are connected to one another by at least k paths. The nodes that belong to k-core and not belong to (k + 1)-core are called as k-shell of the network.
Next, we will give the size of k-core and k-shell of a network ensemble with given degree distribution. It must be pointed that k-core and k-shell can not exist in a finite tree-like network, so all the networks discussed here are infinite. For a node in k-core, it must have at less k links that connect to other nodes in k-core. Therefore, using the generating functions (6) and (10), we can get the size of k-core (or the fraction of nodes in k-core) directly, Here, R k means the probability that the node reached by following a randomly chosen link belongs to k-core. Note that the sum ∑ n in eq.(25) begins with k − 1, since the link used to reach the node has already provided a link for k-core. Solving the two equations, we can get the size of k-core S k . Thus, the size of k-shell can be obtained, For Erdős-Rényi (ER) network [12], the degree distribution is a Poisson distribution p k = e −⟨k⟩ ⟨k⟩ k /k!, thus eqs.(24) and (26) can be rewritten as Here, Γ(n, x) is the regularized incomplete gamma function. Figure 4 compares these theoretical results with the simulations. One can find the simulation results agree well with this theoretical analysis. When the average degree of the network is small, it is clear the size of k-shell is small. When ⟨k⟩ < 1, there is no giant component in the network, so k-shell is obviously inexistent. With the increasing of the average degree of the network, k-shell with larger k will grow up and replace that of small k. Therefore, with the increase of the average degree, k-shell increases first and then decreases (see fig.4).

Robustness of networks
The robustness of networks is one of the important topics in complex networks [4,16], which refers to the ability of a network to resist change without adapting its initial stable configuration.  Next, we consider the cascading failures model proposed in ref. [17], which focuses on the robustness of networks with overlapping of connectivity and dependence links. Here, the connectivity link just is the ordinary link of network, and the dependence link represents the dependence of two nodes in a network. Specifically, a dependence link between node i and node j means that if node i fails, node j also fails, and vice versa. We consider a network connected by connectivity links with a degree distribution p k and each node has exactly one dependence link. For a network with size N , there are N/2 dependence links. To represent the relation of the connectivity links and the dependence links, we assume that a fraction β of the dependence links overlaps with the connectivity links, and the other dependence links are set randomly. That means a fraction β of nodes will be adjacent to their dependence partners connected by the dependence links.
The cascading failure process begins with removing a fraction 1 − p of nodes randomly. We assume that all the nodes not belong to the giant component will fail. As the model setting, the node with a failed dependence partner will also fail. So, two dependence nodes can be preserved after the cascading failures, only when they belong to the giant component simultaneously.
To obtain the size of the giant component S, we need the generating function to express the state of a dependence pair, instead of that of a node. For nonadjacent dependence nodes, the dependence partner is a randomly chosen one, whose state can be described by generating function G 0 (x). So, the states of nonadjacent dependence pairs can be described by the joint state of two nodes. As shown in fig.5, the states for the adjacent dependence pairs can be described by generating functions [G 1 (x)] 2 and G 2 (x)G 1 (x), where G 2 (x) = G ′′ 0 (x)/G ′′ 0 (1). Now, let x = 1 − R, we can obtain the equations for S and R, For an arbitrary degree distribution p k and the fraction of initial removal 1 − p, we can solve eq.(31) to obtain R, and then insert it into eq.(30) to obtain the order parameter S. In fig.6, we give the simulation results for ER networks, which agree with the analytical results well. In addition, we can find that there are two types of phase transition in this model, which can also be predicted by eq.(31) (see ref. [17] for details). For ER networks, we can also find that the tricritical point is a constant β c = 1 3 .

Conclusion
In this paper, we have summarized the generating function technique used in the studying of complex networks, and explored the hierarchical and k-shell structures of networks using this technique. A cascading dynamic model has also been studied in this paper as an example that how to use this technique to solve the network dynamic model. In addition to the topics covered in this paper, there are a number of problems can be solved by the generating function technique, such as spread of epidemic disease on networks [7], observability transition [18], catastrophic cascade of failures in interdependent networks [19], clique percolation [20]. In the other hand, as mentioned in this paper, this technique can only give an exact result for tree-like random networks, and further research is needed to clustering and correlated networks.