Vertex routing models

A class of models describing the flow of information within networks via routing processes is proposed and investigated, concentrating on the effects of memory traces on the global properties. The long-term flow of information is governed by cyclic attractors, allowing to define a measure for the information centrality of a vertex given by the number of attractors passing through this vertex. We find the number of vertices having a non-zero information centrality to be extensive/sub-extensive for models with/without a memory trace in the thermodynamic limit. We evaluate the distribution of the number of cycles, of the cycle length and of the maximal basins of attraction, finding a complete scaling collapse in the thermodynamic limit for the latter. Possible implications of our results on the information flow in social networks are discussed.


Introduction
The structural and statistical properties of evolving and dynamical networks have been studied intensively over the last decade [1,2,3], due to their ubiquitous importance in technology, the realms of life and complex system theory in general [4]. Transmission of physical quantities like electricity and of information are key network functionalities, both in physical networks like the power grid and the Internet, as well as in relational networks such as social networks [5]. The basic transmission process takes place between two network vertices and two constituent vertices are linked by an edge whenever direct transmission is possible.
Another key network functionality is routing. An incoming physical quantity, package or information, arriving at a certain vertex is forwarded by this vertex. This routing process may proceed either via static routing tables or via dynamical routing protocols. The latter is the case for the Internet, the internet servers having the task of routing information packages such that they find their way eventually to the addressees specified in the package headers. Here we specify a class of deterministic vertex routing models with static routing tables, viz with quenched routing dynamics. The routing tables are drawn randomly for every vertex and the models are characterized by the network topology on one side and by the length of the memory trace along the routing path on the other side.
In this study we focus on the effect of the routing memory on the long-term dynamical properties of the routing process, considering the case of information routing. For this purpose we consider fully connected networks and two kinds of trace memory. In the first case memory is absent and the package is passed on irrespectively of where it came from, always along the same outgoing edge, see Fig. 1. In the second case the memory trace consists of a single time step, and the routing of incoming information depends on the vertex that routed it in the previous time step; for every incoming edge the routing table specifies a distinct outgoing edge. We study then the statistics of the resulting cyclic attractors, the basins of attractions and of a measure for the degree of information centrality. The vertex routing models are defined in the phase space of directed links and a given vertex is information central of degree c = 0, 1, 2, ... when it belongs to one or more intersecting attractors of the information routing dynamics.
We find that a memory trace for the routing process makes a qualitative difference. In the absence of memory only a sub-extensive number O(N 1/2 ) of vertices is information central in the thermodynamic limit N → ∞. For the case of a one-timestep memory trace the number of information central vertices is on the other hand extensive, being linearly proportional to the number of vertices N.
The concept of topological-based centrality and its dependence on network properties has been widely studied [1,2,3]. The notion of information centrality used here is, on the other side, based on the observation, that the flux of information the members of a social network receive is important. This flux of information is maximal whenever a person is part of one or more attractors of the information routing process, as it dispose in this case over the entire information generated in the respective basins of attraction. Members of a social network located on the fringe of the information flow will, on the other side, receive information only from a small number of other members.
We note that the standard network characterization of real-world networks is provided in terms of network topologies [6]. We propose that there is a need to supplement the field data with information describing the dynamics of routings, which would allow to evaluate the possible social relevance of the information flux and accumulation. The vertex routing models considered here may, in addition, be regarded as a reference models, akin to the Erdös-Rényi model of graph theory [7] and to the NK-model of dynamical boolean networks [8], having well defined and controllable dynamical properties in the thermodynamic limit.

Vertex routing models
Our models are based on the idealized notion, that a vertex V k receiving information from a vertex V j will transmit it to one other vertex only, say V i . A vertex routing tablê T then corresponds to the binary tensor T ijk = (T ) ijk ,  where z k is the degree of the routing vertex V k . Considering here a fully connected graph with N sites we have z k ≡ N − 1. The entries T ikj = 0, 1 of the routing table are determined consecutively for all vertices: For a given vertex k and a given incoming edge −→ (jk) one outgoing edge −→ (kI) is randomly chosen among the N − 2 potential candidates of outgoing edges. Then T Ikj = 1 and T ikj = 0, ∀i = I. For the two models we make the following differentiation: • Without memory For every directed edge the entries T ikj are selected randomly, and are independent of the originating vertex, thus we can write T ikj ≡ T ikl , ∀i, j, k, l.
• With memory Again all entries T ikj are drawn randomly. Routing of information depends on where it came from, but backrouting is not allowed: T jkj ≡ 0, ∀k, j.
The rational for these two models in the context of social networks is the following: For the memoryless case a new information is passed on always to the best friend, irrespectively of the information source. In the model with memory the information routing depends on the source. Information received by a relative might be passed on to another relative and work-related news might be passed on predominantly to a workplace buddy. The suppression of backrouting is not important for large N but clearly makes sense; It is never a good idea to echo a joke to the person which told it in the first place.
An example of a routing table on a fully connected graph with four nodes is presented in Fig. 2. In a discrete phase space built upon the directed edges every cycle is an attractor, thus we have cyclic attractors. In this example we have three cyclic attractors (labeled with colors), each of which has a basin of attraction of volume V = 4, which is the number of directed edges. The states − − → (13) and − − → (31) in the phase space of the dynamics belong to the basin of attractions of the red and the green attractors, respectively, but they do not belong to any cyclic attractor. Also, for nodes 1 and 3 the information centrality c = 2, while for nodes 2 and 4, c = 3. We also note that the vertex routing models (1) play an important role in the context of neural cognitive information processing. In this context a vertex corresponds to an object and the sequence of vertices activated by the routing process to an associative thought process [9,10]. In addition there is a close relation to random boolean networks [11,12], with the directed links constituting the boolean variables. In terms of a boolean network the routing model operate in the sparse activity limit, since the routing problem deals with routing of individual packages, a single directed link being operative at any given time.

Memoryless model
In this case the routing tensor T ikj is independent of the last index and its dimension is effectively reduced to two. The probability distribution N l (L, N) of finding a cycle of length L in a network with N nodes is given by [see Appendix A] where z(N) is a normalization factor, N L the number of L sites out of the N vertices and L!/L the number of possibilities to connect L sites into distinct loops. The factor (N − 1) −L in (3)    We find, see Fig. 3, that the information centrality approaches a well defined limiting function I(c) = lim N →∞ I(c, N) in the thermodynamic limit. The availability of information is quite democratically distributed, only a fraction I(0) ≈ 0.34 of vertices are cut-off completely from the long term information flow.
In Fig. 4 the distribution of the number of attractors per network is presented. The average number of attractors per network increases only slowly with the size of the network, as N a (n, N) ∼ N αa . Somewhat larger system sizes are necessary for a reliable estimate of the scaling exponent, our best fit (given in the inset of Fig. 4) indicates α a ≈ 0.29.
We encountered problems of undersampling of the space of all possible model realizations when evaluating the cycle-length distribution N l (L, N), presented in Fig. 5, which is a phenomena well known in the field of random boolean networks [12]. For the case of the vertex routing model the probability of finding very long cycles could not be determined accurately, due to the fat tails of N L (l, N). This problem affects also the results for the mean cycle length N l (L, N) , but not the median µ a (N) of N l (L, N). We found scaling close to a square-root law for the median (inset of Fig. 5 The number of cyclic attractors steadily increases with N, as shown in Fig. 4. The question is then, whether there is typically a single dominating attractor, in terms of the size of the respective basins of attractions, or whether the phase-space volume is more or less equally divided between the attractors being present. This information is provided by V max b (v, N). The probability that the largest attractor volume is in the N). Here the volume is relative to the maximal basin of attraction, which is equal to the phase-space volume Ω = N(N − 1) and v = 1 occurs when only a single attractor is present.
The rescaled ΩV max b (v, N) converges rapidly with N to a limiting function, see Fig.  6. There is a divergence for v → 1, due to the fact that the probability of finding an ensemble realization with a single attractor, N a (1, N), scales like 1/N and consequently ΩV max N). For (v, N) for v → 1 seems to indicate that cycles with large volumes of attraction would have a dominating role, controlling most of the longterm information flow. In order to understand the origin of this divergence we have compared the integrated distribution Fig. 6) with an unbiased distribution of basins of attraction, taking the case of two attractors (dashed line in the inset of Fig. 6). In this case, when only two attractors are present, one of the volumes of attraction is always equal or larger than Ω/2, The integrated distribution of model with a memory trace follows somewhat the simplified model (6) of a random distribution of two basins of attraction, albeit with a substantial suppression close to unity, indicating that the divergence of ΩV max b (v, N) for v → 1 is a directly related to the statistical properties of the distribution of basins of attraction, independent of the details of the dynamics.

Discussion
We have presented and analyzed a novel class of models suitable for describing the flow of information in complex networks. In these models the flow of information is realized by information packages travelling along the edges of a communication network, whereas it is assumed, in the intensively studied information diffusion models, that information is an attribute attached to the vertices and not to the edges. The dynamics is determined in our class of models by routing tables, specifying respectively for every vertex the routing of incoming information packages.
The two vertex routing models discussed in the present study constitute idealized reference models, not suitable for a direct modeling of field data. The key point of the present study is to analyze the non-trivial effects of a memory trace on the long-term information flow and not the dependence on the network topology, which is left for further studies. Information is conserved in the present models. This would be clearly not the case for real-world social networks but it holds for the network of internet routers, which have the task of routing packages of information without increasing or decreasing their number.
Traditionally, the flow of information in social networks, like the random spreading of rumors, has been modeled by diffusion processes [13]. Searching for more realistic models the special topology of social networks has been discussed intensively [14], as well as corrections to the diffusion process itself [15]. The vertex routing and the diffusion models of information flow constitute two extremes. In the first model the direction of the information flow is 100% deterministic, in the second model 100% random. The information-flow occurring in real-world social nets is expected to be partially a random and partially a directed process. It will therefore be of interest for future studies to interpolate between these two reference models. We propose that field studies characterizing social networks should be supplemented by data describing the dynamics of the information flow, in addition to the standard structural and topological characterization, as the accumulation of information in the attractors of the information flow may have a substantial social impact.
The dependence of the routing dynamics on the network topology is additionally an important issue for future studies. Cycles in the routing process can only then appear, when the underlying network topology allows for loops. Here we have been considering fully connected networks and loops of all length are present. For many classes of real-world networks there is a characteristic loop-length, which generally scales sub-extensively with the network-size [16]. Interesting interferences phenomena between this scaling and the sub-extensive scaling of the typical attractor-length (inset of Fig.  5) may then be expected.

Appendix A. The cycle length distribution for the memoryless model
Let q t be the probability that a path remains unclosed after t steps. If a path is still open at time t, we have already visited t + 1 different nodes. There are t ways to close the path in the next time step. The relative probability is then ρ t = t/(N − 1). The probability of still having an open path after t steps is The average number of cycles of length L is where we used the following considerations [4]: (i) The probability that the node, visited at time t + 1, is identical to the starting node is 1/(N − 1).
(ii) There are N possible starting points.
(iii) Factor 1/L corrects for the overcounting of cycles when considering the L possible starting sites of the L-cycle.
After normalization we obtain Eq. (3) , the probability distribution of cycle lengths where z(N) is normalization factor Appendix B. Connection between the average information centrality and the average cycle length We consider an ensemble of R random realizations of a routing tensor on a fully connected network with N nodes. Let n α be the total number of vertices which belong to at least one cyclic attractor, where α = 1 . . . R . The length of an cyclic attractor is equivalent to the number of vertices that belong to the attractor. In the case of only one existing attractor n α = L− rmax r=1 (r−1)Q α (r, N), where L is the length of the attractor and Q α (r, N) is the number of nodes which are repeated r times during one cycle of the cyclic attractor. If we denote the number of cycles of length L with N α (L, N), we can write, for the case of more than one co-existing cyclic attractors, the following relation where P α (c, N) is number of nodes with information centrality c. It is easy to count that there are 7 distinct vertices contained in this two attractors. We see that only node 1 has information centrality c=2, thus P α (c = 1) = 6 and P α (c = 2) = 1. Also, node 1 is the only one two repeat two times during one cycle of second attractor, thus Q α (r = 1) = 6 and Q α (r = 2) = 1 . If we put this values into (B.1) we obtain n α = 7.
On the other hand, if we denote the distribution presented in Fig. 3 as I(c, N), where c is the information centrality, then, in the case of one given realization of the routing tensor, we can write P α (c, N) = N · I α (c, N) . In the case of the memoryless model, I(c, N) have non zero values only for c = 0, 1, and it is not possible that one node is repeated more than once in one cycle of cyclic attractor, thus 1 where L = L LN l (L, N) , which is the central result of this appendix.