Dynamics impose limits to detectability of network structure

Networks are universally considered as complex structures of interactions of large multi-component systems. To determine the role that each node has inside a complex network, several centrality measures have been developed. Such topological features are also crucial for their role in the dynamical processes occurring in networked systems. In this paper, we argue that the dynamical activity of the nodes may strongly reshape their relevance inside the network, making centrality measures in many cases, misleading. By proposing a generalisation of the communicability function, we show that when the dynamics taking place at the local level of the node is slower than the global one between the nodes, then the system may lose track of the structural features. On the contrary, hidden global properties such as the shortest path distances can be recovered only in the limit where network-level dynamics are negligible compared to node-level dynamics. From the perspective of network inference, this constitutes an uncertainty condition, in the sense that it limits the extraction of multi-resolution information about the structure, particularly in the presence of noise. For illustration purposes, we show that for networks with different time-scale structures such as strong modularity, the existence of fast global dynamics can imply that precise inference of the community structure is impossible.


Introduction
Networks constitute a paradigm of complexity in real life systems by assembling the structure of the interactions of their elementary constituents [1][2][3][4]. They are found at every level of biological organisation, from genes inside the cells [5] to the trophic relations between species in large ecosystems [6]. Complex networks have historically been decisive in modelling social phenomena as well, starting with the renowned Milgram's experiment of six degrees of separation [7] to the impact of social media in our day [8] and the relevance of social network analysis in crime fighting [9][10][11]. In the last 20 years, science has been impacted by a huge development in the understanding of the way such complex interactions originate, looking for universal patterns in their structure, and investigating the consequences that such discrete topologies have on the dynamics of the systems defined on top of complex networks. Nowadays, with the enormous development of data science, there is a broad interest related to network inference, namely detecting the interacting structure from external measurements or observations. For example, reconstructing the structure of brain networks from the activity of neuronal patches has been a major goal in computational neuroscience [12]. The dynamics that takes place on networked systems can, in some cases, strongly influence the perception that we have regarding local topological features such as the degree [13] or global ones such as network non-normality [14]. Recently, it has been shown that control methods which are based on the structural properties of networks [15] are insufficient to correctly affect the behaviour of the system in the absence of insight at the dynamical level [16].
In this paper, we focus specifically on the problem of measuring network centralities from the dynamical point of view. We show that the inference of networks' structural properties depends heavily on the competition between the node-based dynamics on the one hand and the interactions between the nodes on the other. In particular, we illustrate such a phenomenon by generalising the concept of the communicability centrality [17,18], considered as a reliable measure for dynamical inference [19]. We show that when the local intra-nodes dynamics is sufficiently slower than the inter-nodes one, then the ranking of the nodes becomes inadequate. In fact, most if not all of the ranking methods are based on structural properties of networks only, which are usually represented by linear operators [1,2], and this renders impossible the description of the dynamical observables in a strongly nonlinear regime. From this perspective, understanding the conditions for which it is possible to figure out the structural properties of the networks by observing the dynamical process on them, stands as an essential task that has not been studied so far. The failure of network centralities for the static case (when only the structure is considered without any consideration about the dynamical process on it) has been previously studied, with alternative approaches such as the HITS algorithm [20] or the nonbacktracking matrix [21] being suggested. In contrast, here, we focus on the influence that the observation of the dynamical variables in different regimes of parameters has on the distinguishability of the nodes from each other on one side and the recovery of the hidden structures on the other [22,23].

Dynamical processes in networked systems: the SI model case
We start by considering a general formulation of a dynamical process in a networked system G = (V, E) of N nodes [2,24] where x i represents the multivariable vector of the state of node i and f(·), g(·, ·) the respective intra-node and inter-nodes dynamics. The interactions are given by the adjacency matrix A whose entries are A ij = A ji = 1 if there is an undirected link between the nodes i and j and 0 otherwise. Notice also that to parametrise the two effects in the dynamical system, we have introduced the coefficient α, which can be tuned to control which part of the dynamics is more relevant. From the dynamical point of view, the control parameter α can be also considered as a time-scaling factor α −1 = τ for the local and global dynamics in the network.
To illustrate our analysis, we will consider the SI model for epidemic spreading in a metapopulation network [22,25]. This analysis can be considered in the more general framework of metaplexes [26], where the interior of the nodes can be either a continuous or a discrete space. This model is constituted by two interacting species, the susceptible S and the infected I one. The individuals at a given instant of time t are confined inside given compartments where a susceptible but still healthy individual in constant with an infected one can, in turn, be infected according to a given probability r. At the same time, the single individuals of both species are free to 'jump' with a given probability D from a compartment i to another one j if there is a link bridging the two. It is clear from this description that if there is at least a single infected individual in the system, the probability that everyone gets infected in an asymptotic time equals one. Although such models have been initially developed to study the evolution of epidemics in a compartmental spatial domain, they have been useful also in understanding other types of dynamics. Such formulation of the spreading processes has been employed to model, for example, the propagation of misfolded proteins in neurodegenerative diseases [27,28]. We assume that inside any node (cell) i we have that the susceptible (e.g., regular proteins) and infected individuals (e.g., misfolding protein) will interact respectively according to S i + I i r − →2I i where r is the infection rate and each individual of each species will migrate between nodes S i D − →S j and I i D − →I j with a diffusion constant D [22,25]. As a consequence the mean-field dynamics reads:Ṡ where S, I are the concentrations, respectively, of the susceptible and the infected individuals, r is the infection rate, D the diffusion constant, and L is the Laplacian matrix defined as L ij = A ij − k i where k i is the degree of node i [2]. To be compatible with the notation of equation (1) we have imposed r = α and r + D = 1. A similar formulation has been introduced in reference [22] to infer hidden structures such as the shortest distances from the spreading dynamics. However, at odds with previous results [22,23], we will show that due to the bias of the global dynamics by the node-level local dynamics, such inference is not reliably possible in general. To do so, we first select the most central node of the graph (e.g., the one with the highest betweenness) as the observation node, and then take the time needed for the infection to reach such node as the dynamical observable. More precisely, we initiate our system by infecting a single node of the graph in turn, and then we measure the time needed for the infection to reach a given level of concentration I 0 on the observation node of the network. We will indicate the observable as RT i and will refer to it as the corresponding reaching time for the starting node i. It is also important to emphasise that the threshold I 0 is, in fact, a realistic consideration, commonly known as the tolerance of the measurement instrument. In our case, this means that it would not be possible to distinguish two nodes that have a difference in their infection level smaller than I 0 .

Local vs. global dynamics in complex networks
In order to show how the global-local interaction modifies centrality of the nodes, we will compare the reaching time RT i for each node to the inverse of the corresponding communicability. The reason for choosing the communicability measure as a representative of network centralities is because it acts as an upper bound of the SI dynamics (see supplemental information) (stacks.iop.org/ NJP/22/063037/mmedia). The communicability for a given couple of nodes (i, j) is defined as C ij = e βA ij = l β l A l ij /l! where β is the inverse of the temperature following the Green's function formalism [18]. This way, in addition to the geodesic paths for calculating the centrality of node i from node j, longer paths also contribute proportionally with the inverse of their respective lengths. Communicability has been initially introduced in reference [17] (with β = 1) as a necessity to solve different disadvantages presented by other centrality measures based on the idea of the shortest path (betweenness, closeness, etc) [1,2], and has since found many important applications [1,17,18]. An intuitive interpretation of the meaning of the communicability can be understood by considering the solution of the linear differential equationẋ = Ax, see for instance [29]. In fact, for a given node we have where without loss of generality, we considered here x 0 = (1, 1, . . . , 1) N . This way, the latter sum term, known as the total communicability [30] (of node i), is equivalent to the contribution of the flux of the system to that node once the initial condition is considered uniform over the network.

The generalised communicability measure
Based on this interpretation, we propose a generalisation of the communicability asC ij = e βF(α,A) ij where now F(α, A) is the nonlinear operator acting on the vector of the state x that represents the right-hand side of equation (1). Although such an approach should precisely capture the global dynamics, it is of limited practical utility since, in general, it would require the exact knowledge of the orbits (by numerical integration) of the system, which is not feasible in real scenarios. We will illustrate this by testing several modifications of the original communicability measure depending on the level of insight that we may have regarding the nature of the process occurring on the network. In the first attempt, we substitute the adjacency matrix on the exponential function of the communicability by the Laplacian matrix A → L, so that C ij . This can be considered a good first approximation since diffusion is a typical process in networked systems [2,24]. However, as we will show in the following, the Laplacian alone is not sufficient because it does not take into account the local dynamics of the nodes. So, for this reason, we have extended the idea to the Jacobian matrix of the linearised system (around the starting steady state) C Jac ij = e βJ ij . Note that a weak version of this has been recently proposed in [19] to understand the global dynamics of neuronal networks. Based on these ideas, in figure 1(a) we have first analysed the effectiveness of different definitions of the communicability versus the reaching time observable. It is clear that as one gets more insight into the nature of the process and also its parameters (in this case α), the communicability centralities appear to be more useful in understanding the dynamical observation. However, since the nonlinear terms strongly influence the dynamical observables, the reaching time will not perfectly match with its corresponding (Jacobian) communicability measure.

Competition between the node based dynamics and the network based one
We now turn our attention to the competition between the spatial interactions and the internal dynamics of the nodes by tuning the parameter α. For this, we will make use of the general communicability here Dependence of the reaching time RT on the parameter α (green α = 0.15, red α = 0.33, blue α = 0.5, cyan α = 0.91 and magenta α ≈ 1). Tuning the parameter α it is possible to emphasize more one part of the dynamics than the other. In fact, in the limit when α → 0 all the nodes would behave the same according to the asymptote (dotted horizontal black line) losing track of the spatial structure. On the other hand when α → 1 we have that the RT would converge at a step function (dashed black curve) which represents the distances shell from the original source of infection similarly to a contact process. In the inset is shown the variance s 2 as a function of the parameter α. In both panels, we used a scale-free network of 100 nodes generated by the Barabási-Albert model, where in addition, extra links are introduced at random with a probability p = 0.05 to introduce loops eventually and by avoiding multiple links. Also, the nodes are sorted in shells wherein each shell has been placed the nodes of the same shortest distance from the observable node. The infectiveness threshold in all the cases has fixed at I 0 = 0.001. , A) is denoted the operator of the flux due to the second term on the right-hand side of equation (1). In figure 1(b) is shown that when we change the ratio between these two parts of the system we either lose track of any structure in our system when α → , (where epsilon is an infinitesimal value) or we still keep some topological features in terms of shortest distances, but we lose the local information on the nodes when α → 1 − . In fact, when α decreases towards 0 (but without vanishing totally otherwise the system retains only the diffusion component, and no competition can take place anymore) the diffusion part dominates over the interaction between the susceptible and infected individuals, so lim t→∞C ij . In other words, the system will first, quickly, converge to the asymptotic state of the diffusion operator (which in this case is the homogeneous fixed point), spreading in this way the seed of infection equally in each node. After the system reaches this state, it is just a matter of time before all the nodes will reach the level of desired infection I 0 (almost) simultaneously. This explains why the nodes are not distinguishable anymore, manifested by the flatness of the reaching time vector RT. Let us notice that if the random walk Laplacian [2] was used instead, then the result regarding the indistinguishability in the ranking of the nodes is more robust, but it does not qualitatively change overall (see the supplemental information). On the contrary, when the parameter α increases towards 1 (but never reaching exactly it), we have that lim t→∞C α→1− ij e βg(x, A) ij meaning that the node where the infection is originally seeded would be (almost) immediately fully infected while the diffusion is relatively inactive. In this case, what matters is the graph distance from the node where the infection originates. Since the internal dynamics of the nodes is negligible, the overall dynamics transforms from that of a metapopulation system to one of a contact process [31]. In this extremal case, the network can be considered as being organised in shells, and the nodes inside each shell are not distinguishable anymore. This way, we show that the possibility of inferring the shortest distance from the epidemic spreading is quite limited in a general setting, contrary to what has been argued in references [22,23]. To systematically measure the distinguishability of the nodes from each other, we have also studied the sample variance where X i is the sample and μ the mean value. As shown in the inset of figure 1(b), the variance increases with the parameter α.

Dynamical inference of the network structure: the modularity question
Heretofore we have pointed out that if the range of values taken by the reaching time RT i over all nodes i is small, then in the presence of noise in the experimental data (due to the stochastic nature of the process and measurements) it is not possible to distinguish the nodes anymore. To further emphasize this point, we consider a strongly modular topology [2,32], a feature of crucial importance in modern computational neuroscience [12]. The question of inferring the modularity of real networks from data observations is of a crucial importance in modern computational neuroscience where understanding how and why neurons or neuronal patches are organised in different communities is considered a major step forward in the comprehension of how the brain works [12]. It is well known that in modular networks, there are (at least) two different time-scales embedded in the structure, that of the faster intra-links (connections inside the module) compared to the slower inter-links (connections between different modules) one [33]. To illustrate such behaviour, we will complement equation (1) with a noise term, following the Langevin formalisṁ ξ i = F(α, A)ξ i + η i where now ξ i is the stochastic state vector and η i is the noise term with mean η = 0 and variance η i (t)η j (t ) = σδ ij (t − t ) for each node i. The parameter σ gives the magnitude of the noise. In figure 2(a), we plot the reaching time variable RT i for different values of α. Despite the presence of noise, for a large value of α when the local dynamics dominates over the global one, the dynamical observable is characterised by different levels or 'time slots' for each module. So the modules are clearly distinguishable from each other. However, as α gets smaller, allowing for the dominance of the network dynamics over the nodal one, the differences in the reaching time between the nodes belonging to different modules decrease too, leading to a loss of distinguishability between the modules, as shown in panel (b). We quantitatively estimate the correlation between different modules where for two different modules x, y we have corr xy = M i=1 1/M|{min RT y < RT x i < max RT y } 5 . The consequences of this phenomenon are straightforward for a given inference procedure. In panels (c) and (d) of figure 2 we consider the scenario of the inevitable misinterpretation of the results during the implementation of a (hypothetical) network reconstruction method. In fact, let us suppose that nodes belonging to two different modules x and y have reaching times RT i that overlap with each other, in the sense that corr xy is beyond a prefixed threshold. Then, in this case, it will be impossible for the nodes in the overlapped region, to establish to which community they belong to. In the reconstruction protocol ( figure 2(d)), for visualization purposes, we have decided to randomly add as many inter-links as the correlation corr xy between two given original communities x and y. As shown in figure 2(d), once α gets smaller, all the modules (except the seeded module) appear to merge and become indistinguishable following the previous analytical prediction.

Conclusions
In this paper, we have studied the question of structural parameters and, consequently, that of the inference of hidden structures in the regime of competition between the dynamics occurring at the node level versus that at the network one. By generalising the communicability measure and considering a contagion process, we have shown that, in principle, there is not a universal way to measure how central a node is related to the others unless a better understanding of the local and global dynamics is achieved. However, even in this case, measures based on linear operators will fail to distinguish embedded structures biased by strongly nonlinear nodes dynamics. In this sense, our results constitute a factor of uncertainty where inferring the structural properties of a network at the global level (e.g., modularity) means sacrificing resolution of the local dynamics of the nodes, and vice-versa. Although we have considered here the SI model for illustration purposes, our results go beyond that and extend to the general setting where neither the node dynamics nor the dynamics on the network cannot be neglected. We believe that such results can potentially open up new scenarios of investigation in different fields where inference methods are crucial in the comprehension of the role that the different topologies of interaction have on the outcome of the dynamical process involved. In particular, we think that the modern field of computational neuroscience, where inference methods are fundamental for understanding the functional and/or structural connections in brain networks, should be immediately affected by our results [34][35][36][37]. In this sense, in principle, it should be taken into account that the perception we have based on statistical correlation methods of the neuronal activity, might be highly corrupted by the competition of the local neuron dynamics versus the global one of the network of neurons. Maybe this perceptive gap, the existence of which we have proven here, can explain the substantial difference that often exists between functional and structural networks [34,37]. Further potential applications can span from the understanding of the dynamics in proteins or metabolic networks [38,39] to the spreading of fake news in online social networks [40].
In more general terms, the main message we aim to bring with the paper can be of crucial importance for the nowadays applications of data science. Our results clearly state that the similarities in the behaviour between the entities of a complex system can be far for being intended as a sign of causality. Such a conclusion makes us confident that the perspective should open to the dynamical inference methods, which can really capture the relation between the dynamics and the structure of complex networks. ∀i and ∀t, which means that the linear dynamical systemİ(t) = γA I(t) is an upper-bound for the original non-linear dynamical system. Consequently, a solution for the linearized problem can be written as It has been recently shown by Lee et al [41] that after an appropriate renormalisation of I i (t), the solution of the SI model can be written uniquely (apart from constants) in terms of e γrtA , where γ is a normalisation parameter. In this context, it is then clear that the communicability function [17], which is defined for a couple of nodes as C ij = e βA ij = l β l A l ij /l!, is the most suitable choice of centrality to measure the dynamics on the network. We should notice that β is here a parameter that globalises the other parameters present in the solution of the SI model. It can be interpreted as an inverse temperature following the Green's function formalism [18].

Appendix B. The case of the random walk Laplacian
In this section we will briefly discuss the scenario when other cases beside the combinatorial Laplacian operator used thorough the main text, are considered for modelling the diffusion process. A well known operator used extensively in network science for describing the collective dispersion dynamics of group of individuals is the random walk Laplacian L RW matrix whose entries are defines as L RW ij = A ij /k j − δ ij [1,2]. Random walk diffusion has been used in a broad range of applications from human mobility, to brain dynamics, to social media. An important fact related to the random walk operator is that it does not relax at a homogeneous equilibrium. In fact, the steady state for each node of this operator is proportional to its degree x i (∞) ∝ k i . Nevertheless this property should not distort our vision of the qualitative behaviour predicted and described in the main text. So following the analysis along the same lines as above we expect that for small values of the control parameter α the ranking of the nodes should gradually fade. However, as it can be observed by figure B1 such behaviour starts to become evident only for very small values of α.
The explanation behind this outcome should be found in the mathematical definition of the random walk Laplacian. In fact using the analogy with the combinatorial Laplacian, the random walk one can be defined as L RW = LK where K = diag(1/k 1 , 1/k 2 , . . . , 1/k N ). So in this sense we can think of the random walk diffusion as governed by the standard combinatorial Laplacian but where the diffusion rate 1 − α is now weighted locally by the degree of the neighbour nodes in the way described in the definition of L RW . For this reason it is expected that the ratio of the two components of the system (the local node dynamics versus the global network one) is not simply controlled by the parameter α but also by the degree distribution of the network under discussion. This dependence is twofold: on one hand it depends on how dense the network is (e.g. the mean degree k ) and on the other hand it depends on the distribution of degrees in the sense that the diffusion on a central hub node would be quite different from a peripheral leaf node. From this point of view a random walk diffusion can be considered more robust in terms of losing the ranking of the nodes in the dynamical observable, the reaching time RT in our case. In general, a network process whose dynamics depends implicitely on the local structural properties may behave differently from the explicit parametrisation of its importance inside the system dynamics. Nevertheless this does not change the qualitative outcome discussed in the main text that no distinguishability of the nodes is possible for fast network dynamics compared to the slow nodes' one.