Digraphs are different: Why directionality matters in complex systems

Many networks describing complex systems are directed: the interactions between elements are not symmetric. Recent work has shown that these networks can display properties such as trophic coherence or non-normality, which in turn affect stability, percolation and other dynamical features. I show here that these topological properties have a common origin, in that the edges of directed networks can be aligned - or not - with a global direction. And I illustrate how this can lead to rich and unexpected dynamical behaviour even in the simplest of models.


The importance of being directed
Complex systems -be they cells, ecosystems, brains or financial markets -are invariably made up of many elements interacting in non-trivial ways. A simple yet powerful description of such systems is therefore a graph, or network: a set of vertices representing the elements (genes, species, neurons, banks) connected by edges which capture their interactions [1,2]. Much attention has been devoted to complex networks over the past two decades, and one begins to discern an opinion forming to the effect that the fundamental properties of these constructs are now well understood. This is not yet the case, however, when it comes to directed networks, or digraphs.
It is known that in many, if not perhaps most, complex systems the interactions between elements are not necessarily symmetric, so they are best described by directed networks (in which edges can be represented with arrows rather than lines). Yet while some authors have studied this characteristic and certain of its effects explicitly [3,4,5,6], it is far more common to treat directionality as an afterthought, as though direction were just a random binary number associated with each edge.
In fact, the directions of edges in a network can exhibit a degree of global order somewhat analogous to ferromagnetism in spin systems. In some networks edge directions are indeed statistically independent of each other. But in others they can be aligned to a greater of lesser degree with a global direction. And there is evidence to suggest that this kind of organisation is key to understanding many topological and dynamical features of complex systems.
At least two strands of work on directed networks have recently uncovered some of these effects. On the one hand, the observation that the adjacency matrices describing empirical directed networks can be highly non-normal (i.e. they do not commute with their transpose) [7]. On the other, that directed networks exhibit trophic coherence (that is, there exists a more or less well-defined hierarchy of vertices, such as among plants, herbivores and carnivores in an ecosystem) [8,1]. As I go on to show, these two features are closely related, and affect many other topological properties, such as whether there will exist a giant strongly connected component of vertices that are mutually reachable. Trophic coherence has also been related to the prevalence of motifs such as feed-forward loops [10], and to intervality, a property associated with food webs but observed in other directed networks too [11].
Directionality can also have a crucial effect on dynamical systems. In previous work we have shown, for instance, that trophic coherence is sometimes a determining factor in ecosystem stability [8], or whether spreading processes such as epidemics will become endemic [12]. And both trophic coherence and non-normality are reflected in graph eigenspectra, which in turn can be related with the stability of dynamical systems [1,7]. I show here another example, compelling for its simplicity and richness of behaviour. I simulate a system of binary variables, updated at every time step according to the majority rule, on the neural architecture of the only fully mapped animal brain, that of the worm C. elegans [36]. Because of the worm's trophic coherence, the activity on its network is markedly different from that on a random graph, hopping between states where its random counterpart is stable.
The main conclusion is that there is a common origin to many of the distinctive features of certain directed networks: a global ordering of edge directions which leads to very different topological and dynamical properties than we might have expected from naïvely extending results for undirected networks to the directed case. However, we still have much to learn about digraphs and the effects of directionality on complex systems.

Trophic levels and coherence
Consider a directed graph with adjacency matrix A (where an element a ij = 1 means there is a directed edge from vertex v j to vertex v i , whereas a ij = 0 if not). There are N vertices, L edges, B basal vertices (i.e. vertices with no in-coming edges), and L B basal edges (edges connected to basal vertices). Each vertex v i has an in-degree k in i = j a ij , and an out-degree The standard definition of the trophic level of vertex v i is unless v i is basal (i.e. k in i = 0), in which case s i = 1 by (ecological) convention [14]. Trophic coherence is the extent to which a network is well organised into trophic levels, and can be measured in the following way [8]. We assign to each edge a trophic difference, x ij = s i − s j . For a given network, the distribution of differences over all edges, p(x), will have mean [x] = 1 and variance σ 2 = [x 2 ] − 1, where [·] = L −1 ij a ij (·) indicates an average over edges. We define the incoherence parameter, q, as the standard deviation of p(x), q = σ. A perfectly coherent network, in which vertices fall into clearly defined trophic levels (with integer values for these) will have q = 0. Larger values of q indicate a departure from this well-ordered state.
Eq. (1) can be written in matrix form as where z i = max(k in i , 1) and Λ = diag(z) − A. Each vertex can be assigned a unique trophic level if and only if Λ is invertible. Because the sum of elements of Λ over any row corresponding to a non-basal vertex is zero, Λ will be singular for graphs with no basal vertices (i.e. there will be a right eigenvector u = (1, 1, ..1) with eigenvalue λ = 0). Therefore, the definition of trophic levels depends on there being at least one basal vertex. Figure 1 shows two directed networks -the Ythan Estuary food web [25] (panels A and B) and the C. elegans neural network [36] (C and D) -each plotted in two different ways. On the left (panels A and C) the height of each vertex corresponds to its trophic level, as indicated by the vertical axes. For comparison, panels on the right (B and D) are plotted according to a standard energy-minimisation method for graph visualisation: such layouts are good at highlighting community structure, but do not give any indication about the trophic structure. These examples illustrate how the trophic level of a vertex corresponds to its position in a Figure 1. Two directed networks: the Ythan Estuary food web (A and B) and the the C. elegans neural network (C and D). In panels A and C, the height of each vertex corresponds to its trophic level. In panels B and D, networks are plotted according to a standard energy minimisation algorithm. In all cases, vertices belonging to the largest strongly connected component are represented as diamonds, while the rest appear as circles.
hierarchy. For instance, in a food web biomass usually originates in plants (basal or source vertices), flows through herbivores, then different kinds of omnivores or carnivores, and ends in apex predators (sinks). Similarly, in a neural network, information enters through the sensory neurons, is processed through various kinds of inter-neuron, and finally reaches the motor neurons. Using trophic levels to determine vertex function has long been standard in ecology, but it appears likely that this classification would be informative in a wide variety of complex systems describable as directed networks.
Note that the definition of trophic levels, and hence coherence, can be easily extended to the case of weighted networks, by considering a non-binary adjacency matrix. It is also possible to define trophic levels, and coherence, on A T instead of on A, with sink vertices taking on the role of basals. For simplicity I focus here only on unweighted networks and levels as defined by Eq. (1).

Graph ensembles
A fruitful approach in studying random graphs is to consider ensembles, or sets of possible networks which meet certain constraints. For instance, the Erdős-Rényi ensemble is the set of all possible undirected networks with N vertices and L edges [16], while the directed configuration ensemble comprises all directed networks with given in-and out-degree sequences, k in and k out [17].
In Ref. [1] we present results based on the coherence ensemble, which is defined as the directed configuration ensemble with the added constraint of a given trophic coherence. We also make use of the basal ensemble, which is again based on the directed configuration ensemble, with the extra requisite that all non-basal vertices receive the same proportion L B /L of incoming edges from basal vertices. The basal ensemble is equivalent to the directed configuration ensemble in the limit N → ∞, with L/N → ∞. Expected values of properties y in these ensembles are denoted E(y) = y in the coherence ensemble, and E(y) =ỹ in the basal ensemble. Graph ensembles such as these not only provide a powerful mathematical tool to investigate the topological properties of large networks; they can also be used as null models for ascertaining the extent to which measurements on empirical networks are statistically significant. For example, comparing the q value of a network with its basal expectationq reveals whether it is more or less coherent than would be expected from its degree sequence if all else were random. Thus, in our data set, the food webs have a mean ratio of q/q = 0.44 ± 0.17, while for the metabolic networks it is q/q = 1.81 ± 0.11, which implies significant coherence and incoherence, respectively, for each class (details of each network can be found in the Supplementary Material (SM)) [1].
In the coherence ensemble, we have shown that, in expectation, where τ is the loop exponent, and α = k in k out / k is the branching factor (the notation · = N −1 i (·) stands for an average over vertices). The basal-ensemble expectations for q and α areq = L/L B − 1 and α = (L − L B )/(N − B) [1]. From this, one can derive expectations for various topological properties as a function of trophic coherence. Moreover, we have found that several kinds of empirical network -including food webs and examples of genetic, metabolic, neural, international trade, P2P and word adjacency networks -conform closely to these coherence-ensemble expectations.
According to Gelfand's formula [18], the spectral radius of A is for any matrix norm · . Taking the norm in Eq. (5) to be the trace, we can use Eq. (3) to find the expected value of the spectral radius in the coherence ensemble: The loop exponent given by Eq. (4) can take positive or negative values. If α > 1 and the network in question has the trophic coherence of the basal ensemble (q =q), the loop exponent τ is positive and the spectral radius is ρ = α, as in the directed configuration ensemble. However, if the network is sufficiently coherent (q → 0), τ will be negative. In this case, the spectral radius tends to zero: ρ → 0. We classify networks into the loopful (τ > 0) and loopless (τ < 0) regimes, since the two have markedly different topological properties. For instance, the number of directed cycles of length ν grows exponentially with ν in the loopful regime, while it decays exponentially to zero in the loopless one. I am referring here to directed cycles in which the same vertex can appear more than once (circuits), not to 'simple cycles' in which this is not allowed. Domínguez-García et al. found that the simple cycles in several kinds of directed network seem to be suppressed because of an 'inherent directionality' [19]. In Ref. [1] we discuss how this effect, too, can be explained by considering the coherence ensemble.

Strong connectivity
A directed graph is said to be strongly connected if it is possible to reach any vertex from any other along a directed path. It is weakly connected if this is possible when edge directions are  Figure 2. Fraction of non-basal vertices in the strongly connected component, Φ, against loop exponent, τ , for several empirical networks. Inset: Φ against q for networks generated with the 'preferential preying model' [12]. ignored. A weakly connected graph may have strongly connected subgraphs, and the largest of these is the 'strongly connected component' (SSC). Directed cycles are strongly connected subgraphs, and large strongly connected subgraphs necessarily contain long cycles. So it follows from the analysis above that in the loopless regime the SCC will be vanishingly small -whereas it will comprise a finite proportion of any network in the loopful regime.
In Fig. 1, vertices belonging to the SCC, in each network, are represented with diamonds, in contrast to the circles used for other vertices. In the food web (τ = −1.32) the SCC only has two vertices, while in the neural network (τ = 2.17) the SCC includes most of the vertices. Figure 2 shows the size of the SCC as a fraction of the number of non-basal vertices, Φ, against τ for several empirical networks of various kinds. We observe that Φ ≃ 0 when τ < 0, and Φ > 0 when τ > 0. Details of each network can be found in the SM. This disparity in strong connectivity could not be understood by simply extending known results for undirected networks, according to which the main factor determining Φ is the mean degree, k [1]. Here we see some networks, such as most of the metabolic ones (Table S3 of SM), with Φ > 0.9 and k < 3; whereas many of the food webs (Table S1 of SM) have Φ ≃ 0 despite being much denser ( k > 10). It is only by measuring their trophic coherence, and hence τ , that the reason for this becomes clear.
Networks with specified trophic coherence can be generated computationally with the 'preferential preying model' [12], which is described below in Methods 0.1. The inset in Fig. 2 shows how Φ varies with q in networks simulated with this model. The model displays what appears to be a continuous (i.e. second order) phase transition in strong connectivity with coherence, reminiscent of the percolation transition observed in undirected, Erdős-Rényi random graphs with mean degree [1].

Non-normality
An N ×N matrix A is said to be normal if its adjacency matrix, A, commutes with its transpose, . Clearly, if A is the adjacency matrix of a network, it must be directed to be non-normal. Intuitively, we might expect a large deviation from normality to indicate a network with a well-defined directionality. This is also what occurs in trophically coherent networks.
Asllani et al. have recently shown that a wide variety of empirical networks are highly non-normal, and they discuss the significant implications for dynamical systems with such a structure [7]. To quantify this property they use Hermici's departure from normality: where is the Frobenius norm [20]. And, in order to compare matrices of different sizes, they use the normalised version: A normal matrix will have d F = 0, and d F is closer to 1 the more A departs from normality.

Trophic coherence and non-normality
Using results for the coherence ensemble, it is possible to relate trophic coherence with nonnormality. In particular, the following theorem holds in this ensemble: Theorem. The expected deviation from normality, d F , for digraphs drawn from the coherence ensemble tends to 1 with increasing trophic coherence. That is, Furthermore, for digraphs in the τ < 0 regime, where k is the mean degree.
Proof. For a binary (i.e. unweighted) adjacency matrix A we have A 2 F = L, where L is the number of edges. So we can express the normalised departure from normality as Let ρ = ρ(A) be the spectral radius of A. Since |λ i | ≤ ρ for all i, and |λ j | = ρ for at least one j, we have that We can use these bounds on N i |λ i | 2 to define a lower bound, d L F , and an upper bound, d U F , on d F in terms of the spectral radius ρ: Inserting Eq. (6) into Eqs. (14) and (15) provides lower and upper bounds on the expected deviation from normality in the coherence ensemble: Eq. (16) implies that lim Moreover, in the loopless regime (τ < 0) we have In other words, coherent networks are highly non normal. Figure 3 shows the normalised deviation from normality, d F , against the loop exponent, τ , for the same set of empirical networks used in Fig. 2. The blue line is d L F as given by Eq. (16) for the case of the network with lowest mean degree in the set; while the red line is d U F as given by Eq. (17) for the network with the largest number of edges. The inset shows Eq. (16) for various different mean degrees. We can observe that the real networks with small or negative τ are indeed highly non-normal, and the bounds obtained for the coherence ensemble hold for these empirical cases too.

Dynamical stability
We have shown in previous work that trophic coherence can have an important bearing on the dynamics of complex systems. In particular, it affects linear stability in food-web models [8], and percolation in spreading processes [12]. I go on to show another example in which trophic coherence has a remarkable effect on the stability of one of the simplest dynamical systems.
Consider a set of variables on the vertices of a network which can, at each discrete time step t, take either of two states, σ i (t) = ±1, according to the 'majority rule' applied to the states of in-neighbours. In other words, if h i (t) = j a ij σ j (t), then σ i (t + 1) = +1 if h i (t) > 0, and σ i (t + 1) = −1 if h i (t) < 0. If h i (t) = 0, one of the two states is chosen randomly with equal probability (this is the only source of stochasticity in the dynamics). The variables are updated in parallel at each t, and the overall state of the system can be measured with the mean activity, m(t) = σ(t) . This dynamics is a version of the 'majority rule' model used as a simple approximation to opinion formation [21], and coincides with a Hopfield neural network model when all synaptic weights are equal, and with a zero-temperature Ising model on a directed network [22].    Fig. 1. The red line is for a randomisation of this same network, achieved by repeatedly choosing two pairs of connected vertices, and swapping the out-neighbours. This randomisation preserves the in-and outdegrees of every node while destroying other structure. Activity on the randomised network is stable, with the mean activity remaining in this case close to −1 (other simulations are equally likely to adopt m(t) ≃ +1). However, on the empirical network the mean activity switches between positive and negative states. This instability must be caused by some topological difference between the two networks other than the degree sequences. Figure 4.B shows the same dynamics but on two networks generated with the 'preferential preying model' [12], using the same numbers of nodes, basal nodes and edges as the empirical network has. The red line is for a network with the coherence of a random graph (q ≃ q), and the cyan line for a highly coherent one (q ≃ 0). The incoherent network presents stable dynamics, like the randomised version of the neural network; while the coherent one is completely unstable, with m(t) changing sign every few time steps. Figure 4.C shows m(t) on a network with intermediate trophic coherence, similar to the neural network's value (q/q ≃ 0.4). Now we recover the bistability of the empirical topology, which suggests that it is indeed trophic coherence which accounts for this dynamical behaviour. Figure 4.D shows average results of such simulations on preferential preying networks with different numbers of nodes and basal nodes. The proportion of time steps in which m(t) changes sign, P f lip , is plotted against q, revealing what appears to be a continuous transition between and unstable and a stable phase, at q ≃ 1. Intuitively, we can understand this behaviour by considering that the basal vertices always have h i = 0, and so take either state +1 or −1 with a 1/2 probability at each t. In a maximally coherent network, this random configuration propagates up through the levels, leading to fully unstable behaviour. On the other hand, when the network is highly incoherent, with a large strongly connected component, most vertices are less susceptible to the influence of the basal vertices, and tend instead to preserve the average state of the network.
It is possible to carry out a mean-field approximation for networks drawn from the basal ensemble, which reveals a critical q separating stability from instability at q c = 1 (this is done in Methods 0.2). Although this is not such a straightforward exercise for the coherence ensemble, it appears from numerical simulations that the same critical value may also apply to such networks (see Fig. 4.D).
The fact that the emergence of bistability stems from the stochasticity of the basal vertices might seem like an artefact of this toy model. But there are many networks in which these vertices may represent sources of information -e.g. sensory neurons, oligarchs, oil-producing nations -which cascades through the system. These results suggest that how the system reacts to novel information entering through the basal vertices depends strongly on trophic coherence.

Concluding Remarks
Directed networks can exhibit topological properties which are much more than trivial extensions of the features available to their undirected counterparts. Edges can be organised according to a global direction in a way analogous to the alignment of spins in a ferromagnet. Some hallmarks of this phenomenon have recently been identified, such as trophic coherence [8] and non-normality [7]. One of the results in this paper is that these two properties are closely related.
Directed networks can belong to either of two regimes [1]. In the 'loopful' regime edges are not strongly aligned with a global direction. This manifests in topologies that are trophically incoherent, with small deviations from normality, large spectral radii, large strongly connected components, and numbers of cycles which grow exponentially with length. In the 'loopless' regime, however, edges are organised according to a global direction. This translates into all the above properties being inverted: networks are highly coherent and non-normal, spectral radii and strongly connected components are vanishing, and numbers of cycles decay exponentially with length. All these properties can be related, at least in expectation, to the 'loop exponent' τ , which is a function only of trophic coherence and the in-and out-degree sequences. The sign of τ determines which regime a network belongs to.
These topological features can have an important influence on the behaviour of dynamical systems in which the interactions between elements are not symmetric. In previous work we have shown that trophic coherence increases linear stability in ecological models, and provides a possible solution to May's paradox -i.e. the fact that large ecosystems seem to be more stable than smaller ones [23,8]. Asllani et al. have also related non-normality with stability [7]. In the case of spreading processes like models of epidemics, trophic coherence can determine whether activity (e.g. an infection) dies out quickly or becomes endemic [12].
The example in this paper shows that the relationship between coherence and stability is not straightforward, and that quite rich and unexpected behaviour can emerge even in a very simple dynamical model. A system of binary elements updated according to the majority rule is stable on incoherent networks, as would also be the case on an undirected network. However, on a highly coherent network the system is completely unstable. And in the case of an intermediate coherence the mean activity hops between metastable states, with switching times that depend on coherence. Given that many real-world networks -such as the C. elegans neural network used here for illustration -have intermediate levels of coherence, this may be an important effect in a wide variety of complex, dynamical systems.
These results suggest that a complete understanding of the relationship between structure and function in complex systems will require further fundamental research, in particular as regards topological properties related to edge directionality. While it is possible to tie together several of these properties -such as trophic coherence, non-normality and strong connectivity -we have not yet considered how these might interact with other topological characteristics, like degree distributions, community structure or assortativity [1,24]. And given the highly disparate kinds of dynamical behaviour we have seen even between quite simple models, it is clear that a more exhaustive investigation is needed.
Two tools which could be of use in such an endeavour are the preferential preying model, which provides a way of generating networks with specified coherence numerically [8,12]; and the coherence ensemble, a theoretical approach which allows one to investigate the relationships between properties mathematically [1].

The preferential preying model
We can generate networks with a given trophic coherence with the model used in Ref. [12], which is a generalisation of the one first proposed in Ref. [8]. (The original version is loosely inspired by immigration of species into an ecosystem, and somewhat resembles Barabási and Albert's famous preferential attachment model [1] -hence the name.) We begin with B basal vertices and proceed to introduce N − B non-basal vertices sequentially. Each vertex is initially assigned a single in-neighbour, chosen randomly from among the vertices (basal and non-basal) already in the network when it arrives. At this stage vertex v i has a preliminary trophic level s ′ i , as given by Eq. (1) (note that this simply means assigning s ′ i = s ′ j + 1 if v i is given v j as its in-neighbour). We then introduce the remaining L − N + B edges needed to make up a total of L. For this, each pair of nodes {v i , v j } such that v i is a non-basal vertex is attributed a temporary trophic distancex ′ ij =s ′ i −s ′ j . Edges between pairs are then placed with a probability proportional to until there are L edges in the network. The 'temperature' parameter T sets the network's trophic coherence, with T = 0 yielding maximally coherent networks (q = 0), and incoherence increasing monotonically with T . The specific choice for the edge probability is arbitrary, but the form in Eq. (20) is conducive to a Gaussian distribution of distances x, which we have found to be a good fit to empirical data on several kinds of networks. Unless a maximally coherent network is intended, one must then recalculate the actual trophic levels s of the final networks, according to Eq. (2), and measure q based on these. Although in practice q will not generally be equal to T , one can easily obtain, for given {N, B, L}, the value of T which best approximates the intended q, thanks to the monotonic relation between T and q [8,12].

Majority rule dynamics on basal ensemble networks
Consider the majority rule dynamics described above on networks drawn from the basal ensemble [1]. In such networks all non-basal vertices have the same proportion of in-coming edges from basal nodes. In a mean-field approximation, the field at any non-basal vertex i is where k b i and k n i are the mean numbers of incoming edges from basal and non-basal neighbours, respectively; and m b and m n are the mean activities of basal and non-basal vertices, respectively (time dependencies have been dropped for clarity). If a non-basal vertex v i is in the state s i = sgn(m n ) at time t, the probability that it will change state at t + 1 will be (where Pr(y) stands for the probability of event y). Because the network is drawn from the basal ensemble, we have that At each time step, every basal vertex's state will be +1 or −1 with equal probability. Therefore, the number of basal vertices in the +1 state, n + , will be a random draw from a binomial distribution: where Let us consider the case in which the majority of non-basal vertices are in the same state, so that |m n | ≃ 1; and, without loss of generality, that m n < 0. We now have The CDF of the binomial distribution Bin(n, p) is given by the incomplete Beta function: Pr(X ≤ k) = I 1−p (n − k, 1 + k). Therefore, we have One corollary of Eq. (27) is that P f lip > 0 requires λ < 1, or L < 2L B . The symmetry of the system implies that P f lip is independent of the sign of m n . Therefore, m n will follow a Bernoulli process with p = P f lip . The distribution of time intervals, ∆, between sign changes of m n will therefore follow In the basal ensemble, we haveq Therefore, the critical ratio L/L B = 2 obtained above is equivalent to a critical coherencẽ This mean-field analysis is only valid for the basal ensemble, but simulations of the preferential preying model suggest that q c ≃ 1 applies to other kinds of networks too (see Fig.  4D).

Supplementary Material Network data
The main text makes use of the same set of 62 empirical networks analysed in Ref. [1]. These include food webs, gene regulatory networks, metabolic networks, a neural network, trade networks, a P2P file sharing network, and a network of word adjacencies. All data are available online at: https://www.samuel-johnson.org/data. One can also find on this website the C++ code used for all analyses and simulations performed for the main article.
The tables below list a series of properties for each network, along with references to the original data sources. The captions also include links to other websites where the data can also be found.      [41], as described in Ref. [1], and is available at https://www.samuel-johnson.org/data. The other data can be found on various websites: http://www-personal.umich.edu/~mejn/netdata/ (neural network); https://snap.stanford.edu/data/p2p-Gnutella08.html (P2P network); and http://vlado.fmf.uni-lj.si/pub/networks/data/esna/metalWT.htm (trade networks). Columns as in Table S1