Structure of triadic relations in multiplex networks

Recent advances in the study of networked systems have highlighted that our interconnected world is composed of networks that are coupled to each other through different ‘layers’ that each represent one of many possible subsystems or types of interactions. Nevertheless, it is traditional to aggregate multilayer networks into a single weighted network in order to take advantage of existing tools. This is admittedly convenient, but it is also extremely problematic, as important information can be lost as a result. It is therefore important to develop multilayer generalizations of network concepts. In this paper, we analyze triadic relations and generalize the idea of transitivity to multiplex networks. By focusing on triadic relations, which yield the simplest type of transitivity, we generalize the concept and computation of clustering coefficients to multiplex networks. We show how the layered structure of such networks introduces a new degree of freedom that has a fundamental effect on transitivity. We compute multiplex clustering coefficients for several real multiplex networks and illustrate why one must take great care when generalizing standard network concepts to multiplex networks. We also derive analytical expressions for our clustering coefficients for ensemble averages of networks in a family of random multiplex networks. Our analysis illustrates that social networks have a strong tendency to promote redundancy by closing triads at every layer and that they thereby have a different type of multiplex transitivity from transportation networks, which do not exhibit such a tendency. These insights are invisible if one only studies aggregated networks.

The quantitative study of networks is fundamental for the study of complex systems throughout the biological, social, information, engineering, and physical sciences [1][2][3].The broad applicability of networks, and their success in providing insights into the structure and function of both natural and designed systems, has generated considerable excitement across myriad scientific disciplines.Numerous tools have been developed to study networks, and the realization that several common features arise in a diverse variety of networks has facilitated the development of theoretical tools to study them.For example, many networks constructed from empirical data have heavy-tailed degree distributions, satisfy the small-world property, and/or possess modular structures; and such structural features can have important implications for information diffusion, robustness against component failure, and more Traditional studies of networks generally assume that nodes are connected to each other by a single type of static edge that encapsulates all connections between them.This assumption is almost always a gross oversimplification, and it can lead to misleading results and even the sheer inability to address certain problems.Most real and engineered networks are multiplex [1], as there are almost always multiple types of ties or interactions that can occur between nodes, and it is crucial to take them into account.For example, transportation systems include multiple modes of travel, biological systems include multiple signaling channels that operate in parallel, and social networks include multiple modes of communication.
In this Letter, we develop multiplex generalizations of the clustering coefficient, which is one of the most important properties to calculate in monoplex networks.In constructing such a generalization, myriad definitions are possible, and the most appropriate one to use depends on the application under study.Such considerations are crucial when developing multiplex generalizations of any monoplex network diagnostic.Using the example of clustering coefficients, our Letter illustrates how the new degrees of freedom that result from the existence of an interlayer structure yield rich new phenomena and subtle differences in how one should compute key network diagnostics.We thereby demonstrate that in order to understand real complex systems, it is insufficient to generalize existing diagnostics in a naive manner and one must instead construct their generalizations from first principles.Supra-Adjacency Matrices.We represent a multiplex network by a sequence of graphs , where α ∈ {1, . . ., b} indexes the layers of the network.For simplicity, we will examine unweighted multiplex networks.We define the intra-layer supra-graph as G(V, E), where the set of nodes is We also define the coupling supra-graph G C (V, E C ) using the same sets of nodes, and edges as we say that (v, α) and (v, β) are inter-connected.We say that a multiplex network is fully interconnected if all layers share the same set of nodes (i.e., if V α = V β for all α and β).The supra-graph is Ḡ(V, Ē), where Ē = E ∪ E C .It is useful to define at this point what we call supra-nodes [45].Supra-node ũ is defined by the set l(ũ) = {(α, u), (β, u), . . .} of inter-connected nodes, i.e. nodes connected by edges in E C .Supra-nodes made up the aggregated multi-graph which results from the contraction of all edges in E C .
We let A, C, and Ā denote the corresponding adjacency matrices for G, G C , and Ḡ and we call them respectively intra-layer supra-adjacency matrix, coupling supra-adjacency matrix, and supra-adjacency matrix.It is easy to see that A = α A α , where A α is the adjacency matrix of G α and denotes the direct sum of the matrices.Thus, the supraadjacency matrices satisfy the property Ā = A + C. In this Letter, we will consider undirected networks, so A = A T .Additionally, C = C T follows from the definition of E C .
Intra-Layer (Monoplex) Clustering Coefficient.The local clustering coefficient c i of node i in an unweighted monoplex network is the number of triangles that include node i divided by the number of connected tuples for which the node is central [3,29].The local clustering coefficient is a measure of transitivity [30], and it can be interpreted as the density of the focal node's neighborhood.For our purposes, it is convenient to define the clustering coefficient c i as the number of 3-cycles t i that start and end at the focal node i divided by the number of 3-cycles d i such that the second step occurs in a complete graph (i.e., assuming that the neighborhood of the focal node is as dense as possible).
Using the above definition, we calculate the local clustering coefficient by using the fact that the number of length-n walks that start at node i and end at node j is ( n t=1 A t ) ij , where the tth step occurs in the graph whose adjacency matrix is A t .We thus write t i = (A 3 ) ii and d i = (AF A) ii , where A is the adjacency matrix of the graph and F is the adjacency matrix of a complete network with no self-loops (i.e., F = J − I, where J is a complete square matrix of 1s and I is the identity matrix).The local clustering coefficient is then c i = t i /d i .This is equivalent to the usual definition of the clustering coefficient: , where k i is the degree of node i.One can obtain a single global clustering coefficient for a monoplex network either by averaging c i over all nodes or by computing c = i ti i di , the latter is what we will consider in the rest of the letter.
Cycles on Multiplex Networks.In addition to 3-cycles taking place inside a single layer, in multiplex networks, there are other cycles that can go through different additional layers but still have 3 intra-layer steps.For example, this is very important for both social networks and transportation networks: in the former, transitivity involves social ties across multiple media [1,31]; in the latter, there are typically several choices for how to return to one's starting location.All of these 3cycles can be important for dynamical processes on networks, so they need to be considered when defining a multiplex clus-tering coefficient.Let define a supra-walk as a walk on the supra graph in which, after or before each intra-layer step, a walk can either continue on the same layer or change to some adjacent layer.We represent this choice using the following matrix: where the parameter β is the "weight" of staying in the current layer and γ is the "weight" of stepping to another layer.Suppose, for example, that one wishes to consider only intra-layer steps or steps that change between a pair of layers before or after having an intra-layer step (In this scenario, we disallow two consecutive inter-layer steps.)The number of these cycles is where the first term corresponds to cycles in which the interlayer step is taken after an intra-layer one and the second term to cycles in which the inter-layer step is taken before an intralayer one.We can simplify Eq. 2 by exploiting the fact that both A and C are symmetric.This yields If we relax the condition of disallowing two consecutive inter-layer steps, we can write where C ′ = 1 2 βI + γC.Unlike the matrices in definition Eq. ( 2), the matrices CA C and CA + A C are symmetric and we can interpret them as weighted adjacency matrices of symmetric supra graphs, and we can thereby calculate cycles and clustering coefficients in these graphs.In the Supplementary Material (SM), we include additional discussion about these types of cycles, which include walks with two consecutive inter-layer steps.
Multiplex Clustering Coefficients Based on Counting Cycles.To define multiplex clustering coefficients, we need both the number of cycles t i and a normalization d i .For normalization, we can follow the same idea as with monoplex clustering coefficients and use a complete multiplex network F = α F α , where F α = J − I is the adjacency matrix for a complete graph on layer α.We can then proceed from any definition of t * ,i to d * ,i by replacing the second intra-layer step with a step in the complete multiplex network.For example, we get d W,i = 2A CF CA C for t W,i = 2(A C) 3 .
We can now define local and global clustering coefficients for multiplex graphs analogously to monoplex networks.We can also define a clustering coefficient for the supra-nodes ĩ.

This yields
For simplicity, we henceforth use c(β, γ) to indicate that we calculate c with parameter values β and γ.
We decompose the formula in Eq. ( 8) in terms of contributions from cycles that traverse exactly one, two, and three layers: Using this decomposition yields alternative way to average over contributions from the three types of cycles: where ω is a vector that gives the relative weights of the different contribution.Similar formulas hold for the other two multiplex clustering coefficients of Eqs.(6,7).
Clustering Coefficients on Aggregated Networks.In the previous paragraph we have defined a local-clustering coefficient for the supra-nodes in terms of the local clustering coefficients of the nodes belonging to it.Now, we show another way to assign local clustering coefficient to supra nodes.A common way to study multiplex systems is to aggregate layers to obtain either multi-graphs or weighted networks.One can represent a weighted network using a weighted adjacency matrix W whose elements are the weights of the links.The weighted adjacency matrix associated to the aggregation of a multiplex has elements W ĩj = i∈l( ĩ),j∈l( j) A ij .There are numerous ways to define clustering coefficients on weighted networks [32] and one can use any of these after one has aggregated a multiplex network into a weighted monoplex network.For example, one can calculate the clustering defined in Refs.[33], [34], and [35], respectively, as where A is the unweighted network corresponding to W , the degree of ĩ is k ĩ = j A ĩj , the strength of ĩ is s ĩ = j W ĩj , the quantity w max = max ĩ, j W ĩ, j is the maximum weight in W , and F is the adjacency matrix of the complete unweighted graph.We can define the global version c Z of c Z, ĩ by summing over all the supra-nodes in the numerator and the denominator of Eq. 13 similar to Eq. 8.
The coefficients c Z, ĩ and c Z are related to some of the multiplex coefficients for fully inter-connected multiplex networks.Letting β = γ = 1 and summing over all layers yields: i∈l( ĩ) ((A C) 3 ) ii = ((W ) 3 ) ĩĩ .That is, in this special case the weighted clustering coefficients c Z, ĩ and c Z , and the multiplex clustering coefficients c W, ĩ and c W are equivalent: The term w max /b to match the normalizations arises because aggregation removes the information about the number of layers b, so the normalization must be based on the maximum weight instead of the number of layers.That is, a step in the complete weighted network is described by w max F in Eq. 13 instead of by bF .
Multiplex Clustering Coefficients in the Literature.Now consider the clustering coefficient proposed in [36] defined for fully interconnected multiplex networks as which can be simplified if it's written in terms of the aggregated network as The numerator of Eq. ( 17) is the same as the numerator of the weighted clustering coefficient c Z, ĩ, but the denominator is different.Because of the denominator in Eq. ( 17), the values of the clustering coefficient c Be, ĩ do not have to lie in the interval [0, 1].For example, c Be = (n − 2)b/n for a complete multiplex graph, where n is the number of nodes in the multiplex graph and c Be > 1 when b > n n−2 .Refs. [37,38] define some further multiplex clustering coefficients, but we left out them of the further analysis in this Letter, since they don't return to the monoplex clustering coefficient for unweighted (i.e.networks with binary weights) and undirected networks.
Comparison of the Different Definitions.Next, we provide a comparison between the different formulations of multiplex clustering coefficients.In Table I, we give the values of some of the global and mean local clustering coefficients for multiplex networks (4 social networks and 2 transportation networks) constructed from real data.As we will now discuss, multiplex clustering coefficients give insights that are impossible to infer by calculating weighted clustering coefficients for aggregated networks or even by calculating clustering coefficients separately for each layer of a multiplex network.
For each of the social networks in Table I, c W satisfies c * (1, 0, 0) > c * (0, 1, 0) > c * (0, 0, 1), so it takes larger values  We symmetrized directed networks considering two nodes to be connected if there is at least one edge between them.The social networks above are fully inter-connected multiplex graphs, but the transport networks are not fully inter-connected.We use • ĩ to denote average over ĩ.
if fewer layers are involved.That is, there is more intra-layer clustering than inter-layer clustering.The opposite is the case for the London Tube network: c * (0, 0, 1) > c * (0, 1, 0) > c * (1, 0, 0).This reflects the fact that lines in the Tube are designed to avoid redundant connections, and the lines roughly lie on geographically straight lines.A single-layer triangle would require a line to make a full loop within 3 stations.Two-layer triangles, which are a bit more frequent, entail that two lines run in almost parallel directions and that one line jumps over a single station.For 3-layer triangles, the geographical constraints do not matter because one can construct a triangle with three straight lines.The airline network is also a transportation network, but it is organized differently.Each layer encompasses flights from a single airline.The intra-airline clustering coefficients have small values since it is not in the interest of an airline to introduce new flights between two airports which can already be reached by with the same airline through some other airpot.The two-layer cycles correspond to cases where an airline has a connection from an airport to two other airports and a second airline has a direct connection between those two airports.Completing a three-layer cycle requires using three distinct airlines, and this type of congregation of airlines to the same area is not frequent in the data.
It is also relevant to examine how the various weighted clustering coefficients behave on aggregated multiplex networks.Note that for fully interconnected multiplex networks wmax b c Z gives c W ( 1 2 , 1 2 ).Both c O,i and c Ba,i are based on the unweighted clustering coefficient of the aggregated graph, where c Ba,i uses the weights to weight the importance of the different triangles, and c O,i is the unweighted clustering coefficient multiplied by the average intensity of the triangles.c Be is the only previously defined multiplex clustering coefficient, but it seems to be sensitive to the number of network layers, and even gets values larger than one for the Krackhardt cognitive social structure network with 21 layers.The transport networks are free from this sensitivity to the number of layers since although they have many of them, they are mostly empty.For example, most airlines only use a small subset of the total of 3108 airports.
Conclusions.We derived measurements of transitivity for multiplex networks by developing multiplex generalizations of the clustering coefficient.By using examples from empirical data in diverse settings, we showed that different notions of multiplex transitivity are important in different situations.For example, the balance between intra-layer versus inter-layer clustering is different in social versus transportation networks (and even in different types of networks within each category, as we illustrate explicitly for transportation networks), reflecting the fact that transitivity arises from different mechanisms in these cases.Such differences are rooted in the new degrees of freedom that arise from inter-layer connections and are invisibles to calculations of clustering coefficients on singlelayer networks obtained via aggregation.Generalizing clustering coefficients for multiplex networks thus makes it possible to explore such phenomena and to gain deeper insights into different types of transitivity in networks.Finally, the existence of multiple types of transitivity also has important implications for multiplex network motifs and multiplex community structure.In particular, our work on multiplex clustering coefficients demonstrates that definitions of all clustering notions for multiplex networks need to be able to handle such features.

SUPPLEMENTARY MATERIAL Other Possible Definitions of Cycles
There are many possible ways to define cycles in multiplex networks.For example, one might want to disallow the option of staying inside a layer in the first step of the second term.We can then write With this restriction, cycles that traverse two adjacent edges to the focal node i are only calculated two times instead of four times.Similar to Eq. ( 3) in the main text, we can simplify Eq. ( 18) to obtain where C W ′ = (βI + 2γC).Table II shows the values of the previous clustering coefficient for the same networks studied in the main text.

Writing Clustering Coefficients Using "Elementary" 3-Cycles
It may be useful to decompose multilayer clustering coefficients defined in terms of multilayer cycles into so-called elementary cycles by expanding the formulas.Because we are only interested in the diagonal elements of the terms and the intra-layer supra-graph and coupling supra-graph that we consider are undirected, we can transpose the terms and still write them in terms of the matrices A and C rather than their transposes.
We adopt a convention in which all elementary cycles are transposed in a way that we select the one in which the first different element is A rather than C when comparing the two versions of the term from left to right.We then express all of the cycles in a standard form with terms AAA, AACAC, ACAAC, ACACA, ACACAC, CAAAC, CAACAC, and CACACAC.
Similarly, we write the normalization formulas using the same set of terms, except that the second A is replaced with where w * are scalars that correspond to the weights for each type of elementary cycle.Note that we have absorbed the parameters β and γ into these coefficients.We illustrate the elementary cycles in Fig. 1,2.
In Table III, we show the coefficient values for expansions in terms of elementary cycles.In Table IV, we show the expansions for the case β = γ = 1.These cycle decompositions illuminate the difference between c W and c W ′ .The clustering coefficient c W gives equal weight to each elementary cycle, whereas c W ′ gives half of the weight to AAA and ACACA cycles; these are exactly cycles that include an implicit double-counting of cycles.
One can even express the cycles that include two consecutive inter-layer steps in a standard form for fully interconnected multiplex networks, because CC = (b − 1)I + (b − 2)C in this case.Without the assumption that β = γ = 1, the expansion for the coefficient c SW is cumbersome because it includes coefficients β k γ h with all possible combinations of k and h such that k + h = 6 and h = 1.Furthermore, it is no longer possible to infer the number of layers in which a walk traverses an intra-layer edge from the exponents of β and γ for c SW and c SW ′ .For example, in c SW ′ , the intra-layer elementary triangle AAA includes a contribution from both β 3 (i.e., the walk stays at the original layer) and βγ 2 (i.e., the walk visits some other layer but then comes back to the original layer without traversing any intra-layer edges while it was gone).Moreover, all of the terms with b arise from a walk moving to a new layer and then coming right back to the original layer in the next step.Because there are b − 1 other layers from which to choose, the influence cycles with this type of transient layer visits is amplified by the total number of layers in the network.That is, adding more layers (even ones that do not contain any edges) changes the relative importance of different types of elementary cycles.

Defining Multiplex Clustering Coefficients Using Auxiliary Networks
An elegant way to generalize clustering coefficients for multiplex networks is to define a new (possibly weighted) auxiliary supra-graph G M so that one can define cycles of interest as weighted 3-cycles in G M .Once we have a function that produces the auxiliary supra-adjacency matrix M = M(A, C), we can define the auxiliary complete supraadjacency matrix M F = M(F, C).One can then define a local clustering coefficient for node i with the formula As in the monoplex case, the denominator written in terms of the complete matrix is equivalent to that usual one written in terms of connectivity.In this case the connectivity of a node is considered in the supra-graph induced by the matrix M. We refers to the matrix M as multiplex walk matrix to remind that this matrix encode allowed types of steps in the multiplex.Anyway, in the case for example of A C and CA, the induced supra-graph is directed, so one should differentiate between in-and out-connectivity degree.A clear advantage in defining clustering coefficients using an auxiliary supra-graph is that one can then use the auxiliary graph to calculate other diagnostics (e.g., degree or strength) for nodes.One can then check for correlation between clustering-coefficient values and the size of the multiplex neighborhood of a node.The size of the neighborhood being the number of nodes that are reachable in a single step of the type defined by matrix M The symmetric multiplex walk matrices of Eqs.( 4) and ( 5) are To avoid double-counting intra-layer steps in the definition of M SW ′ , we need to rescale either the intra-layer weight parameter C (i.e., we can write C ′ = β ′ I + γC = 1 2 βI + γC) or the inter-layer weight parameter [i.e., we can write C ′ = βI+ γ ′ C = βI + 2βC and also define Let's consider supra graphs that are induced by multiplex walks matrices.The difference between the matrices M SW and M SW ′ is that M SW also includes terms of the form CAC that take into account walks that have an inter-layer step (C) followed by an intra-layer step (A)followed by another inter-layer step (C).Therefore, in the supra-graph that is induced by M SW , two nodes in the same layer that are not connected in that layer can be connected nodes in the same supra-nodes are connected in another layer.
When β = γ = 1, note that the matrix C sums the contributions of all supra nodes that share the same node.In other words, if we associate to each node i a vector of the canonical basis e i the application of C to e i produces a vector with entries equal to one in correspondence to the nodes that belong to it and zero otherwise.Consquently, M SW is of particular interest, as it is related to the weight matrix of the aggregated graph for β = γ = 1.That is, One can also write the multiplex clustering coefficient induced by Eq. ( 2) in terms of auxiliary supra-adjacency matrix by considering Eq.( 3), which is a simplified version of the equation that counts cycles only in one direction: The matrix M W is not symmetric, which implies that the corresponding graph is a directed supra.graph.Nevertheless, the clustering coefficient that is induced by M W is the same as that induced by its transpose M † W .It is evident that different clustering-coefficients depends differently on the level of overlap (repeated edges among layers) in a multiplex network.The one that should change more is CA C, because it counts as triangles attached to a node all the triangles attached to other nodes in the same supra-node.

Main properties of the clustering coefficients defined in the main text
In Table V, we summarize the main properties for global clustering coefficients obtained by averaging the local multiplex clustering coefficients over all nodes (see main text for TABLE IV: Coefficients of the elementary multiplex 3-cycle terms for different multiplex clustering coefficients when β = γ = 1.For C SW ′ and CSW , we calculate the expansions only for fully inter-connected multiplex networks.details).For example, all of the cluster coefficients except c Be are (properly) normalized to give 1 for a complete network.For all choices of relative weightings (ω 1 , ω 2 , ω 3 ) and in the limit as the number of supra nodes n → ∞, the coefficients c W ′ and c W have the value p for a fully inter-connected multiplex network that consists of an independent Erdős-Rényi (ER) graph with edge probability p for each layer.

A Simple Example
We will now use a simple example (that of Fig. 1, panel a) to illustrate the differences between the different notions of a multiplex clustering coefficient.Consider a two-layer multiplex network with three nodes in layer 1 and two nodes in layer 2. The three nodes in layer 1 form a connected triple, and the two exterior nodes of this triple are connected to the two nodes in layer 2, which are connected to each other.
The adjacency matrix A for the intra-layer graph is and the adjacency matrix C of the coupling supra-graph is Thus, the supra-adjacency matrix is The multiplex walk matrix M W is and we note that it is not symmetric.For example, node 4 is reachable from 1, but node 1 is not reachable from 4. The It is evident that M SW ′ is the sum of M W and M † W with rescaled diagonal blocks in order to not double-count edges (1, 2) and (1, 3).Finally, which differs from M SW ′ in the fact that nodes 2 and 3 are connected through the multiplex walk {2, 4, 5, 3}.
The adjacency matrix of the aggregated graph is That is, it is a complete graph without self-loops.We now calculate c * ,i using the different definitions of a multiplex clustering coefficient.To calculate c W,i , we need to compute F C, which we obtain using the equation The multiplex network F C has a complete graph in each layer, 0 inter-layer entries in correspondence of node 1 in layer 2 because it has no coupling with this, and interchanged interlayer coupling of the rest of nodes because each inter-layer step is preceded by an intra-layer step in the complete graph.
The clustering coefficient of node 1, which is attached to two triangles walkable along the directions of the edges, is and this coefficient is equal to the clustering coefficient of remaining nodes.
To calculate c SW ′ ,i , we need to compute (F C + CF), which we obtain using the equation In the supra-graph F C + CF, all nodes are connected to all other nodes except those that belong to the same supra-nodes.The clustering coefficient of node 1, which is attached to 6 triangles, is The clustering coefficient of supra node 2, which is attached to one triangle, is To calculate c SW,i , we need to compute CF C, which we obtain from the equation The only difference between the graphs CF C and (F C+ CF) is the weight of the edges in CF C that take into account the fact that edges might be repeated in the two layers.
The clustering coefficient of node 1, which is attached to 8 triangles, is The clustering coefficient of node 2, which is attached to 4 triangles, is Because we are weighting edges with their multiplexity (the number of times an edge between two nodes is repeated in different layer among nodes in the same supra-node ) in the normalization, none of the nodes has a clustering coefficient equal to 1, while they have all the same value, in the aggregated network, where information about the layer is lost, all the nodes have clustering coefficient equal to 1 (independent of the definition of the clustering coefficient).

TABLE III :
Coefficients of elementary multiplex 3-cycle terms for different multiplex clustering coefficients.For C SW ′ and CSW , we calculate the expansions only for fully interconnected multiplex networks.

TABLE V :
(1,4)ry of the properties of the different multiplex clustering coefficients.The notation c * (, ĩ) means that the property holds for both the global version and the supra-node version of the clustering coefficient.(1)Thevalue of the clustering coefficient reduces to the values of the associated monoplex clustering coefficient for a single-layer graph.(2)Thevalue of the clustering coefficient is properly normalized; that is, it less or equal to 1 for all networks.(3)Theclusteringcoefficienthasavalue of p in a large (i.e., number of supra nodes n → ∞) fully interconnected multiplex network in which each layer is an independent Erdős-Rényi network with an edge probability of p.We tested this property numerically.(4)Supposethatweconstructamultiplex network by replicating the same given monoplex network in each layer.The clustering coefficient for the multiplex network has the same value as for the monoplex network.(5)Theclusteringcoefficientcanbe defined for each node in the supra-graph and Ci = Cĩ.edge(1,4)in this graph represents the walk {1, 2, 4} in the multiplex network.The symmetric walk matrix M SW ′ is