Maximum-entropy large-scale structures of Boolean networks optimized for criticality

We construct statistical ensembles of modular Boolean networks that are constrained to lie at the critical line between frozen and chaotic dynamic regimes. The ensembles are maximally random given the imposed constraints, and thus represent null models of critical networks. By varying the network density and the entropic cost associated with biased Boolean functions, the ensembles undergo several phase transitions. The observed structures range from fully random to several ordered ones, including a prominent core–periphery-like structure, and an ‘attenuated’ two-group structure, where the network is divided in two groups of nodes, and one of them has Boolean functions with very low sensitivity. This shows that such simple large-scale structures are the most likely to occur when optimizing for criticality, in the absence of any other constraint or competing optimization criteria.


Introduction
Boolean networks are often used as generic models for the dynamics of complex systems of interacting entities, such as social and economic networks, neural networks, and gene or protein interaction networks [1,2].Whenever the states of a system can be reduced to being either 'on' or 'off' without loss of important information, a Boolean approximation captures many features of the dynamics of real networks [3].One of such features is the transition between two dynamical regimes: a 'frozen' phase, where small perturbations of the dynamics vanish after some time, and a 'chaotic' phase, where localized perturbations grow exponentially fast, and disturb the entire system [2].It is often posited that many real systems such as gene regulatory networks [4] or the brain [5] possess features similar to networks which are at the critical line between these two phases, and hence share features inherent to both of them.A central question has been how such systems are capable of selforganizing in this critical state [6][7][8].
In this work, we tackle this question from a different point of view.Instead of describing how a specific dynamics or evolutionary process can drive the structure of the system towards criticality, we focus on the minimal ingredients necessary for a system to be critical under general constraints.In particular, we consider null models of critical Boolean networks that possess the necessary topological and functional characteristics for criticality, but are otherwise maximally random.We are interested both in the large-scale structure of such networks, as well as the choice of Boolean functions.With this in mind, we parametrize general ensembles of functional networks using the stochastic block model [9], and obtain configurations which maximize its entropy [10] under the constraint that the dynamics lie in the critical line.By varying the density of the network, and the entropic cost of choosing Boolean functions with specific sensitivities, we observe topological phase transitions from a fully random configuration, to many structured ones.The most prominent structured topological phases we find are: (1) a core-periphery-like structure [11,12], which achieves criticality by restricting the regulation of most of the network by a few 'core' nodes, and (2) a two-group structure where one of the node groups possess Boolean functions with very low sensitivity.We also show that topologies formed by only two distinct groups of nodes are sufficient to achieve criticality given the general constraints considered.This suggests that more elaborate large-scale structures and choice of Boolean functions do not necessarily arise directly out of an

The Model
Our objective is to investigate the large-scale topology of networks which are forced to be critical under global constraints.We parametrize the possible structures as a general directed stochastic block model [9,10], where the nodes are divided into B groups, where a given group r has size n r , and e rs is the number of edges randomly placed from group s to group r.We also ascribe to each node in the network a Boolean function, chosen randomly from the set of all possible functions with a given bias p r (i.e. the fraction of input combinations that correspond to output 1) 3 .On a given realization of this network ensemble, we consider a synchronous Boolean dynamics, where at each discrete time step the Boolean values σ { } i of all nodes are updated as where f i is the Boolean function of node i, and σ t { ( )} j is the set of inputs of i.We are interested in describing the time evolution of perturbations of the dynamics, where a single Boolean value is flipped σ σ → − 1 i i , and the following cascade of flips is measured.More precisely, we are interested in the Hamming distance between two identical copies of the same network, but where one of them is unperturbed,  where r is the contribution to h(t) from the nodes belonging to group r.Instead of investigating the microscopic dynamics of equation (1) directly, here we make use of the annealed approximation [13], and consider that at each time step the edges of the network are re-sampled from the same ensemble.By restricting ourselves to the early times after the perturbation, such that ≪ h t ( ) 1 r , we can neglect the probability that more than one input flips simultaneously, which is of order O h t ( ( ) ) r 2 .This allows us to write the time evolution of the individual h r (t) values as  1) r is the probability that if an input of a node belonging to group r flips its output will also flip [14][15][16].The general solution of this linear system is The matrix M is non-negative and non-symmetric, and hence it has at least one purely real non-negative leading eigenvalue λ, with a non-negative eigenvector ⃗ x .Henceforth we further assume that M is strictly positive 4 , so that this eigenvalue is positive and unique, and then for a sufficiently large t we can write t 3 For nodes which do not receive any input, we impose that they receive constant functions, with a randomly chosen output value. 4In some of the following results we will observe values → M 0 rs .These should simply be interpreted as asymptotic limits where the values become arbitrarily small but always strictly positive, and hence do not invalidate equation (7).
From this we have that λ < 1 corresponds to the frozen phase, since the magnitude of the perturbation will decrease exponentially, and conversely λ > 1 corresponds to the 'chaotic' phase, since it will increase exponentially.Hence the special value λ = 1 marks the critical line between the two phases.Therefore, given some parameter choice for n { } r , e { } rs and p { } r , we can decide in which phase the corresponding dynamics will lie by computing the largest eigenvalue of the B × B matrix M. Note that from this criterion we recover trivially the critical value for biased random Boolean networks with r r [2].

Optimized ensembles
We are interested in generating ensembles of Boolean networks which lie in the critical line λ = 1.More specifically, we want null models of critical networks where-in addition to being critical-the topology and choice of functions is maximally random within the imposed constraints.We achieve this by maximizing the entropy of stochastic block model ensemble [10] ) being the total number of network realizations, as well as the entropy of the distribution of Boolean functions, which we describe below.The maximization is performed in a constrained fashion, by imposing a critical sensitivity λ , and by adjusting the relative entropic cost of modifying the structure of the network (the parameters n e { }, { } r r s ) and the Boolean functions (the parameters p { } r ).This is achieved by finding the saddle point of the Lagrangian function where c is a Lagrange multiplier which imposes λ = 1, and μ controls the relative entropic cost between the edge placements and the choice of functions.
For the structural entropy  n e ({ }, { }) r r s S we have simply [10] ) where = ∑ E e rs rs is the total number of edges, and the limit of sparse networks was assumed, i.For each choice of k there are 2 k possible input combinations, p 2 k of which will have output 1 and − p 2 ( 1) will have output zero.The total number of Boolean functions is therefore from which we obtain using Stirling's approximation for ≫ 2 1 b is the binary entropy function.As equation (12) shows, this choice will result in an entropy function which grows exponentially with the number of inputs k.This means that, for any choice of μ > 0 in equation ( 9) above, a trivial maximization of the overall entropy can be achieved by sufficiently increasing the average number of inputs of a vanishingly small fraction of the nodes, since the remaining entropy term of equation (10) will only depend log-linearly on the size and density of the groups.In other words, the functional entropy will exponentially dominate the structural one for many parameter choices, . Hence, the outcome would correspond always to fully random networks, with a uniform bias p selected to enforce λ = 1.However, the choice of equation ( 12) corresponds to an unrealistic situation where all entries of the truth table of the Boolean functions with the same bias are equally accessible evolutionary.As equation ( 12) itself shows, the number of such functions increases comparably fast to the number of all Boolean functions with k inputs, 2 2 k .It is known, however, that the vast majority of Boolean functions are not realizable biochemically, and those that are empirically observed often fall within very narrow classes, such as nested canalizing functions [17], which scale in number as ∼k!2 k [18]; far slower than the biased functions above.Furthermore, since we are only considering networks that are forced to lie at the critical line, the vast majority of input combinations of the biased functions considered above are not dynamically accessible.Hence, for the purposes of the actual dynamics, the enumeration done in equation ( 11) is largely immaterial.In view of this, here we consider instead the modified situation where the input combinations not present during typical dynamical trajectories of equation ( 1) are not wastefully encoded in the truth table (which can be imagined to contain constant output values instead).We compute functional entropy approximately via the sensitivity to perturbation itself, − p p 2 ( 1), which corresponds to the probability that the output of the function changes if one input is perturbed.With this we can write, which is more well behaved than equation ( 12) 5 .
Although we can compute both  n e ({ }, { }) , the value of λ can only be obtained analytically for very small number of groups B. Therefore, for larger values of B we are forced to perform the maximization numerically.For details of the numerical methods we refer to appendix.

Numerical results
We obtained the values of n e { }, { } r r s and p { } r , characterizing the network structure and Boolean functions, which maximize , subject to the constraint that the dynamics lies exactly on the critical line, λ = 1.We obtained results for different values of the average in-degree per node 〈 〉 = k E N, as well as different values of μ, which regulates the relative trade-offs between the structural and functional entropies.We consider only the thermodynamic limit with ≫ N 1.We have investigated ensembles with different number of groups B. However, we found that in all cases the obtained structures could be fully equivalently formulated as a B = 2 structure, where one or more groups could be merged together, resulting in the exact same network ensemble, with the same structural and functional entropies.Hence we concluded that a number of B = 2 groups is sufficient to describe the obtained topologies for all parameter choices, similarly to previous work focusing on optimization against noise [19], and structural stability [20].Thus, we focus on the B = 2 case from now on.
Varying both 〈 〉 k and μ we obtain a variety of structural phases, characterized by distinct large-scale structures, as well as several phase transitions between them, as can be seen in the phase diagram of figure 1.We have identified phases by searching for discontinuities and abrupt changes in   n e , , { }, { } r r s S F and p { } r , which correspond to qualitatively distinct structural patterns.An example for this can be found in figure 2, which shows the values of both entropies in the phase diagram, where the phase boundaries can be identified.
Overall, for a functional entropy bias μ becoming sufficiently small, the observed topology is a fully random one with B = 1, and a functional bias p chosen so that λ = − 〈 〉= p p k 2 ( 1) 1.These correspond to the fully random biased Boolean networks often considered in the literature [2].As soon as μ increases sufficiently, and hence also the entropic cost of choosing biased functions, the network rewires itself in specific configurations, depending on how dense it is.For 〈 〉 k approaching 2, the network becomes fully random again, since a fully The different phases are color-coded according to the legend, and described in more detail in figure 3. 5 A similar alternative would be to use the directly the bias p r instead of − p p 2 ( 1) r r in equation ( 13).We investigated this variant as well, and obtained results qualitatively equivalent to those presented here using equation ( 13).random Boolean network with p = 1/2 is critical precisely at . For increasing values of 〈 〉 k a fully random network would be super-critical, and hence other structures arise in order to achieve criticality.For a region around < 〈 〉 ≲ k 2 3the system may find itself divided into two groups of identical sizes and functional biases ( = n n ), but which are asymmetrically connected, such that one group receives more links than the other.We observe also a very narrow phase for smaller μ values ('other'), for which our numerical precision did not allow an accurate characterization, but it most likely corresponds to an extension of the = n n 1 2 phase, as figure 2 seems to imply.For even larger values of 〈 〉 k the system moves towards two other phases, the first one being a core-periphery-like ( × = e e 0 ), where one of the groups (the periphery) is predominantly regulated by the other group (the core), which regulates itself.Because of this, the core is also more densely connected than the periphery, however it also has a smaller bias.For even larger 〈 〉 k , the system transits to yet another phase, where the number of edges between both groups is the same ( = e e 12 21 ), but the groups have different sizes, and hence different in-degrees.The group with a higher in-degree has a much smaller bias p r , such that its sensitivity to perturbations is significantly lower than the other group.Overall, the core-periphery-like structure is preferred for high μ and large k, in comparison to the other phases.

On the nature of the phase transitions
In order to understand the nature of the transitions between the phases we have described, it is useful to consider the Pareto front of the joint optimization of both entropy functions  S and  F .The Pareto front is a line in the   ( , ) S F plane where one could not increase one of the values without decreasing the other.Since we are linearly combining both values as ) S F , each line with slope μ μ − ( 1)in this plane will have the same value of Λ.The maximization procedure consists in, for a given value of μ, finding the intersection of the slope μ μ − ( 1)with the Pareto front which is furthest away from the origin (see figure 4).Note that not all points in the Pareto front will correspond to this intersection with this slope, only those which belong to its convex hull.Since these segments of the Pareto front are often disconnected, a jump from one solution set to the other will correspond to a (first-order) phase transition.The actual value of μ where such a transition occurs depends on 〈 〉 k .The only places where second-order phase transitions are observed are at the boundaries between more than two phases, such as between = e e We also found a relatively simple explanation for the phase transition and its approximate position between the fully random B = 1 phase and the other structured phases.For any given value of B, both entropy values will be bounded within some interval depending on μ, i.e.
F F min F max .Using this, we can rewrite our objective function as S S min F F min up to some unimportant constant.For small values of μ the value of Λ is dominated by the structural entropy  S , and for larger values of μ by functional entropy  F .A plausible hypothesis for the position μ c where the transition occurs is when both values have an equal contribution so that we have .As shown in figure 5, the value of μ c obtained in this manner corresponds well to the numerical results obtained.

Conclusion
We have shown that maximum-entropy ensembles of modular Boolean networks posed at the critical line exhibit structural phase transitions from a fully random topology to several structured ones.The phases occur according to globally imposed constraints, such as the average in-degree and the relative entropic cost of modifying Boolean functions versus the network topology itself.In the limit where non-random Boolean functions possess a very high entropic cost, the emerging topology is a two-group core-periphery-like structure, where a small fraction of the nodes is responsible for the regulation of the entire system, which is formed predominately of a non-regulating majority.If the entropic cost of adapting the Boolean functions diminish, this core-periphery-like structure is replaced by a tiered one, where the network is divided into two groups of comparable size and bidirectional connections, and one of them has a sensitivity much smaller than the other.Finally, for even smaller functional entropic cost, the emerging topology is a fully random network, with uniformly sampled Boolean functions with the necessary critical sensitivity.
The emerging core-periphery-like topology is very similar to the one arising out of optimization against stochastic fluctuations [19] (as well as structural robustness against failure [20]), and is qualitatively similar to what is observed in real gene regulatory networks, where only a minority of genes (transcription factors) are in fact responsible for regulation.Hence our results suggest not only a possible explanation for this property, but also that such a core-periphery-like organization may provide fitness according to multiple criteria.
The ensembles considered in this work are optimized according to a single criterion, and are subject to no constraints other than the total number of inputs per node.Furthermore our approach does not take into account topological features which arise out of non-equilibrium processes such as gene duplication [21][22][23] or frozen accidents, or even other specific parametrizations of Boolean functions such as canalizing [24,25] or nested canalizing functions [26,27].Nevertheless, this type of analysis can provide insight into the topological features which are necessary outcomes of a given optimization procedure, and allows one to rule out others.For instance, although it is known that gene duplication can result in broad degree distributions [21][22][23] and assortative modular structure [28], these features are not present in our ensembles, although they are not forbidden.This strongly suggests that these features are not strictly necessary ingredients of robust systems, and are simply the byproduct of non-equilibrium growth processes.
Although we have forced the networks in the ensemble to lie at the critical line, there are associated properties of critical networks such as the scaling of the 'frozen core' [29] and the number of attractors [30] that were not a priori imposed.It remains to be seen how the obtained topologies affect these characteristics, which we leave to future work.
fraction of inputs of group r that belong to group s, have considered different choices.A first option would be to enumerate all possible Boolean functions of k inputs with a given bias, Ω p k ( , ), and compute Ω r being the in-degree distribution of group r, which is a Poisson with average =

Figure 1 .
Figure 1.Phase diagram for the critical networks as a function of the average in-degree 〈 〉 k and functional entropy bias μ.The different phases are color-coded according to the legend, and described in more detail in figure 3.

Figure 2 .
Figure 2. Relative distance of the structural entropy S S (left) and functional entropy S F (right) from their maximal value, as a function of 〈 〉 k and μ.The abrupt changes (dotted lines) mark the phase boundaries in figure 1.

Figure 3 .
Figure 3. Sample block model parametrizations belonging to the different phases in figure 1.At the top are graphical representations of the matrix e rs (with line thickness corresponding the magnitude), followed the group sizes n { } r , function biases p { } r , and average inand out-degree per group 〈 〉 k r in and 〈 〉 k r out , respectively.

and B = 1 , and at the onset of the = n n 1 2
phase at 〈 〉 = k 2.

Figure 4 .
Figure 4. Qualitative example of a Pareto front of the joint optimization of  S and  F (black).The line segments highlighted in red are the convex hull of the pareto front, which intersects some curve with slope μ μ − ( 1)furthest away from the origin (dotted lines).
can also be obtained with simple arguments.The maximal entropy will be obtained for the fully random case with B = 1 and = minimum value we assume to correspond to the core-periphery-like topology ×

Figure 5 .
Figure 5. Critical point μ c marking the transition between a fully random B = 1 and one of the structured phases in figure1.Blue: analytic values corresponding to equation(16).Green: values obtained from the numeric optimization.