Insights from number theory into the critical Kauffman model with connectivity one

The Kauffman model of genetic computation highlights the importance of criticality at the border of order and chaos. The model with connectivity one is of special interest because it is exactly solvable. But our understanding of its behavior is incomplete, and much of what we do know relies on heuristic arguments. Here, we show that the key quantities in the model are intimately related to aspects of number theory. Using these links, we derive improved bounds for the number of attractors as well as the mean attractor length, which is harder to compute. Our work suggests that number theory is the natural language for deducing many properties of the critical Kauffman model with connectivity one, and opens the door to further insight into this deceptively simple model.


Introduction
The Kauffman model is the simplest and most-studied model of genetic computation.Originally introduced as a toy model of cell type [1,2], it has since garnered interest from many statistical physicists as an archetypal disordered system [3,4].In particular, it displays a critical regime separating frozen and chaotic dynamics, at which many physical and biological systems seem poised [5,6].In this regime, the model exhibits a dynamical phase transition [7] and a disordered landscape similar to the mean-field spin glass [8].The Kauffman model is thus a bridge between deterministic non-equilibrium systems and equilibrium statistical mechanics.But while the physics of the mean-field spin glass is well understood, a similar understanding of disordered Boolean networks has remained elusive.
The Kauffman model is a discrete dynamical system on a directed network of N nodes.The network is generated randomly such that for each node K inputs are drawn uniformly from the pool of all nodes.As a result, each node has K inputs but can have any number of outputs.A typical network for N = 35 and K = 1 is shown in Fig. 1.
The state of each node is 0 or 1 and is a Boolean function of the states of its K inputs.The Boolean functions drawn randomly from the 2 2 K functions on K inputs.The distribution over these is usually uniform for the 16 possible functions for K = 2 or the four (but sometimes just two) for K = 1.Once chosen, the network and Boolean functions do not change as the states are updated.At each time step, the states are updated simultaneously.
Because the number of states of the network is finite, the dynamics will eventually fall into a repeating sequence of states called an attractor or cycle-we use these terms interchangeably.Depending on the number of inputs K and the distribution of Boolean functions, the long-term behavior of the Kauffman model falls into two regimes [3,4].In the frozen regime, attractor lengths die out with the size of the network and small differences in an initial state do not grow.In the chaotic regime, attractor lengths grow with N and small differences in an initial state are amplified, leading initially close trajectories to diverge.These two regimes are separated by a critical boundary, where a perturbation to one node propagates to, on average, one other node.
For over half a century, scientists have sought to understand the number and length of attractors in the critical Kauffman model.In the original K = 2 model, numerical evidence led Kauffman to propose that the number of attractors grows as √ N [2], which he and Socolar later revised to faster than linear [13].Bastolla and Parisi suggested a stretched exponential based on numerical and scaling arguments [14,15].Then, in an elegant analytic study, Samuelsson and Troein [16] proved that the number of attractors grows faster than any power law.
The intense effort to solve the K = 2 model sparked interest in the model with K = 1.Here, the deceptively simple network is composed of loops with trees branching off of them (Fig. 1).Because the nodes in the trees are slaves to the loops, they do not contribute to the number or length of attractors, which are set solely by the m nodes in the loops.In the critical version of the model, all of the Boolean functions must be copy or invert [9,10].In this regime, Flyvbjerg and Kjaer found that the growth rate of the number of attractors is at least 2 0.63 √ N [9].Using a simpler calculation, Drossel et al. obtained a slightly slower growth rate of 2 0.59 √ N [10].The combination of the loop structure for K = 1 and simultaneous updates creates an unexpected connection with number theory.Here the number and length of attractors depends on the divisibility of the loops.Leveraging this allows us to translate bounds from number theory to bounds on the Kauffman model's dynamics.
This paper is organized as follows.In part 2 we describe the K = 1 Kauffman model and introduce the two key quantities that characterize it: the number of attractors and the mean attractor length.In part 3 we show that the number of attractors is at least 2 1.25 √ N .Our approach makes use of Landau's function-the maximum value of the least common multiple of the partitions of a number-and we give an exposition of it here.In part 4, we show that the mean attractor length is at least 2 ϵN , where ϵ is a small constant.Our approach makes use of a generalization of Merton's third theorem, which evaluates the product of 1 − 1/p over all primes p.In the discussion in part 5 we compare our results for the number and length of attractors with other known analytic results.We also discuss why number theory is the right language for understanding our model and consider some open questions for further research.

Dynamics of single and multiple loops
The network in the K = 1 Kauffman model has a particularly simple structure.It is composed of loops and trees branching off of loops, as shown in the top of Fig. 1.The trees that branch off of the loops will eventually fall into periodic behavior determined by the loops, so they do not contribute to the number or length of attractors, which are set by the m nodes in loops.There are four possible Boolean functions with one input: on, off, copy and invert.For a Kauffman model to be critical, a change to the value of one node must propagate to, on average, precisely one other node.So in the critical K = 1 model, all Boolean functions must be either copy or invert.In the bottom of Fig. 1 we show the various attractors generated by the 3-loop and the 5-loop in the network.
It turns our that the detailed arrangement of the copy and invert Boolean functions does not matter-only the parity of of the number of inverts influences the number and size of attractors.We call a loop with an even number of inverts an even loop, and a loop with an odd number of inverts an odd loop.An even loop behaves as if all of the Boolean functions are copy.An odd loops behaves as if all of the Boolean functions are copy apart from one invert.
An even loop of size l, indicated by {l}, has attractors of length k if k divides l.An odd loop of size l, indicated by {l}, has attractors of length 2k if and only if k divides l and l/k is odd [9,10,11,12].For example, for an even 6-loop the cycle lengths are 1, 2, 3 and 6, whereas for an odd 6-loop they are 4 and 12. Let Ax ν denote A attractors of length ν.Then we can represent the number and length of attractors in a loop by the primitive cycle polynomial: A 1 x ν 1 + A 2 x ν 2 + . .., which we call D l if the loop is even and D l if the loop is odd (we dropped the braces around l and l in D l and D l for convenience).Table 1 give some examples of primitive cycle polynomials.
A network containing multiple loops has attractor lengths that are the least common multiples of the attractor lengths in the individual loops.The cycle polynomial for multiple loops can be determined from the cycle polynomials for individuals loops by  15 , where D 3 and D 5 are given in Table 1 and are multiplied according to the rule in Eq. (1).
defining an appropriate product between them.Given A attractors of length ν and B attractors of length ξ, their product is Then the product between two cycle polynomials (primitive or otherwise) is first introduced in [11].For example, the cycle polynomial for the even 3-loop and even 5-loop in Fig. 1 is The cycle polynomial for an even 4-loop and an odd 4-loop is The cycle polynomials for other combinations of loops are given in the right side of Table 1.

Attractor number and attractor length
There are two main quantities of interest in the critical Kauffman model.One is the number of attractors c, and the other is the mean attractor length A. For the critical Kauffman model with connectivity one, these two quantities are intimately related.Let L be the set of loops in the network and m be the number of nodes in loops.Writing the cycle polynomial of the network as i A i x ν i -this being the product of the primitive cycle polynomials for all of the loops-the number of attractors is and the mean attractor length is since all 2 m states of the loop nodes belong to attractors.In other words, for a given set of even and odd loops, the number of attractors and the mean attractor length satisfy the conservation law c(L)A(L) = 2 m .Ultimately, we are interested in c(L) and A(L) averaged over the ensemble of directed networks with K = 1 and the distribution over loop parities.We denote these quantities as Insights from number theory into the critical Kauffman model with connectivity one 6

Even loop
Odd loop Multiple even loops Table 1.Cycle polynomials for single and multiple loops.The primitive cycle polynomials D l (x) and D l (x) indicate the number and length of attractors in an even and odd parity loop of size l, respectively.For example, D 3 (x) = 2x + 2x 3 reads as two cycles of length one and two cycles of length three.The cycle polynomial for two loops is given by the product of the individual primitive cycle polynomials, where the product is defined by Eq. ( 1).Note that we use the shorthand , and so on.
where P (L | N ) is the distribution over network configurations and parities for a fixed number of nodes N .It is also useful to consider c(L) and A(L) averaged over those networks with fixed number of nodes in loops m.We denotes these as where P (m | N ) is the distribution over the number of nodes in loops at fixed total number of nodes N .We will investigate these quantities using insights from number theory in the two sections that follow.

Attractor number and Landau's function
The first key quantity in the critical Kauffman model is the number of attractors.Rather than calculate it directly, we take an indirect approach.We use the conservation law A(L) = 2 m /c(L) in Eq. ( 2) to convert an upper bound on the mean attractor length A(L) into a lower bound on the the number of attractors c(L).We can bound A(L) from above by calculating the largest possible attractor length.Consider all ways of partitioning m nodes into a collection of loops L = {l 1 , l 2 , . ..},where l 1 + l 2 + . . .= m.For m = 8, some of the 22 partitions are shown in the right of Table 1.From the definition of the product in Eq. ( 1), we see that when all of the loops are even, the longest attractor length is the least common multiple of the loop sizes.When one or more of the loops is odd, the longest attractor length is double this.
What is the maximum value of the lcm of the partitions of m?This is precisely the definition of Landau's function g(m) (OEIS A000793 [18]), which we briefly review in  Landau's function g(m) is the maximum value of the least common multiple of the partitions of m.We show this for m = 1 to 10 as well as when m is the sum of the first s primes, up to s = 10.For s ≤ 8, g(m) is just the product of the first s primes.But this is not true in general, as s = 9 and s = 10 show.

A(L) < 2g(m).
Inserting the bound on g(m) from Eq. ( 9) below, we find Inserting this into Eq.( 2) and averaging at fixed m with Eq. ( 3) gives a lower bound on the number of attractors, To re-express this in terms of the total number of nodes N , we must average over the distribution of the number of nodes in loops m using Eq. ( 4).In the large N limit, the probability of m of N nodes occurring in loops is m N exp (−m 2 /(2N )) [9].Summing over this, the mean number of nodes in loops m is asymptotically π 2 N [9, 10].Since Eq. ( 6) is convex, by Jensen's inequality we can replace m with its mean, giving Taking the logarithm, the leading behavior of the number of attractors is where we have neglected constant terms and terms proportional to 4 √ N √ ln N .

Landau's function and its bound
Landau's function g(m) is the maximum value of the least common multiple of the partitions of m.It is shown in Table 2 and Fig. 2. When m is the sum of the first s ≤ 8 primes, g(m) is just the product of the primes: and so on.But this is not true in general: for s = 9, where 2 and 23 are replaced with 9 and 16.
In the early 1900s, Landau proved that ln g(m) asymptotically approaches √ m ln m.More recently, Massias [19] proved In particular, Massias calculated the running maxima of g(m)/ √ m ln m, and showed that there are precisely 378 such maxima (OEIS A103635 [18]).The first 79 of these are shown in Fig. 2 left.The position of the last maxima is m * = 1,319,766, which is where ln(g(m))/ m ln(m) reaches its global maximum, namely where the prime factorization of g(m * ) is

Attractor length and Mertons' theorems
The second key quantity in the critical Kauffman model is the mean attractor length.For a given network, the mean attractor length A has a lower bound that is proportional to the maximum attractor length [10], with a constant of proportionality that we call ζ.When all of the loops are even, the maximum attractor length is the least common multiple of the loop sizes.When one or more of the loops is odd, it is twice this.To write down A, let σ be a binary vector of length N , where σ l = 1 if at least one loop of length l occurs and σ l = 0 otherwise.Given the distribution over the σs, Dropping all but the prime loops gives us a weaker bound, but one in which we no longer have to worry about the lcm: where the product is over all primes p less than N .
The key obstacle to calculating this quantity is that the presence of loops is not independent.However, Drossel and Greil [10] argue that below a critical loop length l * = √ ϵN , where ϵ is a small constant, loops of different sizes are independent and are Poisson distributed with mean 1/l.Therefore the probability of finding at least one loop of size l < l * is 1 − e −1/l , which to second order is 1/l − 1/(2l 2 ).Using Step 1 below, which gives the expected value of the product of randomly selected primes, we can write Eq. ( 10) as Using Step 2 below, we can compute the Euler product in Eq. ( 11), which decreases slowly with l * : Substituting in l * = √ ϵN and taking the logarithm, the leading behavior of the mean attractor length is where we have neglected constant terms and terms proportional to ln ln √ ϵN .
Step 1: Primes in a jar Consider the following problem, which has a surprisingly simple solution.Put the number 1 and the first s primes in a jar, each with probability equal to the inverse of the number itself.What is the expected value of the product of the numbers?For example, for s = 3, if we put the numbers 1, 2, 3 and 5 in a jar with probability 1, 1/2, 1/3 and 1/5, the expected value of their product is 9/2.More generally, label the primes 2, 3, 5, . . .as p 1 , p 2 , p 3 , . .., and let u i be the probability that prime p i is in the jar, with v i = 1 − u i .Then the expected value z of the product can be had by summing over the 2 s configurations, namely, With u i = 1/p i , which is the probability used in our example above, To obtain a lower bound, we expand p i = 1 − e − 1 p i and keep terms to first order in 1/p i .Then Instead of considering the first s primes, we can just as well consider primes less than some cutoff l * .The number of primes less than or equal to l * is asymptotically l * / ln l * , which is a lower bound when l * > 17 [20].Then Eq. ( 12) becomes where the product is over all primes p less than l * .
Step 2: Generalizing Merten's third theorem Mertens' theorems are three results in number theory that generalise familiar sums and products of integers to sums and products of primes [20].For example, the second theorem states lim where M is the Meissel-Mertens constant and the sum is over primes.This is the analogue for primes of the more familiar definition of the Euler-Mascheroni constant γ: where the sum is over integers.The reduction in the rate of growth from ln l * to ln ln l * is the result of the smaller number of terms in the sum over primes.
The Euler product at the end of Step 1 is reminiscent of Merten's third theorem [20], which states lim Here, we generalize this to lim where D(x) is a constant defined below.To do so, we start by writing lim The Meissel-Mertens constant can be expressed in two equivalent ways: Inserting this into Eq.( 13) gives lim .

Discussion
Insights from number theory have enabled us to improve on the best-known bounds on the number of attractors c and the mean attractor length A. In Fig. 3 we compare our results with the known analytic results.We express c and A in terms of both the number of nodes in loops m and the size of the network N .Each of these four functions is shown in a different panel in Fig. 3, and calculating each presents its own challenges.We show lower bounds as black and upper bounds as grey (orange online).When presenting results below, we retain terms to lowest order in the exponent, apart from when this is m, and the deviation from m is of interest.The number of attractors c(m) is shown in Fig. 3A.We find that c(m) > 2 m−1.52 √ m ln m .This is a faster growth rate than 2 0.5m in [9], 2 0.47m in [10], and 2m in [11].Notice that for our result here and the result in [11], the exponent is asymptotic to m.
The number of attractors c(N ) is shown in Fig. 3B.We can convert our bound on c(m) to bounds on c(N ) by using Jensen's inequality.Since 2 m−1.52 √ m ln m is convex, the bound holds on replacing m with its mean m.We obtain c(N ) > 2

Mean attractor length A(N)
Figure 3. Bounds on c and A. We compare our results with the known analytic results for the number of attractors c and the mean average attractor length A, as a function of both the number of nodes in loops m and the size of the network N .Lower bounds are black and upper bounds are grey (orange online).A. Our lower bound on the number of attractors c(m) exceeds those given in [9,10,11].B. The same is true for the number of attractors c(N ).C. We did not calculate a lower bound on the mean attractor length A(m), but we can translate the lower bounds on c(m) into upper bounds on A(m) using c(m)A(m) = 2 m .D. Our lower bound on the mean attractor length A(m) exceeds that given in [9], where we have taken any constants to be unity.For large enough N , the constants are immaterial.plays a special role in understanding it because it is exactly solvable [9].New approaches to solving it, like the one presented here, reveal additional insights, suggesting techniques for more complex models which cannot be solved exactly.One obvious target is the original K = 2 model.Here, the analogs of loops for K = 1 are called relevant components.The cycle polynomials for relevant components multiply in the same way described in Eq. ( 1) to give the cycle polynomial of the whole network.However, understanding the exact behavior for K = 2 relevant components is much harder than for K = 1.A stepping stone in the K = 2 direction would be to consider simple network structures only slightly more complex than loops, such as two loops with a cross-link and one loop with an additional internal link [23].Initial investigations lead us to believe that the cycle polynomials for such structures could also be concisely expressed in number theoretic terms.
While this paper was under review, we were able to use our result involving Landau's function as part of a more ambitious proof that the number of attractors grows as (2/ √ e) N .The resulting paper, which is a follow-up to this one, recently appeared in Physical Review Letters [12].

Figure 1 .
Figure 1.A Kauffman network and its attractors.Top.This typical Kauffman network with connectivity one has N = 35 nodes and m = 8 nodes in loops, namely, a 3-loop and a 5-loop.Only the nodes in loops contribute to the number and length of attractors.Bottom.When all of the Boolean functions are copy, the 2 8 = 256 states form 4 cycles of length 1, 4 cycles of length 3, 12 cycles of length 5, and 12 cycles of length 15.We denote this by the cycle polynomial: D 3 D 5 = 4x + 4x 3 + 12x 5 + 12x15 , where D 3 and D 5 are given in Table1and are multiplied according to the rule in Eq. (1).
where we note that P (L | m) = P (L | N, m).The two ensemble averages are related by c(N ) = N m=1 c(m) P (m | N ) and A(N ) = N m=1 A(m) P (m | N )

Figure 2 .
Figure 2. Bound on Landau's function.Left.The smaller points show the first 3000 values of log 2 g(m)/ √ m ln m, and the larger points (orange online) show their running maxima.Massias showed that there are only 378 such maxima, the largest occurring at m * = 1,319,766.Right.The line is 1.52 √ m ln m, and the points are the first 122 running maxima (the ones with position up to 10 4 ), as well as the last.Thus log 2 g(m) ≤ 1.52 √ m ln m, with equality at m * .

Table 2 .
Values of Landau's function and the partitions that generate them.