Processive and Distributive Non-Equilibrium Networks Discriminate in Alternate Limits

We study biochemical reaction networks capable of product discrimination inspired by biological proofreading mechanisms. At equilibrium, product discrimination, the selective formation of a"correct"product with respect to an"incorrect product", is fundamentally limited by the free energy difference between the two products. However, biological systems often far exceed this limit, by using discriminatory networks that expend free energy to maintain non-equilibrium steady states. Non-equilibrium systems are notoriously difficult to analyze and no systematic methods exist for determining parameter regimes which maximize discrimination. Here we introduce a measure that can be computed directly from the biochemical rate constants and which provides a necessary condition for proofreading. Non-equilibrium systems can discriminate using binding energy (energetic) differences or using activation energy (kinetic) differences, but in both cases, proofreading is optimal when dissipation is maximized, so long as the necessary condition we introduce is satisfied. We show that these constraints can be in opposition and that this explains why the error rate is a non-monotonic function of the irreversible drive in the original proofreading scheme of Hopfield and Ninio. Finally, we introduce mixed networks, in which one product is favored energetically and the other kinetically. In such networks, sensitive product switching can be achieved simply by spending free energy. Biologically, this corresponds to the ability to select between products by driving a single reaction without network fine tuning. This may be used to explore alternate product spaces in challenging environments.


I. INTRODUCTION
Chemical systems in isolation will evolve toward thermodynamic equilibrium, a unique steady state where the concentrations of chemical species no longer change with time, no entropy is produced, and the relative concentrations of different species are a function of their free energy differences alone. In biology, thermodynamic equilibrium is synonymous with death and biochemical systems must avoid it by continuously using energy to maintain nonequilibrium steady states. Far from equilibrium, state occupancies are no longer a function of free energies, in fact, consistent free energies cannot be assigned to out of equilibrium states, and can in principle depend on the full details of every transport process and chemical reaction rate in the system. This allows for much greater flexibility in systems far from equilibrium. This freedom comes at a cost; such systems rarely permit closed form, analytic solutions for quantities of interest, such as steady state concentrations, chemical fluxes, or entropy production (dissipation) [33].
The difference between equilibrium and non-equilibrium thermodynamics is especially salient in the problem of biological discrimination. Discrimination refers to the increase in the concentration of one "correct" product, relative to another "incorrect" product. The ability of living systems to process and transmit information reliably depends on the accuracy of its biochemical reactions; this accuracy can be quantified as the ratio of these "correct" and "incorrect" products. If a discriminatory system were at equilibrium, this ratio would be fundamentally limited by the free energy difference of the two products; discrimination beyond that would be impossible. However, as first noted by Hopfield [18] and Ninio [26], biological processes show accuracy far beyond this limit. For example, in DNA replication, error rates of ∼ 10 −9 ; are observed while the equilibrium limit is ∼ 10 −4 . Hopfield and Ninio proposed a system that they called "kinetic proofreading", which, by coupling certain reactions to an external chemical potential via the hydrolysis of ATP (e.g.), drives the system out of equilibrium, thus negating the limit and permitting enhanced discrimination.
In a simple chemical reaction, the concentration of the final product is determined by its free energy relative to the reactants, while the rate of the reaction is determined by the activation energy barrier and the systems temperature. Activation energy differences are independent of free energy differences and thus cannot contribute to discrimination at thermodynamic equilibrium. However, once a system is driven out of equilibrium, activation energy differences can also be used to discriminate [5,6]. In what follows, we will distinguish these two types of non-equilibrium discrimination, using energetic discrimination to refer to discrimination based on binding energy differences and kinetic discrimination to refer to that based on activation energy differences following Sartori and Pigolotti [31,32].
In cells, biological processes are often carried out by heterogenous, multi-component complexes. Such complexes are ubiquitous in biological information processing systems, from the protein translation system in the ribosome [34], to the gene regulatory networks that control and carry out transcription [16]. Multicomponent complex formation may be a mechanism for assuring the accuracy of biological processes, as discrimination can be enhanced in reaction schemes with many intermediate steps [23]. Reactions such as these, where many intermediate complexes are formed on the way to the final product, may proceed either processively or distributively. Processive reactions must travel a single dominant path from reactants to products, while distributive reactions can go from reactants to products in many ways. For example, the complex formation shown in Figure 1(a) shows a processive mechanism where the association between the components must occur in order, as the nested nature of the molecular shapes ensures that the binding of later components requires that earlier sub-complexes have already formed. Contrast this to distributive complex formation, as shown in Figure 1(b), where the components of the complex are free to associate in almost any order. This gives many effective pathways along which complex formation can occur.
Chemical networks need not rely on unique molecular properties such as the shapes shown in Figure 1(a) to realize processive or distributive assembly. In fact, a single network, such as the ladder like network shown in Figure 1(c,d), can be either processive or distributive depending on the rate constants. For example, a complex might form around an enzyme (E) and its substrate (S), which can exist in either a modified form ( ) or an unmodified form and which associates with many complementary subunits (C n ). The modification in this example could be phosphorylation e.g., with the up and down reactions being carried out by a phosphatase and a kinase respectively. In the network in Figure 1 Thus, there is only a single dominant path to the end state, along the top of the ladder Figure   1(c, shown in green). In this context, modification events can be viewed as "catastrophes" Figure 1(c, dashed blue arrows) , requiring complete disassembly of the complex. If these catastrophes are more likely to occur for an incorrect substrate than for the correct one, such a network can form the basis of a highly selective discriminatory scheme [22].
In contrast, consider what happens if the rate of removal of the modification is greatly increased [ Fig. 1d]. In this case, the complex is free to form with either the unmodified or the modified substrate, as the network can easily move from the lower path of the ladder to the upper path at any time. A typical trajectory [ Figure 1 could be the result of many things. For example, in the case when the modification is phosphorylation, this rate could be increased by increasing the expression of the phosphatase.
In this work we introduce a measure that quantifies the degree of distributivity versus processivity in a network. This measure is a global property of the network that depends on both network topology and the reaction rates between states. We show that distributivity is required for out of equilibrium networks to discriminate based on activation energy differences, and that processivity is required to discriminate based on binding energy differences. We call this measure orthogonality because it precisely quantifies the degree to which the columns of the graph Laplacian are mutually orthogonal to one another. We use this measure to solve an outstanding question about the non-monotonic behavior of the discrimination ratio in response to increasing dissipation in a classic proofreading scheme.
In spirit of previous work [12,22,23] that sought to study how systems can exist in different non-equilibrium regimes without changing the network topology, we explore discrimination in two classes of fixed topologies, the so-called "butterfly" [35] and "ladder" [22] graphs.
We show how the rate constants of these discrimination schemes can be tuned to put the network into either processive or distributive regimes, and thus be utilized for energetic or kinetic discrimination respectively. Finally we show that we can design networks in which orthogonality is extremely sensitive to changes in a single reaction potential and demonstrate a principled way to design systems that can switch from one product forming regime to another without changing the "hard-wiring" of the system, and without extensive "finetuning" of chemical potentials throughout the system, but rather by modulating a single chemical drive.
The ability of a chemical system to change its product space simply by changing the availability of a chemical driving force, such as ATP, provides interesting ways in which biological systems might respond to environmental conditions. Living systems spend a large proportion of their energy on maintaining osmolarity and membrane potentials through the actions of ATP-driven pumps [1]. The idea that modulation of ATP availability could drive a chemical reaction network from one product space to another raises interesting possibilities for mechanisms of either improvisation or contingency in response to adversity.

II. PRELIMINARIES
We consider systems whose dynamics are described by continuous time Markov chains.
System states and transitions can be represented as a strongly connected, directed graph with n states, and have dynamics represented by a matrix differential equation known as the Master equation dp dt = Lp where L is the n × n Laplacian matrix, also known as the generator in stochastic thermodynamics. This matrix encodes the transition rates k ij = (j → i) of the network in its off-diagonal elements (i = j). The diagonal elements are chosen such that all columns sum to zero: and p is an n dimensional vector representing the dynamic occupancy of the network states.
We require these systems to be strongly-connected, meaning that any state is accessible from any other state, though not necessarily directly. Thus there are no "absorbing" states or isolated subgraphs. Such networks have a single unique steady state [21], which is the solution ρ to the equation Lρ = 0. Mathematically, this vector ρ is called the nullspace, or kernel, of the Laplacian L; it is also the eigenvector of the L corresponding to the eigenvalue of 0. Physically, ρ is of interest because it is the (possibly non-equilibrium) steady state of the network. In equilibrium thermodynamics, where detailed-balance holds, ρ can be solved for exactly, as the ratio of the steady state concentrations of any two species i and j can be computed directly from their free energy difference. Out of equilibrium, however, energies of states are not well defined [23], and the calculation of ρ in such networks does not generally permit a simple analytic solution. Consider, for example, the network with rate constants given in Figure 2(a). Based on these rate constants, no consistent free energies can be assigned to the three states as this system is not in equilibrium, and energy is dissipated in each cycle. The free energy differences (up to an independent multiplicative factor) of states A relative to B is log 2, and of B relative to C is log 1.5, however, the free energy of state C relative to A is not log 3, as would be expected from equilibrium considerations, but rather it is equal to 0. Thus, there is free energy of log(3) dissipated for each cycle around this network. For general non-equilibrium networks it is more difficult to compute the dissipation directly as we have done here, however, as long as the steady state vector ρ is known, we can calculate the dissipation from ρ and the rate constants k ij as the entropy production rate [15,33] and when we refer to "dissipation" in what follows we will be calculating it according to Equation 1.

A. Laplacian Geometry
For each directed graph, such as the one shown in the upper panel of Figure 2 [27]. The columns of L are n vectors in an n dimensional space, however, these vectors span only an n − 1 dimensional subspace, as the matrix is not full rank. This is shown in Figure 2(b) where the three vectors v A , v B , and v C span a plane. The nullspace, or kernel, of L is the vector that gives the steady steady solution ρ to the matrix differential equation dp dt = Lρ = 0. In this example, the dimension of span(L) is 2 and we can visualize the three vectors v A , v B , and v C in the plane with an appropriate coordinate transform. In general, the matrix B = {e 1 − e 2 , e 2 − e 3 , ..., e (n−1) − e n }, where e i ∈ R n is the standard basis, provides a basis for L in R (n−1) [20]. While this projection is useful for visualization, it is not part of the approximation we will introduce, nor is it required for the measure we present later. Consider the three vectors v A (green), v B (blue) and v C (red) in Figure 2(c). If we remove the ith vector v i , equivalent to removing the ith column of L, then the remaining vectors form a shape called a polytope, which in two dimensions in a parallelogram and in three, a parallelepiped. If we call L with the ith column removed L 0i , then the polytope associated with state i will be denoted P (L 0i ). The polytopes associated with B, P (L 0B ), consisting of v A and v C , and that associated with C, P (L 0C ), consisting of v A and v B are shown in Figure 2(c) in red and blue respectively. The geometric insight of this paper is that the Laplacian matrix L defines a collection of polytopes P (L 0i ) associated with the states i, and that the ratio of any two steady state concentrations is given by the ratio of the volumes of their corresponding polytopes. (A proof of Equation 2 is given in Appendix A).
In discrimination schemes, it is often the ratio of the steady state concentrations for the correct and the incorrect product that is of interest. Thus, the problem of computing this ratio reduces to a problem of computing the ratio of the volumes of the two polytopes as given by Equation 2.

B. Polytope Volumes
Geometrically, the volume of a polytope is given in general by the famous "base-height" formula, that is, the volume V ol(P (A)) is given a i ·V ol(P (A i )) where P (A i ) is the polytope formed from A with the ith column removed and (a i ) is the magnitude a i , the component of the v i perpendicular to span(A i ). In two dimensions the height is the perpendicular component of the adjacent side with respect to the chosen base. Note we are free to choose either side as the "base". In three dimensions, we have a parallelepiped comprised of three faces. The volume can be given as the area of any of these faces multiplied by the magnitude of perpendicular component of the remaining side with respect to the chosen face. Any face of a parallelepiped is itself a parallelogram, and its area can therefore also be computed using the base height formula for the remaining face. This procedure generalizes to higher dimensions were we choose one column, v i , of A and compute its height a i perpendicular to the subspace spanned by A i . We can then compute the volume of the base in the same way, in one fewer dimensions, iteratively until we reach the final one-dimensional subspace.
This gives the general formula,

C. Ratio of Polytope Volumes
In general, an n dimensional polytope will have n facets, each (n − 1) dimensional, which are the higher dimensional equivalent of faces. The polytope P (L 0i ) corresponding to state i and the polytope P (L 0j ) corresponding to state j will always share a facet which is formed from L ij which is L with both columns i and j removed. In the example in Figure 2, this the polytopes are 2-dimensional and their shared facet is simply v A , and in general, it will be an (n − 2)-dimensional polytope. In the example in Figure 3 the 4-state Laplacian defines four three-dimensional polytopes, which each share a two-dimensional facet (shown in red).
That the polytopes associated with i and with j share a facet comes from the fact that we can remove columns in any order. We have already removed column i and column j to obtain the polytopes associated with state i and state j respectively. To calculate the volumes of these polytopes using the base-height formula, we will start by using removing v j from P (L 0i ) and v i from P (L 0j ) giving the same "base" for both, namely P (L ij ), which is their shared facet. This leads to a simplification of Equation 2, Where (v i ) ⊥ is the component of v i which is perpendicular to the span(L ij ). In the example shown in Figure 2, by removing v b we get the polytope associated with B, P (L 0B ) shown in red, similarly removing v C gives the polytope associated with C shown as the parallelogram P (L 0C ) in blue. We can see that these polytopes share the facet L BC which is L with columns v B and v C removed, which is simply v A . In this example, choosing the shared base to be v A , the ratio of the areas is

D. The Discrimination Ratio in Terms of Projections
The component of v i perpendicular to the span(L ij ) can be written in terms of the subspace projection of v i onto span(L ij ), as This is analogous to decomposing a vector into parallel and perpendicular components, Equation 3 can be rewritten as In the example shown in Figure 2, the subspace L BC is 1-dimensional and projections onto it are simple to compute in terms of the vectors v i . They are given explicitly as . In this example, we can calculate the ratio of the occupancy of B to C at steady state directly, which for the values shown [ Fig 2a] give a ratio of 7/6. Similarly, ρ B ρ A = 7/10 and ρ C ρ A = 6/10. If we combine these ratios with the normalization condition that ρ = 1, ρ is given as [10/23, 7/23, 6/23]. Again, note that because this system is not in equilibrium, detailed balance does not hold, it is easy to see e.g. that ρ A k A→C = ρ C k C→A .
E. An Approximation of the Discrimination Ratio Equation 4 shows that an analytical expression for the projection onto the subspace spanned by L ij will yield an analytic expression for the discrimination ratio. However, computing such a projection requires having an ortho-normal basis for the subspace. If we have such an ortho-normal basis L ij orth , then the projection is given simply by, However, the columns of L ij will not, in general, form an ortho-normal basis. We can orthogonalize the subspace, using a procedure such as the Gram-Schmidt process or by computing a matrix inverse e.g., however the recursive nature of these procedures yields expressions that are generally not analytically tractable. However, a general solution for the discrimination ratio can be derived if these projections can be computed simply. In the special case when the columns of L ij are orthogonal, if we normalize the columns to unit length, and denote the resulting matrix L ij , and compute the ratio with the projection given in Equation 5, yielding the expression, When the columns of L ij are mutually orthogonal, this expression is exact. However, if the columns of L ij are not mutually orthogonal, then this will only be an approximation.
F. Expression for the Error in the Approximation.
The simplest way to compute whether the columns of L ij are mutually orthogonal is to compute their pairwise dot products. This is given compactly as L ij L ij , a symmetric matrix whose i, jth element is given by v i , v j . The diagonal elements will always be 1, as the columns are normalized ( v i , v i = v i 2 = 1) and the off diagonal elements v i , v j = 0 when columns are orthogonal and v i , v j > 0 otherwise. Naturally, if we subtract L ij L ij from the identity matrix I, then the diagonal elements will go to zero and when all of the columns are mutually orthogonal, the off-diagonal elements will be zero as well. Thus the Frobenius norm of this matrix will be zero when the approximation is exact. Thus we posit an expression for the error as follows, This expression is a natural measure for the degree to which the columns of L ij are mutually orthogonal, thus we call it the orthogonality of the matrix L ij and denote it with the symbol Θ(L ij ) with the convention that Θ(L ij ) = 1 − ∆(L ij ).
Finally, we prove that this expression, which quantifies the degree of orthogonality of the matrix L ij , is in fact a bound on the error in the approximation we introduced in Equation 6. Let us assume that the true ortho-normal is given as L ij orth .
A is a general projection matrix onto the column space of A, and AA is the projection matrix onto A in the case that the columns of A form an ortho-normal basis, which is easy to verify, as the term (A A) −1 = I. The second line is given by Cauchy-Schwartz, and the third equality is proven in Appendix B. In general, orthogonality is a function of both the number of nodes in the network and of the rate constants. In this work, we were mostly focused on comparing orthogonality between networks with the same number of nodes (and the same edges) when the rate parameters on those edges vary. The error bound (∆L ij ) that we calculate is actually a sort of "non-orthogonality", as it increases as the columns of the Laplacian become less orthogonal. There is a maximum "non-orthogonality" on a graph with a given number of nodes (N ) which is given by, I − 1 F where I is the identity matrix of size N and 1 is the matrix of all ones of size N. This can be given in terms of N as √ N 2 − N . Thus, the orthogonality for a graph of size N can in principle fall in the . In graphs where the nodes and edges are fixed, these values will be even more constrained, as some of the entries of L ij are forced to be 0 where edges are absent. Furthermore, the remaining non-zero entries must form a Laplacian matrix, with the diagonals set such that the column sums are 0.

III. RESULTS
In the Preliminaries section, we presented an approximation for the discrimination ratio and an error bound for this approximation. The error bound is given by the degree to which the column space of a subset of the generator matrix is mutually orthogonal, and thus we call this bound the orthogonality of the matrix (Eq. 7). Here we will present two results related to orthogonality. First, we show that orthogonality quantifies the degree to which a network is processive vs distributive, with processive networks in the low-orthogonality limit and distributive networks in the high-orthogonality limit. Second, we find that orthogonality is minimized in networks which discriminate based on binding energy differences (energetic discrimination), and is maximized for networks which discriminate based on activation energy differences (kinetic discrimination). Taken together, these results show that processive networks are required for energetic discrimination, and distributive networks are required for kinetic discrimination.

A. Orthogonality in a Processive vs. a Distributive Network
Here we introduce a simple 4-node toy model which demonstrates that orthogonality captures whether a network is distributive or processive. Consider a network [ Fig. 4a, inset] which has four nodes and in which the connections that would form a line graph (black arrows), are considered separately from the other connections (red arrows). If the reversible reactions represented by the black arrows have rate k, and those represented by the red arrows have rate l, we can, in this simple model, change the network from distributive to processive by changing the ratio r = k/l. First consider the case when r 1 (k l). In this case, there is a "dominant path" from the reactants (node 1) to the products (node 4), as reactions are much more likely to proceed along the black pathway as the rates along it are much faster than the pathways that use the red connections. In this case we would say that the network is processive. However, in the case when r ≈ 1 (k ≈ l), this is not the case, the red reactions are just as fast as the black reactions, and this opens many equally good pathways between the reactants and products. In this case, the network would be considered distributive. Thus, for this simple toy model, as the ratio r increases from r = 1 to r 1, the network changes from distributive to processive. If we look at the orthogonality of the network as we increase r, we see that it is decreasing [Fig 4a]. On advantage of this simple model is that the the orthogonality is calculated with respect to a

2-dimensional subspace spanned by two linearly independent vectors [Fig 4b, red arrows],
shown for three different values of r. In this case, the orthogonality is captured by a single value, i.e. the angle between these two vectors. If we rotate the four vectors in Figure   4(b) so that we can visualize the ange (α) between these two vectors, we see that as it approaches π/2 the orthogonality increases, as expected (Fig 4c). These results suggest that in general, a network with line topology will have lower orthogonality than one with an all-to-all connected topology if all of the rates are of roughly equal magnitude. We can compute that this holds in general for all N (Appendix C), where N is the number of nodes in the network. The increased orthogonality of the all-to-all relative to line topology captures a more general fact: orthogonality tends to decrease as connections are removed from a discrimination scheme, so long as these connections are of equal order magnitude to remaining connections, which we demonstrate computationally ( Figure S1).

B. Orthgonality in the Hopfield-Ninio Discrimination Scheme
We first demonstrate the relationship between orthogonality and discrimination in the classical Hopfield-Ninio scheme, shown graphically in Figure 5(a). Here, substrates S = {W, R} compete to form complexes with enzyme E. 'Wrong' and 'Right' products are formed from substrates W and R (respectively), at rates proportional to the steady state occupancy of the final pre-catalysis complex ρ ES . We thus define the error fraction achieved by the discrimination scheme to be Ninio and Hopfield designed this scheme to amplify differences in the binding energies of EW and ER complex formation. Reaction rates are defined below in Equations 8,9, and 10 following the Rao and Peliti [29]. with the rate constants given in Kramer's form. A pseudo free energy diagram which corresponds to these definitions of the rate constants is shown in Figure 5(a, lower).
We have for the EW reactions: where: ω, ω p set overall rates; ( − γ) is the enthalpy difference between E and EW * and ( p + γ) is the free energy difference between EW and E. The ER reactions are given by: For the 'right' reactions, is the enthalpy difference between E and ER * and p is the difference between ER and E. δ and δ p set the activation energy differences between right and wrong complexes for the first and proofreading reaction respectively.
There is no discrimination along the transitions between the intermediary and precatalysis states: Note that for both the R and W reactions cycles, the total free energy consumed in a cycle from E to ES * to ES and back to E is equal to ( + i + p ) in both cases (the γ cancels for the W side). Thus, no consistent free-energies can be assigned to the states unless this sum is equal to zero and the system is in equilibrium. However, we are free to choose the values of , i , and p , and their sum will, in general, be non-zero.
We begin by considering the relationship between error and orthogonality in the regime which is governed only by binding energy differences (γ > 0, δ = 0), termed the 'energetic regime'. The Hopfield-Ninio scheme was originally designed for discrimination in this regime.
Simulations reveal that low orthogonality is necessary, but not sufficient, for low error rates in the energetic regime [ Figure 5 In the original Hopfield scheme, it was already clear that enhanced discrimination beyond the equilibrium limit was only possible in certain parameter regimes. In the following, we show how we can use orthogonality to find these regimes. In schemes based on binding energy differences, orthogonality must be minimized and dissipation maximized for optimal discrimination. Let us start by looking at the limit, long appreciated to be one of the limits required for the Hopfield-Ninio scheme to reach its lowest error, ξ energetic → e −2γ .
Hopfield argued for the necessity of this limit (Eq. 11) by pointing out that if ω p > ωe We demonstrate that orthogonality is monotonically decreasing as this limit is approached (Appendix E) which provides an alternative explanation as to why this limit is necessary.
A less well-appreciated requirement for energetic discrimination concerns the nonequilibrium drive, generated in this case by adjusting i such that |( + i + p )| increases. Some amount of drive is crucial for the discrimination scheme to be able to achieve error rates lower than the equilibrium free energy difference of the products γ, but too much drive will destroy this enhanced discrimination [35]. We can understand this nonlinearity in terms of orthogonality ( Figure 5(c)). Energy dissipation is helpful for discrimination up until the point at which it begins to drive up orthogonality.
We next turn to the regime governed by only activation energy differences (γ = 0, δ > 0), termed the 'kinetic regime'. Simulations reveal a bound opposite to that of the energetic regime: high orthogonality is necessary (but not sufficient) for low error (Supplemental Figure S2). Analytically, we can derive the error in this regime to be The ξ kinetic is minimized when η 1 and c b. That is, when there exists high drive (ω i e i ω) and free enthalpy product differences ( p 0). We demonstrate that orthogonality is monotonically increasing as these limits are approached (Appendix E).
Differences between the two discriminatory regimes are summarized in Figure 5(d). Increasing the dissipative drive ( i ) increases orthogonality, which allows for kinetic discrimination but precludes energetic discrimination.
The ability to modulate orthogonality via driving the second reaction via i suggests a simple strategy for dissipation-driven product switching. If products EW, ER are favored by different energy types, they can be selected for by driving only the second reaction via i such that the network moves from low to high orthogonality. We achieve a four order of magnitude selection effect via this scheme ( Figure 6). Because the Hopfield-Ninio scheme only has one intermediary product, it is difficult to interpret in terms of the number of effective pathways towards the discriminatory products. In order to illustrate the connection between discrimination, effective pathways and orthogonality more clearly, we turn to a more general setting.

C. Orthogonality in a General Setting
Murugan, Huse, and Leibler recently discovered that energetic discrimination in a general network requires a discriminatory fence [23], which can be idealized as a ladder graph having two sides, each with N loops (Figure 7(a)). The sides of the ladder are symmetric about the 0 node; the network aims to discriminate between states represented by its upper corners (i.e., x s2 in Figure 7(a)). Rate constants u S , d S , S = {W, R} will differ for the 'Wrong' (W ) and 'Right' (R) sides of the network.
The ladder idealization captures the fact that a general energetic discrimination network must be processive and have a dominant 'forward' (f ) and 'reverse' (b) path which are parallel to each other and effectively one-directional. On the pathway towards the product state, there is the constant threat of 'discard' (d), after which the reaction is exposed to a one-directional pathway away from the product state (b). There is also the possibility of 'rescue' (u) from discard.
The Kramer's form rate constants for this network are And there is no discrimination (f R = f W = f ) along the forward or reverse pathways: which we approximate to be one-directional for analytical convenience, but treat as bidirectional with small reverse rates when necessary for computing dissipation.
It is clear from the Kramer's form constants that to discriminate in the energetic regime (i.e., via γ), a high discard rate (d) is required. Indeed, the error rate for an N -loop network [28] in this regime is which achieves its minimum when discards are high relative to steps toward reaction completion: Discrimination in this regime is fundamentally processive, and global: accuracy relies on sequential exposure to frequently realized discard pathways, and each reaction step contributes to discrimination via the potential for discard. Correspondingly, orthogonality monotonically decreases in the Equation 14 limit (Appendix F), and is minimized in the high discard regime (Figure 7(b), solid lines).
In contrast, we find that the kinetic regime has error fraction given by (Appendix F): where The error ξ kinetic is minimized when η → 0 and φ → ∞, which is to say that: These limits imply that network dynamics are being pushed quickly towards the final product nodes (ω f , u large, ω b small). This makes local discrimination possible; and indeed orthogonality is monotonically increasing in the Equation 16 limit (Appendix G).
Quick movement towards final product nodes is in opposition to high discard rates; we can thus summarize the difference between the energetic and kinetic regimes by observing their difference with respect to the discard rate (d, Figure 7(b) x-axis), which reveal the expected orthogonality-error relationships in the two regimes. Note that these limits correspond to the dynamical phase localization limits described in [24].
We are now in a position to understand the orthogonality of this model in terms of its effective pathways towards the final product nodes. The energetic discrimination requirement that f << d means that the network effectively contains only a single pathway to the product. Intuitively, the single pathway results from the slowness of one-directional progress towards the final product; rescue pathways cannot add additional paths to the final product because they are effectively equilibrated relative to the slow forward progress. Corresponding to this intuition, we find analytically that u, b, have essentially no effect on orthogonality in the f << d regime (Appendix G). This argument is consistent with the fact that the discrimination error in the energetic regime (Equation 13) is independent of u, b, but in the kinetic regime, which requires d << f, we find that u, b are important factors in the error expression (Equation 15) and orthogonality requirements (Equation 16).
In the energetic regime, we observe that as f becomes close to d (red tick, Figure 7(c)), orthogonality rises sharply. We understand this to result from many more effective pathways now leading to the final product. Again, the rise in orthogonality as we increase f leads to the non-monotonic behavior of the error rate.
Our understanding of orthogonality in terms of effective pathways allows us to apply thermodynamic drive in the energetic regime such that drive does not increase orthogonality.
Our arguments above state that f << d, enforces the single pathway and hence maintains orthogonality. Therefore, if we dissipate energy to drive d, we should find that the orthogonality decreases, and indeed we do [ Figure 7(d)]. Note that Figure 7(c) was generated with the same parameters as Figure 7(d); all that's changed is the reaction we choose to drive. In this parametric limit, the orthogonality and dissipation requirements are not contravening.
Finally, we note that (as in the Hopfield-Ninio regime) highly selective -seven orders of magnitude -dissipation driven product switching is possible between states which are favored by different energy types (Figure 8).

IV. DISCUSSION
We have introduced a measure, which we call orthogonality, that was derived from an error bound on an approximation for the steady state ratio of states in a general nonequilibrium network which can be described by a master equation. This of course presents some limitations, foremost, we require that the dynamics can be linearized, that is that they can be represented by a set of linear differential equations in the form dp dt = Lp. This does not limit the classes of reactions as much as it might at first seem, as many networks whose microscopic interactions are governed by non-linear differential equations may be linearized with carefully defined states and edge labels [14] or by an appropriate coarse graining [9].
For example, a linearization of the classic enzyme based catalysis scheme can be derived from the non-linear mass-action equation by including substrate concentration in an edge label. Interestingly, this recovers the classic time-scale separation assumed to derive the Michaelis-Menten equation [14].
We propose that this orthogonality quantifies the degree to which such a network is processive versus distributive, and show that processive networks, which have a single dominant pathway between reactants and products, are characterized by low-orthogonality, while distributive networks which have many realizable paths, have high orthogonality. In order to discriminate via binding energies, a processive network is required because discrimination is achieved by frequently discarding intermediates from the dominant path. For such inherently processive processes, discrimination is a global function of discards at sequential steps throughout the graph. Final product formation is rare, thus slow. In contrast, discrimination via kinetic barriers is fast. In the kinetic regime, discrimination relies on creating final products quickly, enabled by distributive networks which have many paths towards the final products. These results help to explain why "rescues" in general energetic discrimination schemes increase speed at the cost of accuracy, as increasing the rates of such reactions increases network orthogonality, which is beneficial for speed but detrimental to accuracy in energetic schemes.
Our results suggest that orthogonality is related to the degree of processivity or distributivity in a network, however, we do not have mathematical proof of this relationship. This is in part because orthogonality is the only measure we know of which quantifies this aspect of networks, and thus we have nothing to compare it to directly. While no other measures seem to capture the number of effective pathways in the same way, we can compare it to other graph theoretic measures, such as the graph sparsity and we do indeed find that orthogonality decreases as graphs become more sparsely connected ( Figure S1). It is interesting to note that activation energy differences are symmetric changes to the Laplacian, while binding energy differences are not, this may be significant to our understanding of why activation energy differences require high-orthogonality and binding energy differences require low-orthogonality. It is also interesting to note that we can view this recursive orthogonalization procedure as the source of the extreme parametric complexity in general expressions for the discrimination ratio. It is likely that for equilibrium systems, many symmetries simplify the orthogonalization and result in the simple expressions we are familiar from detailed balance, although it is beyond the scope of this work to derive those.
It is interesting to consider this result in the context of protein complex assembly [25].
Sartori and Leibler [30] have recently proposed that a significant proportion of the discrimination necessary for accurate protein complex assembly can be achieved by equilibrium energy differences in protein-protein interactions. Our results predict that non-equilibrium mechanisms which amplify these energetic differences should result in complexes being assembled sequentially, and slowly. If non-equilibrium mechanisms instead amplify kinetic differences to achieve accurate assembly, we expect a complex's component subunits to assemble in many different orders, quickly.
One potential use for this work is to provide a general procedure in which to find the parametric limits for a network which permit enhanced non-equilibrium discrimination. The parametric landscape for general networks is complex and it is difficult to optimize accuracy.
In networks with relatively few species, there regimes can be found intuitively, as was done for the Hopfield-Ninio scheme, but for larger networks, until now the only way to find the appropriate parameters is by brute force sampling. This was the approach taken in both In some cases, analytical expressions for the orthogonality in certain parametric limits may also be tractable.
Furthermore, our results clarify the role of thermodynamic drive in nonequilibrium discrimination. We find that both kinetic and energetic discrimination are enhanced by increasing dissipation, but are subject to necessary requirements on orthogonality, which itself can be modulated upwards or downwards by free energy expenditure. When dissipation and orthogonality requirements contravene one another, discrimination schemes will have error rates that are non-monotonically increasing with the dissipation. This not only explains the observation of such behavior for a well-known discrimination scheme, but also leads naturally to the idea of modulating orthogonality to select between energetically or kinetically favorable products. We show that by modulating orthogonality with energy expenditure, discriminatory networks can indeed achieve sensitive product switching. In particular, driving a single reaction type is sufficient for sharp selection between products, if the products are favored by different energy types and if the driving shifts the orthogonality of the network.
Networks which are capable of switching from processivity to distributivity may be ubiquitous in biochemical systems. The ladder topology network shown in Fig 1(c, d) is an abstraction and can be useful to describe many different cellular processes. In general, the substrate need not be a protein and the modification need not be phosphorylation, this network could equally describe, e.g., a reaction complex forming around a nucleic acid substrate with methylation as the modification. In fact, with a nucleic acid substrate, the modification could even be the nucleic acids's own self-association into a stem loop. In this case, the "removal" of the modification could be driven by the activity of a helicase and modulated by ATP availability or by helicase gene expression for example.
Biologically, this possibility may be realized in cytoplasmic ribonucleoprotein (RNP) granules [7]. These granules are composed of RNAs and proteins co-localized in liquidliquid phase separated droplets. Their components interact promiscuously and are known to be enriched for multivalent components [3], which we propose may serve to increase distributivity and thus orthogonality. RNA contributes to promiscuous granule interactions via both RNA-RNA interactions and serving as a protein scaffold [10]. RNA structure is appealing as a modulator of orthogonality because it can be modified by driving a single reaction type. It has been recently reported that ATP within granules is hydrolyzed by DEAD-box proteins, which remodel RNA by unwinding duplexes [17]. This ATP-driven unwinding of RNA has been reported responsible for the dynamic makeup of RNA inside of granules, and for granule dissolution. It is possible that driving this reaction type can tune the orthogonality of granule interaction networks, perhaps allowing for exploration of novel interactions among components. Such an ability is consistent with the apparent importance of granules in a wide variety of cellular responses to environmental cues, including stress response [8], transcriptional regulation [2], and local, activity dependent translation of mRNA at neuronal synapses [4,19]. From the theoretical side, it would be interesting to investigate how orthogonality changes in a physical model of phase separation. Experimentally, it would be exciting to engineer a discriminatory network in which we can tune the orthogonality, and measure the resulting speed, accuracy, and product space directly. [27] This is due to the graph being strongly-connected and the normalization condition.
[28] An N -loop network will strictly speaking be composed of 2N + 1 loops, N on each side of the ladder and a single reactant node. [35] Felix Wong, Ariel Amir, and Jeremy Gunawardena. Energy-speed-accuracy relation in complex networks for biological discrimination. Phys Rev E, 98(1-1):012420, July 2018.

Appendix A: The discrimination ratio as a ratio of polytope volumes
In this section we will prove that the ratio ρ i /ρ j of the ith and jth elements of the steady state vector ρ can be expressed as the ratio of the volumes of the polytopes associated with i and j.
The result follows from these equalities: where L i k represents the matrix formed by removing row k and column i from matrix L, and L 0i is formed from L by removing column i only. For matrix A, V ol(P (A)) represents the volume the parallelotope formed by the columns of A and vector v i represents the ith column of L; We proceed by proving each of the equalities. To prove the first equality, it will be useful to have the definition of the adjugate matrix at hand.
where A ji is denotes the (n − 1) × (n − 1) matrix resulting form removing row j and column i from A.
Proposition A.2 (Discrimination ratio in terms of determinants with column and row cuts). We aim to demonstrate that Proof. The proposition was proved in [21]. We include the argument here for completeness.
By the Matrix-Tree theorem, the rank of a strongly-connected N dimensional Laplacian matrix is N − 1. The nullspace is therefore one-dimensional, and can be represented by a single basis vector ρ.
It will suffice to prove that ρ i = det(L i k ). Recall the Laplace expansion for the determinant: adj(L) · L = L · adj(L) = det(L) · I = 0 n×n , where 0 n×n denotes the n by n zero matrix and the final equality follows from L not being full rank, hence det(L) = 0.
Consider that L·adj(L) = 0 implies that Lv = 0 n×1 for all v which are columns of adj(L).
That is: the columns of adj(L) are equal to ρ. This gives the result.
We now prove the second equality.

Proposition A.3 (Discrimination ratio in terms of column cuts only).
We now wish to demonstrate that the equality presented in the previous proposition does not require the removal of some row k [11]: Proof. The first equality is a common characterization of the determinant. The second result follows from a series of equalities where: the first equality is by definition of a polytope volume generated by a non-square matrix; the second equality results from applying the Cauchy-Binet formula; the third equality follows from noting that det(L i k ) = det(L i k ), ∀ k, k ∈ 1 . . . N.
We now prove the final equality in Equation A1. First, it is useful to recall the base-height formula for determinants. to the span of {v 2 , · · · , v n }. It follows that And of course we can carry out this procedure successively for V ol n−1 , V ol n−2 , . . .. This gives the desired result.
Proposition A.5 (Discriminatory ratio in terms of normalized projections). Finally, we Proof. The result follows directly from the base-height formula for determinants.
where proj L ij (v j ) denotes the projection of vector v j onto the subspace spanned by the vectors of matrix L ij , formed by deleting columns i, j from L. Notice that in the numerator, we have chosen to begin the base-height iteration with vector v j . Because L 0i already has column i removed, this procedure yields -in the numerator -a polytope base generated by the non-i, j columns in L. In the denominator, beginning the base-height iteration v i also yields a polytope base generated by the non-i, j columns. These bases cancel to give the desired result.

Appendix B: Orthogonality is equivalent to the projection approximation error
In this section, we aim to prove the following proposition.
Proposition B.1 (Projection approximation). Let S be a matrix having full rank (note that our L ij are of full rank). We have that Proof. Let S have singular value decomposition S = U ΣW .
And similarly (noting that S S is invertible because S is full rank): This gives the result.
Appendix C: Orthogonality of an equal weighted all-to-all graph is greater than that of a line graph In this Appendix we demonstrate that the orthogonality of an N node line graph is strictly less than an N node all-to-all connected graph, in the toy case where all rate constants are the same. The result follows from directly calculating the orthogonality for each topology, which we do in turn.
Proposition C.1 (Θ for a line graph). For a a line graph with bidrectional connections of equal weight (set to 1 without loss of generality), the orthogonality is given by Proof. The result follows from direct computation of i, j , ∀ i = 1, N. There are only two types of nonzero i, j . The first type is i, i + 1 ; there exist 2(N − 1) terms of this type. The second type is i, i + 2 ; there exist N − 4 entries of this type. The first type of nonzero term represents 'neighbors.' The second represents nodes separated by one node, which point at a mutual node. The two types of inner product have (squared, normalized) values: The result follows.
The all-to-all calculation is slightly more complicated.

Proposition C.2 (Θ for an all-to-all graph).
For an all-to-all connected graph with bidrectional connections of equal weight (set to 1 without loss of generality), the orthogonality is Proof. Let S be the n by n − 2 matrix formed by removing two of the columns of the Laplacian for this graph.
Because the diagonal elements (S S) ii = 1, we need only compute the off-diagonal elements of S T S. A generic such element resulting from taking the (not normalized) inner product of columns j, k is given by where the first line is a generic expression for the inner product of columns corresponding to connected nodes for matrix elements θ ij of S, and the resulting lines follow from bidirectional all-to-all connectivity with equal rate constants.
We now need to compute the normalization factor: where again we have begun with generic terms for the normalization of the inner product of columns of the Laplacian matrix, with θ ij representing the elements of S.
Putting these together yields the expression for a generic element of S T S: How many such elements exist? We know that S T S is a square n − 2 length matrix, and we know that the diagonal terms are zero. We therefore have (n − 2)(n − 3) entries each equal to 1 (N −1) 2 . The result follows.
From the two propositions we can calculate that The former (negative) term quickly approaches 1, whereas the latter (positive) term grows as O( √ N ). We conclude that the orthogonality of the all-to-all graph is greater than the line graph, and this difference is increasing for increasing N .
Appendix D: Analytic expression for orthogonality in the 4-Node toy model We will show how orthogonality changes as the graph in Figure 4(a) is modified, in support of the claims made in the main text. Because we are discriminating between the end nodes, the orthogonality of the scheme in Figure 4(a) is a function of a single (normalized) inner product: with k, l corresponding to black, red arrows in Figure 4(a), as defined in the main text.
We will first demonstrate how orthogonality changes as r = k/l grows. We then demonstrate how orthogonality changes upon removing the black (bidirectional) connection between the middle nodes.
a. Adjusting rates to favor a single path reduces orthogonality We can rewrite Equation Two such terms contribute to the orthogonality giving which is decreasing with r as O(r −2 ), as claimed in the main text.
b. Removing a link What happens to the orthogonality when we remove the black bidirectional links between the middle nodes?
The expression for v 2 , v 3 2 removed is given by When r ≈ 1 this expression is equal to Equation D1; there is no affect on orthogonality.
However, as r increases, v 2 , v 3 2 removed becomes smaller than Equation D1; deleting the connections increases orthogonality. We conclude that when r > 1, the black bidirectional links form part of the dominant path, removing them will therefore increase the orthogonality.
We now demonstrate the orthogonality-discrimination relations made in the main text.
To do so, we first compute the orthogonality in the high and low discrimination limits, in order to demonstrate that orthogonality is lower ( s 2 i,j higher) as high discrimination improves. We will then compute the degree to which orthogonality movement between the low and high discrimination limits is monotonic.
In the energetic regime, the discrimination is maximized in the limits . We must therefore consider: ω → ∞, → ∞, and ω p → 0.
a. Energetic Limit 1: ω → ∞ Note that we replace m with µ in the below. s 2 i,j is increasing with ω To demonstrate this, we will show the following.
This term is positive except for the case ω p m > ω p 2ω + e ωm + 2e ω which is only satisfied outside of the proofreading regime ωp e ω > 1. b. Energetic Limit 2: → ∞ s 2 i,j is increasing with To demonstrate this, we will show the following.
These terms are monotonically decreasing except for when −m 2 ω p 2 dominates all other (positive) terms in the square bracket, which requires m large, and ωp e ω > 1. Putting these sums back into the equation for orthogonality we can verify that orthogonality is increasing as ω p increases in the proofreading limit (σ ≈ 50)

d. The Hopfield Network in the Kinetic Regime
Derivation of the kinetic regime error rate We first derive an expression for the error rate in the kinetic regime of the Ninio-Hopfield scheme, ξ kinetic . We then determine the appropriate proofreading limits in the kinetic regime.
We compute that: = (e 2δ a + e δ+ p b + e p c)(e + i a + b + c) (e 2δ+ + i a + e δ b + c)(a + e p b + e p c) where we have let a = ωω i , b = ωω p , c = ω i ω p e i .
When the total dissipation i + p + is high, the terms e + i in Equation E1 will dominate.
We therefore have that ξ kinetic ≈ e 2δ a + e δ+ p b + e p c e 2δ a + e 2δ+ p b + e 2δ+ p c from which it is clear that proofreading requires that e p /a be very large. Moreover, the error fraction is minimized when c/b is very large. Note that proofreading can still occur when b/c >> 1, but the error fraction is not minimized in this regime. Translating these conditions into Kramer's form parameters gives the necessary limits for maximum discrimination As in the energetic regime, we take m = µ = ω i e i , and write the limits as: We now investigate orthogonality in these discriminatory limits.

p)
Which are both strictly negative. We conclude that s 2 i,j is a monotonically decreasing function of µ, thus orthogonality is monotonically increasing.
which: includes every vertex of G and has no cycles (when edge directions ignored). A spanning tree is said to be rooted at node i if i is the only vertex of the subgraph without any outgoing edges.
The MTT provides an expression for the steady states of node i in terms of the sum of the product of the rates of each spanning tree rooted at i. That is: where S i (G) is the set of all spanning trees of graph G rooted at i.
We will exploit the structure of our ladder network in order to simplify this expression.
Our ladder consists of two subgraphs (call them G R , G W , corresponding to right, wrong products, respectively). These two subgraphs are joined at a single node, 0. Because the subgraphs share a single node, the kernel element corresponding to node i in subgraph R is This gives for the error We know that ρ S (G S ) ({S = W, R}) in Equation F1 represents the node at the top corner of the graph. Similarly, ρ 0 (G S ) represents the node 0.
We therefore need only determine analytical expressions for the sums of (products of rate constants of) spanning trees rooted at the top corner and 0 nodes.
Let's count the trees rooted at ρ 0 (G S ) first. In order for the tree to be rooted at 0, there are a number of essential arrows: without any of which it is impossible to produce a spanning tree rooted at 0. The necessity of these arrows comes from the unidirectionality of the f, b.
What other arrows are necessary for a spanning tree? Consider the diagram It is necessary and sufficient for a spanning tree rooted at 0 to contain all of the red arrows, and exactly one of the green arrows and one of the blue arrows. This holds in general; each loop in a ladder must contribute either a factor of f or d to a spanning tree rooted at 0.
We can thus count the number of possible spanning trees, and the product of their rate where the second line follows from the Binomial theorem, and where we have set the number of square loops in the ladder portion of the graph to be α.
We can now repeat this procedure with spanning trees rooted in the upper corner, with red, blue, and green as before: Which gives us: Note that in comparison to the last expression, we have merely made the substitutions: Returning to our expression for the error gives where we have denoted variables coming from the 'right' and 'wrong' sides of the ladder with subscripts R and W, respectively. We can do some cancellation (b = b R = b W and f = f R = f W ) to arrive at: In the energetic regime we have that u R = u W , and that d W = d R e γ : In the kinetic regime, we have that d R = d W e δ , u R = u W e δ , giving

Appendix G: Orthogonality and Error in the Ladder Graph
We first derive the discriminatory limit in the energetic regime. Recall that 1. Energetic regime from which read off that proofreading requires η to be large. This corresponds to the intuition that the rate of discards must be large with respect the reaction speed.
We must now demonstrate that orthogonality is decreasing as η becomes large.
As in the Ninio-Hopfield case, we will use the notation s 2 i,j to denote the squared, normalized inner product between columns i, j in Matrix L a,b formed by deleting the columns corresponding to the discriminatory nodes a, b from the full Laplacian for this graph.
which take values 0, 1/4, 0, and 1/12 in the limit η → 0 and 1/4, 0, 1/4, and 1/12 in the limit η → ∞ For N loops, we will have N terms of the first and third type, and N − 1 terms of the second type. The last term is unchanged in these limits. This gives the desired result, Finally, we consider the case when φ → ∞. Note that x si , x s(i+1) 2 terms are not functions of φ. The two remaining terms to consider are, which in the φ → ∞ limit become, x si , y s(i+1) 2 = 1 4η 2 + 4η + 4 x si , y si 2 = 4η 2 + 4η + 1 4η 2 + 4η + 4 Combining these term yields, We can directly compute that s 2 i,j is monotonically increasing in both the φ → 0 and φ → ∞ limits.
a. φ does not affect orthogonality in the f d limit Before turning to the kinetic regime, we demonstrate that φ does not affect orthogonality in the energetic discrimination limit.
Now we must evaluate these in the limits of φ → 0 and φ → ∞, the first term goes from 1/2 to 0 as φ → ∞. The second term goes from 1/2 to 1 and the third term goes from 1/2 to 0. Because each loop consists of two of the second type term and one each of the first and third type term, the sum is the same in each limit.
In the full expression for orthogonality, we do observe a small non-constant dependence on φ, but this is marginal and strictly decreases the orthogonality, thereby reinforcing our notion that φ cannot be used to increased realizable pathways in the low f regime.

Kinetic regime
We now need to demonstrate that Define η = d/f, φ = u/b as before.
which attains its minimum of e −αδ in the limit φ → ∞, η → 0. The previous sections demonstrated that orthogonality is increasing in these limits.
(a) (c) formation. In (a) reactions must occur sequentially resulting in a processive network, with a single path for assembly. Here, nested molecular shape confers the processivity by dictating an order in which the molecules must assemble. In contrast, in the distributive network shown in (b), the subunits are free to associate in almost any order allowing many effective paths for assembly.
Processivity and distributivity need not be conferred by unique molecular properties such as shape, in fact, reactions which use the same components may be processive or distributive based on the reaction rates. For example, consider a ladder topology network. This network can be changed from a processive network (c) to a distributive network (d) by increasing the rate of a single reaction, the "up" reaction (red arrow). In this example, the top path differs from the bottom path by the addition of a modification ( ) to (S). In (c), the modification is very rarely removed, and the modified complex cannot complete the reaction; any modification will be a "catastrophe" (c, blue path, dashed) requiring the complex to disassemble and start over before finally completing (c, blue path, solid). Thus, all successful reactions must follow a single pathway from reactants (R) to products (P) (c, green path). However, if the rate of the red reaction is greatly increased, the modification can be removed easily at any step. Thus the complex can form either along the top path or along the bottom path and can switch at any time like in the example trajectory (d, blue path), giving many effective paths (d, green paths).  (blue) and the polytope associated with P (L 0B ) is the parallelogram with sides v A and v C . These polytopes share the facet P (L BC ) which is the vecotr v A in this case. The ratios of the elements of ρ, the steady state solution to Lρ = 0 is given by the ratio of the areas of these two polytopes, and in this example, rho B ρ C is given as the area of the red parallelogram divided by the ratio of the blue parallelogram. The area of P (L 0B ) = √ 147 = 7 √ 3 and the area of P (L 0C ) = √ 108 = 6 √ 3, giving a ratio ρ B ρ C = 7/6. This can also be calculated using the base height formula, choosing the shared facet P (L BC ) = v A as the base. The ratio can then be computed simply as the ratio of the perpendicular components. Here we construct a simple example of a 4-species network in which we can tune the orthogonality.
If we construct a linear network with rates k, ((a) inset, black arrows), and l ((a) inset, black arrows), the black pathways represents a single path between nodes species 1 and 4, while the red pathways represent "alternative pathways". If we increase the ratio (r = k/l) of rates of the black with respect to the red reactions (r 1), a single pathway will dominate. However, in the limit where all the rates are equal (r = 1) there are many effective pathways between 1 and 4.
Orthogonality gives a measure of these effective pathways in the network, where more orthogonal networks are more distributed. This example shows directly the meaning of orthogonality. The ratio of steady states can be computed using the projection onto the subspace spanned by the red vectors in (b) corresponding to the highlighted points (blue, orange, yellow) in (a), and becomes exact when these vectors form an orthogonal basis. By rotating the vectors shown in (b), we can see that as the network becomes more processive by increasing r, these basis vectors become less orthogonal (c).
(a) Orthogonality bounds minimum error rate in the energetic regime (γ = 1, δ, δ p = 0). The log of the error rate (log(ξ)) as a function of the orthogonality (Θ) is plotted for simulated data (parameter selection in Methods). Heatmap coloration represents relative dissipation ∆S i [33]; for a given orthogonality, the error rate decreases as dissipation increases. (c) In the energetic regime, minimum error (red line, ξ = e −2γ ) is achieved by simultaneously minimizing orthogonality (Θ, green) and maximizing dissipation (black). Excess dissipation drives orthogonality upwards, approaching the binding energy difference (ξ = e −γ ) asymptotically. (d) Orthogonality as a function of drive. In the energetic regime (solid curves), error rate (ξ) is minimized in the limit of low orthogonality (Θ).
In the kinetic regime, (dashed curves), error rate is minimized in the limit of high orthogonality.
For this scheme, the orthogonality is bounded by (1,-3.47).  6. A Hopfield-Ninio style network designed to tune product selectivity by modulating dissipation (black). One product ρ γ has a lower binding energy and is favored in the energetic regime, while the other ρ δ is has a lower activation energy and is favored in the kinetic regime. The log of the ratio between the products (ρ γ /ρ δ , blue), can be shifted from 2 (ρ γ favored) to -2 (ρ δ favored) by driving across a single reaction. This is due to orthogonality (green line) increasing, which shifts the network from the energetic to the kinetic regime. k on k off y s0 binding energies favor the product (ρ γ ) on one side of the ladder while activation energies favor the other product (ρ δ ). Dissipation is used to drive u , increasing the ratio of rescues to discards u S /d S , thereby shifting the network from low orthogonality (ρ γ favored) to high orthogonality (ρ δ favored).

Appendix H: Supplemental Information
FIG. S1. As a graph becomes more connected, orthogonality increases. Orthogonality is plotted against varying connectivities of a 16 node graph, generated as described in the main text of this section. Zero connectivity corresponds to a 16 × 16 grid graph, by convention.
FIG. S2. Orthogonality is required to achieve the minimum error rate in the kinetic regime (γ = 0, δ = 1). The log of the error rate (ln(ξ)) as a function of the orthogonality (Θ) is plotted for simulations of the triangle graph (Hopfield-Ninio) with Kramer's form rate constants for 1,000 randomly chosen values of ω i , ω p , i , and p . Other parameters were fixed (ω = 1, = 10). Color shows the dissipation ∆S i at steady state. Figure S1 demonstrates that orthogonality tends to increase as we add connections of equal order of magnitude to a graph. The figure was generated by first creating an allto-all connected graph having 16 nodes. Rate constants were chosen randomly from the distribution Exp[N (0, 1 3 )]. For each of the c = 'connectivity fractions' in Figure S1, a random set of 1 − c * (16 2 − 16)/2 connections was then chosen for deletion and removed bidirectionally. These random deletion sets were chosen 1000 times for each connectivity fraction considered. The mean and standard deviation of these 1000 samples is plotted.
Graph sparsity 0 corresponds to a 16 × 16 grid graph. Figure S2 shows the relationship between orthogonality and error for the Ninio-Hopfield model in the kinetic regime (γ = 0, δ = δ p = 1). High orthogonality and high dissipation are necessary for low error. Table I gives the values of the rate constants in the irreversible style ladder graph model, and used to generate the plots in Figure 7 Table II   indicate that this parameter was used as an independent variable for plotting.

Tables of parameter values
form rate constants for the reversible ladder graph and to generate the plots shown in Figure   7. Table III gives the values used to derive Kramer's form rate constants for the Hopfield-Ninio model and to generate the plots shown in Figure 5.  indicate that this parameter was used as a dependent variable for plotting.