Positional information from oscillatory phase shifts : insights from in silico evolution

M Beaupeux; P François

doi:10.1088/1478-3975/13/3/036009

Introduction

Signalling information can be efficiently encoded in temporal dynamics of regulatory pathways [1, 2]. In development as well, positional information appears now much more dynamical than first proposed in the classical 'French Flag model' paradigm [3]. Dynamical aspects of Hox genes patterning have been known for a long time [4, 5] and are intertwined with complex regulations of cell movements [6]. Other recent examples include 'time-integration' of Shh signalling for neural progenitor [7, 8], or Turing-like patterning in growing tissues [9]. Dynamical positional information is most strikingly observed during vertebrate segmentation (or somitogenesis), where multiple signalling pathways oscillate both temporally and spatially [10, 11] to define position, size and polarity of the vertebrae precursors, called somites [12–14].

An important role for oscillations had first been predicted by the classical clock and wavefront (CW) model by Cooke and Zeeman [15]. A front is proposed to sweep across an embryo considered as a linear array of oscillators, locally synchronized. As the front passes over cells, ticking information of the clock is used to define somite boundaries. Heuristically, this turns a temporal oscillation into a spatial one defining metameric structures (somites). Historically, the CW model was proposed to ensure vertebrae scaling with respect to total embryonic size. Dimensional analysis combining a clock (characterised by period T) to the regression of a hypothetical wavefront (speed v) naturally predicts a length scale vT (corresponding to somite size), independent from any local property such as diffusion constant or number of cells (see e.g. [16] for an explicit computation of this length scale).

Different variants of the original CW model have been proposed e.g. the original model by Julian Lewis in appendix of [12], and subsequent works have suggested that patterning is mediated with a diverging clock period [17]. More explicit models account for different molecular controls [18], while other models assume bistability as a crucial effector of somite definition [19] or later polarity [20]. Existence of segmentation oscillators has now been observed in multiple vertebrates, and is most probably homologous from snakes to mammals [21]. Recently, oscillations controlling insect segmentation were also discovered [22–24], suggesting that the ancestral mode of segmentation of chordates might be based on similar principles.

Recent experiments in mouse have challenged the classical CW view [25], and especially the existence of a discrete, well-defined 'wavefront' independent of the clock. In particular, it was shown that scaling of somites with respect to pre-somitic mesoderm size (PSM) occurs in culture even without growth, which cannot be easily accounted by a CW model. Such scaling would require some unknown long range mechanism for PSM size sensing in the CW framework. However the scaling property can also be observed on the phase gradient of kinematic waves of Notch oscillations in culture, without a full-size PSM, and way before the appearance of any morphological somite boundary [25]. This strongly suggests that PSM-size is not actively 'measured' by the system, and excludes any signal propagating from the differentiated anterior.

A parsimonious way to explain these data is to assume that individual cells act as autonomous clocks, slowing down the phase of their Notch oscillation with respect to some reference phase, corresponding experimentally to the phase in the tail bud [25]. The differentiation front indeed appears to be well-defined by a phase difference of Notch oscillations with the tail bud, which suggests that positional information on somite boundary is defined directly by the oscillations. This is reminiscent of a 2-oscillator model proposed by Goodwin and Cohen [26], with an added feedback to account for the slowing down of one of the somitic oscillators. An important difference is that the Goodwin Cohen model necessitates very fast oscillations compared to timing of patterning, while in somitogenesis, the period of oscillation is roughly the period of pattern formation so that phase computation has to be done in real time and can not be averaged out over many oscillations.

These observations prompted us to propose a simple phenomenological '2-oscillator' model driving somitogenesis [25] for single cells in PSM after their tail bud exit. The first oscillator (phase ϕ₀) is synchronized in the embryo with fixed frequency while the second one (assumed to be Notch, phase ϕ₁) slows down with respect to ϕ₀ following

$\begin{eqnarray}&&{\dot{\phi }}_{0}=1,\end{eqnarray} \tag{ 1 }$

$\begin{eqnarray}&&{\dot{\phi }}_{1}=1+\alpha ({\phi }_{1}-{\phi }_{0}),\end{eqnarray} \tag{ 2 }$

where time has been rescaled so that the intrinsic period of segment formation is 2π. To avoid confusion, we will refer to ${\rm{\Delta }}\phi (x,t)={\phi }_{1}(x,t)-{\phi }_{0}(x,t)$ as the local phase difference (between oscillator 1 and 0), and to ${\rm{\Delta }}{\phi }_{1}={\phi }_{1}(x,t)-{\phi }_{1}(0,t)$ as the phase shift between local and tail bud Notch phase (which is the quantity described in [25]). These phenomenological equations strikingly reproduce many features of somitogenesis scaling [25]. Biophysical interpretation of equations (1) and (2) is straightforward: if α < 0 Notch oscillator is slaved to the reference oscillator, while if α > 0, Notch oscillator phase is 'repelled' by reference oscillator.

Biologically, dephasing of Notch oscillations occurs as cells move more and more anteriorly in PSM after their tail bud exit. Phenomenologically, segmentation occurs for some well-defined phase shift ${\rm{\Delta }}{\phi }_{1}^{* }$ between local and tail bud phase for Notch [25], and thus is done in continuity with the autonomous slowing down of the clock described by equations (1) and (2). This suggests that the relative phase of the two oscillators is the crucial biological variable accounting for positional information, contrary to the classical the clock and the wavefront model where CW are essentially two independent entities until they interact.

Phase equations similar to (1) and (2) are widely used to model coupling of similar oscillators at different positions [27]. Here we consider what happens if two different oscillators coexist and are coupled within the same cell, assuming phase difference between the two clocks is used to define temporal and positional information. Thus the natural question: what kind of signalling network can actively 'compute' a phase difference between two oscillators within the same cell, to eventually define a phase shift map in the embryo? In this manuscript, we turn to computational evolution to answer this question [28]. We use a mutual information based fitness to select for networks actively sensing the phase difference between two oscillating pathways. Characteristic network topologies evolve, and their behaviour is studied. Finally, improvements of these networks and implications for somitogenesis are discussed.

Method

Our goal is to evolve in silico networks that are capable of measuring the phase difference between two genetic oscillators through time, using computational evolution [28]. This method consists in simulating evolution of a population of gene networks, through a sequence of growth, selection and random mutation (see other examples of similar methods in [29–31]). The algorithm used here is similar to the one reviewed in [32]. We summarise in the following the most important features of these simulations.

We limit ourselves to purely transcriptional networks, where variables are transcription factor (TF) concentrations. To each network corresponds a set of ODEs encoding for network dynamics. Activations and repressions are modelled with Hill functions, in a similar way to [20], and degradation is assumed to be linear. As an illustration of the formalism used, consider a protein P, with degradation δ, transcriptionally activated by two proteins A and B and repressed by R . In that case, given the two activating hill functions ${\rho }_{A}=\tfrac{{A}^{n}}{{A}^{n}+{A}_{0}^{n}}$ and ${\rho }_{B}=\tfrac{{B}^{m}}{{B}^{m}+{B}_{0}^{m}}$ (n, m Hill coefficients, A₀, B₀ Hill thresholds), we assume a 'OR' like combinatorics, in the sense that the actual production rate is the maximum of these two Hill activities ${\rho }_{{\rm{max}}}={\rho }_{P}\mathrm{max}({\rho }_{A},{\rho }_{B})$ with ${\rho }_{P}$ a parameter defining maximum possible transcriptional rate. Repressors are assumed to act multiplicatively, i.e. if R represses the same gene, the production rate of this network is with same notations, $\rho ={\rho }_{{\rm{max}}}\tfrac{{R}_{0}^{l}}{{R}^{l}+{R}_{0}^{l}}$ , so that in the end, the evolution equation for P is

$\begin{eqnarray*}\begin{array}{rcl}\displaystyle \frac{{\rm{d}}P}{{\rm{d}}{t}} & = & {\rho }_{P}\mathrm{max}\left(\displaystyle \frac{{A}^{n}}{{A}^{n}+{A}_{0}^{n}},\displaystyle \frac{{B}^{m}}{{B}^{m}+{B}_{0}^{m}}\right)\\ & & \times \displaystyle \frac{{R}_{0}^{l}}{{R}^{l}+{R}_{0}^{l}}-\delta P\mathrm{.}\end{array}\end{eqnarray*}$

For representation purpose, in the following, genes are indexed by integers and are called S_i, altogether with the protein they encode.

The algorithm acts on a population of networks (typically 40). All networks in the population are initialised at generation 0 with two input genes (corresponding to our two oscillators, proteins are subsequently called S₀ and S₁), and one output gene (corresponding protein is subsequently called D; the nature of D can be randomly changed by the algorithm and is thus under influence of selection). We present here results for networks evolved over 550 generations. At each generation, differential equations corresponding to these networks are integrated. Then a fitness function (see below) quantifies the performance of these networks. Selection is made based on this fitness: the worst half of the networks is discarded, while the best half is kept, duplicated and then mutated. Practically we also add a very small random number to the fitness (of order 0.001), to ensure that the best network at generation N is not necessarily the direct copy of the best network at generation N − 1 in the absence of mutations.

There are three kinds of possible mutations:

Random changes of kinetics. In this case, we randomly draw new values of the kinetic parameter.
Addition/removal of transcriptional activations. When adding interactions, we randomly choose one TF and one target, and randomly draw the kinetic parameters associated to the interaction.
Addition/removal of a TF. New TFs are assumed to be initially produced with random constant rate, and are degraded linearly.

When creating or changing an interaction, all rates are randomly drawn in a predefined range so that all concentrations stay of order 1 in arbitrary units. (see supplement for more details on the range of parameters).

Mutations are drawn randomly with predefined rates, for the simulations presented in the paper, the probability to change parameters, add/remove a gene or an interaction are identical. Mutation rates are adjusted dynamically to have on average one mutation per network per generation: this ensures incremental evolution and implements a computational equivalent of the Drake rule [33]. Like in our previous work, we tested some variations in these parameters, and the results do not depend much on those mutation rates in the sense that the dynamics of 'working' networks are qualitatively the same even when mutation rates are changed (but success rate or speed of convergence change).

The goal of the algorithm is to select for networks computing phase difference between two oscillators. A priori, a gene network downstream of two oscillators could encode much information about each of the two oscillators, but the question of how this information is later decoded is too generic as is. We therefore impose some constraints: we assume there is a single output gene D (for decoder) containing all 'information' about the phase difference ${\rm{\Delta }}\phi$ between S₁ and S₀. To compute its information content, we use a heuristic approach similar to the ones in [34, 35]: we construct pseudo-probability density functions on both D and ${\rm{\Delta }}\phi$ , then compute mutual information on these pseudo PDF. More precisely, at each generation, differential equations corresponding to each network are integrated, with initial conditions 0 for each gene. Dynamics of S₀ and S₁ are imposed to be sinusoidal. S₀ is oscillating at constant predefined frequency ${S}_{0}(t)\propto 1+\mathrm{cos}({\omega }_{0}t)$ while S₁ is oscillating with a slightly different random frequency and different initial phase ${S}_{1}(t)\propto 1+\mathrm{cos}({\omega }_{1}t+{\phi }_{1}^{0})$ imposing a time-evolving phase shift of the two oscillators (see details of sampling for ${\omega }_{1}$ and ${\phi }_{1}^{0}$ in supplement). Numerical integration is reproduced many times with randomized values of ${\phi }_{1}^{0}$ and ${\omega }_{1}$ to ensure representative sampling of ${\rm{\Delta }}\phi$ and of its dynamics.

For each time point, we store the couple of values $(D(t),{\rm{\Delta }}\phi (t))$ . These values are then binned to reconstruct an effective probability distribution $P(D,{\rm{\Delta }}\phi )$ (we considered 64 bins). Such binning has a natural biological interpretation: realistically there will be some intrinsic noise preventing a perfect measure, and finer binning means less intrinsic noise.

We can define mutual information between D and ${\rm{\Delta }}\phi$ as

$\begin{eqnarray}&&F=I({\rm{\Delta }}\phi \;:D)=S({\rm{\Delta }}\phi )-S({\rm{\Delta }}\phi | D),\end{eqnarray} \tag{ 3 }$

where S is the standard Shannon entropy $S(X)=-{\sum }_{x}P(X=x)\;{\mathrm{log}}_{2}\;P(X=x)$ . We use F as the fitness to optimise for each network. Figure 1 summarises how the fitness is computed. Note that since we are computing the mutual information for the entire time-course starting at 0 initial condition, there could be some influence of a transient behaviour, but with many values of ${\phi }_{1}$ and ${\omega }_{1}$ , rapid convergence of D towards a value faithfully tracking the phase shift should be ensured. Indeed we typically observe very fast convergence so that the transient behaviour is irrelevant, see e.g. simulated time courses on figure 3. We can thus consider that D reaches some stationary behaviour very quickly.

Finally, it should be noted how the relation between D and ${\rm{\Delta }}\phi$ on the example of figure 1 looks qualitatively similar to actual biological input/output relationship in developmental context (e.g. the Bcd/Hb relationship in fly from [36]). This is not a coincidence: as detailed by Gregor and co-worker, such probabilistic input/output relationship gives quantification of the positional information transmitted by gap genes network [37]. The problem we consider here is very similar, except information is encoded in a dynamical way (phase difference ${\rm{\Delta }}\phi$ ) in contrast to a static morphogen such as Bcd. Furthermore, the dynamical nature of this problem imposes more specific constrains: for instance, given the periodicity of the input signals, the behaviour of D is expected to be periodic in ${\rm{\Delta }}\phi$ . This is indeed what we observe, and this is the reason why we display ${\rm{\Delta }}\phi$ only over a 2π interval on all our figures. D should also be continuous, which means, because of periodicity, any value of D should correspond to at least two different values of Δϕ (except for extrema), a problem computational evolution circumvented as described below.

Results

A sample of networks evolved are displayed on figure 2.

Networks found by the procedure can be essentially cast into two groups:

A first group of networks has the property to relate D to two values of the phase difference, Δϕ and its symmetric with respect to π, 2π − Δϕ. Thus those networks are not able to discriminate which oscillator is ahead of the other. We call them symmetric. An example of such network can be found on figure 2(C) (F = 2.66 bits) and figure 3(A) (F = 1.93 bits). Note that both inputs (S₀ and S₁) are equivalent topologically in the graphical representation of network on figure 3(A).
A second group is able to discriminate between a phase difference Δϕ and 2π − Δϕ, therefore is able to discriminate which oscillator is ahead of the other. We call them asymmetric. Examples of such networks can be found on figures 2(A), (B) and 3(D). (Respective fitnesses F = 1.93, 2.62 and 2.31 bits.)

Figure 3 displays the simplest networks found (in terms of gene numbers) by our algorithm for each category.

Our discrimination between asymmetric and symmetric is most visible on the reconstructed distribution P(D, Δϕ), panels (C) and (F) on figure 3. As said above, because of the periodic behaviour, there are always two values of Δϕ corresponding to the same value of D. However, the binning of both Δϕ and D 'compresses' many values of D within a small region of Δϕ for asymmetric network, resulting in an apparent discontinuity or 'shock' in the relationship between D and Δϕ. A limit case of asymmetric network would be a perfect shock, where D suddenly changes its value for a well defined value Δϕ*. This value of Δϕ* would essentially have very small measure given the binning we impose, so that asymmetric networks essentially realise one-to-one mapping between D and Δϕ.

In principle, we would expect asymmetric networks to have more information content (up to 1 extra bit) and to systematically dominate symmetric networks; practically D still oscillates a bit for a given value of Δϕ, which effectively decreases the fitness. This explains why, depending on parameters, some asymmetric networks like on figure 2(A) have lower fitness than some symmetric networks with very dampened oscillations like the one on figure 2(C). Nevertheless, the simplest asymmetric network we found contains 0.4 extra bit of information compared to the simplest symmetric network of figure 3. Our simulations tend to show that asymmetric networks are more difficult to evolve than symmetric networks (less than 30% of networks that we obtained can be qualified as asymmetric), so that the evolution algorithm often leads to symmetric networks and optimise them by reducing their residual oscillations.

Thus computational evolution is able to identify working networks, however their dynamics is not trivial, nor the correspondence between topology and behaviour. Clearly, from the examples displayed here on figure 2, there is a relatively broad family of networks able to compute phase difference. One advantage of computational evolution is that it can generate simple efficient networks with coarse-grained interactions, which can help extracting simple working principles (see e.g. [38] for a discussion in developmental context). To get some more general principles for phase computation, that could potentially be recognised in actual networks, we focus on the simplest networks obtained, and since asymmetric networks are closer to an ideal perfect mapping, on network of figure 3(D). Interestingly, the principles and topologies identified in the following can be identified on the more complex networks and thus appear to be potentially generic.

Networks compute phase difference, irrespective of oscillators drift and periods

Our selective pressure imposes values of D as a function of Δϕ for randomized input profiles with slowly drifting phases. However, it is important to check that the networks are still able to compute phase differences for different conditions. For instance, it is known that somitogenesis period itself slows down during development, or can be modified by changes of temperature. So we checked that the evolved networks perform the same computation irrespective of the periods of the input oscillators (faster and slower). This is illustrated on figure 4, for period 20% faster and slower: indeed, the PDF of D as a function of Δϕ hardly varies for different periods. The only slight difference is observed close to the maximum value of D: for shorter period, the remaining oscillation on D has a smaller amplitude and thus D is defined more precisely. This corresponds to an effective better time-averaging: if the time-scale of oscillations is smaller, keeping the same kinetic parameters, we expect the higher frequency to be more efficiently filtered out (because longer time-scale can not 'follow' shorter one). It is nevertheless quite remarkable that the phase shift information and shock persist despite this, and indeed fitness is maximum for shorter period.

**Figure 4.** Different PDFs of the joined variable (D, Δϕ) for different periods, for network of figure 3(D). The upper row shows the dynamics of network (redrawn in (A)) for a phase shift varying linearly between 0 and 4π if gene S₀ has a period of 8 (B), 10 (C) and 12 (D). The lower row shows the PDFs of network (A) as a function of the period of gene S₀. Note that the dynamics and PDFs are essentially similar, though the influence of the period is visible on the value of the fitness, (B) having a fitness of 2.42, (C) 2.31 and (D) 2.04.
Download figure:
Standard image High-resolution image

Methods for phase computation: dissection of networks

As an illustration of the general principles underlying phase computation, we proceed in a very similar way to 'classical' genetics by successively removing genes and interactions on network of figure 3(D), and then checking the consequence on the network dynamics.

Positive feedback creates asymmetry

Positive auto regulatory feedback loops are systematically observed in the upstream part of asymmetric networks, on gene S₃ and S₂ in network of figure 3(D), but also on 2 genes on network of figure 2(A) and 4 genes on network of figure 2(B). On the contrary, in symmetric networks, there is no such positive feedback loop (figures 2(C) and 3(A)).

Asymmetry due to positive feedback can be well understood theoretically. If the combinatorics of activation is akin to an 'OR' gate (as we assume here), a gene can activate another self-activating gene, inducing complex history dependency in the dynamics of this gene (e.g. transition from oscillation to bistability in [20]) and, here, asymmetry. This is illustrated on figure 5 using a toy model for the subnetwork made of genes S₀, S₁ and S₃. In this model, which is fully analytic, Hill functions for S₀ and S₁ are replaced by step functions (i.e. taking the limit of infinite Hill coefficients). In that case the value of the production rate is then completely determined by the relative concentrations of all activators and repressors with respect to their threshold.

**Figure 5.** Positive regulatory feedback loop builds asymmetry. (A) Module studied. Interactions are simplified to heaviside functions. (B) Time evolution of S₃ when there is no auto activation, with Δϕ = π/2. (C) Time evolution of S₃ when there is no auto activation, with Δϕ = 3π/2. (D) Time evolution of S₃ with auto activation, and Δϕ = π/2. (E) Time evolution of S₃ with auto activation, and Δϕ = 3π/2.
Download figure:
Standard image High-resolution image

Without this feedback loop (figures 5(B) and (C)), there is no strong difference in the behaviour of S₃ for Δϕ or 2π − Δϕ. S₁ activates S₃ above threshold for activation (which represents a periodic window W₁) and S₀ does not repress S₃ below threshold for repression (which represents another periodic window W₀), so that S₃ is periodically active only over the intersection I = W₀ ∩ W₁. Given the value of thresholds evolved, W₀ and W₁ are approximately of the same size, so that S₃ is fully activated only when W₀ = W₁, i.e. for Δϕ ∼ π. In this simplified model, where activations and repressions are step functions, the average value of S₃ is a simple function of the size of its activation window I. If the phase difference Δϕ is changing from π, the size of I decreases, and so does the value of S₃. But the size of I is independent whether S₁ or S₀ are advanced or delayed with respect to each other, so that amplitude of S₃ is always symmetrical with respect to Δϕ ∼ π.

The situation changes when the positive feedback loop is present (figures 5(D) and (E)). Crucially here, S₃ can be on if either S₃ or S₁ is active. If I occurs at the end of window W₀, there is no difference with the previous situation, in the sense that S₃ is only activated over the overlap window I. In particular, at the end of window W₀, S₃ is fully repressed by S₀ (see figure 5(D)). On the contrary, if I occurs at the end of W₁ and at the beginning of W₀, once W₁ is over, auto activation by S₃ can sustain S₃ production for a longer time, resulting in a window of activation longer than I, until W₀ ends (red rectangle on figure 5(E)). As a consequence, there is a higher overall amplitude for S₃ and an asymmetry between Δϕ and 2π − Δϕ.

The importance of this auto activation is also visible if the activator combinatorics is changed from an 'OR' logic to an 'AND' logic where activators act multiplicatively, just as repressors here. If 'AND' logic is used, there can not be any boost of production due to direct self-activation, and asymmetry might only evolve through more complex combinatorics. To check this we ran evolutionary simulations with 'AND' logic for activations, and as expected they led to a significantly lower number of asymmetric networks (around 5%). An example of such asymmetric network evolved with the 'AND' logic is presented in supplement (figure S1). In this example, positive feedback is effectively replaced by purely feedforward interactions, via one extra gene, leading to similar history dependency and asymmetry in the amplitude of a downstream gene. Functional correspondence between feedback and feedforward interactions has been observed in other contexts [35, 39].

Layers of feedforward interactions reinforce asymmetry

Many evolved networks present several feedforward interactions starting with the input genes. The role of specific feedforward interactions is not clear a priori, but in general feedforward loops are used to sense non trivial temporal dynamics (from persistence detection [40] to active sensing of kinetic parameters [35]).

Focusing on network of figure 3(D), asymmetry in S₃ gets reinforced downstream by a similar process, where S₂ itself auto activates and is in turn repressed by the asymmetric S₃. In addition, gene S₂ regulation by S₀ and S₁ is inverted compared to S₃, i.e. S₀ activates S₂ (while it represses S₃) while S₁ represses S₂ (while it activates S₃). This creates two interlocked coherent feedforward loops reinforcing the asymmetry effect, and allowing for better phase shift detection.

To see this, we plot on figure 6 the maximum values of S₂ and S₃ as a function of phase shift, for different simulated 'mutants'. To focus on the feedforward part of the network, we further remove S₄ feedback on S₂. Figure 6(A) shows the maximum level of S₃ for the normal network: it clearly saturates in an interval around Δϕ = 3π/2, so there would be no good phase shift detection in this region if we were only using S₃. If S₂ is not repressed by S₃, it is sensitive to phase shift only in a limited region, symmetric to the region where S₃ detects well the phase shift (figure 6(B)). Gene S₂ encodes a good phase detection for the whole phase range only when repression from S₃ on S₂ is present as finally illustrated on figure 6(C).

Dampening, amplification and shock refinement are due to negative feedback

In many networks we observed a clear 'amplification module', based on negative feedbacks, where the output decoder gene represses an upstream activator. This is apparent on both networks 3(A) and (D), where respective decoder S₂ and S₄ are embedded into negative feedbacks with respectively S₃ and S₂ that appear to amplify them.

Focusing again on the network from figure 3(D), there is a dual role for this module. First, the step of activation of S₄ by S₂ both dampens oscillations and amplify S₄ with respect to S₂. This effect is mathematically detailed in the present context in supplement. Second, the negative feedback from S₄ on S₂ refines the behaviour of the network close to the shock. This is illustrated on figure 6(D), where we have removed the negative interaction from S₄ to S₂: S₄ then clearly displays a more symmetrical profile (black curve), and the shock is less pronounced compared to normal behaviour (red curve). The shock corresponds to a region where maximum production rate of S₂ changes rapidly, so that the faster convergence of the oscillating system towards its average value, the more refined the shock is. It is well known that negative feedback indeed helps fast convergence towards steady state in transcriptional networks [41], and here this effect still exists for slowly varying dynamics of S₂ amplitude. More details on this feedback are included in supplement.

Evolutionary pathway

As an alternative to the a posteriori network study above, it is informative here to study the evolutionary pathway leading to the network of figure 3(D). As we will see, evolutionary transitions correspond almost perfectly to the apparition of functional modules described in the previous section, which validates their adaptive value.

Supplementary video S1 shows the time evolution of the best network (i.e. the one with the highest fitness) at each generation with its corresponding pseudo PDF and fitness (the opposite of the fitness is actually represented, so that a better network corresponds to a lower value on this graph).

Very quickly in evolution, the output gene is connected to the input genes. Until generation 120, only symmetric networks are observed, with a typical topology that contains three to four genes, with many possible interactions which are essentially neutral (figure 7(A)).
A transition occurs from generation 120 to generation 121, when the first asymmetric network appears as the best network, with the positive feedback on S₃ studied in previous section. Interestingly, as can be seen on supplementary video S1, the qualitative change of PDF at generation 121 is not continuous: there is no smooth change of PDF that leads from the first symmetric networks to the asymmetric network of generation 121 (figure 7(B)).
The generation 121 sets the core topology of the network: S₃ and S₂ appear with their interactions that are then adapted. Subsequent parameter and interaction addition leads to better fitness. S₂ gets its auto activation at generation 145. S₄ is progressively fixed as the output gene, so that the adaptation starting at generation 121 continuously deforms the best network's PDF to make it span this entire allowed output interval. Figure 7(C) shows the generation 338's best network and PDF before the next qualitative jump in network and PDF structure occurs.
The final qualitative jump appears at generation 339 and progressively establishes as the dominant topology. At this generation, a negative feedback appears from S₄ to gene S₂. First this mutation is neutral, but quickly, positive selection changes parameters to render it functional and the mutation is then fixed. The initial effect of this negative feedback, shown on figure 7(D) (and discussed above) is to reduce the amplitude of the PDF. This negative effect on the fitness is counterbalanced by the positive effect that is the reduction of the noise. Further adaptation contributes to extend the amplitude of the PDF while keeping a low noise level. The best network'(s) topology is slightly modified but the structure is essentially the same. Further slow evolution of the parameters will lead to the final network displayed on figure 3(D).

**Figure 7.** Evolutionary pathway. (A) Best network with its corresponding PDF at generation 51. (B) Best network at generation 121. (C) Best network at generation 338. (D) Best network at generation 339.
Download figure:
Standard image High-resolution image

Summing up, the sequential evolution of the network itself recapitulates the functional roles described in the previous section, a feature already pointed out in several recent works more focused on dynamics of evolution itself, e.g. [32, 42].

Reconstructing somitogenesis phenomenology using evolved networks

Using computational evolution, we have found gene networks setting a particular gene expression according to the value of the phase difference between two oscillators. Our asymmetric networks are essentially realizing the operation $D=G({\rm{\Delta }}\phi )$ , where G is an invertible decreasing function on [0, 2π] (if we neglect the weak oscillations of the output species D and the finite width of the shock). In this section, we check that such evolved networks can be indeed embedded into a more general model of somitogenesis, and in particular can help recover the phenomenological properties of mouse somitogenesis observed in [25].

As derived in appendix of [25], from equations (1) and (2), a phenomenological equation describing oscillations of individual cell is

$\begin{eqnarray}&&\dot{{\rm{\Delta }}{\phi }_{1}}(x,t)=\alpha {\rm{\Delta }}{\phi }_{1}(x,t)\end{eqnarray} \tag{ 4 }$

with ${\rm{\Delta }}{\phi }_{1}(x,t)={\phi }_{1}(x,t)-{\phi }_{1}(0,t)$ . Figure 3(F) shows that D itself is a linear function of ${\rm{\Delta }}\phi ={\phi }_{1}-{\phi }_{0}$ . As ${\rm{\Delta }}{\phi }_{1}$ and Δϕ only differ by a constant (see supplement for details), it is natural within our framework to replace Δϕ by some linear function of D, effectively implementing feedback of oscillators on themselves. Biologically, this could be done if D modifies the angular velocity of the Notch oscillator.

Computational evolution also predicts the existence of a shock in D when the two oscillators are in phase. So it is tantalizing to assume that our predicted shock might be detected by cells to trigger genetic expression giving rise to differentiation into somites. This would biologically correspond, for instance, to the expression of gene Mesp2 [43, 44].

Such process can be easily modelled: the shock observed for Δϕ* = 0 in our simulation can be used as an input of an adaptive network [45], the output being Mesp2. The rationale is that an adaptive network would not detect the slow increase of D's average value, but would be very sensitive to the shock itself when oscillators are in phase. Then we finally assume than once Mesp2 has been activated, some process turns on after some short delay to switch off oscillators. Details and a MATLAB implementation of this mechanism are given in the supplement.

Figure 8 recapitulates this model, including feedback of D on oscillator ϕ₁ and adaptive reading of the shock, as well as its behaviour. Figure 8(D) presents a simulation of the experiments in figure 4 from [25]. The two oscillators are initialised close to phase opposition, with a spatially linear phase gradient (and corresponding gradient of D measuring the phase difference). Those initial conditions mimic a process where the cells are initially similar to tail bud but slightly phase shifted, close to their stationary limit cycle, and, most importantly recapitulate the observations from [25]. At time t = 0 in this simulation, we turn on the feedback by D and then cells start to slow down, except for the posterior bottom cells, which are assumed to oscillate forever at same frequency for the two oscillators. This simulates the tail bud-like cells similarly to simulations/assumptions from [25], and more recently described in [46].

Quite strikingly, even though the relation between D and Δϕ is not perfectly linear, and despite the small remaining oscillations of D itself, this model reproduces extremely well typical kymographs of the scaling experiments from [25]. In particular, we check on figures 8(B) and (C) that the speed of propagating wave of oscillation of S₁ is an exponential function of time, as observed experimentally. Finally, there is a clear band of simulated Mesp2 gene (blue region on figure 8(D)). Two aspects are interesting: first, the width of the simulated Mesp2 stripes scale with the field of oscillations, which is consistent with the overall scaling of the experimental system (including somite) with PSM size. Second, the simulated band of Mesp2 can be visually separated in almost discrete blocks, suggestive of discrete somites. This block structure indicates that the moment at which Mesp2 is expressed in this model does not purely depends on the phase difference Δϕ.

This is further illustrated on figure 9. Starting with different initial phases at t = 0 for S₀, we integrate the network equations, imposing as before that S₁, initially out of phase ( ${\phi }_{1}-{\phi }_{0}=\pi$ ), slows down. After some time, the phase difference between the two oscillators reaches the critical value Δϕ* = 0, they are in phase (corresponding phase of S₀ is called ϕ_p). In a realistic situation, the moment when the decoder gene drops, thus triggering Mesp2 expression and defining the front, occurs shortly after this. In figure 9(A), we define Δt as the corresponding time difference between front definition and ϕ_p, and ϕ_f as the corresponding value of the phase of reference oscillator at the front. We possibly have different values for Δt and ϕ_f depending on initial phase of S₀. Two extreme cases could arise, illustrated on figures 9(B) and (C):

(1)
The front is a pure function of the phase of the reference oscillator, i.e. ϕ_f coincides with the time when phase ϕ₀ reaches a specific value. In that case Δt decreases linearly with ϕ_p. This situation is represented on figure 9(B), and the corresponding kymograph would exhibit discrete segments corresponding to phases ϕ₀ modulo 2π.
(2)
The front is a pure function of the phase difference, i.e. appears for a constant Δt. In that case ϕ_f is linear in ϕ_p. This is represented on figure 9(C). In this case, the kymograph would exhibit a smooth expression pattern for Mesp2, with no visible segments.

**Figure 9.** Models and results for front definition. (A) Sketches how the quantities on (B)–(D) are defined using schematic dynamics of the inputs and decoder genes. ϕ⁰ is the initial phase of oscillator S₀, ϕ_p is the phase of oscillator S₀ when ${\phi }_{1}(t)-{\phi }_{0}(t)=0$ , ϕ_f is the phase of oscillator S₀ at the front and Δt is the time difference between the moment when the front appears and the moment when Δϕ = 0. We show two examples of what these values could be for different values of ϕ⁰. (B) If the front appears for a given value of the phase of oscillator S₀, Δt decreases linearly if ϕ_p increases. Because the behaviour is periodic in ϕ_p, there is a discontinuity for Δt that results in discrete segments. (C) If the front appears at a given value of the phase difference, then Δt does not depend on ϕ_p and ϕ_f (modulo 2π) increases linearly if ϕ_p increases. (D) Simulation of network on figure 8(D). The behaviour is closer to situation of panel (C) as expected, but one clearly sees a slight oscillation of Δt responsible for a more discrete expression pattern for *Mesp*2 on figure 8(D).
Download figure:
Standard image High-resolution image

As illustrated on figure 9(D), our network is closer to situation of figure 9(C), which is not a surprise since decoder gene has been selected to be a function of Δϕ. Nevertheless, there is a non negligible oscillation of the phase difference on figure 9(D) that modulates the front signal as a function of ϕ_p. Because of this dependency on the absolute phase of oscillators, the system displays more discrete blocks of expression like in figure 8(D). Thus a decoder variable sensing phase difference of two oscillators can naturally implement two contradictory aspects: a continuously regressing shock combined with a mechanism for discrete somite formations. Further rostro-caudal patterning of somites would necessitate another mechanism downstream of the clock, e.g. the slowing down of the clock can provide somite polarity [47].

Discussion

Recent experiments on somitogenesis have challenged the classical CW paradigm [25]. The behaviour of Lfng waves in mouse explants [25] suggests a phenomenological model where two genetic oscillators might be crucial for positional information leading to vertebrae formation. In this manuscript, we have reformulated this idea using the concept of mutual information, and used in silico evolution to generate nonlinear gene networks encoding phase difference with a decoder morphogen. Our evolutionary computation gives clear reproducible network structures, which raises new experimental questions.

The first natural question is the nature of the oscillators implicated. There are many candidates given the plethora of oscillating genes [10]. The oscillatory models described here should be interpreted as coarse-grained descriptions, with one phase variable possibly describing oscillations of complete pathways. From [25], the natural candidate for phase ϕ₁ is the Notch pathway, with Lfng as a natural representative. A prediction from the 2-oscillator model from [25] is that reference oscillator ϕ₀ should be synchronized in the whole PSM, so that there would be no clear wave of genetic expression like what is observed for Lfng. Rather the whole PSM would simply turn 'on/off' in response to this oscillator. Dynamical reporters would be necessary to identify such first oscillator. Microarrays have clustered two groups of oscillating genes: Notch/FGF pathways and Wnt pathways [10]. So Wnt might be a priori a good candidate for ϕ₀ [13], but only detailed spatial and temporal resolution of its oscillations would answer this question. One can not exclude for instance that other components of Notch or FGF, while in phase in tail-bud, present different patterns than Lfng as PSM cells move more anteriorly.

Our study identifies the important combinatorics of positive and negative feedbacks necessary to compute phase difference. An upstream positive feedback creates asymmetry in the decoder profile, while downstream negative feedback plays a role in both dampening of the oscillations and amplifying the final signal. However, as can be seen from the variety of networks evolved in figure 2, those feedbacks can be potentially implemented in many different ways. Nevertheless, these properties are rather generic, and therefore constitute experimental predictions on coarse-grained network organisation.

The feedbacks inside the somitogenesis network appear themselves rather complex and convoluted (see e.g. the multiple delta feedbacks in zebrafish [48]), and it is clear that there are many redundancies in the system since one can knock out entire pathways and still observe oscillations [10]. Phase/amplitude information would be important to disentangle the coarse-grained feedbacks predicted here, but to date there are only few reporters allowing for such fine measurements: Notch and Wnt in mouse [13, 25], and Hes reporters in fish [14, 49]. Nevertheless, screening of oscillating pathways have identified oscillating genes common to Notch and Wnt such as Bcl9L and Nkd1 [10], that could thus be part of the feedforward pathways downstream of ϕ₀ and ϕ₁.

The decoder gene D in our model couples the two oscillators and helps defining the differentiation front. FGF pathway controls somite formation [50], and thus is expected to be related to D in our framework. Interestingly, many members of FGF pathways oscillate as well [10]. Functional roles of FGF pathway oscillations are not clear in a picture where FGF gradient sets a threshold of differentiation [51], but on the contrary, they are expected if oscillations control somite formation like in the two-oscillator framework assumed here. Interestingly, FGF indeed regulates Hes7 oscillations [52] and the whole FGF cluster appears in phase in Notch [10], thus suggesting a feedback between differentiation and oscillation itself.

It should also be pointed out that behaviours qualitatively similar to decoder D here are actually observed experimentally on reporters of core oscillating genes: there is a significant increase of both average level and oscillation amplitude, before a complete collapse in expression at the front (that would correspond to the shock). Examples include Lfng in mouse [25], or Hes in zebrafish [14]. So such genes, while belonging to the core Notch signalling pathway, might actually be part of the actual decoder, feeding back on the core oscillator. If so, their average level (or their signalling intensity) should directly control their period, which might be testable experimentally.

A last phenotypic prediction of our approach is the existence of a shock. This shock appears as a 'phenotypic spandrel' (i.e. a phenotype appearing as a consequence of another phenotype [53]) related to decoding of phase differences by a linear variable. We showed on a simple toy-model how such shock could actually define what is usually considered as the'wavefront', upstream of genes such as Mesp2, that can nevertheless display more discrete patterns of expression corresponding to somites. While the idea of a shock emerging from phase equation has been recently explored in a model bearing resemblance to the Burger's equation [54], the shock there is due to cell-to-cell coupling, while we explore here a different framework based on decoding the phase difference between two oscillators that is purely local and can be implemented in a single cell.

A major and surprising evolutionary insight from comparison of somitogenesis networks is that homolog genes might have different roles in different species [11]. The principles described here are general enough to be shared by many oscillatory pathways during somitogenesis: phase computation can be performed by combining small positive and negative feedback loops like in the asymmetric network from figure 3(D). The associated shock in genetic expression could be shared by many components and be used to define differentiation front, giving evolutionary plasticity. This suggests again that focusing on 'phenotypic' description of developmental networks might be key to functional understanding, as already suggested in [38].

Acknowledgments

We thank Alexander Aulehla for useful comments discussions, and an anonymous referee for suggesting to explore more explicitly the role of feedforward loops. This project has been partially funded by the Natural Science and Engineering Research Council of Canada (NSERC), Discovery Grant Program, McGill University Physics Department and a Simons Investigator Award for Mathematical Modelling of Biological Systems to PF.

Positional information from oscillatory phase shifts : insights from in silico evolution

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

Dates

Peer review information

Abstract

Introduction

Method

Results

Networks compute phase difference, irrespective of oscillators drift and periods

Methods for phase computation: dissection of networks

Positive feedback creates asymmetry

Layers of feedforward interactions reinforce asymmetry

Dampening, amplification and shock refinement are due to negative feedback

Evolutionary pathway

Reconstructing somitogenesis phenomenology using evolved networks

Discussion

Acknowledgments

Positional information from oscillatory phase shifts : insights from in silico evolution

Article metrics

Submit

Share this article

Author e-mails

Author affiliations

Author notes

Dates

Peer review information

Abstract

Introduction

Method

Results

Networks compute phase difference, irrespective of oscillators drift and periods

Methods for phase computation: dissection of networks

Positive feedback creates asymmetry

Layers of feedforward interactions reinforce asymmetry

Dampening, amplification and shock refinement are due to negative feedback

Evolutionary pathway

Reconstructing somitogenesis phenomenology using evolved networks

Discussion

Acknowledgments