A simple cognitive model explains movement decisions during schooling in zebrafish

While moving, animals must frequently make decisions about their future travel direction, whether they are alone or in a group. Here we investigate this process for zebrafish (Danio rerio), which naturally move in cohesive groups. Employing state-of-the-art virtual reality, we study how real fish follow one or several moving, virtual conspecifics. These data are used to inform, and test, a model of social response that includes a process of explicit decision-making, whereby the fish can decide which of the virtual conspecifics to follow, or to follow some average direction. This approach is in contrast with previous models where the direction of motion was based on a continuous computation, such as directional averaging. Building upon a simplified version of this model [Sridhar et al. 2021], which was limited to a one-dimensional projection of the fish motion, we present here a model that describes the motion of the real fish as it swims freely in two-dimensions. Motivated by experimental observations, the swim speed of the fish in this model uses a burst-and-coast swimming pattern, with the burst frequency being dependent on the distance of the fish from the followed conspecific(s). We demonstrate that this model is able to explain the observed spatial distribution of the real fish behind the virtual conspecifics in the experiments. In particular, the model naturally explains the observed critical bifurcations for a freely swimming fish, whenever the fish makes a decision to follow only one of the virtual conspecifics, instead of following them as an averaged group. This model can provide the foundation for modeling a cohesive shoal of swimming fish, while explicitly describing their directional decision-making process at the individual level


I. INTRODUCTION
Understanding how animals move together in groups is a long-standing puzzle [1,2].Current theoretical models of this process involve the description of individual agents that move according to simple rules, which embody the social influence exerted by their conspecifics, and external influences such as physical barriers and predators [2][3][4][5][6][7][8].Within these models, the influences of the neighbors in the group, representing the sensory information, are integrated, and the resulting direction of motion of each agent is calculated.However, during this calculation, the individuals in such model frameworks do not explicitly perform a decision-making process.In other words, the direction of motion of each agent is given by some unique and continuous function of the positions and velocities of the detected (usually neighboring) conspecifics [3,4,[9][10][11][12][13].Given two identical conspecifics, the agent in these modelas has no mechanism to spontaneously decide to follow only one of them (at a given time), in contradiction to the observations [14].
We propose here a theoretical model for how animals respond to social cues, that includes an explicit decisionmaking process [14].Our current work is based on our theoretical model for how animal groups, and individual animals, make decisions regarding their desired direction of motion while moving towards and choosing between different targets [14,15].This assumption is supported by neurobiological studies conducted in a range of animals [16,17].Within this model, the animal's brain is assumed to have neuronal representations of the vectors aimed at the different targets.The neurons (or neuronal groups) that represent each target are treated as "spins", in the terminology of statistical physics, with each spin representing either an "on" (1) or "off" (0) state.The spin state corresponds to the firing state of the neuronal group that it represents.This spin model is a simplification of (and can be mapped to [14]) widely used models of neuronal representation of spatial targets [18,19].The interactions between the spins drives a geometry-based phase transition (from averaging among options to deciding on one), which manifest as bifurcations in the resulting trajectories.This model thus predicts that the animal's direction of travel would tend to be in a compromise (average) direction between the different targets, when they are far, and the relative angle between them is small.However, above a critical relative angle, such as when the animal approaches the targets, the trajectories bifurcate and point toward one target, in the case of only 2 options, and a subset of targets for a larger number of options.This prediction was verified for several types of insects (desert locusts and fruit flies), moving toward stationary targets in a virtual-reality arena [14].
The model was also employed to consider how zebrafish (Danio rerio) respond to conspecifics.Again using immersive virtual reality [14], allowing virtual fish to be projected in specific configurations while on the move, it was also found that individuals would average when the angle subtended by virtual fish was below a critical value, and a decision to follow one among the remaining options occurred when this angle exceeded a critical value.In the comparison with the theoretical model in this previous work [14], it was assumed that the real fish (RF) can only move along one dimension, perpendicular to the direction of motion of the virtual fish (VF) (Fig. 1).This way, the RF is assumed to be moving at a fixed distance behind the line of VF, and only its lateral motion sideways was modeled.Note that in the experiments, the RF also maintains a very similar depth to the VF that it is chasing [14], thus schools of real juvenile zebrafish tend to be quasi-planar.Despite the considerable simplification in the theoretical model, the experimental and simulated distributions of the RF, when projected to one dimension, were in very good agreement for both two and three VF [14].The essence of the simplification obtained by the one-dimensional projection of the fish motion is that it eliminates the need to calculate the speed of the fish, as it responds to the VF.
In the experiments, however, the RF moves freely in (predominantly) two dimensions (due to the nature of fish to swim at the same depth as each other) when schooling with the VF, and its distance to the VF constantly changes during the chase.Here we take this freedom of movement into account and develop a theoretical framework that implements the spin model for directional decision-making, together with a calculation of the fish speed, as the RF interacts freely with an arbitrary number and spatial configurations of moving conspecifics (which can be thought of as mobile targets).This model allows us for the first time to compare the spin model for directional decision-making to the full two-dimensional experimental trajectories of the RF [14].We emphasize that our model does not include any explicit alignment interaction of the Vicsek-type [2,20].Moreover, the main theoretical novelty of our work is the spin-based directional decision making model, with its spontaneous symmetry-breaking property, as opposed to previous models using vectorial averaging and simple distance-based interaction rules.The calculation of the speed, using burst-and-coast dynamics, is more standard [21,22].

II. THE FISH MODEL
Motivated by our experimental observations on zebrafish (Danio rerio) [14], we will limit the motion of the real fish (RF) in our model to a two-dimensional plane on which fish interact (i.e. they are assumed to be swimming at the same depth in the water column) (Fig. 1).In our experiments the VF move with the burst-and-coast (or, burst-and-glide) motions as do real fish (Fig. 2A).In these experiments the virtual fish move in straight trajectories, and make U-turns when they near the edge of the experimental arena.Our model allows us to calculate the full two dimensional motion of the RF, when it interacts with several VF (which move as in the experiments), as illustrated in Fig. 1.We clearly denote in all the figures the experimental data versus the theoretical results, to allow a direct visual comparison to be made.

A. Direction of motion: Spin dynamics
Within our model the direction in which the RF is moving is dictated by the neuronal firing, which is represented by the "on" spins in the Ising-like model [14].Each target is described by a group of spins; each one can be either "on" (1) or "off" (0), such that the net unit vector of the RF's desired direction of motion is given by where, n i refers to the number of "on" spins in subgroup i, ni is the unit vector pointing from the RF towards target i (Fig. 1), and m is the number of VF (and spin groups).
Each neural group vector ni is directed, on average, at each of the different VF, but we consider that there is angular noise affecting the angular direction of this vector according to the following Langevin equation where x i , y i are the coordinates of target i with respect to the location of the RF, Γ is the amplitude of the Gaussian noise (with a standard deviation of σ = π 3 and limits of −b < Γ < b, where b = π), and γ is the rate at which the angle of the internal vector re-adjusts to the new perspective angle of target i.
The dynamics of the spins in each group are calculated using the Ising model, based on the following Hamiltonian [14,15] where σ i are the spin variables, N is the total number of spins in all subgroups, and J ij are the interactions between the spins, given as the dot products of the preferred directions of spins i and j, such that: , where J 0 is the interaction strength energy, and θ * ij is the modified form of the relative angle according to: where θ ij is the Euclidean relative angle (Fig. 1) and the tuning parameter ν describes the non-Euclidean transformation of the relative angle, thereby determining the interactions between the spins (Fig. S1).Comparisons with experiments indicate that a suitable tuning parameter is ν = 0.5 [14], across taxa, which is the value we used in our simulations.The dynamics of the spins follow from the Hamiltonian (Eq.3).They can be calculated for discrete, stochastic spins, or using a continuum description in terms of the following Langevin equation for the dynamics of the neural firing where B(n i ) is the noise amplitude, m is the number of neuronal groups, and ω is a normally distributed white Gaussian noise (with a standard deviation of 1).This dynamic is applied to each spin group that represents a target.The noise amplitude B(n i ) is given by [15] (see SI section S2) where the spin-flip rates (per spin) are given by where the sum goes over all spin groups except i, k 0 = 1[s −1 ] is a spin-flip rate constant, and T stands for the dimensionless ratio between the noise amplitude and the spin-spin interaction strength, which determines the dynamics of the spins flips.These spin flips represent the dynamics of neuronal switching between firing and non-firing states.Integrating Eq. 5 over time gives us the dynamics of the spins that are "on", which determines the direction of the RF motion (Eq.1), as a function of time.
The contribution of each spin group to the desired direction in Eq. 1 is modified if there are several targets that are very close to each other.Previous work suggests that targets that are within a small relative angle with respect to the animal, are considered as a single target [23].To account for this effect, we incorporate an 'overlap' function in our model [14].This function reduces the weight of spins encoding a target in Eq. 1, if there are other targets in directional proximity (see SI section S3).This effect becomes crucially important for the case of 3VF, as described below (Fig. 4).
Note that the distance to the target does not influence an individual's direction in the current model: spins that represent proximal targets do not have a larger weight in Eq. 1 compared to spins that represent more distant targets.We can add such modifications in the future, as needed.

LRD
FIG. 1: Illustration of a 2D system with 1RF (black) following 2VF (red): θ12 is the instantaneous relative angle between the two targets, θi is the angle from the RF to V Fi.Note that we use the same terminology to describe our model as employed in our experiments; thus, the RF in the model represents the focal individual whose cognitive process is being simulated whereas the VF represent the VF used in the virtual-reality experiments, moving in the same direction as each other and at a fixed distance apart.The instantaneous direction vectors to the targets are n1 and n2, with the corresponding distances r1,r2.The lateral distance (LRD) denotes the distance between nearest neighbor VF, when the VF are moving along a line that is perpendicular to their direction of motion (modified from [14]).The VF velocity vectors were taken to be identical, so that the VF move in synchrony, and describes a series of burst-and-coast dynamics.

B. Speed of motion: Burst-and-coast dynamics
In this work, we will consider two possible models for the amplitude of the velocity (V ), and in both models, this amplitude depends on the distance (r) between the RF and the chased VF.Recent experiments demonstrated that the effective interactions between leader-follower have a spring-like property [3,11,12], whereby the follower increases its speed as its distance to the leader increases, up to a critical distance where it declines.This behavior was also measured for the RF following the VF (see Fig. S4).Note that the precise implementation of this tendency, which keeps the RF from losing the VF that it is chasing, is not the main focus of our model.The novel feature that we introduce relates to directional decision making (as described in the previous section).Therefore, there can be different implementations of the RF speed, as we discuss below.
In the SI section S5, we give the results for a model where the speed of the RF (V RF ) is described as a continuous function of the distance between the RF and VF.However, we wish to focus here on a more realistic description of the RF speed, as arising from burst-and-coast dynamics [24,25], in which the RF's velocity will be given by bursts of force from the tail-beat events, occurring in a stochastic manner.The differential form of the velocity can be written as where η is the friction coefficient with the water, F is the stochastic force parameter, which is either "on" (F = 1) or "off" (F = 0), and f is the force amplitude of each burst (in units of force/mass).We assume that the rate at which fish exhibit bursts of tail beats (which account for the rapid acceleration in the burst-phase of their motion), k s,on , depends solely on the distance of the RF from its targets (r = | r|, Fig. 1), and therefore we use the following expression where k is an effective spring constant (k = 250[m • s] −1 ), and r d is the length-scale beyond which the fish loses its target (we fit it to be r d ∼ 0.2[m]).The linear dependence of the burst rate on the distance to the VF, for small values of r (Eq.9), is motivated by the interactions found between fish [11], and allows the RF to successfully pursue the VF without losing it.We tested this relation by statistical analysis of the experimental RF's dynamics, which was not conclusive (SI section S7, Fig. S6).
We use this rate in a Gillespie method (see SI section S5) within our Langevin simulations, and it determines when the next burst of tail beats occurs.Following each burst, F is turned "off" after a fixed duration, t of f = 0.15[s], which for simplicity was chosen to be constant in all the simulations.When the burst of tail beats stops (F : 1 → 0), the coasting phase starts, and the velocity of the RF decreases exponentially with time (Eq.8), until the next burst event.
The heading direction of the simulated RF is updated at the onset of each tail burst, since in experiments it was found that the orientation of the RF stays approximately constant for the coasting duration (and is given by Eq. 1) [21,24].
There are two additional ingredients that we found to be important to implement in our model: In order to better resemble the velocity pattern of the experimental RF, we had to define a velocity threshold (V threshold ), whereby tail bursting occurs only when the simulated RF's velocity is lower than this threshold (SI section S8).In Fig. S7, we compare the experimental and model RF's velocity dynamic with and without the velocity threshold, of V threshold = 0.04[m/s].The velocity threshold was estimated by analyzing statistical data from the experimental velocity pattern (Fig. S8).
In addition, we added some noise to the amplitude of the burst force f , as this noise is observed in the experiments (Fig. S9A-D).Hence, we used in our calculations (Eq.8) force amplitudes (f ) that are drawn from a Gaussian distribution, centered at f 0 , with a constant variance ψ = 0.2[m][s −2 ].Note that the experimental RF exerts stronger and faster tail bursts as it is chasing a faster VF (Fig. S9E,F).However, for simplicity the simulated RF in our model is only increasing its burst rate in order to chase a faster VF.As we noted before, our main aim was to explore here the implementation of our spin model for directional decision making, so for the modelling of the RF speed, we merely wanted to implement a reasonably realistic model which can be further improved in the future.
The only parameter that we needed to change as a function of the number of VF that the RF is chasing, in order to obtain reasonable agreements with the experimental observations, was found to be f 0 : It is possible that the RF exerts stronger average tail burst forces when chasing a smaller group of fish since it feels more vulnerable and exposed.Weaker interactions between fish were indeed observed in experiments with larger groups [11].By reasonable agreement we mean that the simulated and experimental distributions of different quantities that characterize the motion of the RF have the same shape and qualitative features, as well as close average values (see for example the case of 1VF in Fig. 2B-D).
When the RF is following several VF, we need to define how the distance r that enters Eq. 9 is calculated.The problem can be framed in terms of which VF the RF pays attention to, at each instant.We define a threshold (τ ) of fractions of "on" spins (representing the strength of the corresponding neural firing) to determine which targets contribute to the calculation of the distance to targets [14].On each iteration, the level of neural firing that represents each VF (n i ) is monitored to determine if it is above the threshold.If so, the distance to the relevant VF is included in the calculation of r as follows where m * ≤ m is the number of neural groups which fire above the threshold, and | r i | is the distance of the RF to the i'th VF (Fig. 1).We used a threshold value of τ = 10%, however the results were found not to be very sensitive to the threshold value (Fig. S10).
The model parameters are summarized in table I. Their values are in close agreement with those used in other burst-coast models of this fish motion [22].

III. RESULTS
We compare our model with the experiments in which a RF can school with either one, two, or three VF swimming at different average speeds and spatial configurations.We start with calibrating the model for the case of one VF, and then keep all the model parameters fixed when modeling the behavior of RF following two or three VF, except for slightly modulating the mean burst force f 0 .

RF following one VF
We begin with the simplest system of the RF following one VF.A typical time-series of the RF velocity, from the experiment and the simulation, is shown in Fig. 2A.The distributions of the RF speed (V RF , Fig. 2B), maximal speed after each burst (V peak , Fig. 2C), and the time interval between bursts (t peak , Fig. 2D), are all in reasonably good agreement between the experiment and simulations.A typical trajectory of the RF chasing the VF is shown in Fig. 2E,F, in the VF and lab reference frames, respectively (see Supplementary Movie M1).
The accumulated spatial distributions of the RF behind the VF from our simulations are compared to the experimental observations in Fig. 2G-J.The position distributions are shown normalized either over the whole 2D space (Fig. 2G,H), or along individual x-axis sections (Fig. 2I,J).We conclude that our model reasonably fits the experiments, for the different mean speeds of the VF.Further comparisons are shown in Fig. S11.(each row), in the VF frame of reference (the VF is at the origin, red dot).For each velocity, we normalized the heat maps over the whole 2D space (G and H), or over individual x-sections (I and J).(G) and (I) show the experimental heat maps, (H) and (J) the simulated results.For each VF velocity, we ran 100 simulations (in which the RF initial position was random, within a certain distance of the VFs) each for 5, 000[s] (500,000 iteration steps).We used the following parameters:

RF following two VF
In Fig. 3 we plot the accumulated spatial distribution of the RF chasing two VF that are moving at different average speeds, and at different lateral separations (left-right distance, LRD, Fig. 1).In Fig. 3A-F we plot the results for a small value of LRD (= 0.06[m]), where the experimental data show the RF to be mainly positioned behind, and in-between the 2VF, and no clear bifurcation is observable in the RF distributions (Fig. 3A,C).At larger LRD values (Fig. 3G-L), we clearly see that when the RF is close to the two VF, it follows either one of them, while when it is further away it is found more in the compromise position between them (Fig. 3G,I).Further comparisons are shown in Fig. S12.
In our model, the motion of the RF between the two VF is termed the "compromise" regime, where the spin dynamics do not break the symmetry.When the symmetry is broken, one group dominates, and the other is inhibited.Such a bifurcation, termed "decision", arises in our model when the relative angle between the targets, with respect to the RF, is larger than a critical value (of ∼ 90 o for our choice of model parameters) [14].
Comparing to the simulations, we find that for the small LRD value and low VF speeds (V V F = 0.04, 0.05, 0.06[m/s]), the RF distribution is already bifurcated, i.e. the RF follows behind each of the VF, and not in the compromise position between them (Fig. 3B,D, and Supplementary Movie M2).This behavior is not observed in the experiments, and arises from the fact that the experimental trajectories of the VF and the RF are bounded by the container walls, thereby limiting the experimental data to short trajectories (of ∼ 6 − 8s duration) that are interrupted by U-turns, while the simulated trajectories are long and uninterrupted (around 5, 000[s]).We demonstrate this in Fig. 3E,K, by comparing the RF's spatial distributions obtained from many short simulated trajectories (of 7.5[s] duration), that match better the experimental distributions.During short trajectories the spatial distribution of the simulated RF does not reach its steady-state distribution, as obtained from simulating long trajectories.
At the larger LRD values, both the experimental and the simulated distributions exhibit clear bifurcation of the RF dynamics, which follows either one of the two VF (Fig. 3G-L, and Supplementary Movie M3).The projections of the RF's positions on the y-axis (Fig. 3F,L), show that there exists good agreement between the model and the experiments.Similarly, the distributions of the speed, velocity peaks, and intervals between consecutive velocity peaks are in reasonable agreement between the model and the experiments (Fig. S12).
Note that the overlap function did not significantly affect the RF distribution for the simulated two VF system, except at the fastest velocity of the VF (0.07[m/s]).This is expected, as the overlap affects the dynamics when the RF is far behind the two VF, so that their relative angle is smaller, which occurs more for the fastest moving VF.At this velocity, without the overlap function, the bifurcation event occurred too close to the targets.

RF following three VF
The experiments and the simulation results for three VF are displayed in Fig. 4. We find an overall good agreement between the accumulated spatial distributions from the experiments and the corresponding simulations (Fig. 4A-E top two lines).The bifurcations are clearly observed in the model and in the experiments for the larger LRD value (LRD = 0.10[m]), while for LRD = 0.05 [m] we see the bifurcations clearly only in the simulations but not in the experiments.This is likely due to the larger experimental noise, and to the shorter trajectories in the experiments compared to the long trajectories used in these simulations (as in the 2VF case, see Fig. 3E,K).Further comparisons are shown in Fig. S13, and in Supplementary Movies M4-M5.The left panels show the trajectory relative to the 3VF (red dots), and the right panels show the trajectory in the lab frame (trajectories of the 3VF are given by the horizontal lines).Time is given by the color bar.For 3VF cases, we used the same parameters as in 2VF (Fig. 3), except for the burst force amplitude which was changed to f0 = 0.95[m/s 2 ]. we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).
The overlap function makes a crucial difference in the dynamics of the RF in the system with three VF, as we demonstrate by plotting the simulated distribution when this effect is not taken into account (Fig. 4A-E bottom line).Without the overlap function, the RF predominantly follows the middle target due to geometrical reasons.Following the middle target appears to be a steady-state solution in the RF decision-making process, in the absence of the overlap normalization, as demonstrated by the sample trajectory (Fig. 4G).In the presence of the overlap function, we observed that the RF chases behind each of the three VF and switches between them (Fig. 4F), and we could also distinguish the bifurcation events on the density maps (Fig. 4A-D, LRD = 0.1[m]), in agreement with the experimental observations.For small LRD values, the effects of the overlap function are not visible (Fig. S13).

2VF in shifted configuration
Next, we investigated a configuration of 2VF which are not aligned along the y-axis, i.e. along a line that is perpendicular to their direction of motion.One VF was shifted in both the front-back and the left-right directions with respect to the other VF, by a distance of δ.The accumulated spatial distributions from both the experiments and simulations are compared in Fig. 5A-D, for different values of the displacement.Overall the model results match the experimental distributions.For example, even in the 'bifurcation' regime (δ = 0.06[m] in Fig. 5A-D), both the experiment and model show that the RF moves between both VF.Typical simulated trajectories demonstrate the motion of the RF between the two VF in Fig. 5E.Further comparisons are shown in Fig. S14, and in Supplementary Movie M6.
This agreement demonstrates that our model is not limited to describing a fish responding to a single line of leaders, but can also account for the fish motion within a moving group with complex geometry.Note that the one-dimensional version of the model, used in [14], cannot explain the trajectories in this configuration.

IV. DISCUSSION
Our research provides a novel approach to modeling the motion of fish when responding socially to other fish, by including an explicit model for the directional decision-making processes of the individual fish.This is implemented using a spin-based model [14], which gives rise to spontaneous symmetry breaking, describing the cognitive process of choosing which of the targets to follow.Previous models of animals following others moving in a group usually rely on alignment interactions or vectorial averaging, which do not contain a mechanism for choice.While the parameters used in our model simulations are specific to the zebrafish (Danio rerio) that were examined in the experiments, the proposed directional decision-making model may apply more generally to other animals that move in groups.
The Ising spin model that we used for the directional decision-making, is shown to provide an appropriate framework for representing the fish's directional decisions while it is chasing moving targets.While previously applied under the constraints of projecting it to a 1D motion [14], in this work we have extended it to describe the free movement of the fish in 2D when following other moving conspecifics.The model simulations recovered the main dynamical features of the experimentally observed RF motion.Most importantly, the model captures the observed bifurcation transition exhibited by real fish, as a function of the separation between the VF: when the VF are close to each other, the RF follows them along an averaged (compromise) position, while when the VF are far apart, the RF pursues each of them individually.Moreover, we used identical parameters for VF of different speeds and geometries, indicating that the model is rather general and robust.
Our model may form a first step towards implementing individual directional decision-making in models of animal collective motion.As shown in Fig. S15, we can explore the dynamics of many RF interacting using our model, following a single leader, or in its absence.These results indicate that the model can describe the formation of cohesive shoal ("swarm") behavior, without any spontaneous alignment of the fish along a particular direction [26] (see Supplementary Movie M10).This suggests that for the maintenance of collective persistent motion along a particular direction ("schooling phase" [7,13]), the spontaneous emergence of leaders is needed [27][28][29], such as leadership by indifference (whereby less socially-responsive individuals spontaneously become leaders [27]), which is not included in the current model.Note that our model ignores the finite size of the RF, which means that we do not describe the short-range maneuvers that are necessary for collision avoidance.Similarly, in a real fish school some individuals may have more influence than others [4,30].In addition, our model explicitly involves many-body interactions, that are not simply the superposition of pairwise interactions.Significant interactions beyond pairwise have indeed been measured inside fish schools [11].
The model can be elaborated and developed to include more detailed aspects of the interactions between the fish [3,4,8,13,30,31], while the explicit directional decision-making model may be incorporated into existing models of collective animal motion [32,33].It remains for future exploration how this individual decision-making process can give rise to the emergence of polarized fish schools and (possibly transient) leaders.
where θ ij is the original measured conflict angle (as shown in Fig. 1), θ * ij is the new modified angle and the tuning parameter ν describes the shape of the interaction between neural circuits, corresponding to the wiring or filtering activity of the brain's neurons (Fig. S1).Hence the interaction between the spins was modified to: From previous experiments, a suitable tuning parameter to represent animal movement was found to be ν≈0.5 [1], therefore we will use this value in our simulations.In Fig. S1 we plot the affect of ν on the conflict angle in which excitation (J ij > 0) changes to inhibition (J ij < 0).

FIG. S1:
The tuning parameter affect the conflict angle between targets (θ ij ) in which we get inhibition (J ij < 0) or excitation (J ij > 0).
As the tuning parameter is smaller, the conflict angle needed to inhibition is smaller.We used ν = 0.5 (in orange) in our fish model.

S2. CALCULATING THE NEURONAL NOISE TERM IN THE CONTINUUM MODEL
To calculate the neural noise which we added to the continuum approach, for each subgroup of neurons, we first defined r − and r + in each group to be: where, m is the number of neural groups (and also the number of VF), and n i is the amount of "on" neurons in group i.By substituting the Glauber rates as defined in Eq.7 in Eq.S3, we get: We will focus on one event change at most in δn i : here P + and P − are the probabilities of the neurons to turn "on" or "off, and the variance in neural firing is: Next, we calculate the neural noise amplitude, B(n i ): and by substituting δn 2 i inside the previous equation, we get: We will substitute Eq.S4 in Eq.S8 and this will be calculated each time step to find the neural noise amplitude which depend on N , T , n i , n j , and θ ij : We used Eq.S9 as the amplitude of the neural noise as follows: B(n i ) is the noise amplitude, and ω is a normally distributed white Gaussian noise (with a standard deviation of 1 in our model).

S3. THE OVERLAP FUNCTION
The main objective for the overlap function is to represent a situation in which several VF are within very close viewing angles with respect to the RF eyes.In such cases, we want the RF to "see" them as "one" target, so the neural groups will not fire fully to both VF, but will overlap with each other.This will decrease the total neurons that can be turned "on" in each group.In this manner, the fish will not tend to follow the overlapping targets more than isolated VF, since it sees them as approximately one fish.Figure S2 illustrates such a case in which the overlapping function could affect the neural firing of groups number 1 and 2 but does not change the dynamics in group number 3. The RF sees both V F 1 and V F 2 at very similar angles, and therefore their relative neural groups will overlap and will be reduced.

FIG. S2:
The Overlap function in cases of overlapping targets -The RF observes V F 1 and V F 2 at very similar angles, yet it sees V F 3 at a distinct angle.Therefore, the overlap function for the RF has been employed to avoid bias to the direction of the V F 1 and V F 2 .This decreases the total amount of neural firing in n 1 and n 2 depending on how the related angles overlap (how close is θ 1 to θ 2 ).The n 3 group may also be affected, but to a much lesser degree.
Each set of spins (n i ) is aimed at angle θ i which points towards target i, but for simplicity we assume it has a Gaussian form around this angle: Where σ θ is the width of the Gaussian function, and A is a normalization factor such that: The normalization of the neuronal strength due to the overlap function is calculated in the following manner: Here b = 3 √ σ θ , n i,0 is the original neural firing before including the overlapping, f i (θ i , θ) is from Eq.S11 and f j (θ j , θ) is the same just for other VF, in the direction of θ j .The sum in the denominator goes over all the subgroups (to each VF), including j = i.At every angle θ i the fraction of the spins of group i that are counted is not f i (θ i , θ)dθ, but decreased due to the overlap with the other Gaussian functions around the θ j , as given by the normalizing factor fi(θi,θ) j fj (θj ,θ) .In 2VF, when the VF's angles overlap (θ 1 − θ 2 − → 0), the function goes to 1 2 , and we do not double-count the spins that point at the same target.When the targets do not overlap, each recieves its full share of n i,0 spins.In 3VF case, there can be symmetry around the middle target (θ 1 ).When the three targets overlap, the overlap function of each of them goes to 1/3 so that we do not triple-count the spins that point at the same target.The central target is affected by a stronger overlap since it is affected by both edge targets.
We kept the parameter σ θ equal to 0.00002π on our simulations.We confirmed that the overlap contributes to our model in Fig. S13.Before appending the overlap function, the RF only chases behind the middle target of the 3VF (Fig. S13).However, when the overlap is included, the RF chases behind all three VF, which better fits the experiment.The reason is that the overlap causes the middle VF to have lower amplitude due to overlapping with the nearby VF from both sides.Subsequently the neural firing of the group directed at the middle target decreases.

S4. EFFECTIVE INTERACTION BETWEEN THE RF AND VF, EXPERIMENTS AND MODEL
The relative acceleration of a fish following other fish, or another VF, was measured in experiments (Fig. S4A-D).A similar analysis of our simulations is shown in Fig. S4E,F.

S5. SPRING-LIKE MODEL FOR THE RF SPEED
Before applying the tail burst events in our model, we tried a simpler approach, where the velocity amplitude is directly calculated in a continuous manner from the distance of the RF to its targets (depending on the attention of the RF).The RF velocity amplitude is calculated in the following way: where r = r rf − r vf is the absolute distance of the RF from the VF, V 0 is the speed value, r d is a constant that gives the range beyond which the interactions between the fish decay, and k is the spring constant.We added a finite time scale that adjusts the speed of the RF in order for it to reach closer to the VF as in experiments.Together with a noise in the velocity of the RF, such that the velocity distribution is wider and fit better.Both were added by applying the following equation: where β is the frequency to adjust the velocity, and ξ is a white Gaussian noise term.We apply this spring like method on the the two dimensional system for one, two and three targets (Fig. S5) This model resulted in a reasonably good agreement with experimental heat maps (Fig. S5A-E).However, the RF's velocity dynamics did not match the velocity pattern of the RF in experiments (Fig. S5F).Therefore, we focused in the main text on a model of realistic burst-and-coast behavior of the RF.The spring-like model did not give realistic spatial distributions when the 2VF where in a shifted geometry (Fig. 5).and (D,E) 3VF.The right panels give the speed, and projected y-position distributions.The blue/purple line show the experimental/simulated distribution.The bottom plot show the RF dynamics in experiment (blue), and the red line the RF velocity from the spring-like simulation.For good statistics we conducted for each heat map and histogram 500 simulations, with random initial position of the RF, each for 5, 000[s] which equal to 500,000 iteration steps.

S6. BURST EVENTS
In order to simulate the tail bursts we used a Gillespie method to determine whether a tail burst just occurred or not, depending on the rate of the tail to exert a burst (Eq.9).The Gillespie simulation determines in each iteration whether a tail burst just happen (F = 1) or stopped (F = 0).The first step is to determine the time of the next event, a number r is drawn from a uniform distribution and the next time step is calculated by: where k s,on the rate to burst, as given in Eq.9.We defined the time step between iterations to be ∆t = 0.01[s], so that during each iteration we check whether dt < ∆t and also if V RF < V threshold (see section S8 below).If both of these conditions are satisfied, a tail burst occurs and F = 1, which is then inserted in Eq.8, and affects the RF's velocity.

S7. THE BURST RATE VS THE DISTANCE TO TARGET
In order to examine if the experimental tail bursts are more frequent as the distance to the target fish is larger, as assumed in our model (Eq.9), we collected all the bursts events (the minimums of the V RF ) and binned them as function of the distances (| r|) in which they occur (Fig. S6, P (burst)).The rate to give a burst is calculated as follows: we first computed the probability to be at each location (by excluding all the locations just after the bursts and until the maximum of the velocity), and then divided the probability to burst at each distance (P (burst)) by the probability to be at each distance (P (r)), and get the rate of the tail to burst at different distances: where ∆t = 0.01[s] is the time step and is the same in both the experiments and the simulations.To check whether the frequency is indeed higher at greater distances, we plot the rate (f burst ) for different distances, | r| (Fig. S6), and we concluded that the frequency to exert a burst is higher for greater distances.However, the statistics from the experiments are pretty low, due to the short duration of the experimental trajectories.Also, the probability to be at edge positions, very far from the target (above 0.07[m]) or very close to the target (below 0.02[m]), is low, and still, we may get a burst event in those positions, resulting in large fluctuations in the calculated burst rate (dividing by a small number in Eq.S7).For the fastest VF, the linear relation is clear in the experimental data, while it is not very clear for the slower moving VF.First row of plots show the probability to be at each distance, P (r), and the probability to burst at each distance, P (burst) (for the experimental data).For calculating the P (burst) we collect distances where we have a burst, then bin the distances to 100 bins, and for each of them calculate how many bursts occurred.Then we calculate P (r) -the probability that the RF spend time in each of our bins, and we excluded from it the positions during the bursts (the time between burst and next peak of speed).The middle row shows the calculated burst rate, using Eq.S7, in experiments (blue) and the model simulations (orange).The last row of plots present a linear fit to the experimental rate to burst vs the distance to the target.

S8. VELOCITY THRESHOLD
In order for the simulated tail bursts to occur in a realistic manner, we found it crucial to define the velocity threshold.Only when the RF's velocity is below the value of the threshold (0.04[m/s]), a tail burst could occur (meaning we randomize a number -the tail burst event, as described in the previous section).An example of the tail waiving model without such a threshold is presented in Fig. S7, in which it is clear that the tail bursts could happen too quickly from the previous peak without this threshold (see the purple line).We demonstrated why we chose a velocity threshold of 0.04[m/s] in Fig. S8, where are shown histograms of the experimental RF speed values at the burst events.The values of minima of the speeds (where the burst events occur) are mostly below our chosen threshold.When we tried a higher value the RF velocity dynamics didn't match the experiments. V

S9. DISTRIBUTION OF THE AMPLITUDE OF THE VELOCITY BURSTS FROM THE EXPERIMENTAL RF DYNAMICS
We extracted the difference in the experimental RF velocity between the time of burst initiation and its maximum.The average acceleration amplitude for the experimental RF (blue), simulated RF (red), and the VF (yellow).(F) The average rate of the tail bursts for the RF in experiments (blue), the simulated RF (red), and the VF (yellow).This rate is calculated as the average time duration between a burst event and the timing of the prior peak velocity.

S10. ATTENTION THRESHOLD OF THE RF
By expanding the model from 1VF to several VF, we defined a threshold (τ ) of neural firing to determine which targets contribute to the calculation of the distance to targets, Eq.9.On each iteration, the neural firing (n i ) is monitored to determine if it is above the threshold.If so, the distance to the relevant VF is included in the calculation of | r| as follows: where m equals the number of neural groups which fire above the threshold.The N is the total number of neurons in all the groups, k is the number of neural groups, and | r i | is the calculated distance of the RF to target i.
In the case of the 2VF and 3VF, we included: τ = 10%.As shown in Fig. S10 the RF's y-position behind the targets are not very sensitive to the threshold value.

S11. RF FOLLOWING 1VF
We present here extended data for the case of the RF following 1VF (complimentary to Fig. 2).For each velocity we compare the distribution of the distance between the RF and the VF along the x-axis (direction of motion of the VF, Fig. S11E), which emphasizes that the RF lags further behind the VF as the VF speed increases.In our model this trend arises, as the tail-bursts occur more rapidly as the distance to the VF increases (Eq.9), enabling the RF to chase faster VF at larger lag distances.At the fastest VF speed the model gives a lag distance that is significantly larger than the experiment.However, all other measures of the RF dynamics, as shown in Fig. S11F-H, are in reasonable agreement across all VF speeds.Although a more precise fit could be obtained by adjusting the mean burst force f 0 parameter to each V V F , we prefer to keep all the model parameters as constant, such that the model is as general as possible.E-H) show the distributions (purple for simulation and blue for the experimental data) of the RF relative position along the x-axis (E), velocity (F), peak velocity values (G) and the consecutive time between velocity peaks (H).We used the same parameters as in Fig. 2. For each VF velocity, we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).

S12. RF FOLLOWING 2VF
We present here extended data for the case of the RF following 2VF (complimentary to Fig. 3).Note that we could postulate that the fish makes weaker burst forces (f 0 ) when the frequency of its tail bursts decreases, which could improve the agreement, but we opted to keep the model as simple as possible, and maintain that all the parameters are independent of the average VF speed.In particular, we chose a burst force amplitude that would prevent the RF from losing the fastest moving VF (0.07[m/s]), which causes it to move too close to the VF for the slower moving VF cases.).We used the same parameters as in the 1VF system (Fig. 2), except for: f0 = 1.2[m/s 2 ].When dealing with more than one VF we also use the following parameters: ν = 0.5 and σ θ = 0.00002π.For each VF velocity, we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).

S13. RF FOLLOWING 3VF
We present here extended data for the case of the RF following 3VF (complimentary to Fig. 4).The distributions of the RF speed are in reasonable agreement between the simulations and the experiments (Fig. S13F), while the distributions of the velocity peaks and the time intervals between consecutive velocity peaks are in very good agreement (Fig. S13G,H).
FIG. S13: RF following 3VF.The LRD value for each line in (A-H) is given on the left, with the lowest two lines presenting simulations without the overlap function.Accumulated spatial RF distribution, normalized over the whole 2D space (A,B) or over the x-axis sections (C,D).(A,C) Experimental data, while (B,D) the simulation results.(E) Distributions of the RF's projected y-positions, (F) RF's speed, (G) Speed of the RF at the peaks, and (H) the time intervals between consecutive speed peaks.In (E-H) we compare the experimental data (blue) with the model (purple).The average of each distribution is denoted by the vertical line, with the corresponding color.In (A-H) the VF velocity is 0.04[m/s].We used the same parameters as in Fig. 2, except for the burst force amplitude which was changed to f0 = 0.95[m/s 2 ]. we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).

S14. RF FOLLOWING 2VF IN SHIFTED GEOMETRY
We present here extended data for the case of the RF following 2VF in shifted geometry (complimentary to Fig. 5).This occurs at the time indicated by the fish shapes on the trajectory, and the corresponding histograms shown at this "separation" time.The RF lose the leader, form a non-polar shoal.The time duration until this separation occurs decreases with increasing number of RF, as shown in Fig. S15C (the full distributions of the separation times are given in Fig. S16).
In Fig. S15D we demonstrate that in the absence of any leader fish, our model naturally gives rise to cohesive shoal behavior, without any spontaneous alignment of the fish along a particular direction (see Supplementary Movie M10).This falls into the category of "swarm" behavior, with low global polarization of the velocity vectors [2].Note that our model ignores the finite size of the RF, which means that we do not describe the short-range maneuvers that are necessary for collision avoidance.Similarly, we ignore obstructions of one RF by another, with all the RF visible at all times.In addition, our model explicitly involves many-body interactions, that are not simply the superposition of pairwise interactions.Significant interactions beyond pairwise have indeed been measured inside fish schools [3].The average separation times (Fig. 6C) were calculated from conducting 90, 000 simulations.All simulations ran with the same parameters as in the 1VF case (Fig. 2).

S16. SUPPLEMENTARY MOVIES
The simulated RF dynamics while following 1VF, 2VF, and 3VF are shown in movies M1-M6.The many RF following 1VF target moving along a circle are shown in movies M7 (2RF), M8 (3RF), and M9 (4RF).Also, we show the many RF dynamics with no leader among them in movie M10 (5RF).All movies are in the following link: Movies • Movie M1-one RF following one VF (V V F = 0.06[ m s ]), the top plot shows the RF in the real frame of reference, and the bottom plot, the RF is simulated relative to the VF (which is in the origin).The RF is in blue and the VF is in red in both cases (parameters as in Fig. 2).
• Movie M2 -one RF (blue) following 2VF (red) in the compromise regime (LRD = 0.06[m], and V V F = 0.07[ m s ]), the right plot shows the neuronal firing in the RF brain, in which "1" represent neurons directed at the VF in y = 0.03[m], and "2" represent neurons directed at the VF in y = −0.03[m](parameters as in Fig. 3).
• Movie M3-one RF (blue) following 2VF (red) in the bifurcated regime (LRD = 0.11[m], and V V F = 0.07[ m s ]), the right plot shows the neuronal firing in the RF brain, in which "1" represent neurons directed at the VF in y = 0.055[m], and "2" represent neurons directed at the VF in y = −0.055[m](parameters as in Fig. 3).
• Movie M4-one RF (blue) following 2VF (red) in the shifted configuration (LRD = 0.03[m], V V F = 0.04[ m s ], and δ = 0.03[m]), the right plot shows the neuronal firing in the RF brain, in which "1" represent neurons directed at the VF in y = 0.03[m], and "2" represent neurons directed at the VF in y = −0.03[m](parameters as in Fig. 3).
• Movie M5-one RF (blue) following 3VF (red) in the compromise regime (LRD = 0.05[m], and V V F = 0.04[ m s ]), the right plot shows the neuronal firing in the RF brain, in which "1" represent neurons directed at the VF in y = 0.05[m], "2" represent neurons directed at the VF in y = 0[m], and "3" represent neurons directed at the VF in y = −0.05[m](parameters as in Fig. 4).
• Movie M6-one RF (blue) following 3VF (red) in the bifurcated regime (LRD = 0.10[m], and V V F = 0.04[ m s ]), the plot on the right side shows the neuronal firing in the RF brain, in which "1" represent neurons directed at the VF in y = 0.1[m], "2" represent neurons directed at the VF in y = 0[m], and "3" represent neurons directed at the VF in y = −0.1[m](parameters as in Fig. 4).
• Movie M7-2RF following 1VF (green) in a circle (of radius 0.08[m]).The neuronal activity for each RF are given in the right bar plots, where the bar color indicates a neuronal group directed at the fish with the same color (parameters as in Fig. 2).
• Movie M8-3RF following 1VF (green) in a circle (of radius 0.08[m]).The neuronal activity for each RF are given in the right bar plots, where the bar color indicates a neuronal group directed at the fish with the same color (parameters as in Fig. 2).

FIG. 2 :
FIG. 2: RF following one VF: burst-and-coast dynamics and spatial distributions.(A) (i) The RF's speed dynamics in the experiment (blue) compared with the VF velocity (purple, with an average velocity of VV F = 0.05[m/s]), and (ii) shows the simulated RF velocity (blue) compared with the VF velocity (purple).The maxima and minima of the speed are denoted by the red and pink stars respectively.(B) The speed distribution of the RF, (C) The distribution of the RF's velocity peak values, and (D) The distribution of time intervals between consecutive bursts (t peak ).In (B-D) the purple line denotes the simulation and blue line the experimental data.(E) An example of the RF trajectory behind 1VF, relative to the VF movement (located at the red dot).(F) Same as (E) but in the lab frame, where the horizontal line gives the trajectory of the VF.In both (E) and (F) time is represented by the color bar.(G-J) Accumulated distribution of the RF behind the VF, for different mean VV F (each row), in the VF frame of reference (the VF is at the origin, red dot).For each velocity, we normalized the heat maps over the whole 2D space (G and H), or over individual x-sections (I and J).(G) and (I) show the experimental heat maps, (H) and (J) the simulated results.For each VF velocity, we ran 100 simulations (in which the RF initial position was random, within a certain distance of the VFs) each for 5, 000[s] (500,000 iteration steps).We used the following parameters: T = 0.1, η = 5[s −1 ], γ = 5[s −1 ], σ = π/3[rad/s] (standard deviation of the angular noise), b = ±π[rad/s] (angular noise limits), k = 250[(m • s) −1 ], r d = 0.2[m], t of f = 0.15[s], f0 = 1.1[m/s 2 ].

FIG. 3 :
FIG. 3: Two VF -spatial distributions of the RF, as heat maps normalized over the whole 2D space (A-B, G-H) or over individual x-sections (C-E, I-K).(A,G) and (C, I) shows the experimental RF's distribution, while (B, H) and (D, E, J, K) show the corresponding simulated distributions (D,J for long simulations, while E,K for short simulations).(F, L) give the distributions of the RF's y-positions and compare the experimental data (blue) with the model (purple for the long simulation and green for the short simulations).The y-positions of the VF are denoted by the vertical Maroon lines.(A-F) show results for the compromise regime, in which LRD = 0.06m for all the VF velocities.(G-L) show the results from the bifurcated regime, in which the LRD between targets is chosen to be 0.08[m] for VV F = 0.04[m/s], 0.1[m] for VV F = 0.05[m/s], and 0.11[m] for VV F = 0.06[m/s], 0.07[m/s]).We used the identical parameters as in the 1VF system (Fig. 2), except for: f0 = 1.2[m/s 2 ].We also added the following parameters (for the neuronal tuning and the overlap function): ν = 0.5 and σ θ = 0.00002π.The long simulations statistics contain 100 simulations, each ran for 5000[s], and the short simulations statistics contains 20, 000 simulations, each ran for 7.5[s].(M) and (N) show typical trajectory of the RF (in a long simulation) in the case of the compromise regime (M), and the bifurcated regime (N), when the VF velocity is 0.06[m/s].M(i) and N(i) show the RF movement relative to the VF, and M(ii) and N(ii) show the movement in the lab frame.For the short simulation in the case of 0.07[m/s] the initial condition of xRF is randomized in the range of 0 − (−0.06)[m] behind the VF, while in all other plots it is randomized in the range of 0 − (−0.1)[m].

FIG. 4 :
FIG. 4: RF following 3VF with an average velocity of VV F = 0.04[m/s].The value for each line in (A-E) is given on the left, with the third-panel presenting simulations without the overlap function.The heat maps present the accumulated spatial RF distribution, normalized over the whole 2D space (A,B) or over the x-axis sections (C,D).(A,C) Experimental data, while (B,D) the simulation results.(E) Distributions of the RF's projected y-positions (experiments in blue and simulations in purple).The y-positions of the VF are denoted by the vertical Maroon lines.(F,G) Examples of simulated RF trajectories behind the 3VF when LRD = 0.1[m] with (F) and without (G) the overlap function.The left panels show the trajectory relative to the 3VF (red dots), and the right panels show the trajectory in the lab frame (trajectories of the 3VF are given by the horizontal lines).Time is given by the color bar.For 3VF cases, we used the same parameters as in 2VF (Fig.3), except for the burst force amplitude which was changed to f0 = 0.95[m/s 2 ]. we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).

FIG. 5 :
FIG. 5: 2VF in a shifted geometry: one of the VF is shifted in the front-back and the left-right directions with respect to the other VF, by a distance of δ.Both of the VF swim with an average velocity of VV F = 0.04[m/s].The accumulated spatial distribution of the RF behind the 2VF, (A,B) normalized over the whole 2D space, or (C,D) normalized over x-axis sections.(A) and (C) show the results from experimental data, while (B) and (D) show results from the simulations.The VF are denoted by the red circles.The model parameters are the same as those used in Fig. 3. (E) Examples of RF trajectories relative to the 2VF (red dots), for the cases of δ = 0.03[m], and δ = 0.06[m].In both plots, the color of the line represents the time, as indicated by the colorbar, indicating that the RF can go back to follow the trailing VF.
FIG.S3: The Gaussian function of Eq.S11 representing the neuronal spread of the neurons that point to each target.The right plot (A) shows two targets configuration (2VF), and the left plot (B) shows three targets configuration (3VF).The 3VF is symmetrical around the angle θ1 directed at the central target.We chose a value of σ θ = 0.00002π for both 2VF and 3VF.The values of the normalization factor in these examples are: in 2VF -overlap function of groups 1 and 2 get 0.9805.In 3VFoverlap function of groups 1 and 2 is 0.9805 and the overlap function of group 3 equals 0.9610.
FIG. S4: The effective interactions between fish, as manifested by their acceleration as function of their position in relation to the fish they are chasing.The chasing RF is at the origin (0, 0) and the target is at different positions around it.The x-axis is the front back distance between the RF and target, y-axis is the relative left/right displacement.The turning speed and forward speed components are shown by the colormap.(A) Experimental data showing turning components of the RF's speed when chasing another RF, or a VF (B).(C) Experimental data showing forward components of the RF's speed when chasing another RF, or a VF (D).(E-F) The simulated turning (E) and forward speed (F), using the same parameters as in Fig.2.
FIG.S5: Spring Like interactions model -Heat maps of the spatial distribution of the RF chasing (A) 1VF, (B,C) 2VF and (D,E) 3VF.The right panels give the speed, and projected y-position distributions.The blue/purple line show the experimental/simulated distribution.The bottom plot show the RF dynamics in experiment (blue), and the red line the RF velocity from the spring-like simulation.For good statistics we conducted for each heat map and histogram 500 simulations, with random initial position of the RF, each for 5, 000[s] which equal to 500,000 iteration steps.
FIG. S6: Rate to give a burst as function of the distance to the VF.(A) for VV F = 0.04[m/s], (B) for VV F = 0.05[m/s], (C) for VV F = 0.06[m/s], and (D) for VV F = 0.07[m/s].First row of plots show the probability to be at each distance, P (r), and the probability to burst at each distance, P (burst) (for the experimental data).For calculating the P (burst) we collect distances where we have a burst, then bin the distances to 100 bins, and for each of them calculate how many bursts occurred.Then we calculate P (r) -the probability that the RF spend time in each of our bins, and we excluded from it the positions during the bursts (the time between burst and next peak of speed).The middle row shows the calculated burst rate, using Eq.S7, in experiments (blue) and the model simulations (orange).The last row of plots present a linear fit to the experimental rate to burst vs the distance to the target.
FIG.S7:The RF velocity: the simulated RF's velocity dynamics without the velocity threshold is shown in purple, while the RF's velocity with the threshold (0.04[m/s]) is in yellow.The experimental VF velocity is in blue.We used here the same parameters as in Fig.2(average VV F = 0.06[m/s]).
FIG. S8:The distributions of the velocity at the onset of bursts for different VF velocities.The green distributions are for simulations and the orange distributions for the experiments.The black line is the velocity threshold we chose (0.04[m/s]), below which most of the experimental bursts occur.

FIG. S11 :
FIG.S11: RF chasing 1VF.(A-D) Accumulated distribution of the RF behind the VF, for different mean VV F (each row), in the VF frame of reference (the VF is at the origin, red dot).For each velocity we normalized the heat maps over the whole 2D space (A and B), or over individual lines of constant x-sections (C and D).(A) and (C) Displays the experimental heat maps, (B) and (D) shows the simulated results.(E-H) show the distributions (purple for simulation and blue for the experimental data) of the RF relative position along the x-axis (E), velocity (F), peak velocity values (G) and the consecutive time between velocity peaks (H).We used the same parameters as in Fig.2.For each VF velocity, we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).

FIG. S12 :
FIG. S12: Two VF -heat maps normalized over the whole 2D space (A-B and J-K) or over the x-axis sections (C-D and L-M).(A,J) and (C, L) The experimental RF's accumulated distribution, while (B, K) and (D, M) show the corresponding simulated distributions.(E, N) Distributions of the RF's x-positions relative to the 2VF, and y-positions (F, O). (G, P) Histograms of the RF speed (VRF ), (H) burst velocity peak values (P peak ), and (I) the time intervals between consecutive velocity peaks (t peak ).All panels compare the experimental data (blue) with the model (purple).The average of each distribution is given by the vertical line, with the corresponding color.The top four lines are for LRD = 0.06m, while the bottom four lines are for: LRD = 0.08[m] for VV F = 0.04[m/s], LRD = 0.1[m] for VV F = 0.05[m/s], and LRD = 0.11[m] for VV F = 0.06[m/s], 0.07[m/s]).We used the same parameters as in the 1VF system (Fig.2), except for: f0 = 1.2[m/s 2 ].When dealing with more than one VF we also use the following parameters: ν = 0.5 and σ θ = 0.00002π.For each VF velocity, we ran 100 simulations (in which the RF initial position was random) each for 5000[s] (500,000 iteration steps).
FIG. S15: Simulating groups of RF. (A, B)Examples of simulated trajectories, with (A) 2RF and (B) 3RF in the presence of a VF (green) swimming in a circle of radius 0.08[m] with a constant velocity of 0.05[m/s].The starting positions of the RF and VF are denoted by the colored circles, and the end points by the colored fish.The side panels show the spin states of each of the RF, representing the weight of the fellow RF and the VF on their direction of motion.We denote these spin states at a typical "following state" (denoted by the stars on the trajectories) where the RF were following the VF, and the "separation state" at the end of the trajectories, where the RF formed a shoal and stopped following the VF.(C) A plot of the mean time until the RF separate from the VF, as a function of the number of RF, based upon running 90, 000 simulations for each case of RF number.The mean times for 2RF,3RF and 4RF are: t(2RF ) = 9.36[s] ± 13.49[s], t(3RF ) = 6.05[s] ± 8.89[s] and t(4RF ) = 2.44[s] ± 1.96[s].(D) A typical short trajectory of five RF swimming together with no leader (colored circles denote the starting points).We used the same parameters as in Fig.2.

TABLE I :
Table of parameters used in our model.