Abstract
Biological cells estimate concentration gradients of signaling molecules with a precision that is limited not only by sensing noise, but additionally by the cell's own stochastic motion. We ask for the theoretical limits of gradient estimation in the presence of both motility and sensing noise. We introduce a minimal model of a stationary chemotactic agent in the plane subject to rotational diffusion with rotational diffusion coefficient D. The agent uses Bayesian estimation to optimally infer the gradient direction relative to itself from noisy concentration measurements. Meanwhile, this direction changes on a time-scale 1/D. We show that the optimal effective measurement time, which characterizes the time interval over which past gradient measurements should be averaged to reduce sensing noise, does not scale with the rotational diffusion time 1/D, but with the square root (rD)−1/2, where r is a rate of information gain defined as a signal-to-noise ratio normalized per unit time. This result for gradient sensing parallels a recent result by Mora et al (2019 Phys. Rev. Lett.) for sensing absolute concentration in time-varying environments.
Export citation and abstract BibTeX RIS
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
Many motile biological cells navigate in concentration gradients of signaling molecules in a process termed chemotaxis [1–6]. At cellular scales, the stochastic binding of signaling molecules results in molecular shot noise and renders concentration measurements inherently noisy [7]. This sensing noise imposes physical limits on the precision of chemotaxis [7–13]. Experimental work suggests that biological cells indeed operate at these limits in shallow concentration gradients [14–18]. Temporal averaging over extended measurement intervals is a common strategy to reduce sensing noise [7, 19–21]. Yet, temporal averaging may be limited in time-varying environments [18, 22–25], or more directly by the stochastic motion of chemotactic agents themselves [26]. How should a chemotactic agent integrate previous and more recent measurements to most accurately estimate the relative direction of a concentration gradient if this direction changes stochastically in time?
Classical work by Berg and Purcell showed that the relative accuracy ⟨(δc/c)2⟩ for sensing absolute concentration c in the presence of measurement noise scales as 1/T with total measurement time T [7]. Intriguingly, Mora et al recently showed that in time-varying environments, this accuracy scales instead as (τr)−1/2, where τ is the time-scale on which the concentration changes, and r is a rate of information gain [25]. Correspondingly, the optimal effective measurement time for temporal averaging reads Teff = (τ/r)1/2 [25]. To the best of our knowledge, it is not known if similar statements hold also for the nonlinear problem of estimating the direction of a concentration gradient. Here, we develop a theory of optimal estimation of gradient direction in the presence of rotational diffusion with rotational diffusion coefficient D, and hence rotational diffusion time τ = (2D)−1. We derive results that are analogous to Mora et al, provided r is now interpreted as a signal-to-noise ratio of gradient sensing normalized by unit time.
Review of previous results. Previous work suggested that bacteria use Kalman filters to track a time-dependent concentration signal [27], providing an optimal weighting of past and recent measurements, see also [26]. Yet, Kalman filters address linear problems [28], while sensing a direction is a nonlinear problem of circular statistics, which prompts Bayesian updating of an angular likelihood distribution at each time step [29]. Bayesian estimation had been successfully applied, e.g. for monitoring time-varying environments with two states [30], estimating absolute concentration [25, 31–33], or temporal changes thereof [24, 34]. In contrast, previous theories of Bayesian sensing of direction [11, 12, 15, 35], or target position [36], did not account for motility noise as we do. In the engineering literature, estimation algorithms known as 'bearing tracking' do account for motility noise [37, 38], yet lack an analytical theory on the theoretical limits of gradient sensing.
Structure of manuscript. Our manuscript is structured as follows: we first introduce our minimal model of gradient sensing and formulate sequential Bayesian estimation, consisting of alternating update and prediction steps for a likelihood distribution of relative gradient direction. We use a suitably discretized time, and the calculus of von-Mises distributions; where von-Mises distributions can be considered the analogue of the normal distribution for direction angles. As our main result, we derive an analytic result for the ultimate precision of gradient estimation, see equation (19). We show that the estimated precision of an individual agent faithfully reflects the true accuracy of estimation errors in an ensemble of identical agents, as expected for optimal Bayesian estimation. Estimation becomes sub-optimal if the agents do not know their rotational diffusion coefficient exactly. If the agents, however, do not account for their rotational diffusion at all, gradient sensing would fail in a characteristic manner. We emphasize that our minimal model is representative of a larger class of problems, e.g. agents exploring time-dependent concentration gradients that likewise change stochastically on a characteristic time-scale τ.
2. Minimal model of spatial gradient sensing
We consider a chemotactic agent that seeks to estimate the direction of an external concentration gradient by detecting binding events of signaling molecules that diffuse to its surface.
As a minimal model, we consider a stationary agent that is positioned at the origin of a two-dimensional plane. The agent is disklike with radius a. To leading order, the external concentration gradient will be linear
Here, c0 denotes a base concentration, α = |∇c|a/c0 a normalized strength of the concentration gradient, and ec = ∇c/|∇c| is the unit vector pointing in the direction of the gradient. We introduce a material frame of the agent with orthonormal unit vectors h1 and h2, see figure 1(A). The agent seeks to estimate the direction angle ψ* of the gradient relative to the material frame, such that ec = cos ψ* h1 + sin ψ* h2. The circumference r(φ) of the agent is parameterized by r(φ) = a cos φ h1 + a sin φ h2. The rate Λ(φ) at which molecules are detected at the circumference of the agent at position r(φ) is proportional to the local concentration
where λ is a proportionality constant that depends on the diffusion constant of the signaling molecules.
The sequence of binding events at respective times θj and polar direction angles φj can be conveniently encoded in a complex signal
By definition, the expectation value of this signal equals the Fourier transform of the angle-dependent binding rate, ⟨s(t)⟩ = ∮dφ Λ(φ) exp(iφ). Generally, it is much harder to estimate a gradient direction than absolute concentration c0. Correspondingly, we focus on the problem of estimating the relative gradient direction angle ψ* and assume that the agent a priori knows c0 as well as gradient strength α.
We introduce the signal-to-noise ratio of gradient sensing for a given time interval Δt, following [39]
This signal-to-noise ratio scales linearly with Δt and defines an information rate with units of an inverse time
The information rate defines a characteristic time scale of the problem as 1/r. On times shorter than 1/r, the agent cannot gather much useful information on the gradient direction.
We now turn to a time-dependent problem and consider rotational diffusion of the agent with rotational diffusion coefficient D = Drot. The rotational diffusion coefficient defines a second time-scale τ = 1/(2D). We will show that these two time-scales, 1/r and τ, together determine the ultimate precision of gradient sensing. In the following, we consider the limit of weak rotational diffusion, D ≪ r, which is the biologically relevant case. In the opposite limit, D ≳ r, the relative gradient direction would change so fast that the agent could simply not track it.
To make analytical progress, we assume a discrete time dynamics with equidistant time points tj = jΔt. We choose the interval Δt of time discretization such that
where
Here, we introduced a time-scale Teff as the geometric mean of 1/r and τ; the time-scale Teff will later characterize the time-scale of convergence of the estimated precision of gradient sensing to its limit value (and, in fact, represent an optimal effective measurement time). Thus, the second inequality in equation (6) ensures that the time discretization is finer than the dynamics of the problem. Our choice of Δt formally corresponds to a large signal-to-noise ratio SNR = rΔt ≫ 1 for a time interval of duration Δt. At the same time, our choice ensures that the effect of rotational diffusion is small compared to sensing noise during a time interval of duration Δt, i.e. DΔt ≪ 1/SNR. Note that for a time step that is too coarse, alternating update and prediction steps would cause the estimated precision to oscillate; a small time step ensures that the amplitude of these oscillations is small.
At the times tj , the agent shall undergo discrete rotational diffusion events, see figure 1(B). Correspondingly, the relative gradient direction angle ψ* will be constant in each time interval for with , but change by a discrete amount at tj . The stochastic increments Δj are uncorrelated and satisfy ⟨Δj Δk ⟩ = 2DΔtδjk . While estimating the direction of the concentration gradient, the agent will account for its rotational diffusion, assuming that its rotational diffusion coefficient is .
During the time intervals , the agent does not rotate and integrates the stochastic signal s(t) for a time span Δt
The angle of this complex number represents a noisy measurement of the true current gradient direction angle .
3. Bayesian gradient sensing
3.1. Statistics of single measurements
What is the statistics of the measurements ? In total, the agent will detect on average ν = ∫dφ Λ(φ)Δt = λc0Δt molecules during a time interval of duration Δt. A short calculation yields for the mean and variance of
For this calculation, we considered all binding events in the time interval , enumerated by an index k with respective angles φk , where the total number νj of these events is a Poisson random variable with mean ν and variance ν. We then find , and , where we used (Δt/ν)2 ∫∫dφ dφ' Λ(φ)Λ(φ') exp i(φ − φ') = (α/2)2. Similarly, , which implies that fluctuations of in the complex plane are isotropic [40].
Our choice of Δt in equation (6) implies in particular ν ≫ 1. In this limit of large molecule counts, we can approximate the probability distribution of the measurements by a normal distribution with the same mean and the same variance Σ = ν/2 of the real and imaginary part of (diffusion approximation)
In the last step, we simply used the Pythagorean trigonometric identity. The factor of proportionality depends only on . If we interpret the last expression as a function of , it provides a first example of a von-Mises distribution with so-called measure-of-concentration ζj
. von-Mises distributions can be regarded as a generalization of the normal distribution for angular variables and are further discussed in section 3.2 and appendix
A cursory explanation for equation (12) reads , where we used equation (6) in step (*); a more rigorous derivation can be found in appendix
3.2. Estimation of directions using von-Mises distributions
The agent uses the sequence of measurements , j = 1, ..., m, to compute a likelihood distribution for the unknown gradient direction angle at time tm . In each time step, the agent performs a prediction step (accounting for rotational diffusion), and an update step (accounting for the new measurement ), see figure 1(C). For the prediction step, the agents starts with a prior that incorporates all previous measurements, with j ⩽ m − 1, and computes from it a modified likelihood distribution that accounts for the loss of knowledge due to the rotational diffusion event at tm−1. In the update step, the agent computes the new likelihood distribution using Bayes' rule, incorporating the new measurement , with the predicted likelihood as prior.
A fundamental difficulty of this estimating problem is that it concerns an angular variable, i.e. all likelihood distributions must be 2π-periodic. This sets our problem apart from other problems previously considered in the literature [25, 30].
We will approximate the different likelihood distributions by von-Mises distributions
where μ denotes the mean and κ is the so-called measure-of-concentration. The von-Mises distribution can be regarded as a generalization of the normal distribution for angular variables, where κ plays a role similar to an inverse variance. The von-Mises distribution is the maximum-entropy distribution p(ψ) that maximizes information entropy ∫dψ p(ψ)ln p(ψ) if the first complex moment ⟨exp(iψ)⟩ is specified. Importantly, the product of two von-Mises distributions is again a von-Mises distribution (up to normalization), . Here, the new mean μ and new measure-of-concentration κ satisfy κ exp(iμ) = κ1 exp(iμ1) + κ2 exp(iμ2), i.e. the new mean is a weighted circular average of the old mean values. If the old mean values are close, we have κ ≈ κ1 + κ2, which generalizes an analogous, exact statement for normal distributions. Likewise, μ ≈ (κ1/κ) μ1 + (κ2/κ) μ2. The convolution of two von-Mises distributions is only approximately a von-Mises distribution, . Here, the new mean is the sum of the old mean values, μ = μ1 + μ2, and the new measure-of-concentration is approximately the harmonic sum of the old measures-of-concentration, . The sharper the distributions are (κ1, κ2 ≫ 1), the more accurate this approximation for convolutions becomes. Appendix
We will use products and convolutions of von-Mises distributions in the update and the prediction step, respectively. The mean values of these von-Mises distribution represent the current maximum-likelihood estimates of the gradient direction angle.
3.3. Bayesian updating
We assume that just before the updating step at time tm , the agent starts with a Bayesian prior for the unknown gradient direction angle ψ*, given in terms of a von-Mises distribution with mean and measure-of-concentration κ'm−1
In light of the new measurement , the agent updates this prior according to Bayes' rule
Here, is the probability to obtain the specific measurement if the true gradient angle were ψ, i.e. the measurement model. Similarly, the normalization factor is the probability to obtain this specific measurement if the true gradient direction were distributed according to the Bayesian prior ; explicitly, .
The right-hand side of equation (15) represents a product of von-Mises distributions, which yields again a von-Mises distribution, whose measure-of-concentration shall be called κm . Our choice of time step Δt in equation (6) implies that both the mean of the first factor and the mean of the second factor are close to the true angle , provided true and assumed rotational diffusion coefficient are equal, , or at least of similar magnitude, see below. Consequently, also |μ1 − μ2| ≪ 1. The calculus of von-Mises distributions thus yields the iteration rule κm ≈ κ'm−1 + ζm , or, using equation (12)
Fluctuations of ζm cause small fluctuations of κm not considered here.
3.4. Prediction step: rotational diffusion
For rotational diffusion, the stochastic increments Δj of the relative direction angle are distributed according to a wrapped normal distribution with variance parameter 2DΔt. In the limit of weak rotational diffusion considered here, this distribution is well approximated by a von-Mises distribution with zero mean and measure-of-concentration (2DΔt)−1. The agent accounts for this stochastic dynamics of the relative direction angle in a prediction step, assuming its rotational diffusion coefficient to be (which may or may not be equal to the true value D). The prediction step thus corresponds to a convolution of two von-Mises distributions
Equation (17) is a special case of a Chapman–Kolmogorov equation, with assumed transition probability P(ψ'|ψ) from ψ to ψ' given by .
By the calculus of von-Mises distributions, we obtain again an iteration rule, now for the measure-of-concentration κ'm of the likelihood distribution after the prediction step
The prediction step does not change the maximum-likelihood angle, i.e. , since the diffusion kernel is centered at zero.
3.5. Theoretical limit of gradient sensing in the presence of rotational diffusion
Together, the nested iteration rules equations (16) and (18) define a monotonically increasing sequence of measures-of-concentration κm , which converges to a limit value κ∞ for . The measure-of-concentration κ∞ represents a theoretical limit for the estimated precision of gradient sensing at steady-state. We compute κ∞ as a fixed point of the nested iteration rules equations (16) and (18). Specifically, the limit values κ∞ = limm→∞ κm and κ'∞ = limm→∞ κ'm satisfy κ∞ = κ'∞ + SNR and ; eliminating κ'∞ yields , or . Multiplying with the common denominator and rearranging yields the quadratic equation κ∞(κ∞ − SNR) = SNR/(2DΔt), hence κ∞ ≈ [SNR/(2DΔt)]1/2 since κ∞ ≫ SNR by equation (6). We thus have
This result for κ∞ highlights the competition between information gain with rate r = SNR/Δt and information loss by rotational diffusion with rotational diffusion coefficient D. We can rewrite equation (19) as κ∞ = rTeff, where we introduced the effective measurement time Teff = (τ/r)1/2.
This theoretical limit is strict and faithful, i.e. it reflects the distribution of estimation errors in an ensemble of identical agents, as we show next.
3.6. Distribution of estimation errors
The mean of represents the maximum-likelihood estimate of the agent at time tm
. By the update step, equation (15), which expresses as a product of von-Mises distributions [and the rules for products of von-Mises distributions, see appendix
We consider the distribution p(ɛm ) of estimation errors in an ensemble of agents with . We are interested in the true accuracy of estimation, defined as the circular variance of this distribution, CV[p(ɛm )] = 1 − |⟨exp iɛm ⟩|. We have . By multiplying equation (20) with , and linearizing the imaginary part, we obtain an iteration rule for the estimation error ɛm as an affine weighted sum
Recall that is the random change of the true gradient direction angle ψ*(t) at time tm−1.
Case . We first assume that the agent knows the exact value of its own rotational diffusion coefficient, i.e. . Equation (21) represents a linear superposition of two independent random angles. In the limit of sufficiently sharp distributions, valid by equation (6), the circular variance of the superposition is a weighted sum of the circular variances of these random angles, with coefficients squared, analogous to the case of normal distributions. We thus conclude from equation (21) an iteration rule for the true accuracy
Note that is just the sensing noise of a single measurement. For agents that are aware of their rotational diffusion with , this iteration rule for the true accuracy CV[p(ɛm )] in an ensemble of identical agents, equation (22), is equivalent to the combination of update and prediction rule for the estimated precision κm of an individual agent, equations (16) and (18). Indeed, if CV[p(ɛm−1)] ≈ 1/(2κm−1) at t = tm−1, then CV[p(ɛm )] ≈ 1/(2κm ) at t = tm by a short calculation. Specifically, from the initial assumption it follows that CV[p(ɛm−1)] + DΔt ≈ 1/(2κ'm−1), hence the right-hand side of equation (22) becomes . This confirms that Bayesian estimation of gradient direction is faithful, i.e. the estimated precision of an individual agent follows, on average, the true accuracy in an ensemble of agents, as expected for optimal Bayesian estimation. As a corollary, the true accuracy converges to the same limit value as the (circular variance corresponding to the) estimated precision
Figure 2(A) shows simulation results for a case where estimated precision and true accuracy are initially different, but quickly converge to a common limit value. For details on numerical methods, see appendix
Download figure:
Standard image High-resolution imageCase . More generally, we can consider agents that assume a value for their rotational diffusion coefficient when performing the prediction step equation (18), while the true rotational diffusion coefficient is D. In this case, the estimated precision κm converges to , i.e. equation (19) with D replaced by , while the true accuracy CV[p(ɛm )] converges to a limit value CV∞ different from 1/(2κ∞).
To compute the true accuracy CV∞ in the long term limit for the case , , we note that the iteration rule for CV[p(ɛm
)], equation (22), still holds. Taking the limit m → ∞ yields a new self-consistency equation for CV∞. A short calculation yields the unique solution, see appendix
This expression attains the minimal value CV∞ = [DΔt/(2SNR)]1/2 exactly for . Equation (24) compares favorably to simulation results, see figure 3. For details on numerical methods, see appendix
Download figure:
Standard image High-resolution imageHowever, the estimation of gradient direction fails if the agent would completely neglect its rotational diffusion, corresponding to , as we show next.
Agents unaware of their rotational diffusion: D > 0, . If agents were unaware of their own rotational diffusion with , they would perform only the update step, equation (15). Correspondingly, the likelihood distribution for the gradient direction angle at time tm becomes a product of m + 1 von-Mises distributions. The calculus of von-Mises distribution yields an expression for the measure-of-concentration κm of this likelihood distribution as a circular average of all previous measurements
Here, the first term represents the contribution from the Bayesian prior, whereas the sum represents the contribution from the sequential measurements. For simplicity, we assume a flat prior with κ'0 = 0. We can rewrite equation (25) in terms of the complex measurement as . To evaluate this sum, we note the correlations between subsequent measurements, stemming from the rotational diffusion of the agent
We can now compute in a straight-forward manner, noting that the term yields a double-geometric series, see appendix
Hence, for D > 0 but , the estimated precision κm
will initially increase linearly with m, but then cross-over to the asymptotic scaling κm
∼ m1/2 after a characteristic cross-over time on the order of τ = (2D)−1. Simultaneously, CV[p(ɛm
)] → 1 for tm
≫ τ. These simple scaling relations are corroborated by numerical simulations, see figure 2(B). For details on numerical methods, see appendix
Thus, agents not aware of their own rotational diffusion will arrive at erroneous gradient estimates, because past measurements will have become partially invalidated by rotational diffusion, yet are nonetheless incorporated by the agent in its gradient estimates with full weight. Concomitantly, the estimated precisions κm that individual agents assign to their estimates of gradient direction do not reflect the true accuracy, i.e. the dispersion of estimation errors in an ensemble of agents. Individual agents are 'over-confident' of their own estimates.
Case . Finally, in the absence of rotational diffusion with , the estimated precision κm grows linearly with time as ⟨κm ⟩ ≈ mSNR = rtm , as expected. Correspondingly, the true accuracy of estimation errors in an ensemble of agents decreases inversely with total measurement time, CVm ≈ 1/(2rtm ). This is Berg-and-Purcell's classical result for the estimation of absolute concentration [7], generalized to the problem of estimating gradient direction.
4. Discussion
Summary of results. We considered a minimal model of gradient sensing in the presence of both sensing and motility noise. We analytically derived the theoretical limit of gradient sensing for a chemotactic agent that undergoes rotational diffusion. The chemotactic agent is placed in an external concentration gradient and detects stochastic binding events of signaling molecules on its circumference, while it undergoes rotational diffusion with rotational diffusion coefficient D.
We find that the accuracy by which this agent can estimate the time-varying relative direction of the gradient depends on two fundamental time-scales: the rotational diffusion time τ = 1/(2D) on which the relative gradient direction changes, and a second time-scale set by a rate of information gain r = SNR/Δt defined as a signal-to-noise ratio of gradient sensing normalized by unit time. The error ɛ between true and estimated gradient direction is distributed at steady-state with a circular variance that scales inversely with an effective measurement time given by the geometric mean of the two time-scales of the problem. Numerical simulations corroborate this scaling relation, see figure 4. For details on numerical methods, see appendix
Download figure:
Standard image High-resolution imageFor the case of optimal estimation, we assume that the agent uses sequential Bayesian estimation and has perfect knowledge of its own rotational diffusion coefficient. In this case, the precision that individual agents assign to their own gradient estimates matches the accuracy of gradient estimates in an ensemble of identical agents, i.e. gradient estimation is faithful. If the agent assumes a rotational diffusion coefficient that is either too small or too large, this is no longer the case, and the circular variance CV∞[p(ɛ)] of estimation errors will be larger than in the optimal case . If, however, the agent would fully ignore its owns rotational diffusion, corresponding to , gradient sensing will fail in a characteristic way: agents not aware of their rotational diffusion will extend their temporal averaging infinitely into the past. Concomitantly, the estimated gradient direction decorrelates from the true direction on the time-scale τ of rotational diffusion, while agents erroneously believe that their estimates become more and more precise as function of total measurement time t. We find an abnormal asymptotic scaling of the variance of gradient estimates, which decreases with time as (estimated precision), while the errors between true and estimated gradient direction converges to one CV[p(ɛ)] → 1 (true accuracy). The abnormal scaling of the estimated precision represents a signature of erroneous state estimation, and could be tested in real-world applications, e.g. of bearing tracking.
Cause of the square-root scaling of optimal measurement time. Our theory shows that the optimal measurement time for the estimation of a time-varying gradient direction scales as Teff = (τ/r)1/2, while the variance of the expected estimation error scales as 1/(rTeff) = (rτ)−1/2. This result parallels a recent result by Mora et al on the optimal estimation of an absolute concentration that varies in time [25]. In both Mora et al and our work, τ and r denote a characteristic fluctuation time-scale, and a rate of information gain of the problem, respectively.
We propose that the inverse square-root scaling of the optimal measurement time could be a general feature of Bayesian estimation of time-varying signals. The following heuristic argument highlights the common idea: if an agent averages past measurements of a time-varying signal s*(t) with an effective measurement time Teff, this agent effectively estimates , i.e. the estimate lags behind the current value of the signal. The variance σ2 of the expected estimation error between the estimated value of the signal and its true value s*(t) can be approximately decomposed into two contributions: and . The first contribution predominantly reflects sensing noise, and approximately scales as as function of effective measurement time Teff. The second contribution accounts for the stochastic change of the signal during the time Teff and approximately scales as . While the first contribution decreases with Teff, the second summand increases, thus posing an optimization problem for the optimal Teff that minimizes the total variance of the error . Solving for dσ2/dTeff = 0 yields Teff ∼ (τ/r)1/2. A detailed version of this argument can be found in appendix
Difference to bacterial chemotaxis. We emphasize that our and Mora's recent results on the optimal effective measurement time for estimation of time-varying signals are different from previous work [26]. Previously, Strong et al extended the seminal work by Berg and Purcell [7] to compute the optimal filter for bacterial chemotaxis, i.e. the optimal sensori-motor transfer function for bacteria with run-and-tumble motility in the presence of sensing and motility noise. This filter extended temporal comparison over a time span set by the rotational diffusion time, i.e. Teff ∼ D−1, (see equation (73) in loc. cit.), and additionally had an extremely long tail (see equation (70) in loc. cit.). This difference may be related to the fact that bacteria like E. coli use a chemotaxis strategy based on temporal comparison, which is different from the strategy of spatial comparison considered here: these bacteria estimate a time derivative of a concentration signal traced along their swimming path [19]. In contrast, most eukaryotic cells use spatial comparison, i.e. estimate spatial concentration gradients across the cell diameter [4, 5, 41–43], as considered here. A third chemotaxis strategy represented by marine sperm cells navigating along helical paths [3], can be mapped on the case of spatial comparison: while moving along helical paths, these cells 'visit' different sensor positions during one helical turn.
Directional memory and rotational diffusion. Biological cells performing chemotaxis possess internal memory of recent measurements of gradient direction. In eukaryotic cells with crawling cell motility, this memory manifests itself, e.g. in a persistent polarization of cell shape and the acto-myosin cytoskeleton, as well as anisotropic distributions of chemotactic effector molecules at the cell boundary [44, 45]. Stochastic remodeling of the cytoskeleton, as well as stochastic formation of pseudopods, would cause a decorrelation of cell polarization in the absence of chemotactic guidance cues. For marine sperm cells,the axis of helical swimming paths represents a consolidated memory of previous noisy steering responses [46]. In our minimal, coarse-grained model, we capture all effects that randomize direction in terms of a single effective rotational diffusion coefficient.
Biochemical implementation. Storing the likelihood distribution of estimated gradient directions, or just a proxy thereof, requires internal memory. Recent mathematical models of cellular gradient sensing such as LEGI (local excitation, global inhibition) [47], or balanced-inactivation models [48], consider distributions of chemotactic effector molecules on the cell boundary, which could serve as a proxy for a likelihood distribution of gradient direction: while the position of a concentration peak in such a distribution can represent a maximum likelihood estimate, its amplitude could encode a level of certainty. Slow surface diffusion of effector molecules would relax their distribution toward a uniform distribution in a way that is formally equivalent to a prediction step that accounts for the loss of knowledge due to rotational diffusion.
Typical parameters. Typical rotational diffusion coefficients for the bacterium E. coli are D ∼ 0.1 s−1, close to the theoretical lower limit of a passive particle of same size and shape. For ten-fold larger sperm cells, active fluctuations dominate [49], resulting in an estimate D ∼ 0.01–0.1 s−1 [40, 46]. The motility of crawling Dictyostelium cells was characterized by a persistence time of ∼5 min in the absence of chemoactractant [50], which sets an effective rotational diffusion coefficient D ∼ 0.003 s−1. For immune T cells in three-dimensional tissue, a persistence time of ∼1 min was found (displaying a characteristic speed dependence) [51]. If binding of signaling molecules to receptors on the cell surface is diffusion-limited, we can estimate the rate of binding by λc0 = 4πDc ac0 for a perfectly absorbing spherical cell of radius a, where Dc denotes the translational diffusion coefficient of signaling molecules [52]. For typical values (Dc ∼ 300 μm2 s−1, a ∼ 10 μm, c0 ∼ 1 nM), we estimate λc0 ∼ 104 s−1. Thus, for a concentration gradient of either α0 = 1%, 0.5%, or 0.1% across the diameter of a cell, we estimate a rate of information gain of r ≈ 0.5 s−1, ≈0.12 s−1 and ≈0.005 s−1, respectively, or, equivalently, a signal-to-noise ratio of gradient sensing SNR ≈ 0.5, ≈0.12, and ≈0.005 for a time interval Δt = 1 s. Assuming D ∼ 0.003 s−1 [50], our main result, equation (19), predicts for the ultimate precision of gradient sensing CV∞ ≈ 0.06, ≈0.11, and ≈0.55 for these three gradient strengths, respectively. Reversible binding of signaling molecules effectively increases sensing noise, thus decreasing the signal-to-noise ratio by a constant prefactor [8, 9, 12]. Some cells, such as sperm cells, respond chemotactically even at pico-molar concentrations [53], corresponding to even lower rates of information gain [20, 39]. For the slime mold dictyostelium, the precision of chemotactic migration is set by sensing noise at low rates of information gain (corresponding to low signal-to-noise ratios), while noise of downstream intracellular signaling [54] becomes relevant at high rates of information gain [12, 16, 17]. Our theory of Bayesian gradient estimation provides a theoretical limit for the precision of gradient sensing, even if some cells do not achieve this limit, e.g. because these cells use shorter effective measurement times.
Time-varying gradients. The minimal model of a stationary agent subject to rotational diffusion considered here represents a prototypical example for a broader class of problems: if the agent actively moves in a concentration landscape, also the absolute concentration of chemoattractant and the strength of the gradient will change as the agent moves. Thus, the agent faces two estimation problems at the same time: estimating absolute concentration, which is comparatively easy, and estimating the direction of a weak concentration gradient, which is considerably more difficult. Our minimal model decouples these two problems and studies the second, more difficult problem in isolation. Similarly, in a dynamic concentration landscape, the direction of the concentration gradient will change in time with a characteristic fluctuation time-scale τ. For an idealized case of random changes of gradient direction, this case of time-varying concentration gradients is formally equivalent to the case of rotational diffusion with rotational diffusion time τ considered here.
Data availability statement
No new data were created or analysed in this study.
Acknowledgments
MN and BMF are supported by the German National Science Foundation (DFG) through the Excellence Initiative by the German Federal and State Governments (Clusters of Excellence cfaed EXC-1056 and PoL EXC-2068), as well as DFG grant FR3429/3-1 and FR3429/4-1 (Heisenberg program) to BMF. We thank all members of the 'Biological Algorithms Group', in particular Andrea Auconi and Steffen Lange, for stimulating discussions.
Appendix A.: Numerical methods
For numerical simulations, we represented likelihood distributions using an equidistant grid for ψ with step size of Δψ = 0.01. For the case of an agent unaware of its own rotational diffusion (D > 0, ), where likelihood distributions become very sharp, we used a locally refined grid: we first identified a region of interest (ROI), where the likelihood distribution exceeded a relative threshold of , and used two equidistant grids, inside and outside of this ROI, with 3000 grid points each.
In each time step, the likelihood distributions were updated according to the update step, equation (15), and the prediction step, equation (17). We used the diffusion approximation, equation (11), to sample stochastic measurements . Drawing stochastic measurements from a Poisson distribution instead gave virtually identical results in initial simulations (same circular variance and Shannon information of measurements even for small SNR). Likewise, we confirmed that results do not change if we use a finer grid spacing for ψ.
To numerically compute the true accuracy CV∞ at steady state shown in figure 3, we simulated agents for m = 100 time steps and averaged CV[p(ɛm )] over the last 25 time steps. In all cases, we confirmed that CV[p(ɛm )] had converged and was fluctuating around a limit value.
To compute the average estimated precision in figure 4, we first determined the circular variances and of simulated likelihood distributions after update and prediction step, respectively, and then converted these circular variances to equivalent measures-of-concentration κm and κ'm by inverting CV = 1 − I1(κ)/I0(κ). Finally, is obtained as limit value of the geometric mean (determined as average over the last 25 of a total of 100 time steps, when all measures-of-concentration have converged). Similarly, to compute the true accuracy, we first determined the circular variances CV[p(ɛm )] and CV[p(ɛ'm )] of the distribution of estimation errors in an ensemble of agents after update and prediction step, respectively, converted these again to equivalent measures-of-concentration, and finally determined the limit value of their geometric mean.
Error bars determined by bootstrapping are smaller than symbol size.
Appendix B.: Expected precision of a single measurement ⟨ζj ⟩
To prove equation (12) stating , we compute the expectation value using the law of cosines. We write for the deviation of from its mean with polar representation . We thus have a triangle in the complex plane with vertices and side lengths , b = |ρ|, , where γ is the angle enclosed by the sides of length a and b. The law of cosines for this triangle reads
According to equation (11), the deviations ρ are isotropic, hence the angle γ is a uniformly distributed with p(γ) = 1/(2π). The squared magnitude of the deviation b2 = |ρ|2 follows a χ2-distribution for 2 degrees of freedom (namely, real and imaginary part of ρ, respectively); hence, p(b) = 2b/ν exp(−b2/ν).
Thus,
The first integration, , results in an elliptic integral (explicitly, , where E denotes the complete elliptic integral of the second kind, and the '+' sign applies for b < αν/2 and '−' otherwise). The second integration in equation (S2) with respect to b cannot be performed analytically. However, can be well approximated by
The second integration with respect to b can now be easily done, yielding
where the ellipses represent terms that decay exponentially fast for SNR ≫ 1 (explicitly, ).
Appendix C.: Basic properties of circular distributions
A probability distribution p(ψ) of angles should be 2π-periodic, i.e. p(ψ) = p(ψ + 2π), and normalized to one on the unit circle, i.e. . The circular variance of such a circular distribution is defined as
An important circular distribution is the wrapped normal distribution
with variance parameter σ2, whose circular variance reads . In the limit σ2 ≪ 1, CV ≈ σ2/2. The wrapped normal distribution is closely approximated by the von-Mises distribution, which is commonly used in directional statistics due to its mathematical tractability [55]
where κ is the so-called measure-of-concentration or precision, and In (κ) is the modified Bessel function of order n. The circular variance reads .
The normalized product of two von-Mises distributions, say and , is again a von-Mises distribution . Such normalized product appears, e.g. in Bayes formula, equation (15). Specifically, the mean μ and measure-of-concentration κ of the normalized product satisfy
If mean values are close, |μ1 − μ2| ≪ 1, we have the approximate sum rule κ ≈ κ1 + κ2.
In contrast, the circular convolution of two von-Mises distributions and is only approximately a von-Mises distribution [55]. To find such approximation, one can first map the two von-Mises distributions onto wrapped normal distributions of same respective mean and circular variance, and compute the convolution of these wrapped normal distributions [38]. The convolution of two wrapped normal distributions, say with variance parameters and , is again a wrapped normal distribution with new variance parameter . Finally, this new wrapped normal distribution is mapped back onto a von-Mises distribution . Thus, with μ = μ1 + μ2 and . In the limit κ1, κ2 ≫ 1, we have .
For ease of reference, we highlight the sum rules for the measure-of-concentration of either a normalized product or convolution of two von-Mises distributions
While equation (S9) is valid for κ1, κ2 ≫ 1, equation (S10) is valid for |μ1 − μ2| ≪ 1. Table C presents a summary of this section. Figure S1 graphically relates von-Mises distributions and wrapped normal distributions of equal circular variance.
Table C. Mathematical properties of normal distribution, von-Mises distribution, and wrapped normal distribution. See appendix C for details.
Normal distribution | von-Mises distribution | Wrapped normal distribution | |
---|---|---|---|
Definition | |||
Normalized | |||
product | |||
for σ1,2 ≪ 1, |μ1 − μ2|≪ 1 | |||
κ ≈ κ1 + κ2 for |μ1 − μ2| ≪ 1 | for σ1,2 ≪ 1, |μ1 − μ2|≪ 1 | ||
equations (S10) and (S8) | |||
Convolution | |||
μ = μ1 + μ2 | μ = μ1 + μ2 | μ = μ1 + μ2 | |
for κ1, κ2 ≫ 1 equation (S9) | |||
Circular | — | 1 − I1(κ)/I0(κ) ≈ 1/(2κ) for κ ≫ 1 | ≈σ2/2 for σ2 ≪ 1 |
variance CV |
Download figure:
Standard image High-resolution imageAppendix D.: Agents assuming wrong rotational diffusion coefficient
We provide details on the calculation of the true accuracy of gradient sensing in the long time limit, CV∞ = limm→∞CV[p(ɛm )], for the case that agents assume a wrong rotational diffusion coefficient .
First, we note that the iteration rule for CVm , equation (22), still holds in the case . We take the limit m → ∞ on both sides an obtain a self-consistency equation for CV∞
where we used κ'∞ = limm→∞ κ'm ≈ κ∞ − SNR. Note that κ∞, κ'∞ ≫ SNR by equation (6). In the main text, we had shown for the estimated precision κ∞ in the long time limit, , hence . Inserting this into equation (S11) gives
or after rearranging
Solving for CV∞ yields the result stated in equation (24), .
Appendix E.: Agents unaware of their rotational diffusion
We compute the second moment of the measure-of-concentration κm of the likelihood distribution for the gradient angle ψ after m measurements.
Using the calculus of von-Mises distributions, we can express κm as a circular average of all previous measurements, see equation (25), which reads . For simplicity, we assume a flat initial prior with κ'0 = 0. Using ζj = α|Mj |, we can rewrite this equation in terms of the complex measurement as
Note
and
We also note the expression for the double-geometric series with 0 < ξ < 1
Thus,
For long times, tm ≫ τ, this expression scales with m, as . This follows from exp(DΔt) ≈ 1 + DΔt and exp(−mDΔt) ≈ 0 to leading order in DΔt. For short times, tm ≪ τ, however, . Specifically, a Fourier expansion of exp(−mDΔt) ≈ 1 − mDΔt + (mDΔt)2/2 to second order yields .
Appendix F.: Optimal effective measurement time
We consider a minimal example for the estimation of a time-varying signal to illustrate the origin of the square-root scaling of the optimal effective measurement time with the fluctuation time-scale of the signal. Suppose a signal s*(t) varies stochastically with characteristic fluctuation time-scale τ, e.g. as a diffusion process with ds*/dt = ξ(t), where ξ(t) denotes Gaussian white noise with ⟨ξ(t)ξ(t')⟩ = τ−1 δ(t − t'). The agent estimates by averaging noisy measurements over a time interval Teff with measurement noise η(t) satisfying ⟨η(t)η(t')⟩ = r−1 δ(t − t'). The variance σ2 of the expected estimation error between the estimated value of the signal and its true value s*(t) can then be composed into two parts, where
An explicit calculation yields
Solving for dσ2/dTeff = 0 yields Teff = (3τ/r)1/2.
Appendix G.: List of symbols used
Symbol | Meaning |
---|---|
α | Normalized gradient strength α = |∇c|a/c 0 |
a | Sensing length scale, radius of agent |
c0 | Base concentration |
c.c. | In equations: add complex conjugate of last term |
CV | Circular variance of an angular distribution p(ψ), CV = 1 − |⟨ exp iψ⟩| |
ɛm | Estimation error of individual agent at time t m , |
Δj | Random rotation of agent at time t j |
Δt | Time step, cf equation (6) |
D | Rotational diffusion coefficient of agent (with units of inverse time) |
Value of D assumed by agent | |
ec | Gradient direction vector ec = ∇c/|∇c| |
h1, h2 | Orthonormal material frame vectors of agent |
κm | Measure-of-concentration of likelihood distribution after m measurements |
κ∞ | Limit value of κm , representing ultimate precision |
κ'm | Measure-of-concentration of likelihood distribution after prediction step |
λ | Proportionality factor between rate of molecule detection and local concentration |
Λ(φ) | Rate of molecule detection at polar angle position φ |
Likelihood distribution of estimated gradient direction at time t m after Bayesian updating | |
Likelihood distribution of estimated gradient direction at time tm before Bayesian updating | |
μ | Expectation value of measurements |
Measurement for time interval , cf equation (8) | |
ν | Expected total molecule count during time Δt |
ψ*(t) | True gradient direction angle relative to material frame of agent |
Constant value of ψ*(t) during time interval | |
P(M|ψ) | Probability for measurement given direction angle ψ |
r | Rate of information gain, r = SNR/Δt, cf equation (5) |
s(t) | Complex signal of directional binding events , cf equation (3) |
Σ | Covariance of measurement |
SNR | Signal-to-noise ratio of gradient measurements, cf equation (4) |
Δt | Time step (duration of measurement intervals) |
tj | Discrete time points tj = jΔt |
Measurement intervals | |
τ | Rotational diffusion time τ = 1/(2D) |
von-Mises distribution for variable ψ with mean μ and measure-of-concentration κ | |
Wrapped normal distribution for ψ with mean μ and variance parameter σ2 | |
ζj | Measure-of-concentration of update kernel ; note ⟨ζj ⟩ = SNR |