This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Paper The following article is Open access

Stochasticity of infectious outbreaks and consequences for optimal interventions

, , and

Published 26 August 2022 © 2022 The Author(s). Published by IOP Publishing Ltd
, , Fundamental Approaches Towards Predictive Epidemic Modelling Citation Roberto Morán-Tovar et al 2022 J. Phys. A: Math. Theor. 55 384008 DOI 10.1088/1751-8121/ac88a6

1751-8121/55/38/384008

Abstract

Global strategies to contain a pandemic, such as social distancing and protective measures, are designed to reduce the overall transmission rate between individuals. Despite such measures, essential institutions, including hospitals, schools, and food producing plants, remain focal points of local outbreaks. Here we develop a model for the stochastic infection dynamics that predicts the statistics of local outbreaks from observables of the underlying global epidemics. Specifically, we predict two key outbreak characteristics: the probability of proliferation from a first infection in the local community, and the establishment size, which is the threshold size of local infection clusters where proliferation becomes likely. We derive these results using a contact network model of communities, and we show how the proliferation probability depends on the contact degree of the first infected individual. Based on this model, we suggest surveillance protocols by which individuals are tested proportionally to their degree in the contact network. We characterize the efficacy of contact-based protocols as a function of the epidemiological and the contact network parameters, and we show numerically that such protocols outperform random testing.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

The current SARS-CoV-2 pandemic caused millions of deaths and severely affected global health and economy [1]. Before population-wide immunity is achieved, non-pharmaceutical interventions (NPIs) are required to suppress or mitigate the transmission of the infections [24]. Some measures can easily be implemented and cause only a small burden for the population, such as the usage of protective masks or contact tracing strategies. Other measures, such as massive testing, lockdowns and social distancing, can be costly and of large impact for the population [5]. Core institutions such as hospitals, schools, and food supply and producing plants must continue functioning in order to satisfy the basic needs of communities [6]. Under these circumstances, NPIs can be combined in order to avoid local outbreaks. On the one hand, protective measures are implemented to reduce the so-called initial effective reproductive number R0, i.e., the average number of new infections caused by an infected individual during its infectious period, slowing down the initial spread of the virus. Additionally, given the high fraction of asymptomatic infected individuals reported for SARS-Cov-2 [7, 8], surveillance protocols based on regular testing and contact tracing can be performed to detect and control the silent spread of the virus. However, RT–PCR tests can be costly to perform at high coverage even in small populations. Alternative testing protocols, including pool testing [9] and the use of rapid tests, have been developed to tackle this problem. All such testing strategies respond to the stochastic dynamics of local outbreaks; this link is at the core of the present paper.

In the first part, we discuss an epidemiological model for outbreaks that relates key local characteristics to observables of the underlying global epidemic. Starting from the well-known deterministic susceptible–infected–recovered (SIR) model, we build a more realistic model by integration of three important features. First, the infection dynamics of highly contagious pathogens is often delayed by an incubation period, which is accounted for by extending the epidemiological dynamics to a so-called susceptible–exposed–infected–recovered (SEIR) model [10, 11]. Second, the infection, incubation, and recovery processes are intrinsically stochastic across different individuals; that is, the epidemiological model becomes stochastic. Third, the infection risks of individuals are highly disperse because of their contact intensity; this generates an additional stochasticity the infection dynamics that can be captured by a contact network model for local communities [12, 13]. To characterize such networks and the resulting stochastic dynamics, we study the degree distribution ${\hat{p}}_{k}$, i.e., the probability that a given individual interacts with k other individuals in the network. This approach allows for treatable analytical calculations and captures broad network features accessible prior to an outbreak. It differs from disordered out-of-equilibrium models that exploit the unidirectionality of an epidemic propagating in a contact network and have been used to infer the origin of the outbreak from posterior information [1416].

We use the fully stochastic model to describe statistical characteristics of local outbreaks that are important for monitoring and interventions. We compute the proliferation probability for outbreaks starting from a single infected individual, where proliferation is defined as the transition to deterministic growth; the complementary fate of an outbreak is extinction due to stochastic fluctuations. Our analysis extends previous results of epidemic growth in well-mixed populations [1719] and on contact networks [13, 20, 21]. Specifically, we show how the local outbreak statistics can be estimated from global epidemiological parameters, including the growth rate and the average incubation and infection periods. Our analytical results are tested by individual-based stochastic simulations of outbreaks on random contact networks [12].

In the second part, we explore surveillance protocols based on random testing of a fraction of the total asymptomatic population that can be implemented in a straightforward way. Recently, different studies have considered strategies of random testing in large and well-mixed populations [22], where the interaction network between individuals is unknown. Other studies have highlighted the importance of the underlying structure of interactions in the population for the implementation of efficient containment strategies [2325] and of optimal surveillance protocols to protect patients and healthcare workers [8, 26]. In this work, we study the consequences of infection and contact stochasticity on the optimal implementation of surveillance protocols. We propose strategies to improve such protocols, in order to maximize the probability of detection of infected individuals and to minimize the time before detection and the expected epidemic size. We present a novel network-based protocol that outperforms a protocol based on uniform random sampling of the population. We perform simulations of the surveillance protocol and test its performance numerically.

2. Epidemiology of local outbreaks

2.1. Deterministic epidemiological model

The most commonly used model in epidemiology is the classical SIR model [10, 11]. It has been used to study the behavior of epidemics because it captures most of the general features of the dynamics and requires few parameters. The SIR model consists of a population of individuals, each of which can be susceptible (S), infected (I) or recovered (R). In the following, we use an extension of the SIR model, the so-called SEIR model, which contains an additional compartment of exposed (E) individuals. The exposed compartment contains infected individuals which are not yet able to transmit the infection, because the pathogen is in an incubation or pre-contagious phase. The SEIR model has previously been used to model epidemics on contact networks in the context of the 2005 SARS-CoV outbreak [27] and to estimate the turning period of SARS-CoV-2 outbreaks [28].

The deterministic SEIR dynamics is governed by a set of differential equations,

Equation (1)

Equation (2)

Equation (3)

Equation (4)

These equations describe the infection dynamics in a well-mixed population of constant size N. Susceptible individuals get exposed at a rate proportional to the fraction of infected individuals with a transmission rate parameter β. Exposed individuals become infectious at a constant rate σ; this delay represents intra-host growth of the viral population up to a level where transmission to a next host becomes likely. Infected individuals recover at a constant rate γ (figure 1(a)). In the model tuned to SARS-CoV-2, we use estimates of the average pre-contagious period (τσ σ−1 = 4 d) and infectious period (τγ γ−1 = 6 d), and we choose the parameter β from the interval 0.075–0.75 d−1 [1, 2931]. A further, frequently used parameter to characterize epidemic growth is the so-called reproductive number, which is defined a the average number of transmissions from a single individual during its infectious period. In the well-mixed SEIR model, the reproductive number is simply related to the basic rate parameters, R0 = β/γ [17]; however, this relation is no longer valid on a contact network.

Figure 1.

Figure 1. Deterministic dynamics of the SEIR model. (a) Schematic of the infection dynamics between four states: SEIR individuals. The transition rates between these states, β, σ and γ, enter the dynamical equations (1)–(4). (b) Epidemic growth of exposed and infected populations. After an initial phase, both populations grow exponentially at the same rate λ, as given by equation (5) (lines: analytical solution in the exponential growth regime, SN; dots: numerical simulations). The corresponding SIR model at the same value of R0 is shown for comparison (dashed lines); the simulation results show the onset of saturation by depletion of susceptible individuals. Parameters: β = 0.25, γ = 1/6, σ = 1/4 (for SEIR).

Standard image High-resolution image

We are interested in the first stage of the dynamics, where the approximation SN is valid and the number of infected individuals grows exponentially with rate λ, as given by

Equation (5)

(see methods). Here we have used R0 instead of β as the independent parameter, because the relation (5) remains valid on a contact network (see below). The reproductive number delineates the regimes of deterministic growth (λ > 0 for R > 1) and decline (λ < 0 for R0 < 1).

In figure 1(b), we show the deterministic growth of an SEIR epidemic starting from a single infected individual. After a brief initial period, the infected and exposed populations grow at the same rate λ and at comparable relative size. Because incubation slows down growth, this rate is lower than for a SIR model at the same value of R0 = β/γ, which is given by λ(R0, γ) = (R0 − 1)γ (dashed lines in figure 1(b)). In other words, in the deterministic dynamics, a sizeable exposed phase has a deleterious fitness effect for the pathogen. In the stochastic theory, as we discuss below, the exposed phase also generates a beneficial effect, because it reduces fluctuations leading to early extinctions.

2.2. Stochastic infection dynamics

Given initially low numbers of exposed and infected individuals, stochasticity plays an important role in the transmission dynamics of local outbreaks. In figure 2, we plot time courses of the number of infected individuals from simulations of the SEIR model in a well-mixed population. Some of these time courses reach only small transient population sizes and then go extinct (red lines), the remainder proliferate to deterministic, initially exponential growth (blue lines). Three characteristics of these stochastic dynamics are readily recognized. First, the fraction, duration, and population size of extinction-bound time courses strongly depends on the reproductive number. Below, we calculate the proliferation probability Qp(R0), given the initial condition of a single infected individual. Second, the fraction of extinction-bound trajectories decreases with increasing population size. Beyond a threshold n*(R0), the so-called establishment size (black dashed lines in figure 2), proliferation becomes the more likely fate. The establishment size will also be calculated from a suitably extended expression of the extinction probability. Third, in outbreaks bound for proliferation, the average number of infected individuals (thick blue lines) grows initially faster than expected from the deterministic dynamics (blue dashed lines). This initial boost is a posterior effect of sampling only time courses leading to epidemic growth; it is more pronounced for values of R0 close to 1 (figure 2(b)). In this regime, typical stochastic trajectories reach the epidemic regime (black dashed lines) several days earlier than expected from the deterministic model. These stochastic initial-time effects can drastically affect the efficiency of monitoring and control strategies.

Figure 2.

Figure 2. Stochastic dynamics of the SEIR model. Time courses of the number of infected individuals are shown for outbreaks bound for extinction (red lines) and for proliferation (blue lines), together with the ensemble average of proliferation-bound outbreaks (thick blue lines) and the corresponding deterministic dynamics (dashed black lines). A threshold number of infected individuals, the so-called establishment size n*(R0) (dashed green lines), separates regimes predominantly extinction-bound and predominantly proliferation-bound time courses. The dynamics is shown for two reproductive numbers: (a) R0 = 4.0, (b) R0 = 1.2. With decreasing R0, the establishment size increases and saturation effects (S < N; see methods) cap the regime of exponential growth (dotted lines). Other simulation parameters: N = 5000, σ = 1/4, γ = 1/6, (a) β = 0.66, (b) β = 0.19.

Standard image High-resolution image

To study the stochastic dynamics of the SEIR model we assume that, at a given time, each individual j in the populations is in a state xn ∈ {s, e, i, r}. The state of an individual at time t + dt, given the state at time t, is governed by the transition probabilities

Equation (6)

Equation (7)

Equation (8)

In a well-mixed population, the transition rates are uniform across individuals and equal the deterministic rates in equations (1)–(4).

The proliferation probability of the well-mixed SEIR model can readily be computed from a branching process approximation with constant linear birth and death rates of infected individuals, which is appropriate in the exponential growth regime (SN). The well-known result for the SIR model, 1 − Qp = 1/R0 carries over to the SEIR model, because the probability of infections is independent of the incubation step [17, 18].

To compute the establishment size, we evaluate the conditional proliferation probability Qp(R0, n), given that the infected population has already reached a size n. This probability is a decreasing function of n, interpolating between the initial value Qp(R0, n) = Qp(R0) and near-certain epidemic growth, Qp(R0, n) ≃ 1, of sufficiently large infected clusters (figure 2). We obtain ${Q}_{\text{p}}({R}_{0},n)=1-{\left(1/{R}_{0}\right)}^{n}$, assuming that extinction from an initial state of n infected individuals is equivalent to n independent extinction processes, each of them starting from a single individual. The establishment size n*(R0) is then obtained from the condition Qp = 1/2, which gives n*(R0) = log 2/log R0 (black dashed lines in figures 2 and 4(b) and 6(b)).

2.3. Infection dynamics on a contact network

Multiple stochastic factors govern the initial stage of an outbreak [13, 3234]. In addition to the intrinsic stochasticity of the dynamics given by equations (6)–(8), there is heterogeneity in the initial condition of patient zero. Most studies have addressed differences in the reproductive number R0 across the population, which can be caused by a complex mixture of host, pathogen and environmental factors [32]. Here, we consider stochastic effects generated by a network of contacts that controls the interactions between individuals and translates into an effective variation of infectiousness (figure A1(b)). This (undirected) network is characterized by a broad degree distribution $\hat{p}(k)\sim {k}^{-\alpha }$ [12], which gives the probability that an individual in the network has k contacts that can become transmission channels. While the degree distribution ignores the specific topology of the network, it captures the effect of largely connected nodes or hubs on the transmission statistics. Hubs play an important role in super-spreading events, which turns out to be the most relevant network feature for the degree-based testing protocol that we discuss below.

Given that infections take place between neighboring nodes of the contacts of the contact network, the SEIR transition probability from susceptible to exposed is given by

Equation (9)

where Ij is the number of infected contacts (neighboring nodes) of node j and $\left\langle \cdots \,\right\rangle $ denotes averages over the degree distribution $\hat{p\,}(k)$. This local process replaces the infection step of equation (7) in a well-mixed population. The normalization factor ⟨k⟩ in equation (9) ensures that in the limit of large ⟨k⟩ at constant β, the network dynamics converges to the well-mixed case.

In figure 3, we compare the SEIR infection dynamics on a contact network with the dynamics in a well-mixed population at the same parameters β, γ, σ. In the parameter regime shown, the epidemic on a network grows faster; most strikingly, the regime of positive growth extends well below the growth threshold β/γ = 1 in the well-mixed case (figure 3(a)). Consistently, the proliferation probability Qp remains positive in the same regime; for larger values of β/γ, however, proliferation is less likely on a network than in a well-mixed population. This pattern indicates two opposing effects of the contact network on epidemic growth. First, the epidemic spreads preferentially on high-connectivity nodes, which enhances the effective transmission rate. Second, infections deplete susceptible individuals specifically in the neighborhood of an infected node, leading to saturation effects at larger reproductive numbers.

Figure 3.

Figure 3. SEIR epidemics on a contact network, compared to a well-mixed population. (a) Scaled growth rate, λ/γ, and (b) non-proliferation probability, 1 − Qp, on a network of contacts (purple) and in a homogeneous, fully connected system (red). Analytical results for the SEIR dynamics (lines) are compared to simulation results (symbols). Other simulation parameters: N = 2000, α = 2.68 (network), σ = 1/4, γ = 1/6, β = 0.075, 0.1, 0.12, 0.15 (network), 0.17, 0.22, 0.27, 0.32, 0.42, 0.58, 0.75 (network and well-mixed system).

Standard image High-resolution image

2.4. Contact-dependent super-spreading

Most importantly, the proliferation probability of outbreaks on a contact network is no longer uniform but depends strongly on the contacts of the first infected individual (patient zero). In figure 4(a), we plot Qp as a function of the degree of patient zero, k0, and the reproductive number. Similar to the population threshold defined before, we define a threshold degree ${k}_{0}^{\ast }({R}_{0})$, such that ${Q}_{\text{p}}({R}_{0},{k}_{0}^{\ast })=1/2$ (dashed line). As expected, highly connected seeding nodes in the network are more likely to produce epidemics than nodes with low connectivity. This is the so-called super-spreader effect: most of the epidemics are seeded by highly connected individuals. The effect is most pronounced for small values of R0. Even for β/γ ≲ 1, individuals with connectivity ≳10-fold above average are likely to seed epidemics. These rare events correspond to infections seeded in a connectivity hub in the network. Conversely, individuals that have a connectivity smaller than the average connectivity in the population have an extinction probability close to 1 in the same regime. In the next section, we will use Qp(R0, k0) as a weighting factor for testing protocols.

Figure 4.

Figure 4. Stochastic outbreaks on contact networks. (a) Degree-dependent proliferation probability Qp(R0, k0) and threshold degree ${k}_{0}^{\ast }({R}_{0})$ (dashed line); simulation outcomes for various points in parameter space (squares) are compared to analytical predictions (background coloring). (b) Conditional proliferation probability Qp(R0, n) and establishment size n*(R0) (dashed line). The top scale gives the scaled growth rate λ/γ related to R0 by equation (5). Simulation parameters: N = 2000, α = 2.68, σ = 1/4, γ = 1/6.

Standard image High-resolution image

A similar pattern is observed in the conditional proliferation probability Qp(R0, n), which depends on an intermediate outbreak size n as defined above (figure 4(a)). This similarity is intuitively clear: an individual of degree k0 generates, after one transmission step, an infected cluster of size n proportional to k0.

In the remainder of this section, we sketch the derivation of the above results from the outbreak statistics on networks (more details are given in methods; readers interested primarily in the application to contact protocols may skip this part). First, following reference [13], we define the basic building block of the epidemic dynamics on a network: the probability T that an infected individual transmits the infection to a given contact; in other words, the probability that the infection spreads along a given branch of the network. This probability determines the reproductive rate on the network,

Equation (10)

Here ⟨⋯⟩ denotes averaging over the degree distribution $\hat{p}(k)$ and 《⋯》 averaging over the nearest-neighbor degree distribution ${\hat{p}}_{2}(k)=k\hat{p}(k)/\left\langle k\right\rangle $. The resulting growth rate in the exponential regime, λ(R0, γ, σ) is then given by equation (5).

The branch transmission probability T, which has been used as an independent parameter in previous work, has no direct analogue in a well-mixed system. However, we can express T in terms of the rate parameters β and γ,

Equation (11)

where we have used equation (9) (see methods). Together with equation (10), we obtain

Equation (12)

this relation compares the reproductive number on a contact network with the corresponding well-mixed system, which has R0 = β/γ. It displays the two network effects discussed above: preferential spreading on high-degree nodes and saturation of susceptible contacts (figure 3(b)). In the well-mixed model limit, (⟨k⟩ ≫ 1 with Var(k) = o(⟨k2)), we get R0β/γ as expected.

Next, we use the branch transmission probability T to compute the probability that an outbreak proliferates, given that it originates on a node of degree k0. Again, a previous result for the SIR model [13, 20] carries over to the SEIR model, because incubation delays growth but does not affect the probability of transmission. The proliferation probability takes the form

Equation (13)

Here qb is the probability that a sub-outbreak originating on a given branch of the network does not proliferate; this probability can be obtained from a self-consistent summation procedure (reference [13], see methods). The product on the rhs of equation (13) says that transmissions along each of the k0 branches originating from patient zero are statistically independent sub-outbreaks. The relation between T and R0 is given by equation (10).

The global proliferation probabilities are then given by averages over the contact network [13, 20]. We obtain

Equation (14)

and

Equation (15)

The product on the rhs of equation (15) says that transmissions originating from each of the n initially infected individuals are statistically independent sub-outbreaks; this form is analogous to equation (13). A given individual has been infected by one of its contacts and has (k − 1) contacts left to transmit the infection further. We note that these expressions take into account the contacts of typical individuals in the outbreak rather than individual contacts of specific individuals; for this reason, the average is taken using the nearest-neighbor degree distribution ${\hat{p}}_{2}(k)$ [13]. As before, the conditional proliferation probability determines the establishment size of epidemics in a contact network, n*(R0), by the condition Qp(R0, n*) = 1/2. Below, we use this function to characterize the performance of surveillance testing protocols.

2.5. Inference of outbreak statistics from empirical data

The structures of the epidemiological model and of the contact network have important implications for the analysis of empirical data, in particular, for predicting the likelihood of future local outbreaks.

First, the reproductive number R0 is often inferred from the observed growth rate λ. In this procedure, we have to take into account infection and incubation periods, as given by the SEIR relation in equation (5). Using the corresponding SIR relation, which neglects incubation, the reproductive number can be drastically underestimated (figure 5(a)).

Figure 5.

Figure 5. Inferring reproductive number and proliferation probability. Inference procedures can be based on three observed parameters: growth rate, λ, and average incubation and infectivity periods, τγ and τσ . (a) At given parameters, the SEIR model has a higher R0 than the SIR model. (b) At given parameters, the network model has a lower proliferation probability than the well-mixed model.

Standard image High-resolution image

Second, given a correctly inferred value of R0, the contact network shapes the corresponding proliferation probability of local outbreaks, as given by equations (13)–(15). Using the naive expression for a well-mixed system, Qp = 1/R0, proliferation can be substantially overestimated (figure 5(b)); the same applies to the conditional probabilities Qp(R0, k0) and Qp(R0, n). The correct inference of R0 and of proliferation probabilities is a crucial step in the surveillance and containment protocols discussed below.

3. Surveillance of local outbreaks

At early stages of an outbreak, the number of infected individuals can grow substantially before a first individual, if any, develops symptoms. For this reason, a surveillance strategy based only on symptomaticity will be suboptimal. Here we propose a surveillance strategy based on a daily random testing protocol of a fraction of the total asymptomatic population to detect infected individuals and prevent epidemics.

We start by considering a simple protocol that consists of a daily uniform sampling of m random asymptomatic individuals from the total population. Sampled individuals are removed from the testing pool for the next τσ days (figure 6(a)). We choose the time period of no replacement because if an individual is tested negative and gets infected on the next day, that individual will not test positive for the next τσ days on average. After this period of time, tested individuals are included in the testing pool again.

Figure 6.

Figure 6. Surveillance by random testing. (a) Daily random testing protocol. A constant number of individuals are chosen randomly everyday for testing. In addition, individuals that are selected, are not replaced in the sampling pool for the next τσ days (light blue dots). The outbreak starts with one infected individual (red dots) at day one. (b) Performance of the random testing protocol in terms of the infectious cluster size at the time of a first detection. Black horizontal lines show the establishment threshold n*.

Standard image High-resolution image

To evaluate the performance of the monitoring protocol, we focus on the average cluster size of infected individuals at the moment of detection and compared with the cluster size establishment limit defined above. Figure 6(b) shows that the initial reproductive number is a crucial parameter if we want to detect outbreaks before they are likely to proliferate. For an initial reproductive number close to 1, even for the smallest sampling size considered here (1.8%), the average outbreak size at the first detection is smaller than the establishment size. However, for values of R0 ≈ 2.0, a sampling size larger than 5% is required to achieve containment. For larger values of R0, where the establishment size is approximately 1 individual (for R0 > 2.0), other criteria need to be considered to quantify the performance of the surveillance protocol. For example, the time of first detection is crucial to minimize the epidemic size.

3.1. Contact-dependent monitoring protocol

As mentioned above, the degree of the seeding node has a big impact on the probability of having an epidemic. Using Bayes' rule, we calculate the probability $\hat{p}({k}_{0}\vert \mathrm{p}\mathrm{r}\mathrm{o}\mathrm{l}\mathrm{i}\mathrm{f}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n})$ that an epidemic originates from a node of degree k0,

Equation (16)

We upgraded the testing surveillance by using the weight function $\omega ({k}_{0})\equiv \hat{p}({k}_{0}\vert \mathrm{p}\mathrm{r}\mathrm{o}\mathrm{l}\mathrm{i}\mathrm{f}\mathrm{e}\mathrm{r}\mathrm{a}\mathrm{t}\mathrm{i}\mathrm{o}\mathrm{n})$ for the random sampling of the protocol. With this weighting, we test highly connected individuals more often than poorly connected ones (figure A2 in methods).

We find an appreciable improvement of the degree-based testing protocol compared to the uniform testing protocol. The performance is better both in terms of the cluster size of the outbreak (figure 7) and in the time of detection (see figure A3 in methods). Consistently, the improvement is smaller for larger values of R0, where the effects of the network are less significant.

Figure 7.

Figure 7. Degree-based random testing protocol. Performance of the degree-based random testing protocol relative to the uniform protocol in terms of infected cluster size n at the time of detection; we show the case of individuals interacting (a) under a network of contacts and (b) in a well-mixed population as a control.

Standard image High-resolution image

Figure 7 shows the ratio between the cluster size at the moment when the first infected individual is detected in the degree-based sampling protocol, nk , and in the uniform sampling protocol, nU. Figure 7(b) shows that when the population is well-mixed, there is no significant difference between the two protocols. However, when the network of contacts regulates the interactions between individuals, as shown in figure 7(a), the expected cluster size is reduced in all the range of parameters that we explore. This reduction is greater for smaller values of R0, reaching values larger than 20%. We also measure a greater improvement for smaller sample sizes in the protocol, consistent with the fact that with larger uniform random samples it is more likely to detect infectious hubs that drive the proliferation.

4. Discussion

To prevent large epidemics of infectious diseases, it is important to properly understand the dynamics at early stages of infectious outbreaks. When the average pre-infectious and incubation periods are similar to the average infectious period, it is important to consider a SEIR model rather than an SIR model. The early stage of the outbreak, when SN, requires an appropriate stochastic description of the dynamics, where extinction and establishment events arise as novel features compared to a deterministic description. The fact that an outbreak can go extinct can be used to set up NPIs under limited resources or logistic capacities, like the testing protocol presented here. Specifically, as we focus our attention in the regime of few exposed or infected individuals emerging from a single new infection within a small population, the establishment size of an outbreak, as defined above, is a good estimate for the maximum number of infected individuals that can be tolerated before detection of the outbreak. For different regimes of an epidemic, some of our approximations are no longer valid. For instance, we do not consider the case when the number of infections grows to the same order as the size of the population. In this case, the proliferation probability becomes large and the surveillance testing protocol is no longer a suitable strategy for containment.

We have shown that the social connectivity structure is important for efficient monitoring of infectious outbreaks. The variability of the effective reproductive number introduced by a contact network in the population considerably impacts on the outbreak dynamics. More specifically, a heavy-tailed degree distribution creates a substantial super-spreader effect, by which highly connected individuals are more likely to start outbreaks that reach establishment than poorly connected individuals. The super-spreader effect has been extensively studied in the literature [31, 32, 3436]. Here we establish quantitative relations between degree distribution of the contact network and proliferation probabilities of outbreaks that are relevant for pre-emptive monitoring. Specifically, we obtain a connectivity regimes for patient zero where an outbreak is likely to proliferate or not.

Based on the contact structure of the population, we show that a degree-based random testing protocol outperforms an uninformed uniform testing protocol (but we do not claim this protocol is optimal in an absolute sense). The difference in performance depends on the specific parameters of the pandemic spread, which also has important consequences for the global control of pandemics. In practice, the implementation of a degree-based random testing protocol requires a proper estimation of the connectivity of individuals, which can be done by low-cost methods in terms of resources and logistics [3740]. Moreover, the posterior collection of combined data of contact networks and the reconstruction of infectious outbreaks in the corresponding structured population [1416] could be used to validate our sampling protocol. Together, this could help to better understand the sources of stochasticity in local outbreaks and to develop better specific NPIs.

Acknowledgments

This work has been supported by the Deutsche Forschungsgemeinschaft (DFG) through the Collaborative Research Center 1310, Predictability in Evolution. We thank M Meijers and D Trimcev for a careful reading of the manuscript and R S McGee for making Seirsplus available.

Data availability statement

The data that support the findings of this study are openly available at the following URL/DOI: https://github.com/betoto008/epidemics_testing.

Appendix A.: Methods

A.1. Deterministic growth

We use a classical epidemiological model consisting of susceptible (S), exposed (E), infected (I) and recovered (R) individuals. Exposed individuals are infected individuals that do not test positive and cannot infect other individuals yet. The total population size is N. Here we consider the deterministic dynamics of the system described by the ODE system of equations (1)–(4). We are interested in the first stage of the dynamics, where the approximation SN is valid. With this approximation, and assuming the initial conditions S(t) = N − 1, E(t) = R(t) = 0 and I(t) = 1, equations (2) and (3) can be written as

Equation (A.1)

Equation (A.2)

The solution of the system of equation (A.2) is given by

Equation (A.3)

where the constant C is determined by the initial condition and λ+ and λ are the eigenvalues of the matrix

Equation (A.4)

that are given by

Equation (A.5)

As soon as the outbreak is seeded, the time evolution of E and I is lead by the first term in equation (A.3) because λ < 0. With the initial conditions E(0) = 0 and I(0) = 1, equation (A.3) can be written as

Equation (A.6)

As mentioned above, we decided to include the exposed compartment of the population because it affects dramatically the dynamics of the infected individuals (figure 1(b)). More concretely, for SARS-Cov-2 the estimated average time that takes an exposed individual to become infected τσ is in the same order of magnitude of other relevant time-scale, namely the average infectious period τγ [1]. Additionally, under these circumstances the classical relation between the exponential growth rate and the parameter R0, the initial reproductive number, defined as the quotient between the infectious rate and the recovery rate β/γ, need to be revisited.

The approximation SN is only valid when few individuals are infected and breaks down when the infection approaches its expected peak. From equations (1)–(4), it is possible to derive an expression for the number of infected individuals in the peak. Using the condition $\dot{I}=0$ and the constrain of constant total population size N = S + E + I + R, we have that

Equation (A.7)

We expect then deterministic exponential grow when I(t) < Ipeak. In figure 2(a), for the case of β/γ = 4.0, the peak is reached at Ipeak ≈ 1210 individuals, and for that reason we see that the average epidemic trajectory grows exponentially for the whole range shown. On the other hand, in figure 2(b) for the case β/γ = 1.3, the peak is reached at Ipeak ≈ 86. Therefore, an exponential growth is observed only for a shorter period and the approximation made before is invalid for the later part.

A.2. Proliferation on a contact network

We generate the random network of contacts by following a preferential attachment method [12]. Such a method produces graphs with heavy-tailed degree distribution $\hat{p}(k)\sim {k}^{-\alpha }$, like the one shown in figure A1(a). Under these circumstances, most of the individuals have few number of contacts close to the mean degree and a non-negligible amount of individuals have a large number of contacts. The degree of the highly connected individuals can reach values up to two orders of magnitude larger than the mean degree. In such case, a finite variance is guaranteed by a cutoff in the degree distribution at large degree values as a consequence of the finite size of the network.

Figure A1.

Figure A1. Scale-free network (a) degree distribution of a random network generated by the preferential attachement Barabasi–Albert method. It is characterized by its power-law behavior. (b) Visualization of a random network together with its adjacency matrix. N = 100.

Standard image High-resolution image

In this context, a new variable Tnm has been defined in the literature [13] to characterize the epidemiological dynamics on the random network, which accounts for the probability that the infection is transmitted through an edge in the network between nodes n and m. The average value, T, is given in terms of the typical effective time of duration of an infection τ, which is a exponentially distributed stochastic variable with mean value τγ = γ−1, and the rate of infectious encounters between individuals in the network ρ. Given that in our model ρ is proportional to β (see equation (9)), we derived that for the case of a SEIR dynamics, T is given by

Equation (A.8)

where the second term correspond to the average probability across the network that no such transmission event occurs.

Moreover, a critical value ${T}_{\text{c}}=\left\langle k\right\rangle /(\left\langle {k}^{2}\right\rangle -\left\langle k\right\rangle )$ has been derived for an uncorrelated network previously, which sets the minimum value of T required for an outbreak to have the chance to turn into an epidemic. The value of Tc corresponds to the correction factor in equation (10) that we derived for the renormalized basic reproductive number in the network of contacts.

As it has been derived before [13], we start the derivation with a Dyson-like self-consistent equation for the probability qb that an individual at the end of a randomly selected interacting edge will not produce an epidemic,

Equation (A.9)

where

Equation (A.10)

and 1 − T is the probability that the disease does not spread through the interacting edge, Tqb is the probability that it spreads through the edge but it does not produce an epidemic and ${\hat{p}}_{2}(k)=k\hat{p}(k)/\left\langle k\right\rangle $ is the distribution of interacting edges connected to an individual at the end of a randomly selected interacting edge. This nearest-neighbors degree distribution is important because by following a randomly chosen interacting edge, is more likely to reach highly connected hubs than by randomly selecting single nodes. We solved equation (A.9) numerically for a fixed value of T. Here, we are able to write the proliferation probability equations in terms of the epidemiological parameters that define the dynamics of equations (1)–(4) and (6)–(8) for the case of the SEIR model or by the epidemiological measurable values λ, τγ and τσ .

Near the critical value R0 = 1, some properties of the system show well defined scaling behavior [21, 41]. In particular, the proliferation probability ${Q}_{\text{p}}\sim {(T-{T}_{\text{c}})}^{-{\beta }_{\text{p}}}$ with βc = 1/(3 − α) when 2 < α < 3, which is the case for the networks generated by the preferential attachment method and that corresponds to the most common networks occurring in nature [41]. In our case, some approximations may become invalid due to large fluctuations. Given the relation that we derive between T and R0, it might be possible to derive new scaling laws for R0 approaching its corresponding critical value. However, this is beyond the scope of this work.

Figure A2.

Figure A2. Degree-based sampling weight log-ratio of probability that a degree k is drawn from the network given that an outbreak seeded on it produced an epidemic and the degree distribution. Lines show analytical result and symbols simulation outcomes. Parameters γ = 1/6, σ = 1/4, α = 2.68.

Standard image High-resolution image

A.3. Random testing protocols and the likelihood function

The function ω(k) defined in the main text is proportional to the likelihood function L(k) = p(proliferation|k) = Qp(R0, k). It should be reduced to a constant value when the network of contacts is turned off because then Qp(R0, k) = Qp(R0)

Figure A3(a) shows the outcomes of the simulations for the average time of first detection to complement the evaluation of the performance of the degree-based sampling testing protocol. Similar to the cluster size shown in figure 7(a), there is a significant improvement in the time at which the first infected individual is detected. For small values of R0 the reduction of such time can be large than 20%.

Figure A3.

Figure A3. Degree-based random testing protocol. Performance of the degree-based random testing protocol relative to the uniform protocol in terms of time of detection; we show the case of individuals interacting (a) under a network of contacts and (b) in a well-mixed population as a control.

Standard image High-resolution image

To test that the random testing protocol has an effect on a structured population characterized by a network of contacts, we simulate the degree-based sampling protocol turning off the network of interaction as a negative control but still using the likelihood function L described above for the sampling of individuals. Figures 7(b) and A3(b) show the performance of the control protocol in terms of cluster size and time of the first detection, respectively. In both cases the performance of the degree-based protocol is the same as the uniform sampling protocol.

A.4. Numerical simulations

The simulations were performed with a modified version of the individual-based SEIR python code seirsplus publicly available in https://github.com/ryansmcgee/seirsplus. The simulations consist of a Gillespie algorithm for independent Poisson processes.

Please wait… references are loading.