Large-deviation properties of SIR model incorporating protective measures

We simulate spreads of diseases for the susceptible–infected–recovered (SIR) model on contact networks with a particular focus on incorporated protective measures such as face masks. We consider the small-world network model. By using a large-deviation approach, in particular the 1/t Wang–Landau algorithm, we obtained the full probability density function (pdf) of the cumulative number C of infected people over the full range of its support. In this way we are able to reach probabilities as small as 10−50. We obtain distinct characteristics in the pdf such as non-analyticities induced by the onset of the protective measures. Still, the results indicate that the mathematical large-deviation principle also holds for this extended SIR model, meaning that the size-dependence enters in P(C) in a simple fashion and the distribution is determined by the so-called rate function. We observe different phases in the pdf, which we investigate by analyzing the corresponding infection courses, i.e. time series, and and their correlations to the observed values of C.

One of the most influential models, the susceptible-infected-recovered (SIR) model, was first introduced by Kermack and McKendrick [12]. The first mathematical models based on the original SIR model often worked under the assumption that the population they describe is fully-mixed [13,14], i.e. each individual has the same likelihood to spread the disease to any other individual. These fully-mixed models fail, however, to capture the importance of the topology on the disease-spread dynamics. Thus many studies focused on incorporating the contact-network-topology into the differential equations [15][16][17][18].
Traditionally, these stochastic models are investigated with respect to the typical behavior of an ensemble of networks, i.e. concerning those events which occur with rather high probabilities. This allows one to calculate, e.g. the typical outbreak size or the epidemic threshold, i.e. the value of a parameter like the infection rate beyond which a pandemic occurs.
If one wants to investigate the model in its full extend, one needs, e.g. to obtain the underlying probability distribution of the quantity of interest over a large range of its support. This involves reaching the large-deviation tails where the probabilities are as small as 10 −50 . In a previous work, some of us have numerically studied the large-deviation tails [19] of the pure SIR model for different infection probabilities λ. This allowed us not only to obtain corresponding probability distributions, but also to identify different phases which are present for the same value of λ, however, in different regions of the distributions. Consider for example a very large infection probability λ, which corresponds to a large infection rate in continuous-time models. Here deterministic mean-field models [4] would predict a large outbreak although there is a small but finite probability that the disease dies out very early due to some stochastic events. Similarly, even with a very low infection probability, there is still a chance of the entire network becoming infected, even though the system is way below its epidemic threshold. These cases, which are neglected by mean-field theories, can be explored using this large-deviation approach.
Knowing the extremes of the probability distributions, the very mild infection courses and the very severe ones, might help to identify measures which lead to effective mitigation of diseases. This is similar to energy grids, where large-deviation studies have helped to find rules of thumb for creating network structures which lead to resilient energy grids [20,21].
In the present work, we want to go beyond the previous large-deviation study of the raw SIR model. A severe real-world disease will not proceed as an isolated process and a pandemic can pose a serious thread for modern society, since it has the potential to disrupt the economy or even claim many lives. Thus, it is important to find ways to prevent or at least limit such outbreaks. In the beginning of a pandemic, as seen with Covid-19, no pharmaceutical interventions might be possible and therefore the impact of such a pandemic has to be mitigated via non-pharmaceutical interventions (NPIs) [22].
In this study we focus on abstract NPIs that are able to reduce the transmissibility of a disease. A concrete example would be a mandatory mask wearing order by the government, as has been seen during the Corona pandemic.
We want to study the probability distribution function of the cumulative number of infected people occurring during the spread of a disease through a small-world [23] network, subject to NPIs. We want to investigate the influence of the NPIs on the distribution over its full range of the support. Since this involves probabilities as low as 10 −50 we cannot simulate the model by using typical-event sampling alone and instead have to rely on special largedeviation techniques [24][25][26] that are based on Wang-Landau (WL) [27,28] followed by entropic sampling [29].
Beyond the more practical questions we also consider a mathematical viewpoint by investigating the rate function. This allows us to verify whether the large-deviation principle [30][31][32][33] holds, i.e. whether the probability distribution falls into a standard class with respect to its large-deviation behavior.
Note that when aiming at simulating real-world systems, averaging over network ensembles is often not necessary, because usually only one or few typically rather large networks are given, sometimes rather complex ones, e.g. multiplex networks [34]. For this reason, we also focus on a given typical network here, but investigate it with respect to its typical as well as rare dynamical diseases-spreading properties.
The remainder of the paper is structured as follows. First the SIR model and the used network ensemble are defined. Next we present the algorithms used for the rare event sampling. We briefly investigate the available parameter space to decide on the actual parameter we use for the later analysis. Then we present the results of the large-deviation simulations and the investigation of the empirical rate function. We finish with a summary and outlook.

Model
To model the disease spread we use the same SIR-model as in a previous large-deviation study [19] but with some slight modification to incorporate the use of protective measures.
In principle, each of the N nodes of the graph is in one of three states: susceptible (S), infected (I) or recovered (R). Our simulations of disease spreads always start with one infected node, which is arbitrarily chosen before starting the simulations so that every simulation uses the same initial infected.
At each time step τ for each infected node we iterate over their susceptible neighbors. Each of those then becomes infected with the transmission probability λ. Then we iterate over all infected nodes, excluding those that were infected during this time step, and let each recover with the recovery probability µ.
At time step τ the current fraction of susceptible, infected and recovered nodes are denoted by s(τ ), i(τ ) and r(τ ) respectively, from which the fraction of cumulative infected nodes c(τ ) = i(τ ) + r(τ ) can be computed. One way to indicate the severity of the disease spread is the fraction C of infected nodes after the infection died out which also marks the end of the disease-spread simulation. The latter equality only holds for µ > 0.
Here we include protective measures in a simple way. Once a predefined threshold θ of simultaneously infected nodes is reached, the transmission probability λ is decreased by multiplying it with a suppression factor α ⩽ 1. Thus, the new transmission probability until the end of the disease-spread simulation is This mimics, for example, a persistent mask requirement, which was levied in many countries during the COVID-19 pandemic.

Network ensemble
We use a small-world network [23] to model contacts in a population, since many contactnetworks exhibit small-world properties [35]. For the modeling of real-world diseases, one certainly needs more sophisticated network models. But since our study is focussed on fundamental properties, in particular the influence of NPIs on the shape of the investigated probability distributions and the description of corresponding phases, the main results should be apparent for a simple network model. The model we use is defined as follows. We start with a ring of N nodes in which each node i is connected to its next neighbor i + 1 mod N and to its second next neighbor i + 2 mod N. All edges {i, j} are undirected and therefore each node has a degree of 4 in the initial network. In order to gain small-world characteristics, we iterate once over all 2N edges. For each edge {i, j} (j = i + 1 or j = i + 2) we draw an independent random number to rewire it with probability p to a random and uniformly drawn node j ′ , i.e. the edge becomes {i, j ′ }. With probability 1 − p the edge is not changed. If the resulting network is not connected it is discarded and the generation process is started anew.
The rewireing process introduces long-ranging connections while maintaining high transitivity, both of which are features of small-world networks [36]. We use a rewire probability of p = 0.1 just like in our previous study [19], although other values also lead to small-world characteristics [23].

Algorithms
Since the focus of our work is in exploring the tail of the probability density function (pdf) P(C) we need to apply special large-deviation algorithms [24]. Such algorithms exist since the 1950s [37]. In statistical physics, these techniques have become widespread, probably starting with transition path-sampling [38,39] of molecule dynamics. Later on many other variants and models were considered, e.g. distribution of alignment scores of protein sequences [25,40,41], nucleation [42], properties of random graphs [43,44], dynamics of the totally asymmetric exclusion process [45,46], calculation of partition functions [47], dynamics of model glasses [48], ground states of Ising spin glasses [49], dynamics of Ising ferromagnets [50], statistics of negative-weight percolation [51], properties of fractional Brownian motion [52], the Kardar-Parisi-Zhang equation [53], and work distributions of non-equilibrium processes [26,54].
The basic idea, as explained for the SIR model, is that the large-deviation algorithm controls the simulation of instances of courses of disease spreads. This means, the algorithm must be able to manipulate the disease spread in a controlled fashion. Here, the simulation of the disease spread with additional protective measures is carried out in a similar way as in a previous [19] study.
The main idea is as follows. For a standard disease-spread simulation one would draw random numbers distributed uniformly in [0, 1] on the fly during the simulation and compare these with the transmission probability λ or the recovery probability µ to decide whether nodes get infected or become recovered. Instead, we draw all random numbers before a run of the actual disease-spread simulation is performed. These numbers are stored in two vectors ξ λ , ξ µ , which contain for each node and for every time step τ a distinct random number. These numbers are used during the disease-spread simulation.
This means that C is now a deterministic function of the state (ξ λ , ξ µ ), i.e. we can write C(ξ λ , ξ µ ). Note that the vectors must contain enough random numbers for all possible simulation outcomes, i.e. the disease must die out before we reach the end of the stored random numbers. For a detailed explanation of the vector length estimation we refer to [19]. With this approach, the vectors are chosen long enough, so every simulation could finish and no infected node was left.
So far, this is only a technicality and will not change the outcome of the disease-spread process so why are we doing this? Let us give an analogy. Often one calculates the pdf of static quantities for a given ensemble, e.g. for an ensemble of Erdős-Rényi [55] random graphs a state drawn from the ensemble would correspond to a graph G. Each graph G has a corresponding diameter d(G) and one might be ultimately interested in calculating the underlying probability distribution P(d) [56].
For the SIR model we can apply the same techniques as we essentially have the same starting point, i.e. we have an ensemble of the state (ξ λ , ξ µ ), for which we can calculate C(ξ λ , ξ µ ), and are interested in the probability distribution P(C).
The next idea of our approach is that values contained in these random number vectors are controlled within an Markov-chain-Monte-Carlo (MCMC) approach which is wrapped around the disease-spread simulation. Thus, depending on the vectors chosen with the MCMC different infection courses, i.e. time series, will occur. The MCMC is set up in such a way that an estimate for the full pdf P(C) for a given specific network is obtained directly. Specifically, the 1/t WL algorithm [28] is used, which is a small modification to the original WL [27] algorithm to prevent error saturation [28,57].
We here only outline the basic idea of the WL approach. Initially a non-normalized probability distribution P(C) = 1 is assumed for all values of C.
µ . Both configurations of random vectors determine courses of infection and correspond to resulting cumulative numbers C ′ and C (t) which are determined by complete runs of the disease-spread simulations using the random-number vectors. The trial configuration is accepted with a Metropolis-Hastings probability min{1, and a corresponding value C (t+1) . In case of non-acceptance, the configuration at time t also becomes the one at time t + 1. Then, the estimate for the distribution is updated by a multiplicative factor f > 1 at C (t+1) , i.e. P(C (t+1) ) = fP(C (t+1) ). Since the inverse of the acceptance probability is proportional to the probability of the trial configuration, and because the estimated probability is increased for each current configuration, the WL algorithm rather automatically more or less uniformly samples the full space of C values while obtaining a better and better estimate for the distribution P(C). In principle one starts with a relatively large factor f ≈ 2, to quickly obtain a rough estimate of P(C) and then iteratively reduces f according to some schedule to adjust the distribution on a finer scale. Here in particular we follow the 1/t WL algorithm, see the literature [28] for details. The Markov moves, i.e. changes of the vectors to generate the trial configurations are identical to the related previous study [19] and are therefore not repeated here.
This overall approach enables studies in regions of low probabilities, which would not be accessible with simple sampling techniques. Utilizing this approach, one can investigate distinct features of the pdf over a large range or even the full support.
In order to increase the accuracy and to sample trajectories of the infection courses, even in the region of low probabilities 10 −50 , we additionally perform entropic sampling [29] after the WL algorithm is finished. The general principle of entropic sampling is similar to the WL algorithm, but here we already start with the estimated pdf from WL which is not updated any more, i.e. in this way detailed balance [58] is ensured. This results in a sampling of all possible trajectories which is rather uniform in the space of corresponding values of C. Note that the here obtained histogram of the observed values of C also allows for a final although small improvement of the estimate of the distribution. This approach allows us to perform the analysis of the time series as presented in section 6.1.

Results: simple sampling
In order to analyze the effect of protective measures on the spread of the disease, we first have to decide on the values of some parameters. In this section we use simple sampling, i.e. typical-event sampling, to explore the parameter space and choose these parameters such that the obtained results are informative.

Parameter selection
Since for a large range of the parameter space many global results, like the initial growth, depend mainly on the relative difference between transmission and recovery probability, one is basically free to choose any recovery probability. We chose to set the recovery probability to µ = 0.14, just like in previous studies [19].
With this choice, for the pure model without NPIs a critical transmission of λ c (N = ∞) = 0.1763 was found, at which an infinitely large small-world network switches from a nonpandemic to a pandemic phase [19]. The critical value was obtained by considering for different network sizes the transmission probabilities λ c (N), which maximize the variance of C, respectively, and extrapolating to N → ∞ using a fit to a power-law plus a constant. It was observed that for system sizes 1000 ⩽ N ⩽ 3200 these critical transmissions λ c (N) are, at least within their standard-deviations, almost equal to the limiting value [19].
Thus we present below the result for a small-world network of N = 1000 nodes to analyze the general effects of the protective measures by varying the suppression factor α.
Lastly, the threshold θ of the fraction of simultaneously infected nodes at which the protective measures take effect has to be chosen, i.e. the measures take affect once i(τ )/N > θ is reached. If this threshold is set too high θ → 1, almost no spread of a disease will reach the point at which the transmission probability would be reduced. Furthermore, a threshold θ set too high does not reflect real-world pandemic interventions, because protective measures are not introduced when, e.g. already θ = 0.5 ≡ 50% of the network is simultaneously infected.
If the threshold is set to low θ → 0 on the other hand, the protective measures take effect very early in the simulation. This would approximately correspond to a simulation that starts with the reduced transmission probability in the first place but without any protective measures. The results of these simulations are already present in [19] and thus of no interest here.
For our simulations we have fixed the threshold to θ = 0.05. As a result most of the time the disease is not yet strongly spread throughout the network when the protective measures are activated, i.e. the measures can still influence the spread significantly, however the disease still takes some time to reach the threshold and thus makes these simulations different from starting with a reduced value of λ. We have also performed simulations of other values of the threshold in the vicinity of θ = 0.05, but could not observe any change with respect to the general behavior.

Protective measure effects
For both the transmission probability λ and the suppression factor α, 100 equally spaced values in the interval [0, 1] are chosen which lead to 10 4 possible combinations. For each parameter configuration, 10 5 small-world networks are generated and the disease-spread simulation is performed. The mean fraction of cumulative infected nodes C and their standard deviation σ are calculated for these small-world networks and the results are shown in figure 1.
Without protective measures, i.e. α = 1, the position of the peak of the standard deviation λ c (∞) ≈ 0.172 corresponds to the critical transmission of [19] with deviations due to the lower resolution of the considered values of λ and the different system size. With increasing strength of the protective measures α → 0 this critical point of the epidemic phase moves to larger transmissions λ(α) → 1 until protective measures are so strong that on average the infection does not spread through the whole network anymore. This is illustrated by the valuesC ≪ 1 for α ⩽ 0.2 In addition to this critical line separating the pandemic from the non-pandemic region, another peak at λ ≈ 0.13 can be identified when looking the standard deviation. In figure 1 this is well visible for small values of α. Further investigations show that this point corresponds to infection courses where the ratio of currently infected stays just below or barely reaches the threshold θ. This explains the high variability of these infection courses, leading to a peak in the variance of C. Note that the value of λ where this happens depends also on the value of θ. As we observed, right at this point λ = 0.13 about half of the infection courses stay below θ = 0.05 for a long time and can therefore infect many nodes before the disease eventually dies out. The other half of the infection courses reaches the threshold significantly earlier.

Choice of suppression factor α
We are in particular interested in the effect of NPIs on outbreaks when it actually matters, i.e. for cases where one observes large fractions C → 1 without the protective measures. This is in particular the case if the transmission probability is well beyond the critical one of the pure case, e.g. for λ = 0.5, but the exact choice of λ is not essential.
What remains is to choose a value of α. Three cases will now be distinguished. The first case represents no protective measures, the second case moderate and third one strong protective interventions. It is hard to determine what actual values of α corresponds to given real-world interventions. Many studies about the SARS-CoV 19 pandemic show, that even the same protective measures, such as restraint of large gatherings, public obligation to wear a mask or lockdowns have different effects across countries [11,59].
Across all these studies, the transmission of the virus gets reduced by 30%-80% [11,59,60], which corresponds to a parameter of α ∈ [0.2, 0.7]. A smaller value of α, thus highly effective interventions, typically correspond to a strong contact ban. Higher values of α correspond to more moderate interventions such as a ban of large-scale events. Therefore, we chose α = 0.5 for moderate and α = 0.2 for strong protective measures.

Results: large deviations
Using large-deviation methods, it is possible to determine the pdf of interest over a large range of or even on its full support. Here we consider the distribution P(C) of the fraction C of cumulative infected nodes equation (1).
In figure 2 the pdfs for an arbitrarily drawn network with N = 1000 nodes, a transmission probability of λ = 0.5 and the three cases α ∈ {1, 0.5, 0.2} are shown.
Because of the large value of λ > λ c (α = 1), without protective measures (α = 1) the two most likely outcomes are either that the disease dies out immediately as patient zero recovers before infecting another node or that almost the entire network gets infected during the disease outbreak. This is reflected by the appearance of two peaks at C = 1/N and at C ≈ 1. The intermediate states of C ≈ 0.5 are highly unlikely, one observes values as small as P(C) ≈ 10 −40 . This is plausible, because we start the outbreak with just one infected node, so the chance of this node recovering before transmitting the disease further is non-negligible. If the disease did not die out in the first time steps though it will likely propagate throughout the entire network due to the high transmission probability.
The other two pdfs show some totally different probability distributions with a distinct kink characteristic at C = 0.075 for α = 0.5 and C = 0.06 for α = 0.2.
First of all, for small fractions C ⩽ 0.05 the curves show the same courses as for the α = 1 case which is due to the threshold θ = 0.05. For C < θ it is impossible for the infection courses to exceed the threshold and trigger the transmission reduction through the suppression factor α. So the pdfs in this region must be the same regardless of the chosen value of α.
Looking at some of the sampled trajectories i(τ ) resulting in the different C values (not shown) one can understand why the strong measures with α = 0.2 deviates from the pdf with no measures (α = 1) earlier than the moderate case with α = 0.5. The strong measures are able to reduce the spread of disease more efficiently and thus are able to push some outbreaks that reached the threshold to smaller C values, than the moderate choice of α is able to, and thus it increases the probability of observing smaller values of C. In the analyses presented below, this kink in the distribution, which is a visible identification of protective measures usage, will also become visible for other measurable properties.

Infection course disparity
During the entropic sampling we store every 100th outbreak trajectory. These curves can be binned according to their corresponding C value.
Next, we address the question whether for different values of C the disease-spread courses are rather similar to each other or very different. In order to compare the time series, especially near the occurrence of the kink, we randomly choose 800 infection courses from each bin C and compute the disparity [19] of the bins.
The disparity is defined as follows. First we consider two arbitrary time series X 1 , X 2 . X i can be, e.g. s(τ ), c(τ ) or i(τ ) as observed in the simulations but normalized. The normalization is obtained by dividing the original time series values X i (τ ) by the maximum observed value, i.e. X max i = max τ X i (τ ). This allows one to compare the general shape of the time series rather than their magnitudes. Let l 1 , l 2 be the lengths of the two time series with l max = max{l 1 , l 2 }. The values of X i beyond its given length we define as the last recorded value. With this, we define the distance Now the disparity V X (C 1 , C 2 ) for two given values of fractions C is defined as distance d averaged over all pairs of corresponding time series where a pair always consists of one time series which exhibits value C 1 and one which exhibits C 2 . Examples for such time series are shown in section 6.2.  The disparity between times series exhibiting values C 1 and C 2 is color coded accordingly. These heatmaps serve in some sense as phase diagrams in the C space, as it was used for the case without NPIs [19].
The diagonal represents the comparison of infection courses within one single bin C, i.e. it denotes the variability within a given 'class' of infections. Here we observe in general a high similarity. At C = 0.075, α = 0.5 and C = 0.06, α = 0.2 a sharp edge in the heatmaps can be recognized, corresponding to the kinks in the probability distribution.
If we compare time series with high infected fractions, e.g. C = 0.8 with the other time series, we can mainly recognize two regions. For the second region one has to distinguish two cases. For very small values of C, in particular C < θ = 0.05, the NPIs were not activated, so the time series will be like the one seen for the standard case. For larger values, the behavior changes, visible as discontinuity lines in the heat maps. Note that the actual value of C where a change is visible is larger than θ, because the NPIs are triggered by the value of the fraction i(τ ) of currently infected rather than by the cumulative fraction c(τ ). The point where the change is actually visible for α = 0.5 is C = 0.075 and for α = 0.2 we have C = 0.06, which, as noted earlier, directly corresponds to the point where the measured pdf P(C) deviates from α = 1.0. Right at the discontinuity the effect of protective measures leads to a fast decrease of infection numbers once the threshold is reached thus making the overall time series more similar to one another than when comparing to time series that barely missed triggering the NPIs. This results in a reduction of the disparity as can be seen by comparing the disparity on the diagonals for small but increasing values of C, where a jump to very small values of the disparity is visible.
When we compare the disparity on the vertical (or horizontal, its symmetric after all) C ≈ 0.8 line one can see that the disparity first decreases before a sharp rise at the mentioned discontinuity and then it decreases again. This effect is much more pronounced for α = 0.5 and barely noticeable for α = 0.2. The reason for this behavior is that for α = 0.5, the maxima of the i(τ ) time series for rather small and rather large values of C appear at very similar times, see also section 6.2 below. For α = 0.2 the position of the maximum depends stronger on the final value of C, leading to larger disparities.
The results for V C and V r (not shown) look qualitatively similar to V i , whereas the disparity V s (not shown) is not so informative.

Time series analysis
In order to understand the behavior of the infection dynamics in the different regions better, figure 4 shows some non-normalized time series for the case of medium NPIs with α = 0.5. The time series of i(τ ) are binned according to their respective C values.
In each case 100 time series are plotted with transparent colors and one arbitrary example course per chosen value of C is highlighted. Depending on the value of C one gets different characteristic courses, in principle there are three different types of behaviors. The first one corresponds to the first region in the disparity heatmap, while the other two types occur in the second region of the disparity heat map. For intermediate values like C = 0.1, the infection is quickly rising and hits the threshold value, such that the NPIs are activated. This leads to a quick decrease afterwards. When looking at P(C) shown in figure 2, one observes that for α = 0.5 this appears with a rather small probability.
Finally, for even larger values of C, the infections are also fast growing. Although the NPIs are activated, the growth continues such that many nodes get infected. The infection course looks like in a typical non-intervening exponential disease spread [19]. For this medium NPIs with α = 0.5 this happens often, see figure 2, thus many nodes are typically becoming infected.
In figure 5 we consider the case α = 0.2 and show 100 time series for different values C respectively. Again three different types can be observed. For low values C ∈ {0.04, 0.1} the same general shapes of the time series can be recognized just like in the moderate protective measure case with α = 0.5.
However, in the other two cases the time series exhibit lower peak values of the number of infected as compared to the cases with α = 0.5. This is due to the strong NPIs. This means, the infection is less severe with respect to the load on the health care system. But for a given value of C, this means, the infection will last longer, i.e. it does not decrease so quickly, as compared to the medium-NPI case. From the distribution P(C) in figure 2 one observes that this type of behavior with a medium number of total infected is now very common.
For the largest C values, the time series exhibit a bit higher peak values, but still lower that for α = 0.5 and they last also very long. But for α = 0.2 such a behavior is very rare with P(C) < 10 −10 . Thus, the NPI is strong enough to reduce its probability considerably.

Analysis of peaks for simultaneously infected nodes
The sample disease-spread courses presented in the previous section indicate that for moderate and strong NPIs, the shape of the time series may differ. Therefore we now investigate the time step τ at which the infected fraction reaches the maximum value max τ (i(τ )) and denote it by τ max . Now, for all infection courses corresponding to the bin C, the time step τ max is considered and a normalized distribution ρ(τ max |C) is calculated. The results are shown in the heatmaps of figure 6.
Here the outcomes of moderate (α = 0.5) and strong (α = 0.2) protective measures differ significantly from each other. This can be seen by considering four different intervals along the C axis, where the first one corresponds to the first region visible in the disparity heatmaps, while the other three intervals subdivide the second region of the disparity heatmaps.
1. The first interval for about C ⩽ 0.06 collects all the disease-spread courses were no protective measures are activated. Thus, they look exactly the same for α = 0.2 and α = 0.5. The distributions of τ max are moderately broad. 2. The second interval starts with a visible sharp 'edge' and very narrow distribution dominated by values τ max ≈ 10 until C ≈ 0.2. The narrowing is induced by a kind of synchronization of the processes due to the onset of the NPIs. The distributions ρ(τ max |C) for the two cases look not the same but still similar. 3. For larger values until C ≈ 0.8, the two cases of the NPIs strength differ strongly. For the moderate NPIs, the mean of τ max grows slowly and the distribution broadens. On the other hand, for strong NPIs, the distribution ρ(τ max |C) is still dominated by rather early-time peaks, but exhibits a much broader spread. Thus, stronger fluctuations appear. This makes it for real-world situations a bit more difficult to predict the development of a diseases, although, as mentioned, the peak load on the health care system will be smaller, which is good. 4. In the fourth interval C ⩾ 0.8, the behavior for α = 0.5 is similar to the third interval. For α = 0.2 only the broad part of distribution remains, which means that the time of the peak position is not that well defined. The disease-spread courses are characterized by longlasting medium-high levels of infections, as already visible in figure 5. Conditional density ρ(τmax|C) that shows the probability of τmax for any given C for α = 0.5 (top) and α = 0.2 (bottom) as measured during the entropic sampling.

Different system sizes
Finally, we turn to a more mathematical question. In large-deviation theory [30][31][32][33] there is a large class of probability distributions obeying the large-deviation principle. This means the shape follows a standard form with respect to the finite-size behavior as This means dominating contribution of the size N enters just as a linear prefactor before the so-called rate function Φ, which only depends on the density C. Other dependencies on the size are less important o(N). Processes which obey this large-deviation principle are typically better accessible to analytical methods, for example by the application of the Gärtner-Ellis theorem [30][31][32][33]. Note, however, that the calculated rate function is not convex and has a non-differentiable point which means that the Gärtner-Ellis theorem is only applicable for the convex envelope [31]. To investigate whether P(C) of the disease spread with NPIs follows the large-deviation principle, we have to simulate the system for different system sizes, resulting in separate distributions P N (C). From this we compute the empirical rate function where Φ 0 is a constant that ensures that all rate functions have their minimum at min C Φ(C, N) = 0. We do this because for the limiting rate function the minimum value has to be Φ min = 0, because otherwise P(C) would either converge to zero (Φ min > 0) or diverge (Φ min < 0) everywhere. Thus, we can safely define Φ(C, N) such that this holds also for finite values of N. Hence, we have performed large-deviation simulations for different system sizes of N ∈ [500, 3000] with λ = 0.5 and α = 0.2 from which we then calculated the corresponding rate functions. They are shown in figure 7. The results suggest that the general observations for the N ⩾ 1000 case are independent of the system size: There is a peak near C = 0.05, a minimum of Φ(C) appears in the vicinity of C = 0.3 and for C → 1 a strong growth is visible.
Note that we always consider just one specific network for each system size N. For a real mathematical extrapolation, one would also have to average over different graphs, maybe also over rare graphs, such that a kind of 'two-fold' large-deviation study would be necessary. This is beyond the scope of the current study and would require a much larger numerical effort.
We did, however, do some test simulations for different networks and only observed significant changes for very small values of C, not for larger ones. This is because the spread of a disease with a small total number of infected nodes depends much more on the local neighborhood of the first infected one as compared to infections reaching large fractions of the graph.
In general, the observed empirical rate functions are close enough such that we can safely assume that the limiting rate function exists and the large-deviation principle applies.

Summary and outlook
We studied a standard SIR model to investigated the effect of protective measures that come into effect after a threshold θ of simultaneously infected nodes is reached. We considered two different values of the protective measure suppression factor α corresponding to medium and strong measures. We evaluated the effect on the probability distribution P(C) of cumulative infected nodes C. In order to obtain the pdf on its full support we used large-deviation methods and especially the 1/t WL algorithm [28]. We combined this with subsequent entropic sampling [29]. With this approach, we can observe the probability density distribution for probability densities as small as 10 −50 . Most striking, we observed a characteristic kink in the pdf that corresponds directly to the onset of the protective measures.
Our approach allowed us to look very deep into the changes of the infection dynamics induced by using NPIs. In particular, there are broad ranges of values of C, where moderate NPIs lead to a typical behavior, whereas strong NPIs lead to rare behavior with P(C) < 10 −10 or even lower. For other values of C it is just opposite.
Furthermore, the approach allowed us to sample representative disease-spread courses, i.e. time series i(τ ), c(τ ), r(τ ) for every possible value of C, which we binned. Hence we were also able to analyze courses which appear with rather low probabilities, such that they can not be accessed with simple sampling techniques.
We performed a comparison of the courses and could identify three different characteristic behaviors of the course of infections, which can be summarized by two regions in the disparity heatmaps. We also evaluated the times τ max where the maximum of simultaneously infected nodes were reached, which represents the highest load to a health-care system. We observed that the distribution of the time step τ max changes when using different protective measure strengths. With moderate measures, the peak of i(τ ) can be frequently encountered at very similar time steps τ max . However, the peaking times for strong NPIs are much more distributed in the time domain, making the measures necessary for longer times and accurate predictions of the evolution more difficult, which might pose difficulties for the acceptance of the measures in the society.
Finally, looking at mathematical large-deviation properties of the model, the investigation of the rate function suggest that the large-deviation principle holds. This means the probability distribution follows a standard behavior with respect to the finite-size dependence of the tails and could be accessible to standard analytical methods like the application of the Gärtner-Ellis theorem.
With respect to the modeling of disease spread, in future studies we want to increase the spectrum of protective measures by implementing lockdowns directly through changes in the network structure. Removing edges between nodes would represent the core principle of lockdowns, whereby in this study we used a global reduction in the transmission probability. We furthermore plan to study the effect of vaccinations. Also it could be interesting to couple networks of different species, e.g. to model the transfer of zoonoses between animals and humans, which is often a rare process as well. Thus, large-deviation approaches should turn out to be very useful here as well.
In general, our results confirm that the application of large-deviation approaches to dynamic phenomena allows one to investigate the influence of rather arbitrary changes to the system. On the analysis side, one can identify different phases by looking at characteristic properties of the corresponding time series. Here many more dynamical or non-equilibrium models should be accessible in this way.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://doi.org/10.57782/GASWVN.