Percolation of fully penetrable disks using the three-leg cluster method

The three-leg cluster method is a relatively new approach to computing the percolation thresholds. To date, it has only been applied to lattice models. It is characterized by high versatility in choosing the shape of the system and by the universal probability 1/2 that, at the phase transition, a three-leg cluster exists that spans three segments of the system border. We applied it to the problem of continuous percolation of discs, for which we estimated the critical filling factor to be 1.128 086 7(5) . This confirms that the three-leg cluster method is applicable to continuous percolation models.


Introduction
Percolation is an essential model in statistical physics [1,2]. Starting from the set of independently occupied sites of a lattice with some global probability p and the notion of connectivity between adjacent sites, percolation is defined as a formation of an 'infinite' cluster that touches opposite borders of the system at a certain threshold occupation probability p c . In the thermodynamic limit (infinite system size), a phase transition occurs, allowing the modeling of complex phenomena. The theory of percolation has been applied in many areas [3], including * Author to whom any correspondence should be addressed.
Original Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. porous media [4,5], conductivity of heterogeneous composites [6], rheology of polymers [7], and epidemic spreading [8].
In Monte Carlo simulations of percolation, one often investigates the convergence of the percolation probability function to the step function as the system size approaches infinity, which corresponds to a phase transition in the thermodynamic limit. The intersection points of subsequent curves in such a sequence of functions tend to a point (p c , U c ), where 0 < U c < 1 is the universal crossing probability. It is often easier to determine the value of U c either exactly or with very high precision and then use it to improve the accuracy with which p c can be determined numerically [14,22].
The value of U c depends on how exactly the 'infinite' clusters are defined. For example, in a square system, one can investigate the formation of clusters that span or wrap its opposite sides, horizontally or vertically, or exactly along one of these directions but not the other, or in both directions simultaneously [14,22]. Clusters spanning a pair of sides are 'two-leg' whereas those spanning four sides are 'four-leg' ones. This can be generalized to n-leg clusters with an arbitrary integer n ⩾ 2. Although clusters with n > 2 are not expected to have qualitatively different properties than the two-leg clusters, the case n = 3 actually has some special geometric property: for lattices for which a matching lattice exists, exactly one three-leg cluster is expected to exist either in the original or the matching lattice [23], for any p and even for finite systems. This property implies that, for self-matching lattices and systems of arbitrary shapes, U c = 1/2 [13,23,24]. Thus, for self-matching lattices, the three-leg cluster method can provide the exact value of p c = 1/2 [9] even for finite-size models, rendering the computation of the thermodynamic limit trivial. This raises the question of whether this method is beneficial for systems that are not constructed on self-matching lattices, where the values of p c are typically nontrivial. The case of a square lattice was tested numerically for a few system shapes such as semi-infinite strips [23] and triangles [24]. By studying the percolation of overlapping discs, here we explore whether this method can be used to accelerate the convergence rate to the thermodynamic limit in continuous percolation models.

Method
In the three-leg cluster method, a two-dimensional system of an essentially arbitrary shape is considered, with its border divided into three arcs (figure 1). A lattice may be imposed on it (not shown in the figure) with its nodes occupied with some probability p. The thermodynamic limit is the limit of the lattice constant going to 0. For each p, we consider the probability R(p) that a cluster forms such that it touches the three arcs of the system; hence, its name is a threeleg cluster. The central hypothesis of this method is that in the thermodynamic limit precisely at the percolation threshold p c , this probability has a universal value of irrespective of the system shape, topology of the underlying lattice, or division of the system boundary into three arcs [13,23].
Here, we consider a circular system, with its boundary divided into three arcs of equal length (figure 2), and investigate the continuous percolation of disks that are free to overlap (no lattice is imposed on the system). The thermodynamic limit is defined as the limit of the ratio of  The system border is a circle of radius 1 with arcs A, B, and C of the same length. The angle α between the x-axis and the ray from the origin to the separation point between arcs B and C has an arbitrary value between 0 and 2π/3. the disc radius (r) to the system radius going to zero. For simplicity, we set the system radius to 1. In each simulation, we generated a sequence of points that were uniformly distributed over this unit circle. These were treated as the centers of the disks. With each new disk, the unionfind algorithm [14,22] was used to monitor the current composition of the clusters formed by overlapping disks.
We investigated this process by using an infinite number of 'reference systems'. That is, we did not use a fixed set of arcs A, B, and C but considered all possible divisions of the circle boundary into three arcs of equal length. To this end, we define α as the angle between the x-axis and the ray from the origin to the separation point between arcs B and C and let it vary between 0 and 2π/3 (figure 2). For each cluster, we kept a sorted list of all angles between the x-axis and the centers of the discs cutting the system border. Once this list was changed during the union step of the union-find algorithm, we checked whether it contained at least three elements, and if so, we determined the two largest differences between its consecutive elements as well as the complement of their sum to 2π (figure 3). They are denoted as θ 1 ⩾ θ 2 and are used to calculate the probability Q that the cluster spans all three arcs (A, B, C) corres- ponding to α being drawn at random uniformly between 0 and 2π/3. It is not difficult to see that Moreover, for any angle α there exists at most one three-leg cluster spanning arcs A, B, and C. If such a cluster exists for two different angles α, it must be the same cluster. These properties can be used to simplify the simulations, because once a cluster is found with Q > 0, we are certain that it will remain the only cluster satisfying this condition. Each simulation is terminated when Q = 1. We used the 64-bit Mersenne-Twister mt19937 random number generator from the C++ standard library and, to a lesser extent, knuth_b from the same source for additional tests. All floating-point calculations were performed using a 64-bit representation (≈16 significant decimal digits) to obtain p c with at least seven significant digits. For the largest disk radius considered, r = 0.2, one can perform ≈ 4 × 10 10 simulations per day per processor, which introduces a data storage problem. To avoid dealing with terabytes of data, we stored only cumulative information for the probability that for a randomly chosen α a three-leg spanning cluster forms exactly when the nth disc has been added. This was achieved by storing the cumulative (i.e. obtained from all simulations) values of ∆Q r (n) defined as the changes in Q when the nth disc of radius r was added to the system. We denote this parameter by ∆Q tot r (n). This parameter is easy to trace because in an individual simulation ∆Q r (n) ̸ = 0 only for a few values of n (typically ≈ 3, never more than 17). It can be conveniently stored in a data structure called a map, with n and ∆Q tot r (n) used as its keys and values, respectively. This approach also enables one to stop and restart simulations. The sum of all ∆Q tot r (n) is always equal to the number of simulations carried out so far, and is an empirical approximation of the probability that a three-leg spanning cluster will appear first after n discs of radius r have been deposited in the system. We define the cumulative distribution function of q n , which represents the probability that a three-leg spanning cluster exists in a system with n discs of radius r. This microcanonical quantity is convolved with the Poisson distribution with the mean λ = ηπ /π r 2 = η/r 2 to obtain the probability R r (η) of three-leg cluster percolation in the grand canonical ensemble for the filling factor η = nr 2 [22], We estimate the critical filling factor η c r for discs of size r as the solution to the equation where the universal crossing probability U c for the three-leg cluster method is expected to be equal to 1/2 [23]. These values can be extrapolated to obtain the limit η c of η c r as r → 0, which is the critical value of η in the thermodynamic limit.

Results
We performed simulations of disc percolation inside a unit circle for 21 values of the disc radius r ranging from 0.00015 to 0.2. The number of simulations for a given r increased with r from ≈5 · 10 5 to ≈5 · 10 10 so that the uncertainties of η c r for the smallest disk sizes were below 2.5 · 10 −6 and decreased with increasing r to ≈10 −6 .
An example of the data stored on the disk, ∆Q tot r , is shown in figure 4. This indicates that the data for r = 0.001 can be stored with fewer than 10 5 floating point numbers, corresponding to 1.08 × 10 6 ⩽ n ⩽ 1.18 × 10 6 . The entire simulation for all values of r required less than 50 MB of disk storage. The shape of ∆Q tot r can be approximated with a Gaussian distribution (figure 4). However, the match is not perfect, and for a large r (r ⩾ 0.01), the large-n tail diminishes as exp(−n) rather than exp(−n 2 ) (data not shown). Thus, q r (n) is also almost normally distributed and its mean value is close to η c r /r 2 . This can be used to estimate the uncertainty of η c r as σ r r 2 / √ N r , where σ r is the standard deviation of ∆Q tot r and N r is the number of trials for r.
Several methods can be used to obtain the percolation threshold value from finite-size samples [15,25,26]. We chose the one developed specifically to address the solutions of (6), in which the values of η c r are fitted to where M + 1 is the number of the series terms, ν = 4/3 is a critical exponent, and A i are free parameters of the fit [15]. Nonlinear fitting was performed using the Levenberg-Marquardt method. We set M = 1, because higher values did not improve the uncertainty estimate of η c ∞ . The quality of the fit was monitored using the regression standard error s = √ χ 2 /dof (the square root of the chi-square statistic per degree of freedom). We obtained η c ∞ = 1.128 087(1), in agreement with the values obtained using other methods [17,22,[27][28][29]. The regression standard error for the fit was s ≈ 0.8, a value indicating a good match between the fitting function and the data.
For models with open boundary conditions, the leading term in (7), A 0 , is expected to vanish when and only when the universal crossing probability is used for U c in (6) [15]. However, by setting U c = 1/2 we found a small but clearly non-zero value of A 0 ≈ 0.0149 (3). This suggests that the universal crossing probability for the model considered here is close to, but different from 1/2. Following [15], we determined A 0 for several values of U c and found that it vanished for U * c ≈ 0.4954(2) (figure 5). Inserting this into (6) and (7), with A 0 = 0, M = 1, and r ⩽ 0.003, yields η c = 1.128 086 7(5). This is the main numerical result of this study. The data and the fit are shown in figure 6. The regression standard error of fit s = 0.7. Our estimate of η c is in good agreement with those obtained using other methods, including continuum gradient percolation η c = 1.128 059(12) [27], η c = 1.128 085(2) [28], rescale particle method, η c = 1.128 10(3) [17], wrapping cluster percolation, η c = 1.128 087 37(6) [22], and critical polynomials, η c ≡ π ρ c /4 = 1.128 087 06 (8), where ρ c is the threshold number density of the discs [29].
The slope of R r (η) at η c is supposed to scale with r as r −1/ν = r −3/4 [22]. We confirmed this behavior for a wide range of r ( figure 7). This behavior is similar to that reported for the wrapping percolation [22].
Finally, we also performed simulations with slightly modified boundary conditions, in which, rather than using rays shot to the disc centers in (2), we shot rays to the exact positions of the intersection points between the discs and system border. We performed only  small-scale computations of this type, because the preliminary results did not show any significant improvement in the accuracy of the method. Thus, if our method is applied to objects with more complex shapes, precise determination of their intersection points with the system border seems unnecessary.

Discussion
The precision of our estimate of η c , 5 · 10 −7 , exceeds that reported in [27,28] but is worse (≈ 8 times) than that reported in [22], where massive simulations equivalent to 100 years of work on a single computer were used. Our simulations were run in parallel on several computers with varying computational power and would have taken approximately 1.4 years if done only on the PC on which this paper was written (a four-core processor capable of running eight threads in parallel at 3.4 GHz) or about six years if only one thread was used. Because the measurement uncertainty is inversely proportional to the square root of the number of trials, by carrying out simulations for 100 years, we might hope to reduce the uncertainty by a factor between four and eight. Thus, the accuracy of the three-leg method for determining the percolation threshold is similar to (but not larger than) that based on wrapping clusters, even though the convergence rate of the former, ∼r 7/4 , is lower than that of the latter, ∼r 11/4 [22].
The fact that u * ̸ = 1/2 deserves special attention because the universal crossing probabilities for continuous two-dimensional percolation models, including overlapping discs, should be the same as those for lattice models [22], for which u = 1/2 [23]. We ascribe this to the boundary conditions: Points whose distance to the system edge is less than r are less likely to be covered by discs than those that lie closer to the origin. This slightly disturbs the connectivity between the discs in the vicinity of the system border. Although this phenomenon cannot change the percolation threshold, our results suggest that it can shift the crossing probability. However, this hypothesis requires further investigation.
On the one hand, the three-leg cluster method has a disadvantage in that the open boundary conditions are an inherent part of it. Not only are they responsible for the relatively slow convergence rate in (7), at most r 1+1/ν , but they also require continuous analysis of the discs at the system boundary, which introduces additional computational overhead. On the other hand, however, this method has several advantages. One of these is the freedom to choose system geometry. Here, we modeled the system as a circle, a geometry that, to the best of our knowledge, has not been used in the context of percolation. This choice allowed us to take advantage of the rotational symmetry of the system and consider a single simulation simultaneously within an infinite number of reference systems. Another benefit of the three-leg cluster method is its flexibility to use an arbitrary division of the border of the system into three segments. We did not explore this property in detail, thus restricting the current study to the most straightforward symmetric case of three identical arcs. Nevertheless, we expect that this property may be used to calculate the critical threshold with even higher accuracy (although at the cost of a further increase in the algorithmic complexity).
In our approach, r in each simulation is fixed at one of several predetermined values and n is a random variable. An alternative approach, in which n is restricted to one of several preselected values and r is a random variable, has also been used for continuous percolation [17,30]. It would be interesting to verify whether this alternative method can be used together with the transition to the grand canonical ensemble, as in (5).

Conclusions
We demonstrated that the three-leg cluster method for estimating percolation thresholds can be applied in a continuous case. By using a circle as the boundary of the system, the results of each simulation can be considered simultaneously in an infinite number of reference systems. One problematic aspect of this method is the choice of boundary conditions, because it can change the value of the universal crossing probability from the expected value of 1/2. The resulting algorithm has a satisfactory convergence rate (per computational resources applied), which is comparable to that of the wrapping cluster method.

Data availability statement
The data that support the findings of this study are available upon reasonable request from the authors. https://bitbucket.org/ismk_uwr/percolation_penetrating_discs_2023.