Adaptive mesh refinement in binary black holes simulations

We discuss refinement criteria for the Berger–Rigoutsos (block-based) refinement algorithm in our numerical relativity code GR-Athena++ in the context of binary black hole (BBH) merger simulations. We compare three different strategies: the ‘box-in-box’ approach, the ‘sphere-in-sphere’ approach and a local criterion for refinement based on the estimation of truncation error of the finite difference scheme. We extract and compare gravitational waveforms using the three different mesh refinement methods and compare their accuracy against a calibration waveform and demonstrate that the sphere-in-sphere approach provides the best strategy overall when considering computational cost and the waveform accuracy. Ultimately, we demonstrate the capability of each mesh refinement method in accurately simulating gravitational waves from BBH systems—a crucial aspect for their application in next-generation detectors. We quantify the mismatch achievable with the different strategies by extrapolating the gravitational wave mismatch to higher resolution.

In NR codes refinement approaches such as adaptive mesh refinement (AMR) and static mesh refinement play an important role for efficiency and accuracy of the numerical simulations.Unlike static mesh refinement, AMR introduces the capability for dynamic changes in the computational domain resolution, enabling more efficient simulations.Typically, there are two AMR paradigms: patch-based AMR and block-based AMR.In the patch-based AMR the computational domain is covered with overlapping subdomains each with different resolutions.One of the most common implementations of this paradigm is the Berger-Oliger algorithm [53].This implementation requires a complex parallelization algorithm for meshes of multiple levels to communicate with each other.Moreover, quantities at different levels of refinement can possess different values, thereby leading to different dynamics in the same region [54].This can be problematic if the evolution is terminated due to catastrophic accumulation of error on the coarse grids, where the dynamics is severely under resolved.An alternative to this paradigm is block-based AMR [55] where subdomains of different levels do not have regions of overlap except at shared boundaries.In GR-Athena++, we use a Berger-Rigoutsos [56] algorithm for mesh refinement, a detailed exposition of which is provided in [54,46].
In this work, we focus our attention on BBH mergers using three different criteria in the context of block-based AMR in our code.For the first type, we emulate the popular "box-in-box" approach with a hierarchy of mesh blocks (MBs) of different spatial resolutions surrounding the black hole punctures and moving with it.This requires tracking of the position of the black hole punctures throughout the simulation.
The criteria for refining or coarsening a part of the numerical grid is controlled by its distance from the puncture locations, with the maximum norm L ∞ being chosen to compute distances.The second criterion emulates a "sphere-in-sphere" approach in which the norm is replaced by the Euclidean norm L 2 .The third criterion is chosen based on an estimation of the local truncation error of the finite difference scheme.This approach does not require knowledge of the positions of the black holes and can be generalized to tackle other problems.We remark that Ref. [57] contrasted the cellcentered AMR criteria of GRChombo and the box-in-box approach of Lean.Our study compares box-in-box and truncation-error driven AMR in the context of the vertexcentered GR-Athena++ code.We find that by using L 2 and L ∞ criteria, it might be possible produce waveforms with mismatch of O(10 −7 ) thereby making these promising approaches for producing highly accurate NR waveforms for use with next-generation detectors.
The paper is structured as follows: In Section 2, we provide a concise discussion on the grid setup and the refinement algorithm employed in GR-Athena++.Moving on to Section 3, we delve into the indicators within our code that trigger refinement, including the norm-based refinement criteria, namely, L ∞ and L 2 methods as well as the local approximation of the finite difference truncation error, so-called the T E method.The objective of Section 4 is to scrutinize the efficiency and accuracy of each AMR criterion.Initially, in Section 4.1, we conduct a comparative analysis of the efficiency of each AMR criterion and address the challenges posed by the T E method.Subsequently, in Section 4.2, we explore the convergence tests of the waveforms generated using the three refinement methods.Following this, Section 4.3 delves into the discussion of their accuracy where it assess the potential of employing these refinement methods for highfidelity Gravitational Waves (GWs).The paper concludes in Section 5 with remarks on future directions for research.We use geometrized units G = c = 1, where G is the constant of gravity, and c is the speed of light, throughout the paper.

Grid setup
In order to increase the computational efficiency through parallelization, the computational grid (numerical domain) in GR-Athena++ is decomposed into a number of subdomains which are referred to as mesh blocks (MBs).Furthermore, the arrangement of MBs is internally represented as a tree data structure: a binary tree for one-dimensional problems, a quadtree in two dimensions and an octree in three dimensions [54,46].This representation allows us to represent relationships between "parent" and "child" MB conveniently when mesh refinement is used.
The resolution on the root grid is controlled by specifying the number of grid points in each direction.The base grid is then subdivided into MBs, each having a predetermined number of grid points, which should be a divisor for the number of points in the root grid.The MBs can then be subdivided recursively to create the full-AMR grid starting from the root grid.In particular, for a d dimensional problem, each refinement operation increases the level of refinement by 1 and subdivides an MB into 2 d "children" subject to the fulfillment of a refinement criterion.The converse is achieved during coarsening ("de-refinement"), which depends on a coarsening criterion.We refer the reader to a detailed description and illustration of the tree data structure, Z-ordering, and communication to [54,46].

Refinement algorithm
At every time step, each MB is tagged for refinement, coarsening or for keeping it as is, subject to a refinement criterion.This typically involves computing a scalar number for each MB, which is then compared against a given range provided by the user.The MBs are then refined or coarsened subject to refinement constraints which respect the fact that each MB can have neighbors whose refinement level are at most ±1 from itself [54].In the case of refinement, an MB (in three dimensions) is subdivided into 8 "children" MBs with edges of half length as its "parent".A prolongation operation interpolates data from the "parent" to the "children".Similarly, sibling grids which are all tagged for coarsening are consolidated into a single MB, with data being copied to a coarser grid by a restriction operation.[54,46] provide additional description and illustration for refinement algorithm in GR-Athena++.

Prolongation and restriction
Boundary conditions at each MB are implemented in GR-Athena++ by the use of ghost zones.Each MB can communicate with its neighbors by exchanging data between their overlapping ghost zones.For neighboring MBs which are not at the same refinement level, prolongation is performed after the data is exchanged (for coarse to fine communication), while restriction is performed prior to data exchange (for fine to coarse communication).Similarly, when an MB undergoes refinement or coarsening, these same operations are performed to store data on the resulting MBs.
To map data from a coarser MB to a finer MB, a prolongation operation is performed.For a vertex centered variable, this is done using Lagrange polynomial interpolation.A smooth function f , sampled over an equispaced grid of 2N + 1 points in an MB can be uniquely written as where L i (x) are the Lagrange cardinal polynomials satisfying L i (x j ) = δ ij and f i are the values of f at x i .This interpolant can then be used to compute f on the finer mesh.Similarly, to map data from a finer MB to a coarser one, a restriction operation is performed.This is trivial in our case, because with 2:1 grid refinement fraction and vertex-centered sampling of the data, the coarser grid points form a subset of the finer grid, and a direct copy operation can be employed.

Tracker-based AMR
In our simulations, the position of each black hole at every time step can be determined approximately by tracking the location of the punctures [58].This is done by solving an ordinary differential equation in addition to the Z4c equations [46]: where β i | xp is the shift vector evaluated at the position of the puncture x p .Knowledge of puncture locations allows us to refine in the region near the black holes, where the spatial derivatives of gravitational fields are changing rapidly in this regime of strong gravity.Following the well known box-in-box approach used in many numerical relativity codes [40,59,45,47], we design criteria where the punctures are covered by a nonoverlapping arrangement of MBs with the increasing refinement levels closer to the location of the punctures.
The criterion for refinement in this case is not local in nature, since MBs are taggered for refinement/derefinement not only on the basis of the value of the fields in their interior, but also on the basis of the puncture positions at every time step.Given the coordinate position of the p-th puncture x i p , the desired physical refinement level of an MB is calculated by computing the minimum distance between the location of the puncture and the location of the MB.This is determined by calculating the minimum distance between the puncture and the vertices of a rectangular block centered at the MB center, where the rectangular block's edges are set at a length of 1/4 of the size of the edges of the MB.The purpose of this specific rectangular block at the center, with its edge length, is to imitate the box-in-box structure of of the BAM code [46] when using the L ∞ norm.The minimum distance, considering all puncture locations is stored in d: where the minimum is considered for all punctures with locations x p between all vertices x v of the rectangular block.Here k denotes the norm used to compute distances such as Euclidean norm L 2 or the maximum norm L ∞ .The result is compared with the current refinement level of the MB, after which the algorithm decides to refine, coarsen or keep as is, subject to the constraint that the simulation has a maximum refinement level l max .Additionally, MBs with d greater than half of the computational grid extent S are coarsen.In other words, refinement of MBs that reside diagonally with respect to the puncture position at the edge of the computational grid are discouraged.The detailed criterion is presented in Algorithm 1.
Algorithm 1 Static L ∞ /L 2 -norm criterion: input: half of the computational grid extent S, l max , and a root grid.
1: Start with the root grid; 2: for each MB with center position x mb do 3: Find x v j , j = 1, . . ., 8, the vertices of the rectangular block centered at x mb ; 4: for each puncture with position x p do 5: Coarsen; 10: Refine; 12: Coarsen; 14: Keep it as is; 16:

end if 17: end for
The choice of norm determines the exact arrangement of MBs around the puncture.In our simulations we consider the L ∞ and L 2 norms.As seen in the left and right panels of Figure 1, using L ∞ leads to a "box-in-box" arrangement of MBs around each puncture while choosing L 2 leads to a "sphere-in-sphere" arrangement.From a geometric perspective, when considering the same distance relative to the puncture, the L ∞ method delineates an imaginary box and proceeds to label all the MBs located within this box for the coarsening.This results in a "box-in-box" configuration.In contrast, the L 2 method defines a sphere and tags all the contained MBs within it, resulting in a "sphere-in-sphere" arrangement.Furthermore, it is worth noting that the imaginary sphere created by the L 2 method is actually inscribed within the box produced by the L ∞ method.As we will explore in Section 4.1, this characteristic renders the L 2 method more efficient compared to the L ∞ method.

Error-based AMR
As an alternative criterion to the tracker-based AMR, we now consider a method that is based on the truncation error of the finite difference derivative scheme and we call it the T E method.This method, as opposed to the methods of Section 3.1, i.e., L ∞ and L 2 , does not require the information about puncture position at each time step and hence, advantageously, is a local operation for each MB.In this work, we use a sixth-order accurate finite different method for the spatial derivative setup in GR-Athena++.Hence, one can estimate the local truncation error of the derivatives analytically.Consider representation of a C n+1 continuous function f over x 0 , x 1 , ..., x n distinct grid points from the interval I [60] where L i (x) are the Lagrange coefficient polynomials and ξ is some value in the interval I.The first derivative of f at x j is given by As such in GR-Athena++ one can model the spatial derivative error budget as C|f (7) (ξ)|h 6 , for some constant C and the grid space h at each MB.Accordingly, we compute f (7) (ξ)h 6 at each MB point and consider its maximum bound as a proxy for the truncation error E in the given MB, i.e., The field we use to calculate the error E is the conformal factor χ of the Z4c system [61,62].This choice is motivated by the fact that in the single puncture initial data solution of Einstein's field equations, with the bare mass m 0 , the conformal factor χ is [63] and hence, χ (7) → ∞ as r → 0 as well as χ (7) → 0 as r → ∞, where χ (7) is the seventh-order derivative of χ with respect to r.It is worth noting that during dynamical evolution when the moving puncture gauges settle down, χ at the puncture behaves as 1/ √ r [64] and still χ (7) → ∞ as r → 0. Additionally, we note that the Eq. ( 6) is not formally valid at the location of the puncture, since the data is not smooth there.However, this choice ensures that E is large close to the puncture and small in the weak field regime, thereby creating a greater concentration of refinements in the strong field regime.
Having computed the error E, we need to set a bound to trigger refinement or coarsening during the dynamical evolution.This bound is a free parameter at our disposal.Hence, one can choose different values for more refined grid or less one.In this work, we choose this bound by demanding that the finite difference truncation error to be smaller than or equal to the largest asymptotic error in our numerical schemes.In particular, we consider the asymptotically accumulated error of the fourth-order Runge-Kutta method as the largest error in our scheme and hence, we ensure that E is not larger than O(h 4 ).Consequently, the upper and lower bounds are chosen as [a h 4 , b h 4 ], where a and b are defined by the user.In practice, we chose a = 1 and b = 10 for all our simulations.A summary of this process is represented in Algorithm 2. Compute: Refine; 6: else if E ≤ a h 4 then 7: Coarsen; Keep as is; 10: end if 11: end for Fig. 2 shows a grid that is generated using the finite difference truncation errorbased refinement criterion.The criterion, as desired, refined more near the black hole punctures.Comparing with Figure 1, we see that this criterion is parsimonious around the puncture locations and the space between them.The total number of MBs created by this method is generally less than L ∞ and L 2 methods.Here the T E method is used for the refinement criterion.Comparing with Figure 1, which shows the grid structure at the same evolution time, we see T E method tends to create fewer MBs in the space between BBH.

Results
In this section, we consider the usefulness of our AMR criteria, by scrutinizing: computational cost (or simply the efficiency of criteria), the convergence order of the extracted GWs, and the capability of the code to produce high quality GWs for the next-generation detectors.
For our BBH dynamical evolution tests the initial data is produced by the TwoPunctures code [65].We choose the mass ratio q of the system q = 1 and the total mass (at infinite separation) M of non-spinning black holes (BHs) M = m 1 + m 2 = 1.For low eccentricity orbits, following [66]

Efficiency
As mentioned earlier in Section 2.1, GR-Athena++ starts from a root grid, the configuration of which is set in the parameter file, and then the refinement routines begin to check the MBs against the specified criterion and hence refine them and the subsequent children until the grid resolution satisfies the refinement criterion or exceeds the maximum refinement level.Note that initial data is recomputed on each refinement level, if refinement occurs at the beginning of the simulation.
Since the L ∞ and L 2 methods are based on puncture positions and the initial positions are known, for a given root grid and the maximum refinement level, these methods can readily refine the grid structure in the vicinity of the BHs for an accurate interpolation of initial data onto the refined grid.On the contrary, the T E method is not informed of the puncture positions a priori.When the initial data are interpolated onto the root grid they lack the sharp features needed to guide the T E refinement criterion.Furthermore, if we start with a root grid that is properly refined around punctures, for instance, deploying a static mesh refinement, and then immediately switch on the T E method after interpolating initial data on the grid, we observe that the T E method creates excessive refinement, with respect to L ∞ and L 2 , until the fields relax and junk radiation leaves the inner region of the computational domain, resulting in a performance drop.To avoid these issues we use the L 2 refinement criterion until the fields are relaxed, before switching to the T E method.
Another caveat we need to consider when using T E method is prevention of a grid with a too-low numerical resolution around the GW extraction regions.This case can readily happen if we choose inappropriately the range of the controlling parameters in the T E method; namely, large values of a and b in [a, b] h 4 result in aggressive coarsening at extraction distances.Furthermore, when using T E method an excessive refinements can take place at the outer boundary of the computational grid.This occurs due to the finite domain used for our simulations with Sommerfeld boundary conditions imposed [62].Consequently, some sharp and non-physical features can be created nearby these boundaries that trigger undesirable refinements in these areas.To avoid under-or over-resolving regions in the wavezone and near the outer boundaries, we restrict the regions where T E method is applied to be close to the punctures.In particular, we set the effective region of T E method to be a solid sphere centered at the BBH center of mass (coordinate system origin) whose radius is chosen as large as twice the initial coordinate separation between the BHs.The other regions, where T E method is not enabled, are only slightly altered as result of the early evolution with the L 2 method and to satisfy the constraint of maximum 2:1 ratio between neighboring MBs.As a consequence, there is sufficient resolution at the GW extraction regions.We remark that GRChombo avoids these issues by combining multiple refinement criteria [57].We do not investigate these options here.
Fig. 3 shows the number of MBs at every fifth time step of the evolution time for different AMR criteria.The number of points on the root grid is 256 in each direction, Table 1.We observe that even though both L ∞ and L 2 methods are started from the same root grid, the L ∞ method led to a larger number of MBs from the very beginning.As mentioned in Section 3.1, the L 2 method labels all MBs that are inside a spherical region for a refinement; since this spherical region is inside the region specified by the L ∞ method, L 2 method results in a fewer number of MBs.In T E method we used L 2 method as the initial bootstrapping approach until t ≈ 150M .We set this time such that the damped tail of the junk radiation, which arises from the choice of conformally flat initial data [63], and hence the peak amplitude of the junk radiation, has passed from the boundaries of the domain where T E method applied.We see after switching to T E method, that the arrangement of the MBs changes: many of the MBs are destroyed.The largest number of MBs belongs to the L ∞ method (orange), followed by L 2 (blue), and then T E method (green).We use the L 2 criterion as the T E 's bootstrapping until t/M ≈ 150.Right panel: number of created (positive) or destroyed (negative) MBs at every fiftieth time step for different AMR strategies.L ∞ strategy has the largest amplitude, the next largest is T E strategy followed by L 2 strategy.Large oscillations of the MB number demand more load balancing and hence more computational costs.
Finally, we see at t ≈ 1850M , when the BBH system merges and hence there is only one BH for AMR criteria to work with, a drastic reduction in the number of MBs for all methods.Overall, the lowest number of MBs is created by T E strategy followed by the L 2 strategy; while the L ∞ approach results in the largest number of MBs.We observed the same behavior consistently for other resolutions as well.
Apart from the total number of MBs, an important aspect for performance and load balancing is the number of created or destroyed MBs at each time step.In the right panel of Figure 3 we show the oscillations in the creation and destruction of the MBs for every fiftieth time step.In Figure 3 at each instance of the time, a positive number of MB means the number of created MBs; similarly a negative numbers means destroyed MB at that time.In comparison with the L ∞ method and T E method, we see the oscillations are smallest for L 2 method and thereby there are fewer calls of the restriction and prolongation operators as well as the load balancing routines.

Convergence Test
We employ the Newman-Penrose scalar Ψ 4 [67] to extract GWs from our simulations, where Ψ 4 = ḧ+ − i ḧ× and ḧ+ and ḧ× respectively denote the second time derivative of the GW strains: h + (t) and h × (t).In particular, we extract ψ lm modes at the radius Figure 4: Real ℜ and imaginary ℑ parts of the extracted ψ 22 mode at R x = 80M for the highest-resolution L 2 simulation.The amplitude shown with the dashed line.We use the maximum amplitude of the mode to determine the merger time t mrg .
in which −2 Y ℓm (ϑ, φ) stands for the spin weighted spherical harmonic bases.We show ψ 22 extracted at R x = 80 for the highest-resolution L 2 simulation in Figure 4. Additionally, one can write the dominant ψ 22 mode where ϕ denotes the phase of the waveform and A its amplitude.We obtain the error estimates of numerically calculated ψ 22 by assuming the error of the numerically calculated quantity of interest q, can be modeled in powers of h, namely, where, q ex is the exact value of the quantity, n is the convergence order, and c n is the constant coefficient of the leading-order error term.Therefore, ignoring higher order terms of the error, the order of convergence n can be estimated by minimizing the residual of where C(n) is defined in which, subscripts 1, 2, and 3 refer to different resolutions.Accordingly, after aligning the phases of different resolutions at the merger time t mrg , we plot |q(h 1 ) − q(h 2 )| and compare it with different values of n in C(n)|q(h 2 ) − q(h 3 )| in order to estimate what value of n minimizes Equation (11).Fig. 5 shows the result of such convergence tests for different AMR criteria where it depicts that n ≈ 5, n ≈ 4, and n ≈ 3 minimize Equation ( 11) for the L 2 method, L ∞ method, and T E method respectively.
In our analysis, we aim to evaluate the impact of various sources of errors within our numerical schemes, taking into account different strategies for AMR, and to determine which AMR strategy delivers faster convergence, allowing us to efficiently obtain accurate GW.We further ask whether the observed convergence order in Figure 5 aligns with theoretical expectations.
Our numerical scheme introduces four sources of errors.The first source pertains to the truncation error of the finite difference operator, as elaborated in Section 3.2.This error is of order O(h 6 ).The second source is associated with the Kreiss-Oliger dissipation operator [63].In our simulations, this operator employs four ghost zones, resulting in an error of O(h 7 ) [46].Furthermore, our time integration employs the fourth-order Runge-Kutta method.Consequently, the accumulated error, particularly noticeable during extended time evolutions, scales with O(h 4 ).The fourth source of error arises from the prolongation operator and amounts to O(h 6 ).It is worth noting that in our vertex-centered grid, the restriction operator is exact, as discussed in Section 2.3.As such the convergence order should theoretically fall between O(h 6 ) and O(h 4 ).
When examining the data presented in Figure 5, top panel, we can discern an approximate fifth-order convergence for the phase within the L 2 method.Conversely, Figure 5, middle panel, suggests a fourth-order convergence for the L ∞ method.We speculate that this method exhibits a lower convergence order compared to the L 2 method due to the increased number of calls to the prolongation operator routines.As illustrated earlier in Figure 3, the L ∞ method, when compared to the L 2 method, consistently generates and destroys MBs at a rate more than 400 times larger than the L 2 method at each time instance.Even though the prolongation operator is sixth order accurate, the frequent regridding reduces the smoothness of the numerical solution in Here t mrg specifies the merger time.The resolutions h 1 , h 2 , h 3 are given in Table 1.
We try different convergence order n to ascertain which line minimizes Equation (11).These lines illustrated by solid lines.The top panel shows the convergence of ϕ when L 2 method is used.This method results in a fifth-order convergence.The middle panel exhibits a fourth-order convergence of L ∞ method.The bottom panel depicts a thirdorder convergence when the T E criterion chosen for the simulation.
time, and might amplify the O(h 4 ) error term from the time integrator.Finally, the T E method, Figure 5, bottom panel, shows third-order convergence, which may appear lower than the theoretical values.However, as explained in Section 3.2, the T E method is tuned to have truncation error O(h 4 ).The accumulation of this error over O(1/h) time steps is responsible for the overall third convergence order.

Waveform accuracy
In this section we study the quality of the extracted GWs and their computational costs for each AMR method.Here, we only focus on the accuracy of the waveform extracted at finite extraction radius, since this is what we can control with the AMR.For GW astronomy application other sources of errors, in particular finite-extraction error, need to be taken into account.Previously, in Section 4.1, we note that each refinement strategy develops a different grid structure.Equivalently, different AMR strategies result in distinct number of MBs for the same resolution.Therefore, instead of using the resolution, we use the total number of evolved grid points across all refinement levels N to define a more representative computational cost for our runs.Furthermore, to define the word "quality", we use the mismatch measure M between two different ψ 22 modes from two different runs, named ψ(f ) and ψ ref (f ) in their frequency space representations.We use the mismatch M definition [68,69] where, O [ψ(f ), ψ ref (f )] is the overlap between the waves and defined in which the inner product In Equation (15), The † sign denotes the complex conjugate operator, and ψ(f ) is the frequency space representation of the (ℓ = 2, m = 2) mode of Ψ 4 (Eq.8), obtained using the continuous Fourier transform of the time domain ψ 22 mode.In practice, we utilize a discrete Fourier transformation of the numerically derived ψ mode which is real and sampled over a discrete time line; hence, the integral limits are set by the lowest and highest frequency of the transformation.
To measure the accuracy and efficiency of our AMR criteria in GR-Athena++, we carry out a high mesh resolution (h 4 = 6.6 × 10 −3 M ) simulation with the L 2 method, since this method has a higher convergence order with respect to the other AMR criteria.This simulation has N ≈ 7 × 10 13 .We use the resulting data ψ ref as the benchmark for the mismatch comparison.We evaluate the mismatch Equation ( 13) between the extracted ψ from the simulations in Table 1 (the first three resolutions) and the ψ ref wave.Fig. 6 shows the result of this comparison.Each line corresponds to a different  13) versus the total number of grid point updates for different AMR strategies.For the mismatch the benchmark waveform ψ ref is computed using the GR-Athena++ mesh resolution 384 with the L 2 strategy.For a given amount of the work, the lowest mismatch is yielded by L 2 , then by L ∞ , and finally by the T E method.Right panel: mismatch versus the total core-hours taken by each AMR criterion.For a given mismatch, the L 2 method takes less time and hence exhibits better performances with respect to the other AMR methods.
AMR strategy and each point shows the resolution N and mismatch for each simulation.
As can be seen from the left panel of Figure 6, the L 2 -criterion curve lies below both the L ∞ and T E curves, meaning that it has a smaller error.Even though the left panel of Figure 6 depicts the superiority of L 2 method in terms of the total number of grid points N , it is not necessarily informative of the overhead taken for refining and coarsening of the grid, i.e., interpolation and load balancing.To put this into a perspective we also plot, in the right panel of Figure 6, the mismatch against the cpu core-hour to determine how costly is each AMR method.As previously shown in Figure 3, L ∞ has a larger number of MBs and a wider amplitude for creating and destruction of the MBs, so we expect L ∞ to be more expensive (∼ 60% more expensive) than L 2 and Figure 6, the right panel confirms this.Notably, Figure 6 further shows T E is also more costly than L 2 (despite the fact that is has lower number of MBs).We speculate this extra cost comes from the extra creation and destruction of MBs, Figure 3, with respect to the L 2 method and hence more calls to interpolation and load balancing routines.
Finally, we evaluate the potential of these methods to deliver sufficiently accurate waveforms for the next-generation detectors such as LISA [36], the Einstein Telescope [37], and Cosmic Explorer [38].To avoid biases in the parameter estimation for GW observations, we need to ensure the mismatch (from Equation ( 13)) between the computed waveforms ψ and the modeled one ψ ref , satisfies M[ψ, ψ ref ] ≤ p/(2ρ 2 ), where p is the number of parameters used in the GW model and ρ is the signal-to-noise ratio of the detector [70].For next-generation detectors signal to noise ratios as high as 1000 are possible and hence, the required mismatch is M ≲ 10 −7 .
Following Refs.[71,68], we model the mismatch between ϕ at resolution h and ϕ ref as where c and p are coefficients to be fit using the data.Using this ansatz, we can then estimate the mismatch between a waveform at given resolution h and the exact solution as M(c, p, h, h ref = 0).Here the data points shown with triangles with distinct color for each AMR methods and the dashed lines estimate the limit of the Equation ( 16) at h ref → 0, i.e., infinite resolution for the benchmark run.The fastest convergence of mismatch obtained by L 2 method (red color) followed by L ∞ (green) and then T E method (blue).Fig. 7 illustrates the result of this fit.We first derive the fit parameters c and p using the data points (shown with triangles in Figure 7) obtained from the simulations, listed in Table 1.The L 2 method yields the largest power (p ≈ 5.5), followed by the L ∞ method with p ≈ 5, and finally T E method with p ≈ 2.1.To further estimate the resolution required for a low mismatch, we assume the benchmark run has infinite resolution and hence, we take the limit of the fit for h ref → 0. These are shown with dash lines in Figure 7.While at low resolution the L ∞ method shows the smallest mismatch, we expect that the L 2 strategy will result in the smaller errors at high resolution, thanks to its higher convergence order p.
Finally, we predict the minimum mesh resolution that is required for each AMR criterion to yield a mismatch ≲ 10 −7 as follows: for the L 2 method h/M = 6.57× 10 −3 , in the L ∞ method h/M = 6.31 × 10 −3 , and in the T E method an impracticable resolution (h/M = 0.51 × 10 −3 ).It is notable that the model predicts that our highest resolution of L 2 method already achieves the desired low mismatch.However, we remark that this estimate does not account for all sources of error, but only for finite-differencing error, and the accuracy of this estimate depends on a limited amount of data points.As such we favor the L 2 method as an efficient criterion to be used for the next-generation detectors.

Discussion
We have presented a comparison of three AMR strategies for BBH merger simulations in the puncture code GR-Athena++.Two of the approaches set the grid resolution based on the distance from either of the two punctures.The first uses the maximum norm to determine the distance, L ∞ , and produces grid structures similar to those adopted by patch based AMR codes, such as those based on the Carpet mesh refinement infrastructure [44] in the Einstein Toolkit [43].The second approach uses the L 2 -norm which better conforms with the geometry of the fields in a BBH simulation.The third approach, here denoted as T E , is based on an estimate of the local truncation error of the scheme.This method is, in principle, agnostic to the location of the puncture.In practice, we had to impose additional constraints on the grid structure that are informed by the position of the puncture, as discussed in Sec.4.1.While our results are specific to our code, we expect that the lessons learned will be applicable to other numerical relativity codes using similar techniques (finite-differencing, puncture methods).
We have performed simulations of an equal mass non-spinning BBH system extending for ∼ 20 GW cycles up to merger and through merger and ringdown.We find that, for a fixed resolution on the root grid, switching from the L ∞ to the L 2 approach results in a ∼60% decrease in the number of mesh blocks (and hence grid points).The truncation-error based AMR further reduced the number of grid points by ∼30%.However, despite the significantly reduced number of grid points, the truncation-error based AMR simulations are found to be more expensive in terms of actual number of core-hours than the L 2 simulations.This is because AMR results in frequent regridding and load balancing operations which slow down our simulations.The "traditional" L ∞ approach is the worst performer.It generates the largest number of mesh blocks and also incurs more frequent regridding than the truncation-error based AMR simulations.
We have performed simulations at at least three different resolutions, spanning a factor of two in the grid spacing, for each AMR strategy to verify convergence of our simulations.We find relatively clean fifth order convergence for the L 2 simulations.The convergence order drops to fourth and third for the L ∞ and T E AMR strategies, respectively.The drop in convergence order for the T E scheme should have been expected, because we set the parameters of the scheme so that the local truncation error of the scheme is O(h 4 ).However, due to error accumulation over O(1/h) time steps, the convergence order is only O(h 3 ).The reason for the drop in convergence order for the L ∞ scheme is unclear.We speculate that it might arise from the error accumulation in the time-integration scheme arising from the more frequent regridding.The difference in convergence order is reflected in a better accuracy for the waveforms computed using the L 2 AMR strategy.To quantify this, we compute the mismatch between each simulation and a reference very high-resolution calculation.Both L ∞ and L 2 simulations achieve mismatches M ∼ 10 −5 with the reference simulation with h = 9.8 × 10 −3 M on the finest grid.However, at this fixed resolution on the finest grid, and roughly at the same accuracy, the L ∞ simulations is ∼ 60% more expensive than the L 2 simulations.The truncation-error based AMR simulations have mismatch with the reference simulation larger than 10 −2 are are found to be not competitive.
We cannot exclude that with additional tweaking to the refinement criteria, the T E AMR strategy could become competitive, e.g., Ref. [57].However, in our test the L 2 AMR strategy clearly outperformed more traditional "box-in-box" strategies both in terms of costs and accuracy.Moreover, while we have only considered L ∞ and L 2 in this study, from a geometric standpoint, we anticipate to observe similar box-in-box or sphere-in-sphere characteristics when employing a general p-norm.Nevertheless, using other norms may lead to varying levels of performance and accuracy and hence it is a future research extension.Our results suggests that GR-Athena++ can produce waveforms with mismatches as small as 10 −7 , sufficiently accurate for next-generation GW experiments, using the L 2 AMR strategy.An important future extension for this work is to design AMR strategies for high mass-ratio BBH simulations.

Figure 1 :
Figure 1: The grid structure in the x − y plane for an equal mass BBH simulation.The snapshots are taken at the same evolution time.Distinct refinement levels are illustrated using varying colors, with the lightest shade indicating the finest level.Dark blue spots represent the punctures and their immediate surroundings.Left panel: Deploying L ∞ -norm based refinement.The arrangement of MBs reveals a distinctive box-in-box pattern.Right panel: Using L 2 -norm based refinement.A discernible sphere-in-sphere pattern is seen.

Algorithm 2 1 :
Truncation error criterion (6-th order finite difference): input: the error bounds a and b in [a, b] h 4 and a root grid.Start with a root grid; 2: for each MB do 3:

Figure 2 :
Figure 2: An orbital-plane snapshot of the grid structure for an equal mass BBH simulation.Refinement levels are illustrated using varying colors, with the lightest shade indicating the finest level.Dark blue spots represent the punctures and their neighbors.Here the T E method is used for the refinement criterion.Comparing with Figure1, which shows the grid structure at the same evolution time, we see T E method tends to create fewer MBs in the space between BBH.
, we take the separation d between the binary d = 12M and choose the values of the momenta for each BH as |P x | = 4.681 × 10 −4 /M , |P y | = 8.507 × 10 −2 /M , and |P z | = 0.0.

Figure 3 :
Figure 3: Left panel: total number of MBs covering the computational grid at every fifth time step of the simulation.Different AMR criteria are specified with different colors.The largest number of MBs belongs to the L ∞ method (orange), followed by L 2 (blue), and then T E method (green).We use the L 2 criterion as the T E 's bootstrapping until t/M ≈ 150.Right panel: number of created (positive) or destroyed (negative) MBs at every fiftieth time step for different AMR strategies.L ∞ strategy has the largest amplitude, the next largest is T E strategy followed by L 2 strategy.Large oscillations of the MB number demand more load balancing and hence more computational costs.

Figure 6 :
Figure 6: Left panel: mismatch Equation (13) versus the total number of grid point updates for different AMR strategies.For the mismatch the benchmark waveform ψ ref is computed using the GR-Athena++ mesh resolution 384 with the L 2 strategy.For a given amount of the work, the lowest mismatch is yielded by L 2 , then by L ∞ , and finally by the T E method.Right panel: mismatch versus the total core-hours taken by each AMR criterion.For a given mismatch, the L 2 method takes less time and hence exhibits better performances with respect to the other AMR methods.

1 M 2 MFigure 7 :
Figure 7: Mismatch models for different AMR criteria.Here the data points shown with triangles with distinct color for each AMR methods and the dashed lines estimate the limit of the Equation (16) at h ref → 0, i.e., infinite resolution for the benchmark run.The fastest convergence of mismatch obtained by L 2 method (red color) followed by L ∞ (green) and then T E method (blue).

Table 1 :
Grid setups considered in this study.The computational grid covers a region [−2576 M, 2576 M ] 3 .We use 12 refinement levels.Each MB contains 16 3 grid points.The total simulation time is t ≈ 2370 M .In GR-Athena++, mesh resolution specifies the number of grid points on the root grid in each direction.The grid space, h, indicates the resolutions at the finest level.