$\texttt{Hi-COLA}$: Fast, approximate simulations of structure formation in Horndeski gravity

We introduce $\texttt{Hi-COLA}$, a code designed to run fast, approximate $\textit{N}$-body simulations of non-linear structure formation in reduced Horndeski gravity. Given an input Lagrangian, $\texttt{Hi-COLA}$ dynamically constructs the appropriate field equations and consistently solves for the cosmological background, linear growth, and screened fifth force of that theory. Hence $\texttt{Hi-COLA}$ is a general, adaptable, and useful tool that allows the mildly non-linear regime of many Horndeski theories to be investigated for the first time, at low computational cost. In this work, we first describe the screening approximations and simulation setup of $\texttt{Hi-COLA}$ for theories with Vainshtein screening. We validate the code against traditional $\textit{N}$-body simulations for cubic Galileon gravity, finding $2.5\%$ agreement up to $k_{\rm max}=1.2~h/{\rm Mpc}$. To demonstrate the flexibility of $\texttt{Hi-COLA}$, we additionally run the first simulations of an extended shift-symmetric gravity theory. We use the consistency and modularity of $\texttt{Hi-COLA}$ to dissect how the modified background, linear growth, and screened fifth force all contribute to departures from $\Lambda$CDM in the non-linear matter power spectrum. $\texttt{Hi-COLA}$ can be found at https://github.com/Hi-COLACode/Hi-COLA .


Contents
1 Introduction

Motivations
Testing gravity on cosmological scales is a key aim of the upcoming generation of largescale structure surveys, such as LSST [1], Euclid [2], and DESI [3]. These surveys will give unprecedented access to information about the clustering of matter on non-linear scales. They will thus provide an ideal arena for constraining theories of gravity beyond General Relativity (GR). Such theories generically result in fifth-force enhancements of gravity and feature screening mechanisms in order to mimic GR within our Solar System. Notably, many of these deviations from GR lie beyond well-established tests of gravity in the linear regime [4][5][6][7]. However, to make use of this new data, we will need fast and accurate modelling of nonlinear structure formation. Currently, such modelling is only available for a small number of individual modified gravity (MG) theories, largely nDGP gravity [8], f (R) gravity [9,10], cubic Galileon gravity [11][12][13], and interacting dark energy [14,15]. Indeed, nDGP and f (R) gravity have become the standard workhorse theories for tests of gravity beyond the linear regime (e.g. [16][17][18][19]); but this is principally because there exist well-established codes for them 1 , and not due to strong theoretical motivations 2 . A single tool capable of modelling non-linear structure formation across a much broader section of modified gravity theory space would remove these limitations. Indeed for many theories, it would enable them to be subjected to non-linear constraints for the first time. The creation of such a tool is the motivation for this work.
Horndeski gravity [41][42][43] is a class of theories that subsumes a substantial swathe of modified gravity theory space. Any constraints placed on the general Horndeski framework can be rapidly translated into statements about individual members of the Horndeski class. As such, constraining Horndeski gravity has emerged over the past ten years as an efficient and relatively agnostic approach for performing tests of gravity with new data [44][45][46][47][48][49][50][51][52][53][54][55]. At the level of linear perturbation theory, the popular parameterisation developed by [56], sometimes known as the effective field theory of dark energy (EFTofDE), has been successfully constrained by a variety of observations, including the binary neutron star merger GW170817 [57][58][59][60][61][62]. However, obtaining constraints on Horndeski gravity from the non-linear structure formation that will be observed by upcoming galaxy surveys would require parameterisations to be extended to non-linear perturbations. Several attempts to model the mildly non-linear regime within the general framework of EFTofDE have been made [63][64][65][66][67], but these do not address the deeply non-linear regime that we focus on in this paper. The framework of [68] has been proposed to model the deeply non-linear regime for a general modified gravity parameterisation and a spherically symmetric top-hat mass configuration, and we discuss its implementation in an N-body code [69] more below. Interestingly, the relativistic effects of EFTofDE that appear at large, linear scales have been implemented in N-body simulations in [70]. In what follows, we refer to the subset of Horndeski theories with luminal gravitational wave speed as reduced Horndeski gravity.

The Hi-COLA code
The standard method for predicting the observables of non-linear structure formation is to run N-body simulations [71][72][73][74]. Typically, in order to compute the screened fifth forces that arise in MG theories, the non-linear Klein-Gordon equation must be solved for any new field(s) present in the theory. For a general framework like Horndeski gravity, this adds a considerable degree of complexity as well as a significant increase to the computational requirements of the simulation compared to ΛCDM. For these reasons, in this paper we take an alternative approach. We show that the screened fifth force in general Horndeski gravity can be estimated using an approach involving a coupling and a screening coefficient. Our approach essentially equates to solving a linearised Klein-Gordon equation that has been corrected to account for the impact of screening. This technique was first described in [75] and was implemented for nDGP and f (R) gravity in the MG-PICOLA code ( [33], see also [32]), where it led to percent-level agreement with traditional N-body codes up to k ∼ 3h/Mpc. Another method for running N-body simulations in parameterised MG without solving the full Klein-Gordon equation was investigated in [69]. We note there are some similarities in the derivations of the screened-fifth-force expressions given in [69] and that of this work, significantly that they both consider the Vainshtein mechanism in spherical symmetry. However, the screened-fifthforce expression used in [69] is ultimately a phenomenological parameterisation, whereas ours is intrinsically connected to the reduced Horndeski action.
In this work, we present a fast, approximate simulation code called Hi-COLA 3 (Horndeskiin-COLA) that implements our approach for computing screened fifth forces. Hi-COLA uses the COmoving Lagrangian Acceleration (COLA) simulation method [76], which relies on 2 nd order Lagrangian perturbation theory (2LPT) to reduce the number of simulation timesteps required to reproduce accurate large-scale clustering. Thus we can trade computational speed for accuracy at non-linear scales without sacrificing accuracy at large, linear scales. Hi-COLA has two main components. The first is a Python module that takes a Horndeski Lagrangian as input and outputs intermediate quantities necessary for estimating the screened fifth force. The second is an extension of the COLA solver within the publicly available code FML 4 that uses the intermediate quantities from our Python module to compute the screened fifth force throughout the simulation. Whilst Hi-COLA is not itself a 'full' N-body simulation due to its use of COLA, we note that other simulation codes could be adapted to read in the output of our Python module to compute the screened fifth force.
While Hi-COLA allows us to run fast, approximate simulations of non-linear structure for any reduced Horndeski Lagrangian with a Vainshtein screening mechanism, we must first validate the code against well-studied individual theories. For this reason, we first use Hi-COLA to produce simulations for cubic Galileon gravity, with which we explore the parameter space as well as validate our code by comparing against existing traditional Nbody simulations. However, we also investigate a relatively new extended shift-symmetric (ESS) theory studied in [55], creating the first simulations of non-linear structure formation for this theory. We intend to use Hi-COLA to study a more diverse array of modified gravity theories in future work.
The paper is structured as follows. In §2, we review the cosmological behaviour of Horndeski gravity and introduce the cubic Galileon and ESS theories that we focus on in later sections. In §3, we introduce the coupling-plus-screening coefficient approach for estimating a screened fifth force and derive the relevant expressions in terms of background Horndeski quantities. In §4, we briefly introduce the COLA simulation method, before discussing the implementation of the Horndeski screened fifth force in Hi-COLA. In §5, we present non-linear power spectra obtained from Hi-COLA for cubic Galileon gravity, and discuss their features. We then do the same for the new ESS models introduced in §6. We conclude in §7. Additional details regarding validation of the code and in-depth analysis of its outputs, are presented in a set of appendices.
For the busy reader, we suggest those who are principally interested in simulations focus on §2.1, §2.2, §3, and §4 to understand our simulation approach, Appendix B to understand our validation process, and §5 and §6 for demonstrations of our code. For those principally interested in MG theories, we suggest focusing on §2 and §3.1 to understand the theory behind our screened-fifth-force computation, then §5.1, §5.3, §6.1, and §6.3 for our results.

Horndeski cosmology
In this section we describe in detail the cosmology of the Horndeski family of gravity theories, including their background and perturbation equations. This is the fundamental theoretical framework encoded in Hi-COLA. We also introduce the two example Horndeski members for which we will display results in §5 and §6. We describe our procedure for choosing initial conditions and free parameters for these theories in §2.3.

Action and background evolution equations
Horndeski gravity is a generalised class of the simplest modified gravity theories called scalartensor theories. These theories feature four dimensions and second order equations of motion, but also an additional scalar field coupled to matter via the metric tensor. The theories contained within Horndeski gravity feature a variety of screening mechanisms that reduce gravity to GR in certain environments, to maintain consistency with solar system constraints [77]. In this paper, where explicit theories are considered, we will focus on members of the Horndeski class that possess the Vainshtein screening mechanism [78]; we leave a detailed study of chameleon-screened theories [79,80] to a future work.
One of the most prominent constraints on Horndeski gravity comes from the nearsimultaneous observation of the gravitational wave event GW170817 and its gamma-ray counterpart GRB170817A [81,82]. This enabled a stringent constraint to be placed on the speed of propagation of gravitational waves relative to the speed of light, bounding the fractional difference to be smaller than approximately 10 −15 . This in turn strongly constrains several terms in the original Horndeski action [57][58][59][60][61][62] (though see subtleties discussed in [83][84][85][86][87] 5 ).
We refer to the surviving, viable part of the Horndeski action as 'reduced Horndeski'. The reduced Horndeski action is where L m is the matter Lagrangian, M P is the Planck mass, andX represents the kinetic term of the scalar fieldφ:X This action describes the subset of the full Horndeski class in which gravitational waves travel at the speed of light. A member of the reduced Horndeski class is specified viaK,G 3 ,G 4 , which are free functions of the scalar field and its derivative. With respect to the original Horndeski Lagrangian, the quintic 'L 5 ' term has been eliminated, andG 4 reduced to a function ofφ only. G 4 acts as a conformal coupling to the Einstein-Hilbert term (with GR recovered in the limitG 4 = M 2 P /2),K represents a generalised kinetic term, andG 3 is a scalar self-interaction term that gives rise to the Vainshtein screening mechanism (see §3.1). Note that we have also included a cosmological constant term Λ, though its value in Eq. (2.1) should not generally be taken to match that of the ΛCDM model.
The tildes on quantities in Eq. (2.1) indicate that these are generally dimensionful. From here on it will be convenient to work with dimensionless quantities, which we denote without tildes. See Appendix A for an overview of how the dimensionless analogues are defined. In this paper we will make standard choices for all the mass scales and timescales involved in this process, such that the Horndeski scalar field φ is dynamically relevant on cosmological scales. We denote time derivatives with respect to x = ln(a) by a prime.
From the extremisation of Eq. (2.1) we obtain a set of modified Einstein equations and an equation of motion for the scalar field [88]. Evaluating these on a spatially flat Friedman-Robertson-Walker metric, the first Friedman equation can be written as a closure equation: where Ω m , Ω r , and Ω Λ all have their standard meanings (hereafter we suppress their arguments), and the energy density in the Horndeski scalar field is By appropriate substitution, Eqs. (2.4) and (2.5) can be rearranged as a system of two second-order ODEs for φ and E (the resulting expressions are too cumbersome to display here). Together with standard evolution equations for Ω m , Ω r , and Ω Λ , we can solve these as a function of redshift to determine the background cosmology of a given Horndeski model. Ω φ is then constructed from φ and E via Eq. (2.4). One can verify that the closure relation Eq. (2.3) is satisfied to numerical accuracy during this procedure (as it must be).

Horndeski perturbations
Having solved for the background quantities in a general reduced Horndeski theory, our next step will be to study the corresponding evolution of matter density perturbations. On large cosmological scales linear perturbation theory applies, and such modifications to the growth of structure have been computed in many gravity theories [53,[89][90][91][92][93][94]. However, in this work we are interested in pushing beyond this to the (mildly) non-linear regime. In this section we borrow results from the helpful computations of [88]. We work in the Newtonian gauge, expressing the perturbed metric as We also perturb the Horndeski scalar field as φ → φ(t) + δφ(t, x) and the matter density field as ρ m → ρ m (t)[1 + δ m (t, x)]. It will be convenient to use the dimensionless variable where in the last inequality we have indicated the conversion to x = ln(a) we will employ later. On scales substantially below the cosmological horizon we may apply the quasi-static approximation (QSA) [95][96][97][98], in which we ignore time derivatives in the field equations, but keep spatial derivatives. This approximation holds provided the sound speed of the scalar perturbations does not become small, i.e. c 2 s ∼ 1. Since we are working with a general Horndeski framework, this condition cannot be guaranteed to hold for all models, and must be checked on a case-by-case basis 6 .
Under the QSA, and following [88], we assume that Φ, Ψ, and Q are small, but their spatial derivatives may not be and must be kept. In this work we choose to focus on theories that screen via the Vainshtein mechanism. Hence in addition to the term ∼ ∂ 2 Ψ in the Poisson equation, we also keep terms like (∂ 2 Ψ) 2 and (∂ 2 Ψ) 3 , and equivalents for Φ and Q. We will assume that terms without derivatives like Ψ 2 remain second-order in smallness and are discarded. We note that this would not necessarily be true for theories possessing chameleon, dilaton or k-mouflage mechanisms (for example); we leave the inclusion of these terms for future work. The spherical collapse solutions of [68] give good indication that the current work should generalise to screening involving nonlinearity of zero-or first-order derivatives of potentials.
As in §2.1, by projecting out components of the perturbed gravitational field equations, we arrive at two gravitational equations 7 and an equation of motion for the perturbed scalar field. From the traceless spatial part of the gravitational field equations, and assuming matter sources of anisotropic stress are negligible, we obtain The time-time (00) component of the gravitational field equations further yields where the coefficients are (some of these will be needed further below)

16)
and where E and P are the effective energy density and pressure of the Horndeski field in the Friedmann equations, defined in [88]. (For the reduced Horndeski case presented here, A 2 is trivially related to A 1 and B 0 ; however, we maintain the notation of [88] for ease of comparison.) For clarity, the objects above are all 'background' quantities, that is, they are evaluated using the homogeneous solutions forφ and a. The perturbed equation of motion for Q can be written as For a given matter density distribution, Eqs. (2.13), (2.14), and (2.20) are sufficient to solve for the gravitational potentials and scalar field on mildly non-linear scales. Note how the importance of the new scalar field terms in equations (2.13) and (2.14) are controlled by the background objects A 0 , A 1 , etc. This is our first indication that a full solution of the modified cosmological expansion history will be relevant for an accurate description of non-linear scales. Meanwhile, the Horndeski Lagrangian function G 4 clearly plays the role of an effective Planck mass in Eqs. (2.13) and (2.14).
In general, Eqs. (2.13), (2.14), and (2.20) cannot be solved analytically, except in configurations with a high degree of symmetry. In §3.1 we will solve them in a spherically symmetric system, to compute an approximate screening coefficient for our Hi-COLA code.

Cubic Galileon gravity
Later in this paper, we will use cubic Galileon gravity [11][12][13] as a test case with which to validate our general Horndeski simulation code. Although cubic Galileon gravity has been strongly constrained observationally [99], it is one of the simplest Horndeski theories that displays the full phenomenology of Vainshtein screening. Furthermore, it has been extensively studied in [100][101][102] and implemented into N-body simulations in [103][104][105]. We emphasise that the theory we refer to hereafter as 'cubic Galileon' is really the original cubic Galileon model plus a cosmological constant in some cases; this is similar to what is done to restore viability to DGP.
The cubic Galileon theory is defined by the following choices of dimensionless Lagrangian functions (recall these are the dimensionless versions of the functions appearing in (2.1)): where X is defined in (A.3) and k 1 , g 31 ∈ R are the free model parameters of the theory. Within this parameter space, we wish to identify regions that produce a cosmological expansion history reasonably close to that of ΛCDM. One technique for finding such solutions is to demand that the cubic Galileon universe possesses a de Sitter (dS) phase at some future time. This naturally leads us to models in which the universe at z = 0 is evolving towards a dS state, without having reached it fully -much like the present-day ΛCDM model [106][107][108] (see also [109][110][111] for related ideas). A dS phase is characterised by total domination of the dark energy sector; in our case this is the sum of both cosmological constant and scalar field contributions, i.e. Ω DE := Ω Λ + Ω φ = 1 and Ω DE = 0. In order to satisfy these conditions for an extended period, a dS phase then requires 8 E (a dS ) = φ (a dS ) = 0 (where a dS denotes the function for the scale factor in the dS phase). Imposing these dS conditions upon the equations of motion from §2.1 and solving, we quickly find the following relations: where E dS = E(a dS ), φ dS = φ(a dS ) and Ω Λ0 = Ω Λ (z = 0). For later convenience, we define the quantity in rounded brackets above as (2.23) At first glance, this appears to simply shift the need to specify the model parameters k 1 and g 31 , to now instead specifying the values for the Hubble function and scalar field derivative in the dS phase, i.e. E dS and φ dS . However, we can take advantage of the rescaling symmetry of cubic Galileon gravity [106][107][108]112] to perform a change of variables; rescaling symmetry implies the equations of motion are invariant under a transformation of the kind k i → C n(i) k i provided φ → C −1 φ, where C and n are scalars and k i is a generic model parameter. The value of n depends on which parameter(s) are being rescaled, hence n = n(i). E can similarly be rescaled along with counterpart rescalings for the model parameters while leaving the equations of motion invariant. We define Substituting U and y into the equations of motion then motivates the following rescaling of the cubic Galileon model parameters: Now we can rewrite our entire system of equations from §2.1 and §2.2 in terms of the variables {U, y} with parameters {k 1dS , g 31dS } without changing any of the physics involved. Making use of Eqs. (2.22)-(2.24), we see that the model parameters for a cubic Galileon theory consistent with a future dS phase are now given by k 1dS = −6α, g 31dS = 2α. For a given choice of E dS and Ω Λ0 , α is known via Eq. (2.23) and hence the model parameters are determined. In what follows, rather than fix Ω Λ0 directly, we will make use of the ratio f φ , defined by f φ determines what fraction of the energy density of the dark energy sector is due to the cosmological constant Λ, and how much is due to the scalar field energy density. When f φ = 1, the value of dark energy at a = 1 is solely due to the scalar field. In this scenario α = 1, which then implies the model parameters {k 1dS , g 31dS } are completely specified. The determination of initial conditions for our variables, U (a = 1) = U 0 ≡ E −1 dS and y(a = 1) ≡ y 0 , remains. However, one of these can be fixed by demanding the closure equation (2.3) is satisfied at z = 0. We will generally choose this to be y 0 9 . Therefore, in the end, there remain two free parameters for a cubic Galileon model which possesses a future de Sitter phase: U 0 ≡ E −1 dS and f φ . However, we will note here that there is an alternative way to apply the shift symmetry properties of cubic Galileon gravity. We could also choose to normalise the scalar field by its present-day value, defining v = φ/φ 0 . The variables used in the system of equations are now {v, E}, which by definition have their initial conditions at z = 0 both fixed to 1. Then, analogously to the de Sitter rescaling above, one can define new model parameters k 1T and g 31T (subscript T for 'today'), which are related to the original model parameters by k 1T = k 1 (φ 0 ) 2 , g 31T = g 31 (φ 0 ) 3 . Now one of k 1T or g 31T can be fixed using the closure equation, leaving the other as a free parameter along with f φ as previously.
We stress that under correct rescaling, the solutions of the cubic Galileon equations remain identical 10 . In fact this property will be shared by any shift-symmetric Horndeski Lagrangian -this implies theories for which the Lagrangian functions only depend on X. Hence we are free to choose any such rescaling which facilitates our exploration of the parameter space of the theory. This becomes increasingly important as we explore theories with higherdimensional parameter spaces, such as the one in the next subsection. One of our priorities in this work will be to find models that display suitably ΛCDM-like expansion histories; for this purpose, we generally find the de Sitter choice of rescaling the most beneficial.

Extended shift-symmetric (ESS) gravity
As indicated in §1, the aim of this work is to build a fully flexible simulation code that can handle any theory within the reduced Horndeski family. To start putting this generality through its paces, we will further explore a less well-studied theory than cubic Galileon gravity: the extended shift-symmetric (ESS) model (also studied in [55]) defined by (2.27) Comparing with Eqs. (2.21), we see that ESS gravity is a natural extension of cubic Galileon gravity. In addition to the model parameters shared with cubic Galileon gravity, the ESS model has two extra parameters, k 2 , g 32 ∈ R.
As we did with cubic Galileon gravity, we will demand that our ESS model possesses a de Sitter phase at future time (defined by E dS = φ dS = 0). For cubic Galileon gravity this yielded two constraint equations, reducing the number of degrees of freedom by two. This is also what occurs for the ESS model: the dS constraint manifests as two constraint equations that relate one pair of model parameters to the other pair. These constraint equations are Note that, thanks to shift symmetry, we have again rescaled our variables and model parameters to their dS versions (see Eqs. (2.24)-(2.25)). With these constraints in hand, and again 10 Note that one can transform between the de Sitter and 'today' rescalings as k 1dS = k 1T using the closure equation to remove y 0 as a degree of freedom, we are left with four free parameters for the ESS model: U 0 ≡ E −1 dS , f φ , and any two of {k 1dS , k 2dS , g 31dS , g 32dS }. In §2.3, for the cubic Galileon case without a cosmological constant (f φ = α = 1), we found k 1dS = −6 and g 31dS = 2. Plugging these values into Eqs. (2.30) we obtain k 2dS = g 32dS = 0. That is, we recover the cubic Galileon model of §2.3 as a consistent limit of the ESS model. Choosing a model connected to cubic Galileon gravity gives us guidance on finding viable cosmological solutions in the enlarged parameter space of ESS -we will discuss this further in §6.1.
We note that both theories introduced in this section possess shift symmetry. We stress that shift symmetry is not in any way required for a model to be used with our Hi-COLA code; it is merely a convenient device. In future works we will study theories beyond the shift-symmetric ones presented here.

Estimating screened fifth forces
Our method for estimating screened fifth forces is derived by considering the Vainshtein mechanism in spherical symmetry, which we do in §3.1. Treating the calculation as spherically symmetric introduces a well-documented error, which we discuss in §3.2. Our two-part solution to this error involves: 1) interpolation between the spherically symmetric expression and the linear fifth force at large scales, and 2) smoothing of the density field that enters the spherically symmetric expression. We discuss these two additions in §3.2 & §3.3 respectively.

Vainshtein mechanism in spherical symmetry
Mimicking the derivation in [88] (see also [68,113,114]), we consider a spherically symmetric overdensity on a cosmological background, with metric potentials defined by Eq. (2.11). The Horndeski perturbation equations from §2.2 can be integrated once with respect to a comoving radial coordinate r, becoming (note we suppress arguments) where primes denote derivatives with respect to r, and we define the enclosed mass perturbation and dimensionless coefficients For clarity, we explicitly indicate in Eqs. (3.4)-(3.5) that the density distribution and modified gravity coefficients appearing here will change with time steps in our simulations.
Re-arranging Eqs. (3.1)-(3.3) to eliminate ∂Ψ/∂r and ∂Φ/∂r, we quickly arrive at a quadratic equation for r −1 ∂Q/∂r. This has the following solution 11 : From Eq. (3.6) we can extract the Vainshtein radius: We introduce the useful dimensionless quantity χ (where Ω m0 = Ω m [z = 0]): To reach the second equality above we have used the definition of µ r from Eq. (3.7), and also expressed the mean matter density viaρ m = δ AV is a radially-averaged matter density perturbation: Note that in the limit of a top-hat perturbation, δ AV (r ≤ R) = δ m (r). In §3.3 we will see that the density field of our Hi-COLA simulations must be smoothed for practical reasons. This smoothing also has a side-effect of bringing the non-spherically symmetric density field in the simulation closer to the spherically symmetry δ AV . For this reason, in our simulations we replace δ AV in Eq. (3.9) with the smoothed density field; our validation results in Appendix B will justify that this is acceptable. We discuss the use of the spherically symmetric approximation further in §3.2.
Returning to Eqs. (3.1) and (3.2), we can eliminate ∂Ψ/∂r and substitute in our solutions for ∂Q/∂r and r * . To make the resulting expressions as familiar as possible we define G G 4 = (16πG 4 ) −1 . This quantity is the contribution to the effective Newton's constant coming from conformal coupling alone, i.e. the first term in the Lagrangian of Eq. (2.1). If this part of the Lagrangian is unmodified from GR, then 11) or equivalently, when re-written as an expression for the modified total force in terms of χ (using Eq. (3.9)):
where F N represents the standard Newtonian force law used in simulations (see [115][116][117][118] for work on fully relativistic N-body codes). There are two key components appearing in this expression that are purely time-dependent (we remind the reader E = H/H 0 ): (3.14) Hereafter we refer to β as the coupling. Rather than χ/δ m itself, we will work in terms of the screening coefficient S defined by which is density-dependent (and therefore implicitly scale-dependent), as well as time-dependent. On large scales where density perturbations are small, χ 1 and hence the screening coefficient S → 1. On small scales χ is large, and hence S → 0 like χ − 1 2 ; the fifth force is screened away and the force law returns to that of GR, except for the potential factor of G G 4 /G N in Eq. (3.12) 12 .
Putting this all together, our modified force law including the screened fifth force can be written in the helpful schematic form: We reiterate here, for later use, the salient features of this expression: β is a purely timedependent coupling that sets the maximum strength for the fifth-force term. It is determined entirely by the solution of the background equations for H and the scalar field. S is a screening coefficient that smoothly takes the fifth force from full-strength to fully-screened as determined by the density field. It is also determined by the background solutions through the quantity χ/δ m , and the smoothed density field in the simulation. Plots of β and S for cubic Galileon gravity and ESS gravity will appear in §5 and §6.
As mentioned in §1.2, this is schematically the same as the approach described in [75]. The similarity can be seen by comparing Eq. (3.16) to Eq. (45) in [75]. A key difference is that our expression for the screened fifth force is directly connected to the functions that appear in the Jordan frame reduced Horndeski action. We can use this to compute screened fifth forces for any Vainshtein screened theory contained within that action via the Hi-COLA front-end module described in §4.1. Note also that [120] found that modified gravity COLA simulations using the screening approach described in [75] reproduced inaccurate results for some 3-point statistics in cases where the effect of screening is strong on quasi-non-linear scales, so it is possible Hi-COLA would do the same, although we restrict our study to 2-point statistics in this work.
We note in passing that the appearance of a square root in Eq. (3.16) is potentially dangerous in voids, where the underdensities could result in a negative χ (Eq. (3.9)). This issue is known to occur in Galileon models for some parameter values; its physical interpretation is poorly understood. Existing simulations simply set the scalar field to a constant value when such pathologies occur. We do not encounter this problem in the models used in §5 and §6; a pre-check for likely pathological solutions can be implemented in future work.
Finally, we remind the reader that Eq. (3.16) is specialised to the Vainshtein mechanism due to assumptions made in §2.2. We note that the linearised limit of Eq. (3.16) depends on redshift only; this implies that the linear growth rate will remain scale independent, see also §4.2 . Scale-independence of the linear growth rate will not be maintained under other screening mechanisms.

Linear-screened force interpolation
Expressions for the screened fifth force derived in the manner above are known to underestimate the linear modification to gravity at large scales when used in N-body simulations. This behaviour is explained clearly in Appendix A of [23] in the context of DGP, which, like the theories we consider here, features a Vainshtein screening mechanism. Essentially, when we assume spherical symmetry the ∂ i ∂ j φ term that is present in Eq. (2.20) becomes proportional to the ∇ 2 φ term and is thus absorbed into it. While this simplifies the system of equations and allows us to derive our expressions above, it does mean our solution is approximate compared to the exact solution where spherical symmetry is not assumed and the ∂ i ∂ j φ term remains. The key consequence is that for high densities δ m 1, our approximation underestimates the far-field effect of φ. This is because, for the exact solution, the fifth force ∇φ decreases more slowly than the 1/r 2 that we see in Eq. (3.11) for our approximate solution. This topic is discussed further in [24,[121][122][123][124][125].
Fortunately, it is easy to compute the correct far-field fifth force due to a dense, highlyscreened source using linear theory. However, this linear solution of course includes no screening, so will overestimate the fifth force on small scales.
To summarise, we have a solution for the fifth force from linear theory that is accurate at large scales but not at small scales due to the lack of screening. We also have a solution for the fifth force derived in §3 that is accurate at small scales due to the inclusion of screening, but inaccurate at large scales due to the assumption of spherical symmetry. Thus to circumvent the issues with each, we can interpolate at large scales between the linear solution for the total force F lin and the approximate solution for the total force F approx computed by Eq. (3.16) to ensure we get the correct behaviour at all scales.
One way to do this is to use a low-pass filter such as which has a single parameter k filter to calibrate. This ensures that at large scales, f → 1 and F filtered → F lin ; whereas at small scales f → 0 and F filtered → F approx . Other interpolation filters, including those where the width and midpoint can be varied independently, could be used instead.
We discuss how to calibrate the value of k filter in Appendix B.2; see Fig. 11 specifically to see how it affects the clustering of matter in cubic Galileon gravity.

Density field smoothing
The density in a particular grid-cell of a cosmological simulation is computed using the volume of that grid-cell, and therefore the resolution of the simulation. As we increase the resolution the volume of each grid-cell decreases, and thus the density in cells that contain a simulation particle becomes large. Since our screening coefficient S is dependent on density, this means our screened fifth force is very sensitive to the simulation resolution, as mentioned by [75].
To alleviate this issue, we apply a Gaussian smoothing to the density field before computing the screened fifth forces. Since a convolution in real-space is equivalent to a product in Fourier-space, the density smoothing is applied as where R smooth is the smoothing radius to be calibrated. Alternative smoothing filters, such as a top-hat or a sharp cut-off, could be used instead.
We discuss how to calibrate the value of R smooth in Appendix B.2; see Fig. 12 specifically to see how it affects the clustering of matter in cubic Galileon gravity.

Hi-COLA implementation
The approach for estimating screened fifth forces in reduced Horndeski gravity that we outline in §3 could be implemented in any N-body simulation code. In this work, we have chosen to implement it on top of the COLA solver contained within FML 13 . We call the resulting modified code Hi-COLA 14 (Horndeski-in-COLA).
In the COLA approach to simulating structure formation [76], instead of solving for the full particle trajectories, we solve for the deviations of the full trajectories relative to the trajectories predicted by 2 nd order Lagrangian perturbation theory (2LPT). The evolution of the particles on large scales will be very close to that predicted by 2LPT. Thus in this approach, we can decrease the number of simulation timesteps to trade accuracy at small scales for overall simulation speed while maintaining good accuracy at large scales.
The COLA solver in FML has been modified in three key areas to create Hi-COLA: the background expansion, 2LPT, and particle-mesh (PM) sections of the code. Conveniently, it already contains many of the necessary components we have discussed above in §3, including density-dependent screening (via the method described in [75], although for DGP gravity only), linear-screened force interpolation, and density field smoothing.
One entirely new addition in Hi-COLA is the Python module component which takes a Horndeski Lagrangian as input, solves the background for that model following the approach described in §2, then outputs β and χ/δ m as computed by Eqs. (3.13) & (3.14). Various quantities from these pre-computations are then given to the COLA solver as additional input and used as required (for example to compute the screened fifth force).
In the remainder of this section, we describe the various components of Hi-COLA in more detail.

Background computation
The Hi-COLA front-end module directly receives as inputs the functions K(φ, X), G 3 (φ, X) and G 4 (φ) appearing in the Horndeski Lagrangian, subject only to a few manipulations to factor out the dimensions of these objects (described in Appendix A). This immediate connection between action and outputs is designed to enable rapid investigation of theoretical models and their observational consequences for large-scale structure. We note this is something often obscured by parameterisations of MG at the field equation level, which usually cannot be linked to terms in the gravitational action in a straightforwards manner.
The front-end module is responsible for the pre-computation of cosmological background quantities. These are E and E (the Hubble rate and its derivative), the scalar field trajectory φ, and the energy densities Ω m , Ω r , Ω Λ and Ω φ . From the scalar field and Hubble solutions the dimensionless coefficients in Eq. (2.23) are computed, followed by the combinations B and C in Eq. (3.7). These enter the coupling function β (Eq. (3.13)) and the time-dependent object χ/δ m (Eq. (3.14)), the latter of which is needed for the screening coefficient S (Eq. (3.15)). β and χ/δ m , together with E and E , are passed into the COLA solver to compute modified forces as described in §3.
Pragmatically, the first step is for the user to specify forms for the functions K, G 3 and G 4 . Our front-end module then forms the necessary derivatives of these objects and constructs the equations of motion shown in §2, utilising the Python symbolic computation library SymPy 15 to do so. The equations of motion are manipulated so that they provide the derivatives E and φ in a solvable format. Note that although the Horndeski equations are generally quite complicated, they remain linear in E and φ . Once constructed symbolically, the resulting expressions are then rendered useable by other Python functions and solved numerically.
A number of support modules are available to visualise the cosmological background solutions, and also to perform further tasks. In particular, one module performs scans over sets of model parameters and initial conditions, locating those that result in background expansion histories consistent with given criteria. This procedure is described in more detail in §5.1 and §6.1 16 .

2LPT
Computing the particle trajectories predicted by 2LPT is a vital component of the COLA method. The LPT equations for the first and second order growth factors implemented in FML are where µ (1) & µ (2) parameterise the effect of modified gravity in the Poisson equation at first and second order respectively. For a more detailed overview of LPT in modified gravity theories, see [33,[126][127][128]. In the parameterisation above, µ (1) is simply where β is the coupling from Eq. (3.16). Note that β depends only on redshift, meaning that scale-independent growth is maintained. When the form of µ (2) in this parameterisation is derived for modified gravity theories (for example, see Appendix B of [65] for the derivation in Horndeski gravity), we expect µ (2) = µ (1) as µ (2) should include the leading order nonlinear screening behaviour. However, for simplicity in Hi-COLA, we choose to make the approximation µ (2) = µ (1) . We have tested this does not affect the P (k) output by our simulations above the 1% level for the cubic Galileon theory; we expect this to hold for ESS gravity, given the similarity of the two theories. Our use of this 2LPT approximation is further supported by the fact that, in the COLA method, the 2LPT prediction is essentially only used as a first guess that is then corrected by the PM part of the code. While these PM corrections dominate on small scales, the 2LPT does normally impact the simulation at large scales due to the low number of PM timesteps typical of COLA simulations. However, on these large scales there is no non-linear screening and thus our approximation is valid as µ (2) → µ (1) . See [32] for another work where the role of µ in LPT is simplified. We acknowledge that the above approximation may be less appropriate for other theories within Horndeski gravity, such as those that demonstrate screening on quasi-linear scales, and may introduce larger errors for quantities other than the real-space matter power spectrum, such as bispectra or halo statistics. Our 2LPT treatment could be improved by extending the approach described in [114] for cubic Galileon gravity to general Vainshtein-screened Horndeski theories.

Forces between particles
To include the effect of the screened fifth force in the PM part of Hi-COLA, we implement Eq. (3.16). Fortunately, the FML code already contains an implementation of densitydependent screening for DGP gravity following the method described in [75], which we build upon to implement Eq. (3.16). We spline the two time-dependent quantities β and χ/δ m produced as described in §4.1 for the full redshift range of the simulations, such that they can be quickly read and given to Eq. (3.16) whenever the simulation needs to compute the screened fifth force.
Additionally, the existing FML code already allows for interpolation between the linear and screened force solutions at large scales using Eq. (3.18), and smoothing of the density field δ m before it is passed to Eq. (3.16) using Eq. (3.19). We discussed the importance of these two steps in §3.2 and §3.3 respectively. These two steps introduce two new free parameters: To calibrate these two free parameters, we vary them to maximise the agreement between the output of Hi-COLA and the equivalent from N-body codes that compute the screened fifth force directly via slowly solving the Klein-Gordon equation. Ideally, this calibration should be done for each modified gravity theory for which we intend to use our approximate method. However, this presents a problem, as N-body codes that directly solve the Klein-Gordon equation only exist for a handful of individual modified gravity theories, whereas we want our approximate method to be applicable across a wide section of modified gravity theory space.
For now, we carry out the calibration using N-body simulations for cubic Galileon gravity only, and then, using these calibrated parameters, validate Hi-COLA against the same cubic Galileon N-body simulations. This validation yields an understanding of what regimes our fast, approximate Hi-COLA code produces accurate results for. We present this validation in Appendix B, and the calibration specifically in Appendix B.2. Table 1: Key base cosmological parameters used throughout; they match those used for the simulations in [104].

Cubic Galileon simulations with Hi-COLA
As noted earlier, although Hi-COLA is capable of creating simulations for any theory within the Horndeski Lagrangian, in this work we focus in-depth on two specific models. We first explore the parameter space of the well-studied cubic Galileon model that we introduced in §2. 3.
We note that throughout this section, we use the same base ΛCDM cosmological parameters as [104]; we recap the key values in Table 1. However, we do not use the same cubic Galileon parameters. We will describe the cubic Galileon parameters we use instead below. In this section, we focus on the behaviour that is essential for understanding the non-linear matter power spectrum result. We investigate the cubic Galileon phenomenology in greater detail in Appendix C.

Viability of expansion history
In §2.3 we explained that after exploiting the rescaling symmetry of cubic Galileon gravity, we are left with two free parameters for the model, E dS and f φ 17 . Additionally we have the initial conditions for {φ , Ω m , Ω r , Ω Λ } that need to be specified. For {Ω m , Ω r } we use the values in Table 1 which match those of [104], which were obtained from fitting cubic Galileon gravity to WMAP 9yr results, SNLS supernovae and BAO measurements from SDSS. The initial value of Ω Λ is then fixed by ensuring Ω Λ + Ω φ = Ω LCDM Λ for a given value of f φ . The initial condition for φ is determined through enforcement of the closure equation at z = 0.
Thus, the question that remains is how to select appropriate values of E dS and f φ . We scan over the joint parameter space of E dS and f φ in order to find cubic Galileon models with expansion histories that are broadly similar to ΛCDM, since this is what observations currently support. In practice, this is achieved by subjecting the background solutions to a set of criteria that reject models which are either: i) qualitatively different from ΛCDM, or ii) have unphysical solutions. These rejection criteria are as follows, and are checked in the order presented: 1. Ω i < − or Ω i > 1 + , or Ω i = NaN at any point, where i is one of {m, r, Λ, φ}. This identifies when energy densities become unphysical. This criterion is implemented to within a small tolerance of = 10 −6 to allow for numerical errors. In Fig. 1 these models are coloured purple-blue .
2. max [Ω DE (z > 1)] > Ω DE (z = 0). When satisfied, this indicates that there exists a value in the dark energy density at z > 1 that is greater than its value at z = 0. This criterion identifies when a model is exhibiting early dark energy, though it allows for The bottom-most light pink region corresponds to cubic Galileon models with early dark energy. The dark pink region corresponds to cubic Galileon models where the matter energy density fails to reach a critical value, meant to represent matter energy domination. Finally, the dark blue region shows cubic Galileon models that pass all criteria, with the orange stars representing the five cubic Galileon models featured in this paper. The 'fins' in the dark blue region are numerical artefacts and depend on the value of (larger values have the effect of increasing the fin sizes). For simplicity, models that satisfy multiple criteria have been coloured so as to keep the coloured regions in the plot as continuous as possible.
dark energy density to exceed values today in the redshift range 0 ≤ z < 1. In Fig. 1 these models are coloured light pink .
3. max [Ω m ] < Ω crit m . This indicates that the maximum value for the matter energy density fails to equal or exceed a critical value, Ω crit m . This criterion identifies when a model fails to produce a matter-dominated era in the universe. For the results in this paper, Ω crit m = 0.99. In Fig. 1 these models are coloured dark pink .
4. If all above criteria are passed, then the model's expansion history is considered viable. In Fig. 1 these are coloured dark blue .
For cubic Galileon gravity, we show the model space in terms of the criteria listed above in Fig. 1. We can see that for cubic Galileon gravity, there is a dark blue band of fully viable, LCDM-like models in (f φ , E dS ) parameter space. The 'fins' in this band are numerical artefacts that depend on . Larger values of tend to increase the sizes of these fins. The models that we study in this paper are picked from this viable band, and the model parameter values are listed in Table 2 and represented by orange stars in Fig. 1. They all possess the usual traits of a Hubble rate that decreases with time, a radiation-dominated phase followed  Table 2: Parameters for the five cubic Galileon models considered in §5. f φ and E dS are highlighted in bold as these are the independent parameters. k 1dS and g 31dS are fixed by these values as per Eq. (2.22). by a period of matter domination, followed by a dark energy-dominated phase at late times. A further investigation of why cubic Galileon gravity has a particular region of viability, along with analogous discussions for the ESS model, will be studied in a future work. As a reminder, one should note that in the context of this work, models are called "viable" after passing background-level criteria; constraints coming from large-scale structure data are not currently considered in the model-finding process.
Given the relative narrowness of the viable band in Fig. 1 we can consider an effective one-parameter parameterisation of cubic Galileon gravity. Excluding the fin artefacts, one can fit a function to the upper edge of the remainder of the band, and use it to determine the corresponding maximum valid value of E dS for a chosen value of f φ . In the next section we show Hi-COLA simulations for the five {f φ , E dS (f φ )} cases listed in Table 2, determined in this way.
We note that one could also use priors like positivity bounds [129,130] to reduce the parameter space being considered, particularly before a scan is performed. While it appears the sign of k 1dS is consistent with the bounds detailed in [130], as the authors explain, these bounds are flat-space calculations. It would therefore be of interest to utilise similar bounds derived in curved spacetimes like de Sitter space as priors in future parameter scans. Such an approach may be of significant aid in reducing the time taken to complete scans in the higher-dimensional ESS parameter space discussed in §6.1.

Initial conditions
A common approach for generating initial conditions (ICs) in N-body simulations is to use backscaling (referred to as 'Newtonian backscaling' in [131]). The backscaling approach is designed to account for the flaw of N-body simulations not fully including the effects of radiation during their forward evolution. This flaw manifests when such an N-body code is used with ICs generated directly at the initial simulation redshift.
To generate backscaled ICs, the linear power spectrum at a target redshift (often z = 0), generated by a Einstein-Boltzmann code including the full effects of radiation, is evolved backwards to the initial simulation redshift using the flawed linear growth factors computed by the simulation code (which do not include the full effects of radiation). This ensures that, when the simulation evolves forwards to the target redshift (effectively using those same flawed linear growth factors on linear scales), it recovers the correct linear power spectrum containing the full effects of radiation as computed by the Einstein-Boltzmann code.
In this work, we do not have access to a Einstein-Boltzmann code that has been modified to account for either of our two modified gravity cosmologies. This prevents us from not only creating appropriately backscaled ICs, but also from generating modified gravity ICs directly at the initial simulation redshift. What we chose to use instead is to use ΛCDM ICs generated directly at the initial simulation redshift, which in our case is z = 49. Specifically, to set our initial conditions (ICs), we use FML's existing functionality to read directly from the particle data in the z = 49 snapshot of [104], such that our ICs are identical to theirs.
With our approach, in addition to the error caused by the lack of backscaling, any deviation from ΛCDM at z = 49 will introduce error via the ICs. We can estimate the size of this error from the ratio of modified gravity and ΛCDM growth factors at z = 49, a quantity which we plot in Fig. 15 for our strongest cubic Galileon case where (f, E dS ) = (1, 0.921). We find the ratio of cubic Galileon and ΛCDM growth factors at z = 49 is 1.00225, or a 0.225% deviation from ΛCDM. Ignoring the effects of radiation, we can approximate the ratio of cubic Galileon and ΛCDM linear power spectra at z = 49 to be 1.00225 2 ≈ 1.00451, or a 0.451% deviation from ΛCDM. This small error in the linear power spectra at z = 49 will propagate to give a similarly-sized offset at large, linear scales in the non-linear power spectra measured from the simulation output at late times, and may also introduce transient effects at non-linear scales. We discuss the linear growth factor in cubic Galileon gravity further in Appendix C.2.

Other simulation specifics
We use a boxsize of L = 400 Mpc/h with N part = 512 3 particles per dimension and a force grid of size N mesh = 3(N part ) 1/3 = 1536. We use 40 timesteps linearly spaced in a between a ini = 0.02 and a = 1. This choice aims to balance accuracy at small scales against computational cost, and is based on both our previous experience with COLA [19,32,33,36,126] and the thorough study of the optimal number of timesteps in COLA simulations from Section 4.1 of [132]. The base cosmological parameters used are listed in Table 1, again matching the simulations from [104]. 18 As described in §4, our Hi-COLA simulation takes the time-dependent E, E /E, β, and χ/δ m functions produced by the Hi-COLA front-end module as input.
In addition to the reference ΛCDM simulation and the full cubic Galileon gravity simulations where the expansion is modified relative to ΛCDM and the screened fifth forces are computed, we run a simulation that is a hybrid between ΛCDM and full cubic Galileon gravity for each of the five cases listed in Table 2. This type of hybrid, where the expansion of the simulation matches that of the full cubic Galileon simulation but the forces are purely GR, is referred to as a QCDM simulation in [104] 19 . By taking the ratio between the outputs of the QCDM and ΛCDM simulations, we can isolate the effect of cubic Galileon gravity's modified expansion history. By taking the ratio between the outputs of the full cubic Galileon and QCDM simulations, we can isolate the effect of cubic Galileon gravity's screened fifth force. These two isolated effects are shown for the real-space matter power spectrum in the lower panels of Figs. 3 and 4 respectively. In Appendix C.3, we use these hybrid simulations and others to further investigate the isolated impacts of various components that are modified in cubic Galileon gravity relative to ΛCDM.
Thus the final requirement is two simulations for each of the five cubic Galileon cases described in Table 2, plus a single ΛCDM reference simulation; a total of 11 simulations.
The real-space non-linear matter power spectra P (k) are measured from the simulation snapshots using FML's built-in on-the-fly power spectrum measurement option. To ensure we can trust our results, we have validated Hi-COLA against traditional N-body simulations of cubic Galileon gravity [104,133] in Appendix B. This validation indicates that the simulation setup described above allows our results for boost factors (ratios of real-space matter power spectra, here specifically between different cosmologies, for example cubic Galileon gravity vs ΛCDM) 20 to be trusted within 2.5% up to k max = 1.2 h/Mpc.
At scales smaller than k max our force computation becomes too inaccurate to successfully reproduce the results of traditional N-body simulations to the level we desire. This is due to the combination of the low number of timesteps (a defining feature of COLA simulations), the force resolution of our simulations (set by the ratio N mesh /N 1/3 part ), and the approximations involved in our specific approach to estimating the screened fifth force.

Simulation results
We study the effect of cubic Galileon gravity on structure formation by taking the ratio between the P (k) from the full cubic Galileon gravity simulation and the ΛCDM simulation at z = 0, which we plot in Fig. 2.
Firstly, we see that the impacts of cubic Galileon gravity are stronger for increasing values of f φ , the fraction of effective DE contained in the scalar field. This is reassuring, given that we know we should recover ΛCDM for f → 0.
Secondly, we see the P (k) have an approximately scale-independent enhancement in cubic Galileon gravity relative to ΛCDM on large scales, and that this enhancement increases slightly with k at intermediate scales. However, this enhancement is then reduced significantly at small scales. But what specifically is responsible for this behaviour?
The first key feature of cubic Galileon gravity is that there is a period of slower-than-LCDM expansion around z ≈ 1, as can be seen in the upper panel of Fig. 3. Cosmic expansion suppresses structure formation; thus the slower-than-ΛCDM expansion rate reduces this suppression effect, leading to a net enhancement of structure formation relative to ΛCDM. However, cosmic expansion will have less of a suppressing effect on small scales where structures are locally gravitationally bound. So the impact of the different expansion history in cubic Galileon gravity is to produce an enhancement of the z = 0 power spectrum on large scales relative to ΛCDM that falls away towards small scales. This effect can be seen in isolation via the ratio of power spectra in QCDM and ΛCDM in the lower panel of 20 Accurate boost factors are still very useful, albeit in a less direct way than accurate absolute power spectra. For example, a beyond-ΛCDM vs ΛCDM 'cosmology boost factor' computed by fast, approximate code (such as Hi-COLA) can be multiplied by the ΛCDM power spectrum computed by a more accurate method (such as traditional N-body) to yield an accurate beyond-ΛCDM power spectrum. Other kinds of boost factor can also be used to shift cosmological parameters from a reference set (a 'parameter boost factor'), or even just evolve a power spectrum from a reference redshift to a target redshift (a 'redshift boost factor'). One could even design a boost factor to do multiple tasks at once, for example to transform a ΛCDM power spectrum with base cosmological parameters 'A' at z = 49 to a beyond-ΛCDM power spectrum with base cosmological parameters 'B' at z = 0. See [134] for more information, although note that in their terminology our boost factors are referred to as 'response functions'.   Fig. 3. Note that at high redshifts the enhancement is almost scale-independent since the linear regime extends to larger k at early times. The second key feature of cubic Galileon gravity is the screened fifth force. In Fig. 4, we study the two components of the screened fifth force in cubic Galileon gravity: the coupling β and the screening coefficient S, which we introduced in §3.1. The coupling is obtained from the background solutions via Eqs. (3.13). To compute the screening coefficient, we first compute the Vainshtein radius ratio χ/δ m via Eq. (3.14). We can then use χ/δ m to compute the screening coefficient S for various test values of the overdensity δ m using Eq. (3.15). Since G G 4 = G N for cubic Galileon gravity, the product βS is then the strength of the screened fifth force at each test value of δ m . In Fig. 4 we plot the evolution of β with redshift for each of the five cubic Galileon cases described above, as well as the evolution of S and the product βS for three test values of δ m = {0.1, 1, 10}. Finally, we want to understand how the screened fifth force, which is a function of redshift and density, maps onto the power spectrum, which is a function of redshift and scale. To do so, we also plot the ratio of power spectra from the full cubic Galileon simulation and that of the QCDM simulation in the lower panel. This ratio isolates the effect of the screened fifth force, as both simulations feature the cubic Galileon expansion history. Firstly, the upper left panel of Fig. 4 shows that the coupling in cubic Galileon gravity is very small at early times but grows rapidly around z = 1, leading to a significant fifth force activating at late times. This is consistent with our criteria in §5.1, which reject models where the Horndeski scalar field is significant at early times. Secondly, the lower left panel of Fig. 4 shows the screening coefficient decreases with increasing density. For low densities, S ≈ 1 meaning that screening is inactive and the full fifth force operates. For high densities, S 1 indicating that screening is active. The upper right panel displays the product of the coupling and screening coefficient, and we see, unsurprisingly, that the fifth force activates at late times but is suppressed at high densities due to screening. Finally, the lowest panel shows the effect of this screened fifth force on the power spectrum is to produce an approximately scale-dependent enhancement to the z = 0 power spectrum at large scales, which increases slightly with k at intermediate scales, before falling off at small scales as screening activates.
For the interested reader, we study and discuss the phenomenology of cubic Galileon gravity in greater detail in Appendix C. But in summary, Fig. 2 shows that the combined impact of the period of slower-than-LCDM expansion and the screened fifth force is to enhance P (k) on large linear scales, but that this enhancement falls off at small non-linear scales due to the screening and the cosmic expansion not affecting structures that are locally gravitationally bound. We highlight that including the modified expansion history of cubic Galileon gravity has a significant impact on the resulting non-linear power spectrum.
This study of cubic Galileon gravity has demonstrated the power of Hi-COLA to take us, consistently, from a Lagrangian all the way through to simulating non-linear clustering. While a ΛCDM-like background is sometimes assumed in MG studies (and particularly in MG simulations), we emphasise that our method does not make this assumption. We find that, for the models considered here, the modified expansion history has an effect that is perhaps surprising in magnitude. In the next section we repeat this demonstration for a previously un-simulated theory.

Extended shift-symmetric (ESS) simulations with Hi-COLA
In this section we will present results from the first simulations of non-linear structure in the extended shift-symmetric (ESS) gravity models described in §2.4. An extended discussion of power spectrum features can be found in Appendix D.

Viability of expansion history
The procedure for finding viable ESS models is similar to the process carried out for cubic Galileon gravity; the initial conditions for the cosmological density parameters are fixed in the same way. A difference is that there are now four model parameters to fix: E dS , f φ , k 1dS and g 31dS . This time, we scan over a four-dimensional space, subjecting the background solutions to the criteria listed in §5.1. With increased dimensionality, we face the difficulties of a much larger space to scan and a less intuitive landscape to visualise. To address the former, we first perform broad and coarse scans over the model space, which we then use to inform narrower and finer scans once regions of viability are identified. For the latter, we can only look at 2D slices and 3D volumes of the overall 4D space to understand the placement of the viable models. The three ESS models chosen for analysis in this paper are listed in Table 3. Unlike for cubic Galileon gravity, it does not appear that viable ESS models are located in a clearly identifiable and continuous band. A discussion of the structure of the viable ESS model regions will be studied further in a future work.

Simulation setup
We use the same simulation setup as described in §5.2. This includes ΛCDM initial conditions generated directly at z = 49 (i.e. without backscaling). As discussed in §5.2.1, with this approach, in addition to the error due to the lack of backscaling, any deviation from ΛCDM at z = 49 will introduce error via the ICs. We can estimate the size of this error from the ratio of ESS and ΛCDM growth factors at z = 49, a quantity which we plot in Fig. 18 for our strongest ESS case (ESS-C). We find the ratio of ESS and ΛCDM growth factors at z = 49 is approximately 0.991, or a 0.9% deviation from ΛCDM. Ignoring the effects of  radiation, we can approximate the ratio of ESS and ΛCDM linear power spectra at z = 49 to be 0.991 2 ≈ 0.982, or a 1.8% deviation from ΛCDM. Again, this error in the linear power spectra at z = 49 will propagate to give a similarly-sized offset at large, linear scales in the non-linear power spectra measured from the simulation output at late times, and may also introduce transient effects at non-linear scales. We discuss the linear growth factor in ESS gravity further in Appendix D.2. As in §5, in addition to the full ESS simulation with both the expansion modified relative to ΛCDM and screened fifth forces between particles, we run QCDM hybrid simulations. These QCDM simulations feature the ESS expansion history but GR forces 21 . In Appendix D.3, we use these hybrid simulations and others to isolate the impact of the different expansion history from the impact of the screened fifth force. Thus we require two simulations for each of the three ESS cases from Table 3, which results in a total of 6 simulations (since we re-use the reference ΛCDM simulation produced for §5).

Simulation results
We study the effect of ESS gravity on structure formation by taking the ratio between the P (k) from the full ESS gravity simulation and the ΛCDM simulation at z = 0, which we plot in Fig. 5. The behaviour of the ESS power spectrum is broadly similar to the cubic Galileon power spectrum discussed in §5.3. Instead of repeating ourselves, we will therefore focus on highlighting the differences between ESS gravity and cubic Galileon gravity in this section.
The first key difference in ESS gravity is related to the expansion history, displayed in Fig. 6. Firstly, the upper panel shows the size of the deviations from ΛCDM are around 0.2% compared to 8% in cubic Galileon gravity, so they will have a smaller effect. Secondly, in ESS gravity there is a phase where expansion is faster than in ΛCDM for z 2, in addition to the phase where expansion is slower than in ΛCDM for 0.1 z 2 which is qualitatively similar to cubic Galileon gravity. Cosmic expansion suppresses structure formation, thus faster-than-ΛCDM expansion increases this suppression, whereas slower-than-ΛCDM expansion reduces this suppression leading to a net enhancement of structure formation relative to ΛCDM.
This effect can be seen in isolation in the lower panel of Fig. 6 via the ratios of power spectra in QCDM and ΛCDM. We see that the suppression of clustering increases from z = 21.631 to z = 9.804 to z = 1.990, but then weakens between z = 1.990 and z = 0.000; this reflects the transition from faster-than-ΛCDM expansion to slower-than-ΛCDM expansion that occurs around z ∼ 2 as seen in the upper panel of Fig. 6. The combined result of the different expansion history in ESS gravity is to produce a small net suppression of the z = 0   power spectrum on large scales relative to ΛCDM that becomes stronger at intermediate scales before weakening towards small scales where structures tend to be locally gravitationally bound and thus unaffected by cosmic expansion. This suppression of clustering is in contrast to the enhancement of the power spectrum caused by the different expansion history in cubic Galileon gravity seen in Fig. 3. Looking at Fig. 7, we see that the second key difference in ESS gravity is related to the screened fifth force. While both the screening and the late time strength of the coupling are very similar in ESS and cubic Galileon gravity, the early time coupling in ESS gravity is a few orders of magnitude stronger than that of cubic Galileon gravity; this reflects the corresponding behaviours for the respective scalar fields, with the gradient of the scalar field y being larger at early times in ESS gravity than in cubic Galileon gravity. As a result, the impact of the screened fifth force on the z = 0 power spectrum in ESS gravity is slightly larger than in cubic Galileon gravity.
For the interested reader, we study and discuss the phenomenology of ESS gravity in greater detail in Appendix D. But in summary, the small amount of suppression of clustering due to the modified expansion history combines with the larger amount of enhancement due to the screened fifth force. As seen in Fig. 5, this produces an overall moderate enhancement of clustering on large scales that briefly increases with k at intermediate scales before screening activates to eliminate the enhancement at small scales.
Comparing the different ESS cases, we see that ESS-A produces the smallest deviations from ΛCDM and ESS-C the largest. The ESS-B case is very similar to ESS-A at early times, but at late times is more similar in strength to ESS-C. We intend to elaborate on the parameter dependence of ESS phenomenology in future work.
While we have focused on the differences between cubic Galileon gravity and ESS gravity in this section, the phenomenology of these two theories is broadly similar. This is not surprising, given that they share the same screening mechanism, and the rather conservative criteria we applied in §5.1 and §6.1 ensure fairly similar background solutions. We leave investigations of more complicated reduced Horndeski theories, such as those that go beyond shift symmetry or feature different screening mechanisms, for future work.

Conclusions
In this work, we have implemented reduced Horndeski gravity in a fast, approximate COLA simulation code called Hi-COLA 22 . We believe Hi-COLA is the first implementation of reduced Horndeski gravity in an N-body code, and one of only a small number of codes that have the flexibility to simulate many different gravity theories. Thus Hi-COLA has excellent potential to be used by modified gravity model builders to explore the quasi-non-linear regime up to scales of k ∼ 1 h Mpc −1 , in the same way that the modified Einstein-Boltzmann codes Hi-Class and EFTCAMB [135][136][137][138][139][140] have enabled general exploration of the linear cosmological regime 23 .
Hi-COLA computes screening effects on non-linear scales using a coupling function and a screening coefficient, both of which are determined by the cosmological background of the model. A consistent solution of both cosmological background and screening quantities with the same model parameters is automatically implemented in Hi-COLA. The approach we adopt makes the mathematical connection between background and non-linear scales easy to follow.
We have used Hi-COLA to compute non-linear structure formation in two example theories belonging to the Horndeski family -cubic Galileon gravity and the extended shiftsymmetric theories of [55] -validating the code extensively against traditional N-body simulations of the former. Our work to date has centred on shift-symmetric theories for convenience; however, we plan to next turn attention to theories without this property. This will enable the functionality needed for, e.g. classic quintessence models. Likewise, we chose to first focus on theories that screen via the Vainshtein mechanism; further density-dependent functionality needed for chameleon screening will be implemented in the future.
The key limit to Hi-COLA's accuracy beyond quasi-non-linear scales is the use of a spherically symmetric approximation when deriving the screening coefficient in §3.1. We have shown that, provided steps are taken to alleviate errors ( §3.2 & §3.3), this approximation is acceptable and is worthwhile given the rapid simulations it enables. However, in order to calibrate the free parameters in these mitigation steps, and to validate Hi-COLA more generally, we must still compare to the results of traditional N-body codes. These currently only exist for a handful of specific MG theories; we will continue to validate Hi-COLA against these for available theories. We intend to investigate methods to improve the accuracy of Hi-COLA, for example by finding faster ways to solve the full non-linear Klein-Gordon equation where necessary. This would remove the need to calibrate parameters in our current spherically symmetric approach.
There are many directions and applications to be explored with Hi-COLA, and we intend to make the code publicly available in due course. Whilst we have validated Hi-COLA's ability to produce dark matter power spectra, there are many other quantities that N-body codes are relied upon to produce, including halo and lensing quantities. Integration of Hi-COLA into such pipelines will ultimately allow for the broad span of Horndeski theories to be tested with data from upcoming galaxy and lensing surveys. In this way, we hope to open up the landscape of modified gravity theory space to non-linear constraints and continue to progress our understanding of what does, and what does not, constitute a viable theory of gravity.

Contribution Statements
B.S.W. modified the FML code, ran the Hi-COLA simulations, carried out the validation, led the analysis of simulation results, and led the manuscript writing. A.S.G. wrote the code for the front-end module, assisted with analysing the simulation results, and contributed to the manuscript. T.B. led the development of the project concept, led the analytical calculations, assisted with analysing the simulation results, contributed to the manuscript, and managed the project. G.V. helped develop an early version of the simulation code and guided the project within LSST-DESC. B.F. assisted with the analysis and validation of results for the final version of this paper.

A Mass scales
In order to avoid our numerical solvers handling very small or large numbers, Hi-COLA has been constructed to use dimensionless analogues of the quantities shown in Eq. (2.1), and dimensionless equations of motion. In this appendix we describe first the quantities, and then the equations implemented in the code. Since we work in geometric units, where c = 1, we typically express the dimensionality as a mass dimension. We remind the reader of our convention where tilded quantities have dimensions and untilded quantities are dimensionless.

A.1 Dimensionless quantities
As an example, let us considerφ, which has a mass dimension of 1. We can straightforwardly define a dimensionless analogue of the scalar field φ as where M s is the mass scale associated with the scalar field. Using φ, we can then construct a dimensionless analogue ofX, which we call X. This involves an extra step of also performing a change of variables so that we use x = ln(a) as a dimensionless coordinate for time in place of the scale factor, a. We denote derivatives with respect to time with a dot, and to x with a prime. Substituting φ and performing the change of variables (implicitly we are evaluating X on the background FRW metric) yields Our dimensionless kinetic variable is therefore where we note that E = H/H 0 . We can then follow through with these newly defined objects to construct dimensionless analogues of the Horndeski functions and their derivatives. For example, given thatG 3 has dimensions of mass, we can define the dimensionless analogue G 3 through where MG 3 is the mass scale associated toG 3 . Then derivatives ofG 3 can be expressed using the dimensionless analogues as follows: where in the last line G 3X is the dimensionless analogue ofG 3X . We construct dimensionless analogues of all the remaining Horndeski functions and their derivatives in a manner similar to this example.

A.2 Dimensionless equations of motion
Having pulled out the mass scales of all quantities used, our equations of motion will consist of dimensionless variables multiplied by ratios of masses. These are generally (or generally expected to be) within a few magnitudes of order unity, at least for MG theories which have effects on cosmological scales. As an example, below is the equation for Ω φ with the mass ratios left unspecified: Terms of the form M ab are the mass ratios, M ab := M a /M b , where M a and M b are the mass scales for quantities a and b respectively. Here a, b stand for subscripts P, s, X, K, G 3 , G 4 , indicating the Planck mass and mass scales for the scalar field, X, K, G 3 , and G 4 respectively. For the results in this paper, it is assumed that all mass ratios are 1. This implies that the Horndeski functionsK,G 3 , andG 4 have the same order of magnitude, and M PG 4 = 1 fixes this scale to be approximately M P .

B Code Validation & Calibration
To understand the regime of validity for Hi-COLA and how this depends on the simulation setup, we compare against cubic Galileon N-body simulations described in [104]. Unlike the cubic Galileon model considered elsewhere in this paper, the model in [104] is the original cubic Galileon model that does not include a cosmological constant. The screened fifth force in these simulations was computed using the slow but accurate method of solving the full non-linear scalar field equation, rather than the approximate approach we outlined in §3.1.
In addition to running simulations using the full cubic Galileon theory, [104] also produced simulations for linear cubic Galileon theory (i.e. without screening), as well as the case where the forces in the simulation are purely GR but the background expansion history is that of the cubic Galileon model (referred to as the QCDM case).
We use Hi-COLA to produce simulations for all three cosmological cases: full cubic Galileon theory, linear cubic Galileon theory, and QCDM. We use FML's existing functionality to read directly from the z = 49 N-body snapshot of [104], such that our ICs are identical. This allows us to run only a single simulation per cosmology while eliminating realisation noise at large scales. We replicate the simulation setup used to generate the N-body data as closely as possible, using a boxsize of L = 400 Mpc/h and N part = 512 3 particles. However, since the N-body data was produced using ECOSMOG, which is an adaptive mesh refinement (AMR) code, we cannot replicate its force computation approach exactly with FML's particlemesh (PM) approach; instead we choose a force mesh of size N mesh = 3(N part ) 1/3 = 1536. Also, since the key advantage of the COLA method is to allow us to use a smaller number of timesteps without losing accuracy at large scales, we again use only 40 timesteps linearly spaced in scale factor a between z ini = 49 and z = 0, which should still yield good accuracy on small scales.
To ensure the power spectra are computed consistently from both the Hi-COLA and Nbody simulations, we use the Pylians tool 24 for measuring power spectra from the simulation snapshots at a = {0.2, 0.6, 1.0}.  Figure 8: Power spectrum comparison between Hi-COLA and N-body results from [104]. We consider three cosmologies: the full cubic Galileon case, the linear cubic Galileon case (i.e. without screening), and the QCDM case with GR forces but a cubic Galileon expansion history. The top, middle, and bottom panels correspond to a = 0.6, a = 0.8, and a = 1 respectively. Dark and light grey bands indicate the ±1% and ±2.5% regions respectively.

B.1 Validation
For the validation, we compare power spectra for individual cosmologies and ratios of power spectra for different cosmologies which we refer to as boost factors. COLA simulations are well known to perform better at reproducing boost factors (ratios of power spectra in two cosmologies) than power spectra themselves [19,33,35,126]. As discussed in §5.2.2, accurate boost factors are still very useful. The boost factors (denoted B) we focus on for this validation are the ratio of power spectra between 1) the full cubic Galileon and QCDM cases; and 2) the full and linear cubic Galileon cases. The first of these negates the effect of the modified expansion history to isolate the effects of modified forces, while the latter isolates further the effect of screening. We reproduce the output at three redshifts corresponding to a = {0.6, 0.8, 1}.
As described in §3, we have some flexibility in our screening approach with two parameters to calibrate: k filter & R smooth . As part of our validation, we compared our Hi-COLA simulations to the N-body simulations for several different values of these two parameters. We direct interested readers to Appendix B.2 for the details of that calibration, but the summary is that we determine the best parameter values in terms of agreement with the N-body sims are k filter = 0.2 h/Mpc & R smooth = 3 Mpc/h. We use these parameters for our headline validation below. Figure 8 shows the power spectrum comparison between Hi-COLA and N-body results from [104]. We note that there is some offset at large scales in the power spectrum comparison for all cosmologies and redshifts. We strongly believe this is due to a small disagreement in our respective expansion histories; however, we have been unable to identify the exact source. This makes it difficult to draw reliable conditions from the power spectrum comparison.  Figure 9: Boost factor comparison between Hi-COLA and N-body results from [104]. The boost factor in this plot is the ratio of power spectra in the full cubic Galileon and QCDM cosmologies, which negates the effect of the modified expansion history to isolate the effects of modified forces. The top, middle, and bottom panels correspond to a = 0.6, a = 0.8, and a = 1 respectively. Dark and light grey bands indicate the ±1% and ±2.5% regions respectively.
Fortunately, this expansion history disagreement does not propagate to our boost factors, giving further reason to focus on these for our validation. Figures 9 & 10 show the two boost factor comparisons between Hi-COLA and N-body results from [104]. Figure 9 is for the force boost factor, while Figure 10 is for the screening boost factor. Table 4 records the k max values at which the boost factors are accurate at the 1% & 2.5% levels at each redshift. Therefore, we determine Hi-COLA is valid at the 2.5% level down to k max = 1.2 h/Mpc at z = 0. This is a measure of accuracy determined with respect to the Cubic Galileon, and strictly speaking, comparisons should be made on a per-model basis where possible. However, as detailed in §B.2, making use of Hi-COLA's calibration parameters can help ensure this accuracy is maintained across models in the reduced Horndeski framework.

B.2 Calibration
We have two calibration parameters as described in §3. The first, k filter , controls the scale at which we interpolate between the linear and screened solutions for the fifth force. The second, R smooth , controls the scale at which we smooth the density field before computing the screened fifth force.
To decide on a 'best' set of parameters, we focus on the second boost factor described above, the one that isolates the effect of the screening, since it is the screening that these two parameters affect and that we want to match in the more accurate N-body simulations. We also focus on a = 1, although we expect the best parameters to be relatively independent of redshift.    We first investigate the effect of varying k filter while fixing R smooth = 2 Mpc/h, which is seen in Fig. 11. We also compute the results for the cases with fully linear and fully screened forces, which represent the limits k filter → ∞ and k filter → 0 respectively. Simply examining the figure, we determine that k filter = 0.2 h/Mpc yields the best results.
We next investigate the effect of varying R smooth while fixing k filter = 0.2 h/Mpc, which is seen in Fig. 12. Again by simply examining the figure, we determine that R smooth = 3 Mpc/h yields the best results.
We also need to confirm that our screening setup prevents simulations from being spuriously affected by changing the force resolution, as discussed in §3.3. To do so, we run a set of lower resolution Hi-COLA simulations with N mesh = 512 instead of our standard N mesh = 1536 for various values of k filter and R smooth . We compute the screening boost factor for this lower  Figure 11: Effect of k filter on the boost factor comparison between Hi-COLA and N-body results from [104] at a = 1. The boost factor in this plot is the ratio of power spectra in the full and linear cubic Galileon cosmologies, which isolates the effects of screening. We have fixed R smooth = 2 Mpc/h where applicable. We also plot the fully linear (k filter → ∞) and fully screened (k filter → 0) results to demonstrate what varying k filter interpolates between. Dark and light grey bands indicate the ±1% and ±2.5% regions respectively.
resolution set and then compare the results against our standard higher resolution set. We plot this as a ratio in Fig. 13. Firstly, we note that changing the force resolution of a simulation will have a natural effect on clustering statistics at small scales, as can be seen for k 0.2 h/Mpc in Fig. 13. It is not this effect we are trying to avoid with our screening setup, but any 'unnatural' effects, such as those on large scales. The figure shows that if we neither interpolate between the linear and screened force solutions at large scales (i.e. we use the screened force solution at all scales) nor apply any density field smoothing before computing the screened force solution, then there is an offset between the high-and low-resolution simulations at large and intermediate scales. The figure shows that R smooth 1 Mpc/h is necessary to mitigate this effect. It also shows that, even without any density field smoothing, interpolating between the linear and screened force solutions at large scales is sufficient to eliminate the offset at large scales. However, using interpolation alone without smoothing may still lead to spurious resolution effects at intermediate scales, and Fig. 12 demonstrated that smoothing improves our agreement with N-body regardless.
Thus based on the 'screening' boost factor, we determine by examination that the approximate values of our Hi-COLA parameters that give the 'best' agreement with N-body are k filter = 0.2 h/Mpc & R smooth = 3 Mpc/h, and we use these values elsewhere in the paper.

C Detailed cubic Galileon results
In this Appendix, we discuss the results of our cubic Galileon simulations in greater detail. We aim to walk the interested reader through some of the phenomena that result in the  figures of §5.3. We break this down by first discussing the background solutions and screening coefficient, then the linear growth rate, and finally the power spectra.

C.1 Background evolution
We start by assessing the background solutions we obtain from solving the equations in §2.3 using the Hi-COLA front-end module described in §4.1. In Fig. 14, we plot the evolution of various background quantities with redshift in each of the five models listed in Table 2, and also in the corresponding ΛCDM model, which is reached in the limit f φ → 0. To highlight the departures from ΛCDM, we also plot the ratio between the f φ = 1 cubic Galileon gravity case and ΛCDM for E & E /E in the upper panel of Fig. 15. As we will describe below, these two quantities are important in determining the rate of linear growth, as seen in Eq. (4.1).
We can see that the main effect of the cubic Galileon scalar field is to produce a brief epoch around z = 1 where the expansion is temporarily slower relative to the ΛCDM case. This coincides with the period of rapid evolution of the scalar field derivative (top right panel). Unsurprisingly, we see that as f φ , and thus the proportion of effective DE coming from the cubic Galileon scalar field, increases the deviation from ΛCDM also increases. In Fig. 14 we see that Ω r and Ω m are marginally affected, whilst the sum of the two lowest panels approximately reproduces the behaviour of Ω Λ in ΛCDM (dotted line in lower left panel).
As we discussed in §5.3, once we have these background solutions, we then use them to compute the quantities we will need for estimating the screened fifth force, the coupling β and Vainshtein radius χ/δ m , using Eqs. (3.13) & (3.14). These two time-dependent functions along with E and E /E make up the set needed as input for our Hi-COLA simulations.

C.2 Linear growth rate
We can also compute the linear growth in these cubic Galileon cases using Eq. (4.1), which shows that linear growth is dependent on the expansion history through E & E /E and the linear modification to the strength of gravity µ (1) = G eff /G N = 1 + β. In Fig. 15  From the magenta dashed curve we see that the expansion in these cubic Galileon models goes through a period around z = 1 where it is slower than in ΛCDM, which causes a reduced suppression of structure formation, and thus a boost to linear growth relative to ΛCDM. We see that the effective strength of gravity is stronger than in GR (green dashed curve), and this further increases the linear growth relative to ΛCDM, although the impact on growth is smaller than that of the change to the expansion history.

C.3 P (k) breakdown
In §5.2 we mentioned that we ran hybrid QCDM simulations in addition to the full cubic Galileon and ΛCDM simulations. Here, we run an additional second type of hybrid simulation, which we refer to as 'Lin-cG', for each of the five cases listed in Table 2. These Lin-cG simulations are identical to the full cubic Galileon simulations, except that the screening has been artificially turned off, resulting in linear modifications to GR on all scales, or equivalently unscreened fifth forces between particles in all density environments. For convenience, the features of the two types of hybrid simulation are also summarised in Table 5   By taking boost factors or P (k) ratios between the various hybrid simulations described in Table 5, we can isolate the impact of different aspects of cubic Galileon gravity. Specifically, in Fig. 16 we look at four boost factors: 1. QCDM vs ΛCDM -this isolates the impact of the modified expansion history.
2. Lin-cuGal vs QCDM -this isolates the impact of the unscreened fifth force.
3. cuGal vs Lin-cuGal -this isolates the impact of the screening. 4. cuGal vs QCDM -this isolates the impact of the full fifth force, including screening.
The upper left panel of Fig. 16 demonstrates that the isolated effect of the modified expansion history in cubic Galileon gravity is to produce an approximately scale-independent enhancement of clustering on large linear scales, which then briefly increases with k at intermediate scales before disappearing at small non-linear scales. We saw in the upper panel of Fig. 15 that in cubic Galileon gravity there is a brief period where the expansion of the universe is slower than in ΛCDM. Physically, this reduction in the expansion will produce less of a suppression (i.e. a net enhancement) of structure formation on large linear scales relative to ΛCDM. This enhancement of linear growth at large scales produces an increasing enhancement on quasi-linear intermediate scales as the physics of structure formation begins to become non-linear 25 . However, the change in expansion rate will have little impact on structure formation on small non-linear scales since these structures are locally gravitationally bound and thus not affected by the expansion of the universe; this explains why the enhancement of clustering decreases at small non-linear scales in the plot.
The upper right panel of Fig. 16 demonstrates that the isolated effect of the unscreened fifth force in cubic Galileon gravity is to produce an approximately scale-independent enhancement of clustering on large linear scales, which then begins to rapidly increase with k towards small non-linear scales. Although the effect on large linear scales is similar to the impact of the modified expansion history, the behaviour on small non-linear scales is different. Physically, the unscreened fifth force in cubic Galileon gravity produces a stronger attraction between particles relative to ΛCDM. Unlike the impact of the slower expansion, the impact of the stronger attraction between particles is just as relevant on small scales as it is on large scales. Thus the enhancement of clustering does not vanish on small scales and in fact increases rapidly due to the non-linear nature of structure formation.
The lower left panel of Fig. 16 demonstrates that screening in cubic Galileon gravity has no effect on clustering on large linear scales (as expected), but suppresses structure formation with increasing effect towards small non-linear scales. Physically, this behaviour can be understood by recognising that density increases on small non-linear scales, given that we saw in Fig. 4 that screening is stronger in high-density environments.
The lower right panel of Fig. 16 demonstrates the effect of the total fifth force in cubic Galileon gravity, effectively combining the isolated impacts of the unscreened fifth force and the screening. The result is to produce an approximately scale-independent enhancement of clustering on large scales where the screening is not active, which then tends to increase with k at intermediate scales due to growing non-linearity before the screening activates on small scales and suppresses it.
The combination of the enhancement of clustering from the modified expansion history and the enhancement due to the screened fifth force gives the final result we presented in Fig. 2.

D Detailed ESS results
Analogously to Appendix C, here we discuss the results of our ESS simulations in greater detail than §6.3. However, as in §6.3, we will focus our discussion on the points of difference between ESS gravity and cubic Galileon gravity to minimise repetition.

D.1 Background evolution
As in Appendix C.1, we begin our investigation by studying the background solutions we obtain from solving the ESS equations in §2.4. In Fig. 17, we plot the evolution of various background quantities with redshift for each of the three ESS models described Table 3, as well as for the corresponding ΛCDM model (which is reached in the limit of f φ → 0). We , Ω Λ (lower left), and Ω φ (lower right). We also show ΛCDM predictions for comparison. The model parameters are found in Table 2. The equivalent of this figure for cubic Galileon gravity is Fig. 14. also plot the ratio between the ESS-C case and ΛCDM for E & E /E in the upper panel of Fig. 18. These two quantities are important in determining the rate of linear growth, as seen in Eq. (4.1). We find that for these ESS models, departures from ΛCDM-like evolution in E, Ω r and Ω m are very small -practically invisible on the scale of Fig. 17. However, deviations are present, as can be seen in the inset for one of the models in Fig. 18. Cosmological expansion in these ESS cases is faster than in ΛCDM at early times for z > 2, then becomes slower than in ΛCDM around z = 1 before approaching the ΛCDM rate as z → 0.

D.2 Linear growth rate
In Fig. 18, we study the linear growth in ESS gravity and its dependence on expansion history and the linear modification to the strength of gravity.
We see that because the expansion in ESS gravity is faster than in ΛCDM for z 2, the suppression of structure formation due to expansion is increased which causes a reduction in linear growth relative to ΛCDM (purple dashed line). We also see that the effective gravitational strength being stronger than in ΛCDM causes an increase in linear growth relative to ΛCDM (green dashed line). These two impacts compete, leading to linear growth in ESS gravity being suppressed for z 2, but enhanced for z 2 (orange solid line). Note this is qualitatively different to the cubic Galileon case, where the modified expansion history was the dominant source of deviation from ΛCDM in the growth rate, and both the modified expansion history and G eff /G N acted to enhance growth.

D.3 P (k) breakdown
As in Appendix C.3, here we run an additional type of hybrid simulation, which we refer to as 'Lin-ESS', for each of the three cases in Table 3. These Lin-ESS simulations are identical to the full ESS, except that the screening has been artificially turned off resulting in linear modifications to gravity on all scales, or equivalently unscreened ESS fifth forces in all density environments. As in Appendix C.3, we take ratios between P (k) from the various simulations to isolate the impact of different aspects of the ESS model, which we then plot in Fig. 19.
The upper left panel of Fig. 19 demonstrates that the effect of the modified expansion history in ESS gravity is to produce an approximately scale-independent suppression of clustering on large scales. This suppression then increases with k at intermediate scales before beginning to reduce again at small scales. Physically, this result matches our expectation that the faster-than-ΛCDM expansion we see for z 2 in ESS gravity in the upper panel of Fig. 18 will enhance the suppression of structure formation that occurs due to cosmic expansion. This suppression increases at quasi-linear scales as the physics of structure formation becomes non-linear. Again, cosmic expansion will have less of a suppressing effect on structure formation on scales where structures are locally gravitationally bound, which explains the decrease in the suppression of clustering we see on small scales.
The isolated impacts of the unscreened fifth force and the screening are shown in the upper right and lower left panels of Fig. 19 respectively. Unlike the impact of the modified expansion history, the impacts of these two quantities in ESS gravity are qualitatively similar to the corresponding impacts in cubic Galileon gravity. In quantitative terms, the enhancement of clustering due to the unscreened fifth force is larger at large scales in our ESS cases, but smaller at small scales. However, the suppression of clustering at small scales due to the screening is also reduced in our ESS cases. As a result, we actually find the net enhancement to clustering from the combination of these two effects is greater for our ESS cases at all scales. The suppression of clustering from the modified expansion history and the enhancement due to the screened fifth force combine to give the final result we presented in  : Relationship between the wallclock time for simulations of ESS-C and the numbers of cores used. We find that in logarithmic space, the relationship between wallclock time and core count is well described by a linear fit, in particular, T = 1710N −0.92 minutes. In other words, Hi-COLA scales well and makes effective use of the resources given to it.
Being a relatively fast simulation code, naturally one may want to know how Hi-COLA's run time scales with the number of cores used. This is shown in Fig. 20, where multiple runs of ESS-C (with the same setup as described in the main text, see §5.2 and §6.2) were conducted, only differing by the number of cores given for the simulation. We find that the relationship between the wallclock time and core count scales approximately like T ∝ N −1 , showing that Hi-COLA has good scaling and is making efficient use of resources given to it, with no obvious signs of running into overheads.