The AGORA High-resolution Galaxy Simulations Comparison Project. VI. Similarities and Differences in the Circumgalactic Medium

We analyze the circumgalactic medium (CGM) for eight commonly-used cosmological codes in the AGORA collaboration. The codes are calibrated to use identical initial conditions, cosmology, heating and cooling, and star formation thresholds, but each evolves with its own unique code architecture and stellar feedback implementation. Here, we analyze the results of these simulations in terms of the structure, composition, and phase dynamics of the CGM. We show properties such as metal distribution, ionization levels, and kinematics are effective tracers of the effects of the different code feedback and implementation methods, and as such they can be highly divergent between simulations. This is merely a fiducial set of models, against which we will in the future compare multiple feedback recipes for each code. Nevertheless, we find that the large parameter space these simulations establish can help disentangle the different variables that affect observable quantities in the CGM, e.g., showing that abundances for ions with higher ionization energy are more strongly determined by the simulation’s metallicity, while abundances for ions with lower ionization energy are more strongly determined by the gas density and temperature.


INTRODUCTION
The circumgalactic medium, or CGM, is usually defined as the baryonic matter which resides within the virial radius R vir but outside the galaxy "boundary", for which a number of different definitions exist.We will use the value 0.15 R vir , corresponding to the expected size of the galaxy disk, though this is significantly larger than other common boundary definitions like the half-mass radius (see Rohr et al. 2022, Appendix A for a discussion of the evolution of half-mass radii in simulations).This gas is essential for any meaningful understanding of the long-term growth and evolution of galaxies, because any gas which flows into or out of a visible galaxy, for use in star formation within a galaxy disk or metal pollution of the intergalactic medium (IGM), has to pass through this region (Woods et al. 2014).In transit, it is caught up in a web of dynamical forces operating in a physical regime which is quite distinct from that of the other populations of gas in the universe, such as the interstellar medium (ISM), affected by active galactic nuclei (AGN), star formation, and dynamical perturbuations due to clumps, or gas within the extremely low-density intergalactic medium (IGM), dominated by cosmological effects.A summary of the current state of the theory of CGM dynamics can be found in Faucher-Giguere & Oh (2023), and references therein.
Interest in the CGM has grown considerably in recent years, as the significance of this region has become more apparent to the galaxy formation community and more data has become available (See Tumlinson et al. 2017, and references therein, for a summary of the observational picture).Due to its low density, the CGM is very difficult to see in emission line mapping, with the exceptions being H I emission (Zhang et al. 2016;Cai et al. 2019), which is unfortunately not a very good tracer of higher temperature gas, and metal line emission which is usually only possible in very nearby galaxies at z = 0 (Howk et al. 2017;Li et al. 2017).Instead, the CGM tends to be observed in absorption * Code leaders † High School students who worked with the Collaboration through the UC Santa Cruz Science Internship Program (SIP).
against bright background sources, generally quasar spectra.
In the last decade, there has been a tremendous increase in the amount of observational data available due to the development of improved space-based and ground-based telescopes, including the groundbreaking COS-Halos survey (e.g.Tumlinson et al. 2011;Werk et al. 2013Werk et al. , 2014) ) and an expanding number of new and larger samples, e.g.KBSS (Rudie et al. 2019), CASBaH (Prochaska et al. 2019;Burchett et al. 2019), CUBS (Chen et al. 2020), and CGM 2 (Wilde et al. 2021;Tchernyshyov et al. 2022), among many others.
Because absorption line spectroscopy requires a coincidence between background sources and foreground galaxies, it is very rare to get multiple sightlines of data around any single galaxy, though this is possible, either through coincidence (e.g.Keeney et al. 2013), or through exploiting the effects of strong lensing by the foreground halo to see the same background object in multiple places (e.g.Ellison et al. 2004;Okoshi et al. 2019).It is especially challenging because separate imaging and spectroscopy tools are needed to analyze the hosting galaxy system and the quasar sightline.Together this means that there is still significant uncertainty regarding the physical state of gas in the region, and that maximal information needs to be extracted from each line of sight.
As a rule, the CGM is highly ionized, and much of the interpretation of the physical state of gas, therefore, comes from interpreting absorption lines from ionized metals, in particular their column density, Doppler broadening, and kinematic alignment with one another.Metal lines have the advantage of relatively low line confusion with the Lyman alpha forest, and they are more likely than hydrogen to be in the linear regime and not saturated.Ionized metal densities can be a very good test of the physical state and evolution of the CGM because they are very sensitive to multiple variables, all of which can vary continuously.The number density of an element X in ionization state i is where A X is the fractional abundance of element X per metallicity unit, n is the number density of gas, Z is the overall metallicity in that parcel, and finally f X i is the fraction of the element X in state i, at the parcels given temperature and density.In this work, A X is assumed to be the constant solar abundance value, e.g. the number of carbon, oxygen, etc. nuclei for each hydrogen nucleus (at Z = Z ⊙ ), which are taken from CLOUDY documentation (Ferland et al. 2013). 1his extreme sensitivity to multiple variables makes the CGM an interesting area of focus for the AGORA (Assembling Galaxies of Resolved Anatomy) code comparison project, whose earlier simulations are shown in Kim et al. (2013Kim et al. ( , 2016)), hereafter Papers I and II, respectively.This large international collaboration of leading simulation code researchers is dedicated to examining the convergence or divergence of different simulation codes when applied to the same initial conditions and holding constant as much of the physical implementation as possible.In this work, we use a number of analytic methods to examine the CGM of the Cos-moRun simulation suite (Roca-Fàbrega et al. 2021, hereafter Paper III), the relevant details of which will be elucidated in Section 2. This work is being developed concurrently with two additional AGORA papers also focusing on the Cos-moRun simulation.The first is Roca-Fábrega et al. (in prep), or Paper IV, which presents the final fiducial models for Cos-moRun including new codes and models added since Paper III, as well as merger histories of the AGORA galaxies down to z = 1.The second is Jung et al. (submitted), or Paper V, which compares the satellite populations between codes and against identical dark matter only (DMO) simulations.
While the complexity of the gas state in the CGM and dependence on so many interlocking factors make it highly unlikely that all codes will converge on the same column densities or other observational features for individual lines of sight, the carefully calibrated and specified physics and initial conditions allows profile divergences to be disentangled, or in other words to see how much each underlying variable contributes to observable quantities.This can tell us about the range of effects of modern feedback and implementation systems.For example, if significant variation takes place in metallicity distribution, this means that feedback strength and timing deliver metals from the inside to the outside of galaxies at different efficiencies.On the other hand if ion fractions are significantly different, that means that the primary effect is on cooling and heating systems causing characteristic clouds to be in a substantially different phase.We will also be looking for structure formation within the CGM, and its relationship with various ions and their kinematic distributions.
This paper is organized as follows.Section 2 describes the parameters of the codes, including initial conditions and shared physics, and gives an overview of the mechanics for each of the 8 codes participating in the study, including any existing studies of the CGM of other simulations using those codes.We also describe the analysis tools utilized in this work for creating mock observations or interpretations of the CGM.In Section 3 we analyze the growth and distribution of gas and metals in the CGM, including how far they spread, their usual phase, etc.We also perform analysis of observable parameters, such as absorption lines, kinematic alignment, and divergences and similarities between codes in column densities of medium-high ions.Finally, in Section 4 we conclude the article with remarks on the essential contribution cross-code studies like this make to the field of galaxy simulations.We discuss how different codes could currently be compatible with different plausible models of the CGM, in the interest of combining their strengths to adequately resolve and populate this region in future projects.

Initial Conditions and Cosmology
Each of the codes is designed to accept as input a common set of initial conditions (ICs), which in principle means that each of the codes should create the same zoom-in galaxy in the same location and with the same orientation.These of course will not be exactly identical, due to the stochastic elements which are built into several of the codes, but in macroscopic details they should be similar and features should be recognizable between them.The ICs are created using the software MUSIC, which uses an adaptive multi-grid Poisson solver (Hahn & Abel 2011) 2 to create a realistic distribution of dark matter and primordial gas at a starting redshift of z = 100.The zoom-in region was chosen from a large DM-only simulation such that the largest galaxy in the zoomin region will evolve to have a virial mass of ∼ 10 12 M ⊙ at z = 0, and will not have any major merger events between the redshifts of 2 and 0.3 Any outside research groups, whether interested in joining as part of the Collaboration or merely to test their own code with our ICs, can freely download the MUSIC file 1e12q on the AGORA website.4AGORA members will be happy to assist in set up and calibration of any new codes.
The cosmology used by each code is the standard ΛCDM parameters (Komatsu et al. 2011;Hinshaw et al. 2013), with an assumption of a primordial metallicity of 10 −4 Z ⊙ in each cell. 5 Each code has a different system for refining and degrading resolution according to the local conditions, either intrinsically, as is the case for particle codes, where resolution is directly carried by particles, or by automatically refining after specific threshold requirements are met in grid codes. 6The resolution refinement schema for each code is listed in section 2.2.Overall requirements for the codes set by the ICs, however, were to have a 128 3 root resolution in a (60 comoving h −1 Mpc) 3 box, with five concentric regions of increasingly high resolution centered around the target halo.At the smallest, highest resolution region, it is equivalent to a unigrid resolution of 4096 3 resolution objects, giving a minimum cell size of 163 comoving pc (around 40 physical pc at z = 3).The size of this highest-resolution region is chosen to enclose all particles which will fall within 4R vir of the target halo by z = 0.The dark matter particles in this region are of a uniform mass (m DM, IC = 2.8 × 10 5 M ⊙ ), and the gas particles, for codes for that use them, have m gas = 5.65 × 10 4 M ⊙ .For more information about this IC and other available AGORA ICs, we refer the interested readers to Section 2 of Paper I, as well as Section 2 of Paper III.

Individual codes in AGORA
The codes used in this paper are summarized in depth in Papers I -IV, each paper focusing on a different aspect of how the codes work relative to different common physics implementations.Paper I focuses on the details of the gravity implementation of each code, Paper II focuses on the hydrodynamics and fluid dynamics solvers, and Paper III discusses the creation of stars and metals within the codes.Paper IV focuses on summarizing any changes in the active simulation setup or feedback implementation since Paper III.For convenience and to stay up to date with current developments, we also list the participating codes here, with some basics about their mechanisms and information on their most recent results, including noting any papers which focused on the CGM.

ART-I
The simulation code ART-I, is an AMR-type grid code introduced in Kravtsov et al. (1997).Whenever a single cell 5 1 Z ⊙ = 0.02041 is used across all participating codes in order to follow our choice in Paper II (see Section 2 of Paper II for details).This has no effect on the physical conditions in GRACKLE, which are calibrated to this value, as the total metal production by mass remains the same, though it does affect some of the plots in this work.
6 Specifically, refinement takes place when an individual cell reaches a mass of four times the gas particle mass used in SPH codes (m gas = 5.65 × 10 4 M ⊙ ), in order to keep grid and particle codes at roughly the same resolution, though continuity requirements for refinement does vary between codes.See Section 5.1 of Paper I and section 4.3 of Paper II for more details.reaches a particle or gas overdensity of 4.0 (see Footnote 6), that cell splits in half along all three directions forming 8 subcells (codes that do this are referred to as "octree" codes).This proceeds until the best-allowed resolution of 163 comoving pc is reached, at which point cells are no longer allowed to split.Recent work using ART-I cosmological simulations includes the FIRSTLIGHT simulations (Ceverino et al. 2017) with a large number of zoom-ins at high redshift.The CGM of an ART-I suite was explored in significant detail in Roca-Fàbrega et al. (2019) and Strawn et al. (2021) for the VELA3 suite (Ceverino et al. 2014;Zolotov et al. 2015), finding that cool, inflowing streams contain mostly photoionized O VI, but are enclosed by Kelvin-Helmholtz interface layers (Mandelker et al. 2020) which contain significant quantities of collisionally ionized O VI.We will also point out that many of the computational and analytic tools used in this paper were first introduced in Strawn et al. (2021).

ENZO
The code ENZO is another AMR-type code, notable for its open-source development strategy and history (Bryan et al. 2014).It was developed alongside its native gas heating and cooling package GRACKLE (Smith et al. 2017), which has been modified for use as a shared heating and cooling implementation used by all AGORA simulations.The most significant CGM-focused work using ENZO is the development of the FOGGIE simulation (Peeples et al. 2019), as well as similar fixed-resolution halo simulations (Hummels et al. 2019), which showed that resolution has very significant effects on the survival and amount of cool and cold gas found in the CGM.

RAMSES
The RAMSES code is also an AMR-type octree (See Section 2.2.1) code, introduced in Teyssier (2002).Current cosmological simulations which demonstrate the feedback implementation used here are shown in Nuñez-Castiñeyra et al. (2021), and especially Augustin et al. (2019) which, focusing on the CGM of a similar RAMSES zoom-in simulation, found that redshift 1-2 would be a "sweet spot" for observations of the CGM in emission with new telescopes now coming online.

CHANGA-T
CHANGA is a particle SPH code, where fluid interactions are mediated between multiple "smoothed particles."It is is a redevelopment of the code GASOLINE (Menon et al. 2014;Wadsley et al. 2017) with a different architecture.This code has been recently used for the ROMULUS simulation series, summarized in (Jung et al. 2022).The CGM of several ROMULUS halos was recently analyzed and categorized a large number of different phases and dynamic modes in Saeedzadeh et al. (2023).e This value is added as heat to gas particles surrounding new star particles over a small number of timesteps, see Shimizu et al. (2019).f The fractions f T and f K are the fraction of total SN energy distributed into thermal and kinetic feedback, and depend on a number of factors according to Hopkins et al. (2018).
We have changed the name to CHANGA-T to indicate a different version from the one used in Paper III.In that paper, we ran a version of CHANGA with so-called "superbubbles," a form of feedback that superheats small regions near supernovas (see Keller et al. 2014), while the version shown here has only thermal feedback, as visible in Table 1.Both versions of CHANGA were run with the CosmoRun ICs, with a comparison between the two shown in Appendix B of Paper IV.We focus on this version here because it was more easily accessible at the time of submission of this paper and could be analyzed more straightforwardly, however further comparison between the CGM of the two versions would be an interesting topic for future work.

GADGET-3
The next SPH-type code is GADGET-3, a highly versatile code with many different offshoots, with gravity computed by the tree-particle-mesh method.GADGET3-OSAKA, referred to in this paper as GADGET-3 (Aoyama et al. 2017;Shimizu et al. 2019) is one of several offshoots of the SPH code GADGET (Generations 1 and 2 were showcased in Springel et al. 2001 andSpringel 2005, respectively).The code used in this paper uses the feedback system adapted from Shimizu et al. (2019).
Previous studies of the CGM in GADGET-3 include Oppenheimer et al. (2016), which analyzed the EAGLE simulation and found that in their codes, O VI was not necessarily connected to galaxy star formation as inferred from Tumlinson et al. (2011).Nagamine et al. (2021) also studied the distribution of neutral hydrogen in the CGM, and showed that varying treatment of feedback can cause about 30% variations in the Lyα flux decrement around galaxies.
GADGET-4 (Springel et al. 2022) is also in current use (e.g., Romano et al. 2022a,b), and has expressed interest in pursuing the AGORA project.It will be included in future papers after completion of the rigorous calibration required by CosmoRun.

GEAR
The code GEAR (Revaz & Jablonka 2012) is another SPH code.While originally based on GADGET-2, it contains a number of improvements and possess its own physical model (Revaz et al. 2016;Revaz & Jablonka 2018).GEAR uses the improved SPH formulation of Hopkins (2013) and operates with individual and adaptive time steps as described in Durier & Vecchia (2012).Star formation is modelled using a modified version of the stochastic prescription proposed by Katz (1992) and Katz et al. (1996), where stars form in unresolved regions, and which reproduces the Schmidt (1959) law.Stellar feedback includes core collapse and type Ia supernovae (Revaz et al. 2016), where energy and synthesised elements are injected into the surrounding gas particles using weights provided by the SPH kernel.To avoid instantaneous radiation of the injected energy, the delayed cooling method is used (Stinson et al. 2006).The released chemical elements are further mixed in the ISM using the smooth metallicity scheme (Wiersma et al. 2009).
The GEAR physical model has been mainly calibrated to reproduce Local Group dwarf galaxies (Revaz & Jablonka 2018;Harvey et al. 2018;Hausammann et al. 2019;Sanati et al. 2020) and ultra-faint dwarfs (Sanati et al. 2023) and in particular their chemical content.

AREPO-T
The AREPO code operates using an unstructured moving mesh, which is generated dynamically according to density and velocity, allowing it to evolve resolution naturally while still solving Euler equations on cell faces as in grid codes (Springel 2010).Major recent AREPO projects include Illustris-TNG (Pillepich et al. 2018) and Auriga (Grand et al. 2017).Analysis of the CGM of the former was given in Nelson et al. (2020), finding that magnetic fields could be essential to cold clouds surviving in the halo, and of the latter in van de Voort et al. (2021), which found in a zoom-in simulation that resolution was essential to resolving cold and cool neutral gas in the CGM.
Like CHANGA-T (Section 2.2.4), we have adopted the name AREPO-T in this paper to indicate this run uses only thermal feedback.Another version with a different feedback system has also been run on the same initial conditions by the Collaboration.That run contains a more complex schema for stellar wind propagation (see Section 2.3.2 of Pillepich et al. 2018) and is compared to the version here in Paper IV, Appendix B. In this work, we focus on the thermal-only version because it was somewhat faster to calibrate and simpler to analyze, making it more accessible at the time of publication of this paper.Direct comparison between the CGM of AREPO's thermal and IllustrisTNG-like wind models will be considered as a future project by the AGORA collaboration.

GIZMO
Finally, GIZMO is a mesh-free code based on a volume partition scheme, in which particles represent cells with smoothed boundaries.Despite being a descendant of GADGET-3, GIZMO is somewhat similar in spirit to AREPO, where the Euler equations are solved as in grid codes across effective faces shared between nearby particles.The actual scheme employed in the GIZMO runs for this comparison is the finite-mass one, in which cells are not allowed to exchange mass through the faces.
The Simba (Davé et al. 2019) and FIRE-2 (Hopkins et al. 2018) projects are examples of high-resolution zoom-in GIZMO simulations.These works found that in the CGM, cool inflows generally reached temperature equilibrium quickly and are not very sensitive to the heating implementation, while hotter gas has a cooling time longer than the dynamical time and therefore its state depends more sensitively on this implementation.

Common, Code-independent Physics
Much of the physics in the operation of the codes is fixed, and each aspect of this was thoroughly calibrated in the process described in Paper III. 7 While hydrodynamic and gravitational solvers are intrinsically tied to individual codes, gas heating and cooling parameters are fixed by the common package GRACKLE8 (Smith et al. 2017), and the details of the GRACKLE runtime parameters were shown in Section 3.1 and the process of calibration with each code was shown in Section 5.2 (Figures 4 and 5) of Paper III.
A pressure floor requires the local Jeans length to be resolved at all times, in order to prevent unphysical collapse and fragmentation, and each code was given a minimum cell size (for AMR codes) or gravitational softening length (for SPH codes).More details on these conditions can be found in Paper III, specifically sections 3.1 and 4. In this paper which focuses on the much lower-resolution CGM region, we are interested in not just the highest available resolution, but also the specific pattern of the resolution degrading as the simulation moves away from the galaxy center.
In Figure 1 we show the increase in the effective size of resolution elements as a function of distance to the galaxy center for each code.All codes were found to show a general degradation in resolution with distance, and mostly convergent with one another.Generally, all codes have a resolution of between 30-300 pc within 0.15 R vir (considered to roughly represent the "galaxy"), between 100 pc -3 kpc within 1.0 R vir (representing the "CGM"), and between 300 pc -10 kpc outside 1.0 R vir (the "IGM,"), with the outer boundary of the IGM taken to be at 4.0 R vir in order to stay within the Lagrangian region defined in the ICs.A few resolution differences between the codes persist, however, mostly as a result of their general hydrodynamical mechanism.SPH codes are not as strongly constrained by either resolution ceilings or floors, because the free motion of particles is paramount.While particle masses are chosen in order to force a certain mass resolution, if gas particles cluster together into a small region, they will effectively resolve that volume at a better resolution than the best-allowed volume resolution for AMR codes, and thus can be more detailed within the internal galaxy structure. 9The disadvantage of this free motion is that in low density regions such as the CGM the effective Figure 1.Resolution of all 8 AGORA codes at z = 3.In each shell of increasing size, color shows the mass fraction contained in "linear resolution equivalent" bins of width 0.5 dex, normalized within columns.For grid and moving mesh codes, "linear resolution equivalent" is defined as cell volume raised to the 1/3 power.For particle-type codes, it is instead defined as "effective volume" (particle mass divided by particle density) to the 1/3 power.See Section 2.3 for more details.
resolution in particle codes is worse than in grid codes, which have their resolution-degradation suppressed by the strict requirements for cell recombination.Moving mesh codes remain somewhere in between these two outcomes.Within the IGM, all types of codes have very similar outcomes.
All codes are given the same requirements to form stars, though how those requirements are implemented can vary greatly.The code groups are each asked to determine, according to their code's design and particle generation format, the stochastic or deterministic nature of this process.This takes place at a threshold number density of 1cm −3 .The mass of each star particle formed also determined by the individual processes, only requiring a minimum mass of 6.1 × 10 4 M ⊙ .Details on the requirements for star formation within the codes in CosmoRun are given in Paper III.
Unlike in Paper II, where the form of stellar feedback was specified in an idealized galaxy disk, in the CosmoRun simulation of Papers III-VI (this work), we allow each supernova's schema for injection of metals, mass, and energy into the nearby gas to be as close as possible to the version most commonly used by that code group in comparable simulations.We do require some top-level parameters to be the same.Specifically, we require each supernova event to release at least 10 51 ergs of thermal energy, 14.8 M ⊙ of gas, and 2.6 M ⊙ of metals.This change was detailed further in Paper III. Different codes add many different effects or implement feedback in different ways, as shown in Table 1.
Notably, we use the "thermal-only" models analyzed in Paper IV Appendix B for CHANGA and AREPO.In addition to the logistical reasons stated in sections 2.2.4 and 2.2.7, this is useful because it allows us to examine one example of the CGM that results from each code architecture using simple thermal-only feedback, these being ENZO (AMR), CHANGA-T (SPH), and AREPO-T (MM).

Shared Analysis Tools
The most important analysis tool for this work is the highly versatile simulation analysis code YT.This code was first developed in Turk et al. (2011), and significant improvements to YT were integrated by AGORA collaborators during the process of writing Papers I, II, and III, alongside many others.The code has reached widespread adoption in the cosmological simulation community, and engagement from that community has led to significant improvements in all aspects of the code.The most significant update since Paper III to YT is the "demeshening", where particle codes were integrated much more naturally into the architecture, which was designed primarily for use on grid codes (Turk et al., in prep).We also rely heavily on a YT-based CGM tool TRI-DENT (Hummels et al. 2016), which makes sightline generation significantly easier, implements ion fractions using a lookup table from the photoionization code CLOUDY (Ferland et al. 2013(Ferland et al. , 2017)), and has efficient functions for both generating and analyzing realistic spectra.
With these two programs powering our back-end analysis, we have developed a user-oriented frontend tool AGORA ANALYSIS, 10 which is integrated into the shared supercomputer architecture 11 to make accessing each simulation snapshot and any necessary metadata for that snapshot (center coordinates, R vir , bulk velocity vector, angular momentum vector) very straightforward for use by any collaborators or interested parties.AGORA ANALYSIS also includes scripts for creating most of the images in this text, besides the ones which use individual sightline data for which there is another package QUASARSCAN.As an important point here, by default AGORA ANALYSIS will calculate the sizes of different regions using a virial radius which is the average of all eight codes' individual virial radii generated using ROCK-STAR (Behroozi et al. 2013).At a fixed stellar mass and with a fixed environment, it was decided that to include significantly more (up to ∼ 1.5 times, at most) volume in some codes would detract from the comparison, especially when considering the number of satellites of the main halo (Paper V).So, all virial radii and derived quantities taken within the 0.15 R vir edge of the galaxy or the 1.0 R vir edge of the CGM are shared among all 8 codes, even though individual virial radii have been calculated for each.
Finally, QUASARSCAN12 is a random sightline generator and analysis tool, first introduced in Strawn et al. (2021).It creates approximately 400 sightlines through the CGM by placing sightline start points on an enclosing large sphere (∼ 6.0 R vir ) at a discrete set of polar and azimuthal coordinates, and the vector from the galaxy center to the start point is normal to a "midpoint" plane within which the distance to the galaxy center will be equal to the sightline impact parameter.A midpoint is then selected from that plane at one of a discrete set of impact parameters from 0 to 1.5 R vir .The probability of each impact parameter and polar angle is weighted so that the lines comprehensively sample the area within that radius, i.e. higher impact parameters are more likely.The sightline is then projected from the starting point through that midpoint, and ends back on the aforementioned large sphere, on the opposite side of the halo.
Each line of sight integrates a set of ions of interest (here, Si IV, C IV, O VI, and Ne VIII) to calculate a column density.Furthermore, we calculate the overall metallicity, and the mean and peak densities along the line.For a small subset of sightlines, physical spectra are also projected and saved, for analysis with TRIDENT's built-in Voigt profile fitter (Figures 10,11,and 12).

RESULTS OF CGM STUDY
Because of how sensitive the CGM's observables are to so many different variables, it is worth reiterating the design philosophy of the AGORA project.In Paper II we have already established that the implementation differences between codes in idealized conditions are minimal.So, any significant differences between codes are likely to be a result of their different choices of stellar and supernova feedback at least as much as their underlying hydrodynamical and gravity solver.For the convenience of the reader, we will continue to refer to different codes by code name, rather than by referring to the feedback mechanism explicitly, except where the feedback appears to have clear effects on the outcome.This means that other simulation groups using a code in AGORA with a different feedback implementation are cautioned to be careful when comparing their simulation to the CosmoRun results for their code.As mentioned above, these are the initial feedback models, and several codes have already run the same ICs with new feedback prescriptions, which will be added to the AGORA public data release and will be analyzed in future works.We will also comment that at the level of detail of individual particles, streams, or other features, there are inherent stochastic and numerical effects, which means that some details might not be the same even between runs of the same code.This means, however, that we should be careful interpreting very specific objects, such as lines of sight, slices, and projections, and that instead, averages, profiles, and phase diagrams will be more robust to stochastic effects.

Differences in metal distribution and gas state
The most striking feature of the different codes for their observable CGM is precisely the difference in mass and metal distribution out to R vir and beyond, which depends strongly on feedback mechanism and code architecture.In Figure 2, we show the evolution of the gas mass distribution throughout all eight models over time, both as raw masses (top) and as a fraction of the total (bottom).The four components of gas mass are as follows: 1. "GAL (GAS)": gas within 0.15 R vir 2. "GAL (STARS)": stellar mass within 0.15 R vir 3. "CGM": gas (and stars) between 0.15 and 1.0 R vir , however only a small number of star particles are present 4. "IGM": gas (and stars) between 1.0 and 4.0 R vir , however as with the previous item, only a very small number of star particles are present.
5. "TOT": Total gas and star mass in the entire 4.0 R vir enclosing sphere.
We can notice here that all eight codes agree remarkably well in the total gas (red curve) at both redshifts z = 3 and z = 1.
Note that not all codes reach redshift z = 1, meaning that the codes do not necessarily agree at their own "last" points. 13ll of the AGORA galaxies are dominated by gas in the IGM throughout cosmic time, as expected due to it containing more than 98 percent of the total analyzed volume, and due to primordial gas which continues to inflow along cosmic filaments into the "IGM" region.Within the galaxies, there is significant variation between retaining more mass in stars or gas over time, with stellar mass eventually eclipsing gas mass in ART-I, ENZO, CHANGA-T, GADGET-3, and AREPO-T.Interestingly, the CGM mass (orange) remains more consistent among codes, even though whether the CGM is overall larger or smaller than the galaxy mass is not.The CGM contains in some codes more mass than galactic stars and gas combined, even being overtaken by stars alone in ENZO, CHANGA-T, and AREPO-T, notably the three codes using thermal-only feedback.Additionally, notice that all codes have an extremely "bursty" accretion pattern into the CGM with the mergers that take place at z = 5 and less noticeably at redshift z ∼ 2.
In Figure 3 we examine the distribution and evolution of metals in the different regions with time.As in Section 2.3 we have selected to take 4.0 R vir as the outer boundary of the IGM because inclusion of any regions outside this distance creates unphysical metal distribution results.This arises because with integration of large volumes, the metallicity floor for the AMR codes results in substantial metals far from any meaningful sources, while the total in SPH codes is much lower.Within this sphere, all metal mass can be assumed to originate in local stars, either within the central galaxy or in satellites.
Overall, the total metal creation (red) is relatively consistent, though not as consistent as we would expect, given the requirements on each code and the closeness of their star formation rates.The effective metal yields are given in Table 1 of Paper III, and generally the yield is a metal mass of 0.033 M ⊙ for each 1.0 M ⊙ of stellar mass.The exception to this is GEAR, where the metal production (yield 0.015) could not be detached from the star formation prescription.This means that metal production in GEAR is consistently at least a factor of two below the other codes, though it is worth noting that it is more than a factor of two below at other times, indicating it is not only the yield which suppresses metal production.At redshifts approaching z = 1, ART-I slows this production significantly.This is likely due to an oncoming quenching period where star formation slows down in most codes, and which will be a topic of future AGORA papers.
Total metal production is within a factor of 4 at redshift z = 2 (between RAMSES and CHANGA-T), and retains about the same range at z = 1, but now between GEAR and ENZO.Overall, ENZO's consistently high star formation causes a dramatic turnaround from the slow start; in Paper III it was noted that ENZO had the lowest stellar mass of all eight codes in CosmoRun at z = 4.Here it has the highest SFR by a decent margin, with only CHANGA-T coming close, and already slowing down by z = 1.5 (Figure 2).This has a complex relationship with ENZO having the strongest purely thermal feedback of all AGORA codes, which clearly suppresses star formation at early times but which then allows additional star formation at later times.Further discussion of the star formation rates as a function of time and feedback process can be found in Paper IV.
Interestingly, there is no consistent pattern as to whether most metals within the galaxy remain locked into stars, effectively inaccessible to any kind of gas mixing (ART-I, CHANGA-T, AREPO-T, GIZMO), or whether most metals are in the ISM and thus could be subject to outflows and/or recy-cling (RAMSES and GEAR, codes both using T, DC feedback), while ENZO and GADGET-3 keep the ISM and stellar metal mass roughly equal.
Another striking feature of this plot is how some codes, in particular ART-I and RAMSES, send comparable amounts of metals into the IGM or CGM as remain inside the galaxy, including star contributions, while most codes keep the vast majority inside the galaxy.With regard to how far the average metals go, we can note that regardless of how much of the metal mass leaves the central galaxy, generally metals that do leave become roughly equally divided between the IGM and CGM, with the exception being the fast-outflowing ART-I and AREPO-T galaxies which eject metals so quickly from the ISM that they flow through the CGM and immediately leave, leading the IGM to dominate the metal distribution, though as ART-I approaches z = 1, the metals slow down and seem to return to the CGM.Metal diffusion and transportation processes depend in complex ways on code architectures, as discussed in detail in Section 3.2 of Paper III.In grid codes, diffusion over surfaces is built in with solving the Reimann problem on each cell interface, while in particle codes, diffusion is often implicit in the smoothing procedure.
Moving mesh codes can provide either explicit or implicit diffusion depending on their architecture, see Paper III for explanation of GIZMO and Paper IV, Appendix B for one of AREPO.
Finally, we will analyze a property which is common to all eight codes.Namely, in Figure 4 we show the overall metallicities of the outflowing and inflowing gas elements (cells or particles) in the left and center columns.In the outflowing gas column at z = 3, while there is an approximately two orders of magnitude difference between the highest and lowest metallicities, all codes remain approximately flat with radius outside the galaxy, declining only by a factor of 3 at most (in CHANGA-T).GEAR remains constant outside of 0.5 R vir , but declines significantly within that region, indicating that with the feedback process implemented in that code, only a small amount of ejected gas reaches the virial radius (in addition to the previously mentioned factor of 2 lower yield, see Paper III).Inflowing gas has signficantly lower metallicities overall in all codes, with a significantly stronger decline with radius.In the third column of Figure 4 we show that the ratio of inflowing to outflowing metallicity is much more closely constrained, with less than an order of magnitude difference between the codes.Figure 4 suggests that all codes have outflows and inflows interacting with similar dynamics, which causes inflows to significantly increase in metallicity as they approach the central galaxy.The similarity between codes on the ratio of outflows to inflows, combined with the very different total metallicity of each, suggests that it is indeed the feedback systems, rather than the overall code architecture (which would control inflow-outflow dynamics) which affect the distribution of metals.Previously, it was found that, in some cosmological simulations (Mandelker et al. 2020;Strawn et al. 2021), cool inflows entrained metals from the hot outflowing material, so that when they fed the galaxy they were barely more metal-poor than the hot outflows, leading newly formed stars and cool gas to be generally not "pristine."These results suggest something broadly similar here, and so gas entering the galaxy from outside is likely to be only mildly more metal-poor than the ISM itself.

Comparison Snapshot Analysis
Here we will perform a detailed analysis of a single snapshot for eight codes at redshift z = 3, and five codes at redshift z = 1.These redshifts are chosen to avoid any effects from the timing discrepancies of mergers at redshifts 4 and 2 (See Paper IV for details on the timing discrepancies).First, we analyze a projection plot at a particular viewing angle for Figures 5 and 6.The rows of this plot are metallicity, column density, temperature, and radial velocity, respectively, with columns representing different codes.Note that these plots are chosen to be axis-aligned to show shared structural features.Face-on and edge-on figures are available in Paper IV.
We also elected to use thin mass-weighted projections rather than slices to facilitate straightforward comparisons, which due to stochasticity, timing discrepancies and minor numerical effects, have features which are rarely aligned into identical planes, even if they are largely the same.A good example of both is the cool streams which are visible in the temperature projection (third row from top) in each code, which are clearly relatively similar between codes here; with slightly different image parameters these streams would only appear in some panels.As noted in Stewart et al. (2017), in many simulation codes, including several codes showcased here, angular momentum is primarily built up through these inspiraling flows which are connected to the cosmic web.
At z = 3, there are many similarities between the snapshots.The mass structure is broadly the same in each code, with the main star formation fuel -cool, dense, inflowing streams -being approximately z-axis-aligned, with N ∼ 10 21 cm −2 , and a hot outflowing bulk medium elsewhere.Average column density in the galaxy region is at 10 23 cm −2 and above in all codes except ART-I.Within the CGM, average densities outside of the streams are around N ∼ 10 20 cm −2 , with only GADGET-3 seeming to have a significant filling out to the virial radius with higher density.In temperature, there is a fairly substantial difference between the grid and particle codes, with significantly more cool gas visible in ART-I, ENZO, RAMSES and AREPO-T.RAMSES and AREPO-T have particularly strong contrasts, containing cooler highdensity clouds and a hotter low-density bulk.Moving mesh codes have behavior somewhat in between the two styles, with GIZMO more closely resembling the particle codes and AREPO-T more closely resembling grid codes.
In this axis-aligned image some important differences can be very subtle, such as that in some codes (ART-I, ENZO, RAMSES, AREPO-T), the inflowing stream from the center right merges with the other stream on the top right near the virial radius and gives the impression of a single stream entering the halo, while in others (CHANGA-T, GADGET-3, GIZMO), the three streams generally merge much closer to the galaxy, appearing as more or less separate valves for inflow.This is a highly stochastic effect, which depends sensitively on the plane chosen and timestep.The most significant difference is in fact the volume and metallicity of the outflow structure.An extremely visible effect which distinguishes particle codes from grid codes is how fast gas is ejected, as seen in the radial velocity images in Figure 5 (bottom row).While a biconical outflow structure is visible in all codes (though very faintly in GIZMO), the difference between the extremely fast speeds in the grid codes and the much slower speeds in the particle codes leads to metals being more uniformly distributed in grid codes out to large distances, as also noted in Shin et al. (2021).Examination of larger-scale plots, shown in Appendix A, demonstrates that the maximum spa- tial extent of metals in these codes is a sphere of about 4.0 R vir .
ART-I and RAMSES are by far the strongest, and send gas sometimes with supersolar metallicities at speeds on the order of 100s of km/s, with AREPO-T containing similar speeds in somewhat narrower outflow jets, (note that Figure 5 is a mass-weighted projection, so these values are significantly diluted by slow-moving or slowly infalling gas along the projection lines-of-sight).The GADGET and CHANGA-T snapshots have similarly shaped high-metallicity biconical outflows, but much slower, and GIZMO has even weaker outflows.While ENZO'S outflowing gas is as fast as the other grid codes, its much narrower structure means fewer overall metals leave the virial radius.Finally, GEAR has significantly less metals sent into the CGM than any of the other codes, due to the low yield, highly concentrated center and relatively slow outflows.
By z = 1, (Figure 6) a number of changes have taken place.The higher density gas filling the virial radius, seen before in GADGET-3 has also happened in GEAR.The grid codes, here including AREPO-T, remain largely filamentary, most visible in low-metallicity in the top row.Grid codes retain both faster inflows and outflows, and over the time from z = 3 to z = 1, we can see that both particle codes have significant metallicities only about out to the virial radius (with GEAR somewhat less than GADGET-3), while the grid codes have effectively filled the visible IGM.
Another view of the inflows and outflows from the galaxy can be seen in Figure 7.Here we analyze temperature profiles averaged over spherical annuli at different distances to the galaxy center.We analyze two populations of interest, which are the high-density inflows and metal-rich outflows, defined as gas parcels (cells or particles) which have v r < 0 km/s, n > 10 −2.5 cm −3 and v r > 0 km/s, Z > 0.1Z ⊙ , respectively.We can see that indeed these galaxy-fueling inflows are significantly cooler than the outflows.Interestingly, there are significant differences in the profiles by code type and feedback mechanism.First, grid codes (here including AREPO-T) at z = 3 have their fueling inflows heat up significantly less on the final approach to the galaxy than particle codes, reaching around 10 4.5 K to the particle codes' 10 5−5.5 K.This difference between code types might be due to slightly higher densities in the cool inflows in grid codes (Figure 5), giving them access to faster cooling, and interestingly is different from the result of Nelson et al. (2013), which did a similar study without explicit feedback.As time evolves to z = 1, several codes do not reach this threshold density of n > 10 −2.5 cm −3 in significant parts of their CGM, leaving gaps such as in ART-I at high radial distance.At the same time, only GADGET-3 can still be seen reaching the high temperatures mentioned above.The outflows in Figure 7 have even more substantial differences in temperature, at about an order of magnitude from 10 5 and 10 6 K.At z = 3, five of the eight codes follow a very similar power law, mostly codes with simple thermal feedback or weaker delayed cooling (with the exception of GADGET-3).ART-I, on the other hand, becomes much cooler past around 0.25 R vir , while RAMSES and GIZMO, after an initial decline with radius like the other codes, actually increase in temperature to 10 6 K as they approach the outer halo.This remains roughly the same at z = 1, except that the codes just mentioned did not reach this redshift and so it gives a (misleading) appearance of further convergence.
In Figure 8, we show the total probability density function of all gas at z = 3 in each of the AGORA simulations, from r = 0.15 R vir to r = 1.5 R vir , thus including the CGM and some of the IGM.In all codes, a primary cooling curve is visible at around 10 4 K.We will refer to this as the "cooling track," which follows the minimum gas temperature for which cooling is stronger than heating (see Figure 5

in Paper III).
There are a few interesting distinctions between the AMR and SPH type codes in Figure 8. AMR codes are generally more likely to fill out large clouds in phase space both above and below the cooling track, with no other really distinguishable structure.In ENZO, we can even see a significant population of cold gas.SPH codes, on the other hand, have no or negligible cold gas here.They also have a much more apparent hot cloud (yellow cloud in upper left), clearly out of pressure equilibrium due to increasing in density with increasing temperature rather than decreasing.This hot cloud follows an isentropic line, meaning gas in this phase follows the equation T n −2/3 = const., as seen in other high-temperature, low-density gas in e.g.Paper II and Shin et al. (2021).For gas which reaches these high temperatures, the only relevant cooling process is very slow bremsstrahlung radiation, so it then expands more or less without significant cooling.This means the particle codes have a much more straightforward two-phase structure: cool, high-density streams and hot bulk material, though to some extent this is because SPH codes do not have very many particles in the outer CGM.
As the codes evolve to redshift z = 1 (Figure 9), they spread out to fill more of the low-density phase space, while losing most of the cold gas below and to the right of the cooling track.This may be connected to the mass threshold for virial shocks the codes cross at around this time.This mass, around ∼ 10 12 M ⊙ , was first proposed in Birnboim & Dekel (2003); Dekel & Birnboim (2006), and has been explored further by a number of other groups (e.g.Keres et al. 2005;Kereš et al. 2009;Faucher-Giguère et al. 2011;van de Voort & Schaye 2012;Stern et al. 2020).In particular, in light of the results in Stern et al. (2021); Hafen et al. (2022), a consequence of this shock heating of the inflowing gas could be that when the  simulations reach z < 1, gas will cool more slowly and have time to virialize (i.e.relax and rotate coherently) before entering the galaxy, with the thin gas/stellar disk forming from this coherently rotating and slowly cooling gas.While in this work we do not use redshifts below z = 1, an assessment of this "outside-in" virialization scheme at lower redshifts will be pursued in future AGORA papers.Interestingly, the grid codes now form a similar isentropic hot cloud as mentioned for particle codes at z = 3 (upper left region of phase plot).This suggests that this heating effect simply takes place significantly faster in particle codes, but eventually does follow in grid codes.In all five codes, this hot phase seems to have drifted away (to lower density) from the "cooling track".
On this plot, we also show the distribution of ∼ 400 sightlines sent through the CGM, which will be examined further in Sections 3.3 and 3.4.The sightlines are here shown according to the density of their maximum-contribution element (where the contribution is defined as number density times path length for that element) and mass-weighted average temperature, thus showing cell features intersecting sightlines.Along a sightline, SPH codes are deposited in the form of line segments indistinguishable from grid-type "cells"; however, this can lead to somewhat strange behavior if a sightline is far from a direct intersection with any particular gas particle, such as the extremely low-density points in GADGET-3.The color indicates the impact parameter of each sightline, with blue being near the galaxy and red being at or near the virial radius.We will discuss the sightlines in more detail in the next section.The main result here is that at redshift z = 3, the average temperature of sightlines remains roughly constant with increased maximum density, showing that the densest (and likely coldest) cells don't dominate the overall temperature distribution, or in other words, sightlines dominated by a high-density cell go through multiple phases with a comparable total mass contribution.There is a clear impact parameter dependence, showing more distant lines of sight are significantly less likely to go through high-density cells/regions.
At z = 1, by contrast (Figure 9), there is a much more significant temperature dependence on density, in both grid and particle codes.Higher-density sightlines (which remain largely close to the galaxy) have, on average, significantly hotter gas, indicating that the denser regions that lines pass through now more effectively dominate the mass distribution along the line of sight.

Metal Ions in Mock Spectra
Our understanding of the CGM in the real Universe, rather than in simulations, is generally predicated on observing different ionization levels for astronomical metals, which probe different temperature and density regions.We expect that as the ionization state depends sensitively on multiple variables (temperature, density, metallicity), the different AGORA CGMs should be very different compared to observations.In this AGORA project we categorize how each of these variables contributes to observable results, rather than attempting to track which code or feedback mechanism is "best," though future projects could do further analysis of how well different feedback strategies fit the observations.In this section, we analyze some characteristic spectra (Figures 10,11,and 12), as well as decompose ion column densities into their constituent factors (Figure 13).We focus on four mediumhigh ions: Si IV, C IV, O VI, and Ne VIII.These were chosen because these are the most commonly observed higher ions, generally because they have very strong lines.We avoided analysis of low ions O II or Mg II because none of these codes should be able to resolve the small clouds expected to host them (Hummels et al. 2019).We then compare the radial column density profiles to a selection of observational results and present some insights as to what causes convergence or divergence from these results.It is apparent that there are dramatic differences in the visible mock spectra 14 for the selected ions in each code.In Figures 10 and 11, we examine several lines at both z = 3 and z = 1, chosen out of a sample of 31 lines to be representative of the simulation overall while containing at least some detectable absorption.Because stochastic effects would make direct comparisons of "the same" sightlines unlikely to probe exactly the same phases in different simulation instantiations, we instead randomize sightlines in each code independently and take this set to be a full and independent sample.Noiseless spectra are used here to more deeply understand the physical conditions underlying detections.In Figure 12 and associated discussion we will examine the effect of adding noise to these spectra at a given signal-to-noise ratio.Voigt profiles are identified using the built-in TRIDENT line fitting tool (Egan et al. 2014), with centroids marked with triangles.Absorption lines within 15 km/s of one another are considered part of the same "component," and components are marked with black bars.We will analyze the spectra on a code-by-code basis, also comparing the two redshifts if they are available.
• ART-I: ART-I has spectra which at z = 3 contain both deep and wide absorption lines, with many components.While there is some amount of overlap between O VI and C IV, the lines are generally not well connected, with C IV much more closely tracing Si IV.
As ART-I evolves to z = 1, there is an evolution towards higher ions.While absorption gets significantly weaker and broader in general, we also see that O VI has become the dominant line and is generally accompanied by Ne VIII, while C IV has reached a negligible level.
• ENZO: Like ART-I, ENZO shows a large number of fairly deep and wide absorption lines in C IV and O VI at z = 3, with each dominating in different components, in addition to small amounts of Ne VIII.The main components are also quite widely separated in velocity-space, so the scale is significantly wider than all other codes besides RAMSES.ENZO evolves to z = 1 by becoming weaker in general, except for growth in Ne VIII, which is mostly aligned with O VI, though some C IV/O VI alignment is still visible.
• RAMSES: RAMSES at z = 3 contains very wide O VI lines with only minimal overlap with also significant C IV lines.In some cases, cooler clouds are "bracketed" by presumably hotter clouds, like the two 14 TRIDENT's default behavior was modified to create spectra with LOS velocity rather than using cosmological redshifting along the line, see https: //github.com/trident-project/trident/pull/196Ne VIII components detected on either side of the deep C IV/Si IV component at ∼ 25km/s.This occurs regularly throughout RAMSES spectra.
• CHANGA-T: The SPH codes generally have less absorption overall in these ions.CHANGA-T has some clouds of both C IV and O VI, with the former generally being stronger.The two are often loosely aligned, but not perfectly, indicating they follow similar dynamics, but are generally not in the same clouds.Some clouds further show detectable Si IV aligned with the C IV, though that is not visible in this figure.
• GADGET-3: In GADGET-3 at z = 3, there is more significant absorption than in the other particle codes.
Larger O VI components tend to be aligned with, or almost "contain", slightly weaker C IV lines, the most significant of which also tend to contain detectable Si IV.This structure is only minimally changed as GADGET-3 approaches z = 1, with the main difference being that the strongest components, rather than containing any Si IV, now contain a small amount of Ne VIII, with extremely wide lines.
• GEAR: GEAR almost never has detectable absorption in any ions except when the sightline passes through the very innermost part of the halo or the galaxy.Nevertheless, some relatively significant and deep clouds can be seen in both Si IV and C IV. O VI is very rare.Evolution to z = 1 affects mostly what species are visible.The kinds of components which previously appeared in C IV are now visible instead in O VI.
• AREPO-T: AREPO-T has somewhat deeper absorption lines at z = 3 than particle-based codes, and interestingly has an extremely wide spread in velocity-space, with multiple very discrete clouds separated by hundreds of km/s.These mostly align C IV with Si IV, or O VI dominated clouds even further out in velocityspace.As AREPO-T evolves to z = 1, it becomes significantly weaker, with Ne VIII becoming stronger than C IV, and the lower ions fading.
• GIZMO: The GIZMO run has generally few components per sightline, though there can be significant absorption along them.In the spectrum shown in Figure 10, we again see a "bracketing" behavior, where two O VI components (which align closely enough that they give the impression of one slightly skewed line) are seen on either side of a C IV component.
Next, we examine more quantitatively the properties of the absorption lines detected by TRIDENT using the methodology described in Egan et al. (2014) in Figure 12.Here we create spectra for 31 sightlines through each simulation, and analyze each twice.Once, using the noiseless spectra of Figures 10 and 11, and then again with Gaussian noise added so that the signal-to-noise ratio (SNR) is 10, on the higher end of modern observational capacity.The noiseless results are in solid colors, and the SNR = 10 results in empty squares.Rough observational results, when available, are shown as cyan rectangles.
First, we see that, the column densities of the individual components are similar among all the codes, increasing with higher ionization energy from about 10 12.5 to about 10 13.5 cm −2 , with very little evolution over redshift.At z = 3, this roughly agrees with observations, but at z = 1, observed components are substantially larger, either due to higher noise making smaller components undetectable, or through physical divergences between the codes and observations.Similarly, the line width, or b parameter, remains fairly similar between codes (though substantially below observations) at z = 3. b increases with increasing ionization energy for most codes, besides GIZMO and AREPO-T, and noise can be seen to cause decreases in width in higher ion species.At lower redshift, this conclusion remains broadly the same, though all widths are somewhat decreased compared to z = 3, with observational values also falling to reach rough parity with the simulations.Since the CGM is getting hotter, as we saw above in comparing Figures 8 and 9, this indicates that turbulence, the other source of Doppler broadening, must be decreasing.
The covering fractions have significantly more variation.At redshift z = 3, we see that all three grid codes, and GADGET-3, have more or less uniform coverage of C IV and O VI, even though those are usually used to probe very different clouds of gas.Two of them, ART-I and RAMSES even extend this to Ne VIII, though with somewhat less coverage.Particle codes, on the other hand, have a clear peak around O VI, with the exception of GEAR which peaks at lower ionization level with C IV. Noise usually decreases covering fractions, except when they are very close to 0. Interestingly, it is the lower C IV covering fraction in the noisy spectra of particle codes that most closely aligns with the observations (Galbiati et al. 2023).Covering fractions for most ions lower as the codes evolve to z = 1, however all codes moderately increase their Ne VIII covering fraction, at least in the SNR=10 data.This shows that the CGM is generally getting hotter over time (see also Figure 7).The Ne VIII covering fraction remains noticeably higher in ART-I at both redshifts, making it the only one approaching the value in Burchett et al. (2019).ART-I also sees a general collapse in C IV and Si IV detections at this redshift.The most significant disagreement with observations here is in Si IV, which in Werk et al. (2013) was significantly more likely to be detected than in any code.This could be an artifact of the lower redshifts and smaller impact parameters used in COS-Halos (Figure 15), or it could result from the codes' resolution limitations having difficulty generating clouds for ions lower than C IV.
Completing this analysis, in the fourth row of Figure 12 we track the total number of detected components in sightlines which were covered.In other words, if an ion is detected at least once in a sightline, how many components (usually interpreted as "clouds", though see Marra et al. (2022) for a counterargument) is it found in. 15Generally there are more clouds detected with noise, as some noise patterns can make what is really a single component look like two peaks.There is a significant gap between grid and particle codes in the number of O VI components at z = 3, and a smaller one in C IV. GIZMO is an exception here, and generally shows more fragmentary components than the other particle codes.Ne VIII almost always has a small number (1-2) of components.At lower redshift, interestingly, while the coverage increases or is maintained for O VI and Ne VIII, the number of components goes down for all species in most codes, sug-gesting that clouds are getting bigger and more uniform, even while becoming less numerous.

Metal Ion Origins
While spectra can lead to useful information would be difficult to estimate with more simplistic analysis methods (see for example Hafen et al. 2023), it is also useful to disentangle the source of the differences between codes more precisely.The column density of an ion can be decomposed into the product of three factors times a constant abundance A x , as described in Equation 1. Often, absorption line systems are assumed to probe only one of these variables, sometimes leading to confusion or misleading statements.
In Figure 13, we examine this situation by separating out the three variables.Here we show a suite of ∼ 400 lines of sight passing through each galaxy's CGM.For each of the same four ions, Si IV, C IV, O VI, and Ne VIII, we have directly calculated the column density along each line of sight.For grid codes, this is the sum of ion number density (calculated with TRIDENT) times sightline path length for each cell in the sightline path.For particle codes, column density is instead calculated by dividing the path length into discrete sections defined by the smoothed gas particle field, and then integrating the ion number density of that smoothed particle times section length. 16These column densities are the y-values of the points in the scatterplots of figure 13, with the same sightlines appearing in each panel.
The column densities described above are plotted against the total hydrogen column density (left, calculated similarly to the ion number densities), average metallicity (center, calculated as the total metal column density over total hydrogen column density), and the total ion fraction along the LOS (right, calculated as the column density of the given ion divided by the total column density for the element).The diagonal lines on each image are linear relationships, so if one factor alone could explain the variation in column density, points would follow these lines in one column, and have no correlation in any other column.To guide the eye, and for comparison to available spectra, lines which were highlighted in Figures 10 and 11 are plotted as small and large stars, respectively.
What is remarkable about this factorization of column density is that there is no single variable that controls ion column densities.Instead the relationship appears to change with ionization energy, with lower ions (upper rows) following a different pattern than higher ions (lower rows).
The lower ions Si IV and C IV appear to track much more strongly with ion fractions than anything else, and the higher ions O VI and Ne VIII instead track most closely with metal- licity.There appears to be a continuous morphing of the shapes in both the center and right columns as one tracks from the top row to the bottom.The center column compresses from a wide scatter to a linear relationship along the N X i ∝ Z lines, while the rightmost column starts as a clear linear relationship for low ions and flattens out into an approximately constant f X i ≃ 0.1 for high ions.The leftmost column is less clear, because ion fraction depends sensitively on density so these values are not independent.While this image only shows four ions, this trend remains uniform to both higher (e.g.Mg X) and lower (e.g.Mg II) ionization states besides the ones shown here.
The multiple simulations with controlled conditions used in the AGORA project are useful for our interpretation of this result.For example, let us compare the distribution of sightlines by code (color).In the center column of Figure 13, metallicities are tightly grouped together on a code-by-code basis by color, especially in the bottom row.In the ion fraction graphs, however, all codes follow very similar tracks, with wide spread in ion fraction for low ions, and a very narrow range for high ions.Thus, ion fraction depends more strongly on the sightline position within the simulation (for low ions), while metallicity depends more on the parameters of the simulation itself.
If rather than a large number of calibrated simulations, we were only studying one or two implementations, it would have been very straightforward to see intra-code ion fraction variations, and much harder to see inter-code metallicity variations.If two implementations were close in metallicity, this variable would appear to have negligible impact, and if they were widely separated, it would appear to simply make the codes impossible to directly compare.Only with a large number of codes that fill in a wide phase space of possible metal diffusion patterns, as in AGORA, is the increasingly linear relationship with increasing ionization potential between metallicity and column density visible.Most uncalibrated simulation suites would struggle to disentangle confounding effects such as differences in mass and environment, whereas here the physical reason would be more likely to relate to the implementation of the ionization models within CLOUDYunder the variety of conditions caused by different systems of metal diffusion and models of feedback.

Comparison with Observations
In Figures 14 and 15 we see that there are significant differences in the radial profiles of ion column densities in the different simulations compared to observations.The connected dots represent the median column density values at that distance, and the error bars are the 16th and 84th percentiles over the same 400 sightlines used in Section 3.3, which would correspond to one standard deviation if the column densities followed a Gaussian distribution (which they gen- erally do not).We will point out that this comparison is inherently limited by the difference between the simulated and observational datasets.The observations chosen were designed to be in the CGM of similarly-sized galaxies at around the same redshift, but will inherently measure unknown phases through the CGM of distinct halos, while the simulation data is, for each snapshot, multiple sightlines through the same halo.This inherently means that while some variation naturally does reflect the random phases the sightlines may pass through, other variation will reflect the different halos and environments, which is not available in AGORA.The relatively small error bars on the simulation data show that even though the CGM is a multiphase medium, the different phases are distributed in such a way that most sightlines sample many available phases, and so different lines of sight with the same impact parameter have similar column densities.However, it is clear that these distributions can be very different between codes.Because the CGM is relatively unconverged between codes according to multiple metrics (gas temperature, density, metal distribution, and to some extent, resolution), it is not recommended to interpret these results as primarily indicating which feedback system (or which codes) agree "most closely" with observations.Rather, what is most useful about this analysis is to disentangle which metrics matter more for the ion of interest.2022) having sufficient data to filter by redshift, so here we show only points with 0.4 < z < 1.0.The former two were at lower redshift (z < 0.4) and so are only approximately comparable to the AGORA galaxies.
For example, at z = 3 (Figure 14) we can see that there is a very clear bimodality between the grid and particle type codes, which is most visible for the Si IV and C IV profiles.For Si IV and C IV, the grid codes are more or less aligned with the data in Rudie et al. (2019); Galbiati et al. (2023) where there is data outside the innermost halo,17 with still some slight underprediction for Si IV at mid-range (0.5 -0.8 R vir ).It is notable that the higher ions remain more constant with impact parameter, especially at higher distances from the CGM.This makes sense considering that higher ions are more sensitive to metallicity than gas state as shown in Figure 13, which depends more on which code is used than where N is the column density in cm −2 , EW is the component equivalent width in Å, λ 0 is the rest wavelength of the transition in Å, and f is the oscillator strength of the transition, taken from TRIDENT documentation.This equation requires the profile to be in the linear regime, meaning EW < 0.2 Å.We get relative distances by dividing Galbiati et al. (2023) impact parameters in kpc by the AGORA z = 3 virial radius, 53 kpc.
where the sightline penetrates it due to differences in metal mixing and diffusion.
As the codes evolve to z = 1 (Figure 15), there is a significant convergence in the Si IV and C IV profiles, while the O VI and Ne VIII profiles remain more spread out over three orders of magnitude.All codes drop much lower than the detectability threshold within 0.3 R vir for Si IV (Werk et al. 2013) and C IV (Chen et al. 2001),18 while only ART-I and ENZO seem to generate enough metals to match the O VI profile from Tchernyshyov et al. (2022) in the outer halo (though more codes are close in the inner part of the halo).Ne VIII has a much more significant scatter in Burchett et al. (2019), and no code really effectively resolves it; however, the scatter in the simulations remains fairly low, indicating perhaps that metal mixing is too efficient (as we can see with the high degree of homogeneity in sightlines by code in Figure 13) or that the Ne VIII dominant phase is too efficiently distributed throughout the CGM.It could also imply that the difference between different halos with different masses, environments, and other conditions, is more significant than the difference between sightlines, which would resonate with this study but unfortunately cannot be tested with the single implementations of each galaxy used in AGORA.
Finally, we showcase a relevant effect which might be causing low and high ions to respond differently to ion fraction versus metallicity, to motivate future work in this field.In Figure 16 we show z = 3 phaseplots similar to those seen in Figure 8, except now colored by metal mass rather than total mass.Each phaseplot is repeated four times vertically, and plotted over each are the 1 percent and 20 percent ion fraction contours for the four ions being analyzed in this work.As argued in Strawn et al. (2021) and Strawn et al. (2023a), in the horizontal "upper" ridge, each ion should be considered collisionally ionized (CI), and the diagonal "lower" ridge, it should be considered photoionized (PI).
As we can see here, all of these codes have their Si IV PI ion fraction peak either somewhat overlapping or at least near the general "cooling track" curve, with slightly less overlap for C IV. O VI and Ne VIII have PI peaks (and in other models with different parameters, these can be important, see for example Stern et al. 2018;Strawn et al. 2021), but they take place at densities so low that they are not occupied on these phaseplots.It is important to note that the CI ion fraction peaks are not at the same temperatures for all ions.For high ions, these are approaching the bulk of the metal mass in the hot phase, while for lower ions, the collisional peak is in the less occupied "middle" region between the cool and hot phases.Therefore, high ions are much more ubiquitously created through collisional ionization and there- fore more weakly sensitive to density.Nevertheless we note that these collisionally ionized column densities are by no means totally independent of density, as seen in the leftmost column of Figure 13.
Examining these results, we posit that the evolution in ion factorization shown in Figure 13 from low to high ions might be correlated with the switch from dual contributions of photoionization and collisional ionization for lower ions to collisional ionization dominance for higher ones, though more research on this point will be needed and in a larger parameter space than that swept out by AGORA.These results could be substantially changed with the inclusion of more physics allowing for more small, cool clouds to survive in the halo or be created there, such as magnetic fields (Nelson et al. 2020) or higher resolution (Peeples et al. 2019;Hummels et al. 2019;van de Voort et al. 2019;Ramesh & Nelson 2023).Future AGORA projects which include these improvements, as have been suggested, would be an excellent way to disentangle these effects, and possibly modify the conclusions found here.

DISCUSSION AND CONCLUSION
The AGORA project is and remains primarily a community of scientists attempting to understand whether the results of cosmological and galaxy simulations are at this time converged, and what aspects of this theoretical project are and are not well understood.Scientific programming is generally not designed to be highly scalable, or to be adopted en masse and maintained by large, professional companies.Indeed, as new techniques are developed and processing power increases, scientific codes need the flexibility of being developed by a small group to remain cutting-edge enough for original research, with new codes arising whenever their need becomes apparent.Thus, a large number of groups are developing more or less redundant codes which all attempt to answer the same question: does application of known and commonly accepted galactic astrophysics create adequately realistic galaxies?It is much more rarely asked, does the application of this shared physics always create the same results with each different implementation method?AGORA was founded to analyze this question, and to generally get the backend simulation developers in contact with one another, so their simulations could be mutually intelligible.
In Paper I and Paper II, this question was approached by development of all codes to accept common input files which standardized the presentation of initial conditions, heating/cooling functions, visualization tools, and other aspects.With CosmoRun (Paper III and IV), it was further asked whether different (commonly used) physics prescriptions change simulation results, holding everything else, even particularities like initial conditions, constant.It was necessary to expand the scope in this way because the codes were so particularized in their development that it would be impossible to effectively modify the codes to use the same "feedback" (which here is only stellar feedback, though AGORA will be developing new AGN simulations in the near term) without changing the codes so dramatically from their normal use that it no longer represented a comparison between commonly used codes.This new approach made the AGORA project much more complex, as now two variables, code implementation and feedback prescription, control the outcomes instead of only one, and these outcomes are correspondingly much more different from each other than they were for the simulations in Paper I and Paper II.
The result in Paper III was that even with these significant differences, the codes could be compatible with overall results in star formation, i.e. realistic star formation histories were compatible with many different feedback implementations.But as we show in Paper IV and here, other effects such as merger timing discrepancies and especially the quantity and state of mass and metals distributed into the CGM, and the state of that gas with respect to observable quantities, is vastly different, making direct comparisons more challenging.This more complex simulation space leads to significant benefits as well as challenges.Particularly, it allows us to examine a vast parameter space in a way that the individual implementation of each code or the multiple formation histories of different galaxies can be neglected, which could help us reach a more sophisticated understanding of the physics, either of the simulations or of their accompanying analysis tools.
The main results presented in this paper are as follows: 1.All codes retain similar total gas mass into the CGM from z = 6 and below, but send vastly different metal masses into this region.
2. All codes mix metals between inflowing and outflowing phases in similar ways, but they are mostly distinguished in how many metals are in either phase, according to the variety of feedback prescriptions used.
3. All codes have some amount of hot, metal-rich biconical outflows and cool inflowing streams.The outflows are significantly faster in grid codes and slower in particle codes, with moving mesh codes somewhere in between.
4. Spectra between medium-high ions are often kinematically distinct from each other, and in some codes O VI aligns with C IV; in others O VI with Ne VIII, and in others no alignments are found, showing that the ions visible in spectra do not always arise from the same gas temperature-density phase.
5. Low ions are more strongly determined by ion fraction, while high ions are more strongly determined by metallicity.This difference may have to do with the photoionized or collisionally ionized origins of the species at different energy levels.
6.Most codes underpredict ion column densities for most ions, with significant spread between codes.Low ion column densities generally have more impact parameter dependence than high ions, which have stronger code and feedback type dependence instead but change less steeply with radius.
Future work with the CosmoRun galaxies will involve more detailed comparisons with observations using the radiative transfer code SKIRT (Baes & Camps 2015), and possibly a final follow-up on halo evolution (Papers III and IV) down to z = 0. Other projects will include continuing analysis of ionization states in the CGM and further analysis of the satellite galaxies in a follow-up to Paper V. Additionally, new codes such as SWIFT (Schaller et al. 2023) and GADGET-4 (Springel et al. 2022) have expressed interest in joining this project.These will be added to future CosmoRun papers, though they had not finished running at the time this work was submitted.Finally, a re-run of the CosmoRun simulation with higher resolution might be executed to compare how the increased resolution changes each code, as well as allowing us to compare more detailed structures such as clumps or smaller clouds in the CGM.
Besides these, AGORA will continue to run new simulations, including simulations of an AGN interaction with the isolated disk conditions of Paper II, and technical analyses of the codes' responses to heating and cooling curves (Revaz et al., in prep.).As the simulation community continues to add newer and more efficient physics and implementations, collaborators are committed to planning new AGORA simulations to continue to dive into their effects, as simulation groups around the world try to converge on all the critical questions surrounding galaxy and cosmological evolution.

Figure 2 .
Figure2.Current distribution with redshift (evolving from right to left) of gas mass in the galaxy, CGM, and IGM, both in solar masses, top, and as a fraction of the total, bottom."GAL" (galaxy) refers to the region from 0.0 -0.15 R vir , "CGM" refers to 0.15 -1.0 R vir , and "IGM" is defined as the region between 1.0 -4.0 R vir .Starred points are added to each line at redshifts 3 and 1, to guide the eye when comparing to other plots in this work.Additionally, the shaded region down to z = 2 is shaded to indicate the epoch reached by all 8 codes.Inside the galaxy, the total mass is split between stars and gas.

Figure 3 .
Figure 3. Like Figure 2, but now tracing the total mass of metals in and around the main AGORA galaxy in each simulation.

Figure 4 .
Figure 4.Here we show the metallicities of outflowing (left) and inflowing (center) gas elements (cells or particles, depending on the code architecture), as a function of radius.Top row is redshift z = 3, bottom row is redshift z = 1.Right: outflow metallicity divided by inflow metallicity with radius.This is much more similar between codes than the individual metallicities of the two phases.

Figure 5 .
Figure 5. Mass-weighted Projection Plots at redshift z = 3 of all eight codes in four fields, out to 1.5 times the average virial radius of all codes ( R vir = 53 kpc).Inner and outer white circles represent 0.15 and 1.0 R vir , respectively.Rows (from top) are metallicity, column density, temperature, and radial velocity v r , where v r > 0 represents outflows and v r < 0 represents inflows.Projections are aligned with simulation box axes, rather than angular momentum (face-on vs edge-on) for global consistency.Cool, dense inflows are visible along the left-right axis in each code, and metal-rich outflows along the up-down axis.

Figure 6 .
Figure 6.Identical to Figure 5, but for five codes at redshift z = 1.At this redshift, R vir = 153 kpc.

Figure 7 .
Figure7.Profiles of temperature with distance to the galaxy center as a fraction of R vir .The left two panels show profiles of dense (n > 10 −2.5 cm −3 ), inflowing gas, and the right shows metal-rich (Z > 0.1Z ⊙ ), outflowing gas, at redshifts z = 3 and z = 1.

Figure 8 .
Figure 8. Phaseplot of each code at redshift z = 3, showing all gas between 0.15 and 1.5 R vir (7.6 and 76 kpc).Dots indicate the average temperature and maximum density of sightlines passing through the CGM of these halos, with color indicating the impact parameter.The blue stars are the sightlines with spectra shown in Figure 10.

Figure 9 .
Figure 9. Identical to Figure 8, but at redshift z = 1 showing all gas between 0.15 and 1.5 R vir (23 and 230 kpc).Stars show sightlines with spectra visible in Figure 11.

Figure 10 .
Figure 10.Noiseless example spectra from snapshots of each code at z = 3, here showing the strongest transition lines for medium-high ions: Si IV, C IV, O VI, and Ne VIII.Triangles indicate absorption lines as detected by TRIDENT, and black lines indicate multi-ion components, grouping together all lines found within 15 km/s of one another.Sightlines are selected by inspection to have visible components while remaining representative of 31 examined sightlines for each code.The number in square brackets indicates which line (between 0 and 30) was chosen.

Figure 12 .
Figure 12. Analysis by ion on 31 randomized spectra through each AGORA CGM.Left column is at z = 3, right column is at z = 1.Colors are the same in each graph, as well as the order of small x-offsets added for visibility.The effects of noise on spectrum detectability are visible through comparing the noiseless results (solid markers, connected) to the results with a reasonably good S/N ratio of 10 (unfilled squares in same colors).If not enough components are detected for a particular ion in a particular code, those points are not displayed Top: Column density per component; Second from Top: Average Doppler b parameter of each component; Second from Bottom: Covering fraction for this ion; Bottom: Average number of components in a sightline containing at least one component.Bright horizontal bars are estimated from observational work with arbitrary thickness for visibility (which does not represent an error bar).Specifically we show our own very rough estimates for column density per component and covering fractions, extracting data from Galbiati et al. (2023) for z = 3 Si IV and C IV, Chen et al. (2001) for z = 1 C IV, Werk et al. (2013) for z = 1 Si IV, Tchernyshyov et al. (2022) for z = 1 O VI (see Figure 15 caption), and Burchett et al. (2019) for z = 1 Ne VIII.b parameters are generally not available in these papers, so those are sourced from Galbiati et al. (2023) for z = 3 C IV, Werk et al. (2013) for z = 1 Si IV and C IV, and Werk et al. (2016) for z = 1 O VI.

Figure 13 .
Figure 13.Origin of ions along sightlines.Columns adjust whether the x-coordinate is hydrogen column density, sightline metallicity, or ion fraction.Rows have y-coordinates as column density of Si IV, C IV, O VI, or Ne VIII.These ions are sorted by increasing ionization energy from top to bottom.Colors indicate different codes from the AGORA simulation, and different shapes indicate different redshifts (z = 3, triangles, and z = 1, squares).Smaller and larger stars show the selected line in each code at z = 3 and z = 1 highlighted in Figures 10 and 11, respectively.

Figure 14 .
Figure 14.Comparison of radial column density profiles between AGORA galaxies and relevant observations at z = 3. Nondetections and saturated lines are indicated with open squares, with a downward or upward arrow, respectively.In this figure, points labeled "KBSS -Rudie19" and "MAGG -Galbiati23" are taken from Rudie et al. (2019) and Galbiati et al. (2023).

Figure 16 .
Figure 16.Map of phaseplots of all codes at z = 3, similar to Figures 8 and 9 but colored by the metal mass, rather than the total.Columns are each code, repeated four times.Overplotted are 20 percent and 1 percent contours for each ion.

Figure 17 .
Figure17.Projection Plot at redshift z = 3, identical to and aligned with Figure5, but at larger scale and not including radial velocity field.Here we show eight codes in three fields out to 6 times R vir .As before, inner and outer white circles represent 0.15 and 1.0 R vir , respectively (at this scale, the former appears point-like).Additional black circle represents the approximate simulation zoom-in region of 4.0 R vir .Rows (from top) are metallicity, number density, and temperature.

Table 1 .
Shimizu et al. (2019)) each code, including numerical runtime parameters when available.AMR = Adaptive Mesh Refinement, SPH = Smoothed Particle Hydrodynamics, MM = Moving Mesh, T = Thermal feedback, K = Kinetic feedback, RP = Radiation Pressure feedback, DC = Delayed Cooling feedback.The final two columns show the stellar mass within 0.15 R vir at z = 3 and z = 1.These feedback parameters shouldn't be numerically compared to each other, and sometimes cannot, as they are not given in the same units.Still, this remains a broad overview of the breadth of implementations used in AGORA.aNote that ART-I is not exactly the same feedback as in Paper III, see Appendix A of Paper IV. b A pressure proportional to 10 49 erg Myr −1 M −1 ⊙ is added to the pressure of cells containing or adjacent to cells with sufficiently high hydrogen column density and star particles younger than 5 Myr.See Section 2.2 ofCeverino et al. (2014)for details. c Note that CHANGA-T is not the same run of CHANGA as the one in Paper III, and instead uses only thermal feedback.See Appendix B of Paper IV and section 2.2.4.dSeeShimizu et al. (2019)for a definition of t hot .Generally this parameter ranges between 0.8 and 10 Myr.