Buzzard to Cardinal: Improved Mock Catalogs for Large Galaxy Surveys

We present the Cardinal mock galaxy catalogs, a new version of the Buzzard simulation that has been updated to support ongoing and future cosmological surveys, including the Dark Energy Survey (DES), DESI, and LSST. These catalogs are based on a one-quarter sky simulation populated with galaxies out to a redshift of z = 2.35 to a depth of m r = 27. Compared to the Buzzard mocks, the Cardinal mocks include an updated subhalo abundance matching model that considers orphan galaxies and includes mass-dependent scatter between galaxy luminosity and halo properties. This model can simultaneously fit galaxy clustering and group–galaxy cross-correlations measured in three different luminosity threshold samples. The Cardinal mocks also feature a new color assignment model that can simultaneously fit color-dependent galaxy clustering in three different luminosity bins. We have developed an algorithm that uses photometric data to further improve the color assignment model and have also developed a novel method to improve small-scale lensing below the ray-tracing resolution. These improvements enable the Cardinal mocks to accurately reproduce the abundance of galaxy clusters and the properties of lens galaxies in the DES data. As such, these simulations will be a valuable tool for future cosmological analyses based on large sky surveys.


INTRODUCTION
Over the past two decades, large galaxy surveys have systematically mapped hundreds of millions of galaxies with unprecedented precision, allowing us to establish the standard cosmological model that describes the universe's evolution over billions of years.However, analyzing these data to their full potential requires advanced theoretical models and excellent control of systematics.Achieving these requirements is challenging because most of the information lies in the scales where the theory is highly non-perturbative.Furthermore, one galaxy survey can be analyzed with multiple cosmological probes, which could share the same sources of Corresponding author: Chun-Hao To to.87@osu.edusystematics.Consistently modeling these systematics in different cosmological probes is essential to yielding unbiased cosmological constraints.Finally, developing accurate theoretical models is more challenging when considering blind analyses, in which the data is transformed to obscure actual cosmological signals during the development of the models.
Synthetic sky catalogs, also known as mock catalogs or mocks, provide a valuable tool for quantifying systematics and developing analysis techniques.They consist of plausible universes that can serve as a sandbox for researchers to test and develop methods for analyzing survey data.Accomplishing this task places several requirements on the synthetic catalogs.First, one wishes to use these catalogs to control systematics so that they are much smaller than the statistical uncertainties of the data.Therefore, the volume of the mocks has to be larger (ideally, much larger) than the volumes probed by the targeted surveys.Second, the galaxies To et al in the mocks should be realistic, although the level of realism required depends on the specific surveys and analysis techniques used.Third, fast generation of new mocks is desirable.When analyzing survey data, new techniques might be developed, and new systematics might be found.These developments might require new mocks that meet newly defined requirements.Further, fast mock generation allows the creation of various plausible realizations, allowing one to marginalize over uncertain physical processes.
Many techniques have been developed over the past two decades to generate synthetic catalogs (see e.g.Wechsler & Tinker 2018, for a review).Ideally, one would want to simulate galaxies directly from numerical solutions of coupled dark matter and baryon evolution to generate a realistic galaxy catalog.Unfortunately, while significant progress has been made over the past two decades, this method is still too computationally demanding to produce synthetic galaxy catalogs larger than the volume observed by galaxy surveys (see e.g.Vogelsberger et al. 2020, for a review).On the other hand, several practical alternative methods have been developed to simulate galaxy formation processes using phenomenological models.Ordered from least computationally demanding to most computationally demanding, these alternative methods include: 1. the halo occupation model (HOD, Berlind & Weinberg 2002a;Zheng et al. 2005), where one adopts phenomenological models to describe the statistical relations of galaxy properties and properties of the largest dark matter halos hosting these galaxies; 2. the subhalo abundance matching model (SHAM, Kravtsov et al. 2004;Conroy et al. 2006), where one relates galaxy properties to subhalo properties via simple rankings; 3. semi-analytic models (SAMs, see e.g.Baugh 2006;Somerville & Davé 2015, for reviews), where one simulates galaxy formation physics using analytical prescriptions and integrates galaxy properties through halo merger histories.
Combinations of these alternative methods have led to a blossoming of synthetic catalogs that meet the aforementioned requirements for galaxy survey cosmology analyses (e.g.Fosalba et al. 2015;Potter et al. 2017;DeRose et al. 2019;Korytov et al. 2019).The most direct predecessor of this work is the Buzzard simulations (DeRose et al. 2019), which combine subhalo abundance matching (Lehmann et al. 2017) and low-resolution large volume N-body simulations using a machine learning-based technique (Addgals, Wechsler et al. 2022).The Buzzard simulations produce realistic galaxy properties, allowing one to run target selections (such as redMaGiC galaxy selections, Rozo et al. 2016) on the simulations in the same way as survey data.Further, the simulations are relatively computationally inexpensive, making it possible to generate large numbers of realizations of survey data.Because of these features, the Buzzard simulations have facilitated end-to-end validations of cosmological analysis pipelines from main galaxy catalogs to cosmological constraints (e.g.MacCrann et al. 2018;To & Krause et al. 2021b;DeRose et al. 2022;White et al. 2022;Chen et al. 2022).
While the Buzzard simulations and other catalogs built with the Addgals 1 have facilitated analyses in multiple cosmological surveys (see Wechsler et al. 2022, and references therein), the galaxy clustering on scales less than 1 h −1 Mpc in this simulation is smaller than the SDSS measurements up to 50 percent.This suppressed galaxy clustering significantly impacts the properties of optically selected clusters, where one relies on the overdensity of red-sequence galaxies at < 1 h −1 Mpc scales to identify galaxy clusters.Specifically, the number of redMaPPer galaxy clusters at a given richness (λ) in the Buzzard simulation is a factor of three to four smaller than the observed value in the Dark Energy Survey Year 1 data (DeRose et al. 2019;Abbott et al. 2020).The lack of redMaPPer cluster problem is not unique in the Buzzard simulation.CosmoDC2 (Korytov et al. 2019), the only alternative simulation that currently has been tested with its redMaPPer cluster properties, also underpredicts the richness of optically selected clusters unless one artificially boosts red-sequence galaxies in cluster environments.While the additional boosting solves the lack of clusters problem (Korytov et al. 2019), it creates discontinuities between galaxy properties in cluster environments and in the field.
The unrealistic galaxy population in cluster environments is a critical limitation for studies that use these simulations to quantify the performance of optically selected clusters (Shin et al. 2019;Abbott et al. 2020;Myles et al. 2021b;To & Krause et al. 2021b;Wu et al. 2022;Zhang et al. 2022).This shortcoming can be partly mitigated by abundance matching: one compares the Nth richest clusters in the data to those in the simulations.The abundance matching technique makes the comparison of simulations and data sensitive only to the rank of richness instead of its absolute value, thereby reducing the problem of mismatched galaxy abundances in clusters.However, these effects cannot be calibrated reliably from simulations in which cluster richnesses are a factor of two lower than observed values.
The exact reason for this lack of cluster galaxies in the Buzzard simulation was previously unknown.DeRose et al. (2019) and Wechsler et al. (2022) hypothesize that it arises from artificial disruptions of subhalos that experience close pericentric passages (van den Bosch & Ogiya 2018).Following this line, DeRose et al. (2021) developed a new SHAM model that includes orphan galaxies to address this problem.However, while the SHAM model with orphan galaxies can fit the galaxy clustering measured in SDSS in several stellar mass bins, no model was found to fit all three stellar mass bins considered in that work simultaneously.Further, DeRose et al. ( 2021) also found that the color assignment model used in Wechsler et al. (2022) and DeRose et al. (2022) can lead to an underestimation of red galaxy clustering in the lowest stellar mass bins (logM * ∈ [9.8, 10.2] h −1 M ).Because red galaxies more likely reside in cluster environments, this reduced red galaxy clustering can also lead to a lack of cluster galaxies.
In this paper, we solve the lack of cluster galaxies problem in the Buzzard simulation by quantifying and addressing both contributing factors: (1) artificial subhalo disruption in the SHAM model and ( 2) the color assignment model.Our new model can simultaneously fit galaxy clustering and group-galaxy cross-correlations measured at three luminosity thresholds and also fit color-dependent galaxy clustering.We propagate this model through the Addgals algorithm (Wechsler et al. 2022) and generate the Cardinal simulations.Figure 1 shows the flowchart that summarizes the key steps of generating Cardinal.A list of improvements from Buzzard v2.0 to Cardinal is also presented in appendix K. Finally, we compare properties of redMaGiC galaxies and redMaP-Per clusters in Cardinal and DES-Y3 data (Sevilla-Noarbe et al. 2021) and find excellent agreement.
This paper is organized as follows.In section 2, we detail the construction of the new SHAM models that Cardinal is based on.In section 3, we detail the steps of generating Cardinal using the new SHAM model.Specifically, the improved color assignment method is presented in section 3.4.In section 3.6, we address the remaining problem in our color assignment models, including the lack of redshift evolution in training spectra and the inadequacy of summarizing colors using current SED templates.In section 4, we compare the properties of redMaGiC galaxies and redMaPPer clusters in Cardinal and DES-Y3 data.Finally, we conclude in section 5 with a discussion on future improvements.

A SHAM-BASED GALAXY-HALO CONNECTION MODEL
We use a modified subhalo abundance matching (SHAM) algorithm to construct the training data for the galaxy-halo connection model.We describe the data and simulations used to construct the SHAM model below.

Calibrating Data
We use the NYU Value-Added Galaxy Catalog (VAGC, Blanton et al. 2005a) constructed from SDSS DR7 (Abazajian et al. 2009) main galaxy catalog to constrain the parameters in the SHAM model.We consider three volume-limited galaxy samples: M r < −19, M r < −20, and M r < −21 with 0.026 < z < 0.067, 0.026 < z < 0.106, and 0.026 < z < 0.106 respectively.We limit our analysis to the north galactic cap (NGC) to avoid modeling differences in target selection between north and south galactic caps.From these three volume-limited samples, we measure the projected correlation function given by where π is the line-of-sight distance between pairs, r is the distance between pairs perpendicular to the line-of-sight, and π max is 40 h −1 Mpc.We measure ξ(r, π) using the Landy-Szalay estimator (Landy & Szalay 1993), with 12 logarithmically spaced bins between r = 0.13 h −1 Mpc to r = 32.6 h −1 Mpc and 40 linearly spaced bins in π.The smallest scale is chosen to avoid systematics caused by the size of SDSS fibers.
In addition to galaxy clustering (w gg ), we also use the cross-correlation between galaxy groups and galaxies (w cg ) to constrain the SHAM model.Berlind & Weinberg (2002b) suggest that the group multiplicity function provides complementary information of galaxy-halo connections relative to galaxy clustering (see also Sinha et al. 2018).Given the number densities of galaxies, group multiplicity functions are simply integrations of galaxy-galaxy group crosscorrelations across spatial separations.Therefore, we include the galaxy-galaxy group cross-correlations to better constrain SHAM parameters important to galaxy occupations in cluster environments.We first measure galaxy groups in VAGC catalogs using the Self-Calibrated Galaxy Group Finder (Tinker 2020(Tinker , 2021)).We restrict the measurement to galaxies with 0.026 < z < 0.067 and M r < −19 to ensure the completeness of galaxies.We remove all color information used in the Self-Calibrated Galaxy Group Finder because this information does not exist in the mock galaxy catalogs constructed from the SHAM model.This might degrade the performance of the group finders, but it allows apples-to-apples comparison between the measurements in data and mocks.With the group catalogs, we select groups with mass greater than 5 × 10 13 h −1 M , the mass range of λ > 20 redMaPPer clusters.We then cross-correlate the centers of these groups with galaxies at 0.026 < z < 0.067 and M r > −19, M r > −20, and M r > −21, using equation 1.
We measure the covariance matrix of the data using the jackknife resampling technique.We first employ a Kmeans algorithm implemented in treecorr (Jarvis 2015) on random
Abundance match galaxies to subhalos in high-resolution N-body simulations (Section 2).
Cut out survey footprints and apply observational effects, including survey depth variations, lensing, photometric uncertainties (Section 3.5).

iterations
1' Figure 1.Flowchart of the algorithm.The left column shows the observational inputs, and the right column shows the algorithm.The algorithm can be broadly categorized into five steps.First, we develop an extended Subhalo Abundance Matching (SHAM) model to populate galaxies on N-body simulations with resolved structure (section 2).Second, we measure the statistical relations of galaxies' luminosities and local density tracer R δ in the SHAM galaxy catalogs.We then paint luminosity onto particles of a lightcone simulation using the measured relation (section 3.3).Third, we use conditional abundance matching techniques to assign colors to galaxies in the mock using observed galaxy colors in SDSS (section 3.4).Fourth, we apply observational effects in the mocks, including lensing, photometric uncertainties, and varying survey depths (section 3.5).Finally, we use DES's photometric galaxy catalog to fine-tune the galaxy colors (section 3.6).
points to separate the sky into 128 uniform patches.We then perform the measurements by excluding one patch at a time.The covariance matrix is then estimated as, where d i is the i th element of the data vector, N is 128, d i,k is the i th element of the data from measurements that exclude the k th patch, and x i is the mean of the element over N = 128 patches.Jackknife estimations introduce noise in the covariance matrix, which leads to biases in the inversion of the covariance.Thus, one has to regularize the covariance matrix before inverting it.Here, we adopt an approach similar to Behroozi et al. (2019).In short, we perform an eigenvalue decomposition of the jackknife-estimated covariance matrix to obtain eigenvalues D n and the associated eigenvectors v n .Due to various possible sources of noise (such as variations of sky backgrounds, variations of fiber assignment efficiency, and systematics in galaxy photometry), we do not expect the error estimated to be better than 10 percent.We therefore rank order the eigenvalues and find the eigenvalues whose square rooted values are below 0.1 of the data projected onto the eigenspace.We then replace those eigenvalues with 0.1 of the data projected onto the eigenspace and multiply the new eigenvalues with the eigenvectors to form a regularized covariance matrix.

Simulations
To generate the training model, we use the Chinchilla-T1 dark matter simulation, which have volume (400 h −1 Mpc) 3 with particle resolution 5.9 × 10 8 h −1 M .The simulation is generated using L-GADGET2 (Springel 2005) with a ΛCDM cosmology that has Ω m = 0.286, Ω b = 0.047, σ 8 = 0.82, n s = 0.96, h = 0.7, and three massless neutrino species with N eff = 3.046.Halo finding is performed using Rockstar (Behroozi et al. 2013), and merger trees were generated using Consistent Tree (Behroozi et al. 2012).We refer to Wechsler et al. (2022) for details of this simulation.Given that our galaxy samples are selected at low redshift, we use the snapshot corresponding to z = 0 for the work described in this section.
Evidence has shown that subhalos in dark matter-only simulations are susceptible to physical and unphysical disruptions (Klypin et al. 1999;Weinberg et al. 2008;van den Bosch & Ogiya 2018).We account for this effect using a prescription similar to Behroozi et al. (2019), which adds disrupted subhalos back to the simulation.We first identify subhalos that are no longer detected by Rockstar in each snapshot.Then, we locate the host halos that contain these subhalos within their virial radius and use the semi-analytic model from Behroozi et al. (2019) to simulate the evolution of the subhalos' position, mass, and maximal circular velocity.These procedures produce a catalog containing standard halos that can be found and tracked using Rockstar and disrupted subhalos (also known as orphans) that have properties calculated semi-analytically.

Models
Using the subhalo abundance matching technique, we populate galaxies on subhalos (tracked subhalos and orphans).Here, we first describe the general concept of this technique and then describe the extensions we develop in this paper.In the most basic form, subhalo abundance matching assigns each subhalo with a luminosity by enforcing the relation, where n(L > x) represents the number density of galaxies with luminosity greater than x and n(X h > y) represents the number density of subhalos with properties X h greater than y.
Following Lehmann et al. (2017), we adopt X h as v α defined as, where v vir is the virial velocity of the halos, v max is the maximum circular velocity, and α is a free parameter.These quantities are evaluated at the epoch when the halo's mass is at the maximum to avoid complicated physics subhalos experienced when falling into a big halo.The free parameter α allows additional flexibility for the subhalo abundance matching model.In particular, v max v vir can be viewed as a proxy of halo concentration.The free parameter α controls the dependence of galaxy luminosity on halo concentrations.
The n(L > x) in equation 3 is estimated by fitting a modified double-Schechter function with a Gaussian tail to the luminosity function measured in SDSS DR7 galaxy catalogs.For details of constructing n(L > x), we refer the readers to appendix E.1. of DeRose et al. (2019).We cannot directly apply the estimated n(L > x) in equation 3 because the relations between galaxy luminosity and halo properties are stochastic.This stochasticity comes from observational uncertainties, complicated astrophysical processes that affect galaxy evolution within dark matter halos, and additional halo properties that are correlated with galaxy luminosities.One can model these complicated processes as where n(L > x) is the measured luminosity function.The intrinsic galaxy luminosity L in the above expression is determined solely by the selected halo properties X h by comparing the number density of galaxies with luminosity above

SHAM galaxy catalogs generation and tests
Table 1.Descriptions of the simulations used for this analysis, including the size of the box (L box ), the mass resolution of the particles (M part ), and the force softening length ( plummer ).
L , n(x > L ), to the number density of halos with the selected property above X h , n(x > X h ).In most of the subhalo abundance matching work (e.g.Tinker (2018) shows that the value of this scatter is a constant at the high mass end and increases at the low mass end.We therefore parametrize P(L|L ) as a log-normal distribution with scatter, where σ v , σ vs , and σ vp are free parameters, and max(A, B) represents the maximum of A and B. Further, observational constraints based on galaxy groups (Tinker 2021) and satellite kinematics (Lange et al. 2019) constrain this scatter to be less than 0.4 with 95 percent confidence over the range of luminosities considered here.We, therefore, set an additional prior on σ(L ) to be below 0.4.With a functional form of P(L|L ), we then estimate n(L > x) based on the measured n(L > x) using the algorithm presented in appendix A.
When subhalos (tracked subhalos and orphans) fall onto big halos, they might be tidally disrupted, the gas within them could be stripped, and the galaxy living inside them might be destroyed.Thus, we must allow additional flexibility in the model to capture these physical processes.Previous work has been done on mitigating these issues by applying cuts in halo properties on tracked subhalos (Reddick et al. 2013), orphans (Behroozi et al. 2019;DeRose et al. 2021), or both (Contreras et al. 2021).In this paper, we treat tracked subhalos and orphans equally.This approach has the benefit that the result depends less on the resolution of the simulations.We mitigate the issues of physical disruptions using a procedure similar to Behroozi et al. (2019) and DeRose et al. (2021).For each subhalo, we compare the maximum circular velocity at the current time (v max,now ) to the maximum circular velocity at the time when the halo mass is at the maximum (v max,M peak ).When v max,now of a subhalo is much smaller than v max,M peak , the subhalo is likely tidally stripped and is less likely to host a galaxy.We therefore set a threshold of the ratio between v max,now and v max,M peak , below which the subhalos do not host galaxies.Further, because galaxies with different luminosities will have different time scales of dynamical frictions and resilience to tidal disruptions, we allow this threshold to depend on v α , the combination of halo properties used in subhalo abundance matching.With these insights, our parametrization of the probability that a subhalo is physically disrupted reads where Θ is the Heaviside step function.In the above expression, T l and T h are free parameters, with T dis (v α ) interpolating between asymptotic behaviors at high and low v α ends, v m governs where the transition from T l to T h occurs, and σ d controls how steep the transition is.Although Equations 7 and 8 may appear complex, the underlying physical behavior is simple.For halos with large v α , T h sets the threshold of v max,now /v max,M peak to determine whether the halos are tidally disrupted.Conversely, for halos with small v α , T l determines the threshold.Equation 8 ensures a smooth transition between small and large v α values, and two additional parameters control the location and slope of the transition.
To summarize, our extended SHAM model has 8 free parameters, whose values are given in table 2. Given these parameters, we can generate a galaxy catalog using the following procedure.We first employ equation 4 to calculate v α of each subhalo in a halo catalog, including tracked subhalos and halos identified by Rockstar and orphan subhalos generated using a semi-analytic model from Behroozi et al. (2019).We then remove subhalos according to equations 7 and 8.For each of the remaining subhalos, we populate a galaxy at the position of the subhalo with luminosity determined by equation 5. We select galaxies in simulations using the same luminosity cut as the data.To account for the effect of redshift-space distortions, we first transform the coordinates of halos along the line of sight using the velocity of the halos.We then apply the periodic boundary condition on the transformed coordinates and calculate w gg (r) using the same radial binning as the data.Finally, the w gg (r) is estimated using the natural estimator, with an analytic calculation of the random-random pairs.To minimize cosmic variance, we repeat the above process by choosing line-of-sight direction along x, y, and z axes of the box and taking the average of the measurements.Further, we minimize the stochasticities due to the Monte Carlo nature of the SHAM model by repopulating the simulations 10 times with the same parameters.

Measurements in simulations
Because the simulation box size is small, cosmic variance cannot be ignored in the total error budget.We estimate this cosmic variance using the jackknife technique similar to the procedure described in section 2.1.The main difference is that there is no survey mask in the simulations, so we generate 125 jackknife subsamples by equally partitioning the 0 .00 0 .2 5 0 .5 0 0 .7 5 box.Further, the jackknife subsampling breaks the periodic boundary conditions.One has to consider this when analytically estimating the random-random terms in the natural estimator.Here, we adopt the approach described in He (2021) to estimate the random-random terms for each jackknife subsample.We estimate the covariance matrix from this cosmic variance using the best-fit parameters to the data presented in section 2.3.This covariance matrix is then added to the observed covariance matrix while performing the likelihood inferences.
Regarding the galaxy group samples, we run the Self-Calibrated Galaxy Group Finder (Tinker 2020(Tinker , 2021) ) on the simulated galaxies with the same setting as run on SDSS data.As pointed out before, in both runs on simulations and data, we remove steps that use color information to ensure an apples-to-apples comparison.

SHAM result and discussions
We fit the model described in section 2.2.2 to the data described in section 2.1 assuming a Gaussian likelihood with the priors described in table 2. The challenge is that the fitting is hugely time-consuming because each step of the likelihood inference requires populating halos 10 times and computing the correlation function 30 times.We tackle this challenge by first building an emulator of the SHAM model and using it for likelihood inferences.We detail the procedure of constructing this emulator in appendix B. Then, the likelihood inference are performed using an ensemble slicing sampling method implemented in zeus (Karamanis et al. 2021;Karamanis & Beutler 2020).
Figure 2 compares the best-fit model and the data.The minimum χ 2 is 79.4 with the degree of freedom 64, yielding a Probability-to-Exceed (PTE) 0.09.Table 2 shows the best-fit parameters.Overall, our model describes the data well, but a small difference can be seen at r < 0.3 h −1 Mpc of w cg for the faintest galaxy sample.We want to determine whether the difference we have observed is due to limitations of the N-body simulations, such as force softening or finite resolution.To do this, we repeat the entire SHAM analysis using the Small MultiDark Planck (SMDPL) N-body simulation (Klypin et al. 2016), which has a higher resolution, a different softening scale, and a different fiducial cosmology than the previous simulation.This difference persists, as shown in the blue line in figure 2. One possibility is that this difference comes from tensions between w cg and w gg .Smallscale w gg is dominated by the one-halo term, which has significant contributions from galaxies in high-mass halos.In figure 2, one can see that the small-scale w gg for the faintest galaxy samples prefers a slightly smaller one-halo term than the model, while the small-scale w cg prefers the opposite.This mild tension might be related to the findings in Hearin et al. (2013), where they find tensions between group multiplicity functions, which are an integrated version of w cg , and galaxy clustering, under the assumption of the basic SHAM model.This tension is much smaller in our extended SHAM model, and the constraining power of the data cannot distinguish it from statistical fluctuations.We therefore leave further investigations to future work.
Figure 2.2.2 shows the posteriors of SHAM parameter constraints based on Chinchilla-T1 (red) and SMDPL (blue).Most of the SHAM parameters based on Chinchilla-T1 and SMDPL are consistent, indicating the robustness of the result to details of N-body simulations, including cosmology, resolution, and force softening.The only parameter that is slightly inconsistent is σ vp .Chinchilla-T1 (the lower resolution simulations) prefers a brighter (more negative) value than SMDPL (the higher resolution simulations).One possible explanation is that SMDPL has more low-mass halos than Chinchilla-T1.For a given magnitude bin, the missing lowmass halos in Chinchilla-T1 can be compensated by a larger scatter.Thus, Chinchilla-T1 has a brighter pivot point for the scatter.In both Chinchilla-T1 and SMDPL, the inclusion of orphan galaxies is strongly preferred.For the galaxy sample considered in this work, ∼ 20 percent are orphan galaxies for Chinchilla-T1 and ∼ 15 percent for SMDPL.Another interesting result is that both Chinchilla-T1 and SMDPL prefer a varying scatter in the luminosity-v α relation at the 1σ level.The positive value of σ cs indicates that the scatter increases for lower mass halos.This is consistent with results based on group finders (Tinker 2021), satellite kinematics (Lange et al. 2019), and galaxy clustering (Xu et al. 2018).

POPULATING GALAXIES IN LOW-RESOLUTION SIMULATIONS
The SHAM model presented in section 2 allows us to create high-fidelity galaxy catalogs based on high-resolution simulations.However, the high-resolution simulations typically have a volume much smaller than the volume accessible with current and upcoming galaxy surveys, making them insufficient to validate models with the required accuracy.In this section, we describe the formalism to transfer the knowledge learned in the SHAM catalogs to populate galaxies in large simulations with low resolution.This way, one can generate multiple realizations of mock galaxy catalogs with a modest computational expense.

Simulations
We use lightcones with an area 10, 313 square degree constructed from the L1, L2, and L3 simulations detailed in table 1.These simulations are generated with the same cosmological parameters as Chinchilla-T1.Details of the lightcone construction were presented in appendix B1 of DeRose et al. (2019).

Targets
We aim to generate mock that support sciences in large galaxy surveys, such as the Dark Energy Survey (DES) and Vera Rubin Observatories' Legacy Survey of Space and Time (LSST).The various science cases in these surveys place stringent constraints on the galaxy properties in the simulations.In this paper, we mainly focus on the two main samples in DES: redMaPPer clusters (Rykoff et al. 2014(Rykoff et al. , 2016) ) and redMaGiC galaxies (Rozo et al. 2016).As shown in DeRose et al. ( 2019), these two samples place the most stringent constraints on the galaxy models using DES data.We provide a brief description of these two samples below.

redMaPPer clusters
In optical surveys, galaxy clusters appear to be spatial and redshift concentrations of red-sequence galaxies.The relatively tight color-absolute magnitude relations of redsequence galaxies enable one to identify galaxy clusters using galaxies without redshift information.The primary algorithm used by Dark Energy Survey collaboration (Rykoff et al. 2016) and the LSST Dark Energy Science Collaboration (Kovacs et al. 2022) to select clusters is redMaPPer (Rykoff et al. 2014), which employs a matched filter algorithm to select overdensities of red-sequence galaxies.Here, we briefly summarize the algorithm.First, the redMaPPer algorithm uses spectroscopic data to construct a red sequence template empirically.It then computes the redshift of each galaxy by matching its color to the template.Second, redMaPPer identifies bright and red galaxies as cluster centers and determines the probability of each galaxy being a member (p mem ) by comparing its spatial distribution, color, and luminosity to a model.Third, the algorithm removes clusters that have p mem > 0.5 of another cluster and repeats the above processes.Finally, redMaPPer assigns a richness value (λ) to each cluster, calculated by summing p mem of each member galaxy.This richness value is used as a primary cluster mass proxy (McClintock et al. 2019;Abbott et al. 2020;To & Krause et al. 2021a) in cosmological analyses given their expected tight relation to halo masses (Rozo & Rykoff 2014;Rozo et al. 2015a).redMaPPer also provides the most probable redshift of each galaxy cluster (z λ ) based on the colors of its member galaxies.

redMaGiC galaxies
The red-sequence model derived from the redMaPPer clusters can be used to select galaxy samples with excellent photometric redshift uncertainties (σ z ≈ 2 percent).To achieve this, one first constructs a color model using redMaPPer member galaxies with high membership probabilities (p mem ).The redshift of each galaxy can be estimated by maximizing the consistency of galaxy colors and color models.One can then select bright galaxies with colors consistent with the color model.The galaxy samples selected in this way are called redMaGiC (Rozo et al. 2016) and are one of the primary lens samples in the Dark Energy Survey cosmology analyses (Abbott et al. 2018(Abbott et al. , 2022)).Based on the consistency with the color model, each redMaGiC galaxy is associated with a redshift probability distribution p(z redmagic ).

Painting galaxy luminosities onto dark matter particles
We use the Addgals algorithm to populate the dark matter lightcones presented in DeRose et al. ( 2019) and Wechsler et al. ( 2022) with galaxies.The algorithm is detailed in Wechsler et al. (2022).Here, we briefly summarize the general formalism and highlight modifications.
Addgals populates galaxies in low-resolution N-body simulations based on three distributions: 1.The distribution of dark matter overdensities around galaxies given their absolute r-band magnitude M r and redshift z, P(R δ |M r < x, z).The local dark matter overdensity is estimated using R δ , distances to k nearest dark matter particles such that the enclosed dark matter mass is 1.3In this work, the galaxy luminosity function is the same as the one used to create SHAM models at z = 0.At z > 0, the Schechter function's characteristic luminosity (L * ) is shifted based on a third-order polynomial.The parameters of this polynomial are constrained based on DES-Y1 data (Abbott et al. 2018).Details of constructing this luminosity function are given in DeRose et al. (2019).
Given the redshift-dependent luminosity function, we populate galaxies on each snapshot of Chinchilla-T1 simulations using the best-fit SHAM model presented in section 2. We then use these galaxies to determine P(M r |M vir , z) for central galaxies in resolved halos, defined as halos containing more than 200 particles, and P(R δ |M r < x, z) for satellite and central galaxies in unresolved halos.
To determine the distribution of central galaxy absolute magnitudes at fixed host halo virial mass M vir and redshift, we assume that P(M r |M vir , z) is a Gaussian distribution with a mass-dependent scatter.The mass dependency of the scatter is determined by fitting a straight line to the measured M r scatter of M vir > 10 13 h −1 M halos in the SHAM galaxy catalogs.The mean of the Gaussian distribution is given by where A, a, b, c, d are free parameters determined at each snapshot of the SHAM galaxy catalogs.Using the best-fit function, we populate central galaxies on resolved halos in the low-resolution lightcone simulations.Similar to P(M r |M vir , z), we determine P(R δ |M r < x, z) in the SHAM galaxy catalogs at each snapshot.We model P(R δ |M r < x, z) as a log-normal and normal distribution sums, given by where p, µ c , µ f , σ c , σ f are free parameters and are measured in each snapshot of the SHAM galaxy catalogs at grids of magnitude thresholds from M r = −18 to M r = −24.Note that σ f has length units, and σ c is dimensionless.The additional R δ in the denominator of the first term ensures the consistency of the units.We then build a Gaussian process emulator of these parameters to enable accurate interpolations.We detail the emulator construction in appendix C.
With P(M r |M vir , z) and P(R δ |M r < x, z), we paint r-band luminosities onto dark matter particles in the lightcone simulations.For the resolved halos, we paint luminosities at the center of each halo using the learned P(M r |M vir , z).The resolved halos are defined as halos with M vir > 6 × 10 12 h −1 M for the L1 and L2 boxes and M vir > 1×10 13 h −1 M for the L3 box.These choices of halo masses were justified in Wechsler et al. (2022).For unresolved galaxies, we paint luminosities onto dark matter particles.We first generate random realizations of galaxies' redshifts by inverse transform sampling the measured redshift distributions of dark matter particles in the lightcone simulations.Next, we generate random realizations of galaxies' r-band luminosities (M r ) from the luminosity function φ(x, z) after subtracting the number densities of resolved halos.By doing so, we avoid double-painting central galaxies.We then convert the cumulative conditional probability P(R δ |M r < x, z) to P(R δ |M r , z) using finite differ-ence estimation: where M is a normalization constant and N(x) = x −∞ φ(x, z), the cumulative luminosity function.Each galaxy is randomly assigned an R δ based on its M r and z by inverse transform sampling of P(R δ |M r , z).We arrive at a list of galaxy (R δ , M r , z) values.The remaining task is to put galaxies in the correct positions.In the N-body lightcones, we measure R δ for each dark matter particle.We then divide the dark matter lightcone and galaxy samples into ∆z = 0.01 bins.For each redshift bin, we assign galaxies to dark matter particles using the closest match of R δ in orders of galaxies' brightness.
Figure 4 shows the comparison of R δ distributions of the painted galaxies (red line) onto those measured in the SHAM model (black dots).The agreement between the red line and the black dots indicates that equation 11 provides a reasonable description of the measurement and the galaxy assignment algorithm works reasonably well.Similar to Wechsler et al. (2022), the R δ distributions show double bump features for the two faintest galaxy thresholds.Central galaxies dominate one bump, and satellite galaxies dominate the other.This double bump feature of R δ distributions indicates the effectiveness of R δ on separating centrals and satellites at a given luminosity.We further compare the R δ distribution in this work with the previous version (Buzzard v2.0; Wechsler et al. 2022).We find that R δ is shifted to smaller values at a given luminosity than Buzzard v2.0, indicating a more significant satellite fraction at a given luminosity.This is likely because we include orphan satellites in the SHAM model while Buzzard v2.0 did not.

Painting colors onto galaxies
So far, we have generated a galaxy mock with a realistic spatial distribution and rest-frame r-band luminosity distribution.This section describes an algorithm for assigning colors to these galaxies.

Overall color distribution
Following Wechsler et al. (2022), we assume that galaxy SEDs can be described by the product of five coefficients and five KCORRECT spectral templates (Blanton et al. 2003;Blanton & Roweis 2007).One can then directly assign the measured KCORRECT coefficients in real data to galaxies in the simulations.This approach has a couple of benefits.First, the small number of coefficients for each galaxy reduces the requirements of memory and computational power to generate and store mock catalogs.Second, the product of coefficients and KCORRECT templates provides full SED information for each galaxy, from which one can compute the observed magnitude by applying band shifts and the observed bandpasses without regenerating galaxy SEDs.Third, the direct assignment from data guarantees reasonable matches between mocks and data.However, this approach requires a pool of measured KCORRECT coefficients representative of the targeted survey data.Generating this pool of KCOR-RECT coefficients requires representative spectroscopic redshifts down to the photometric survey depths.This is usually unachievable.
Following DeRose et al. ( 2019), we adopt a hierarchical approach to associate KCORRECT coefficients of a SDSS galaxy to a simulated galaxy.We first generate a representative sample at z < 0.2 down to DES survey depth using galaxies with z = (0.005, 0.2) in the SDSS DR7 VAGC catalog (Blanton et al. 2005b).Each galaxy has five KCORRECT coefficients according to Blanton et al. (2003) and Blanton & Roweis (2007).We then use PRIMUS (Coil et al. 2011) galaxies to quantify the redshift evolution of this coefficient pool.Specifically, we employ PRIMUS galaxies to calculate the ratio w r (M r , z) of the probability of being red at z to the probability at z < 0.2.We define galaxies as red when 0.1 g − r > 0.15 − 0.03M r .With w r (M r , z) at hand, we can generate KCORRECT coefficients for each simulated galaxy using a rejection sampling algorithm.We first bin SDSS galaxy samples and simulated galaxies using their rest-frame r-band luminosities (M r ) with widths ∆M r = 0.1(22.5 + M r ).For each simulated galaxy in the ∆M r bin, we first remove the SDSS galaxy samples in the same bin if where rand is a random number drawn from a uniform distribution from 0 to 1, p is an empirically measured red fraction in SDSS galaxy samples, and z is the redshift of the simulated galaxy.We associate the KCORRECT coefficients of a galaxy in the remaining SDSS galaxy sample to the simulated galaxy by selecting the one with the closest value of M r .If there is no SDSS galaxy in the ∆M r bin, we expand the width of the bin to ∆M r = 0.1(22.5 + M r ) 2 and repeat the rejection sampling step.By doing so, we paint KCOR-RECT coefficients to each simulated galaxy.With KCOR-RECT coefficients for each simulated galaxy, we can compute the galaxy's rest-frame g − r color by convolving the SED generated from these KCORRECT coefficients and the SDSS's bandpass.The simulated galaxies will have the same color-M r relation in the SDSS galaxy samples and the same red fraction-redshift relation as constrained by the PRIMUS galaxy samples.These steps have been validated in Wechsler et al. (2022) and DeRose et al. (2019).

Environmental dependent galaxy color distribution
Once we create mock galaxies with realistic overall color distribution, we shuffle galaxy colors to create environmental dependencies of the color.We use the conditional abundance matching technique (Masaki et al. 2013;Hearin & Watson 2013) and the rest-frame g − r color to perform the shuffling.Specifically, we assign the SED that corresponds to the restframe g − r color to a galaxy such that the following equation is satisfied, where e p is a proxy to quantify galaxy environments.DeRose et al. ( 2021) have found that R h , the distance to the nearest massive halos, provides a good proxy of galaxy environments.Using R h and the conditional abundance matching technique, they find comparable color-dependent galaxy clustering to SDSS measurements (Wechsler et al. 2022;DeRose et al. 2021).However, this proxy is inadequate for galaxies in galaxy clusters.First, more massive halos are bigger.On average, galaxies that live in a more massive halo would have larger R h than those living in small mass halos.Therefore using R h as a color proxy would make galaxies bluer in more massive halos.Second, for galaxies living in massive clusters, R h is the distance to the central galaxy.Using R h as a color indicator would create a strong radial color profile in a massive halo that is independent of cluster mass.This strong correlation breaks the self-similarity of galaxy clusters of different masses and lacks observational support.One hint of these problems shows in Wechsler et al. ( 2022)'s comparison of red galaxy clustering at r < 1 h −1 Mpc in the faintest magnitude bin (see also the grey line in figure 5).
The ratio of red galaxy clustering amplitude to all galaxies is low compared to the data.Since most of the pairs that contribute to small-scale clustering are from galaxies residing in large-mass halos, the small clustering ratio indicates that the galaxies in clusters are too blue.
Given the aforementioned shortcomings of R h , we define a new galaxy environment proxy e hc .e hc is constructed with two insights.First, assuming galaxy clusters with different masses are self-similar, the distance to a cluster should be measured in units of the cluster's radius.In practice, we use the ratio of the distance to a cluster and the cluster's virial radius (R vir ) to some power (c 0 ) as the color proxy.The additional power of the virial radius allows the possibility that more massive halos are stronger at quenching galaxies.The second insight is that the galaxy color gradient is shallower at the inner part of the clusters compared to the outskirts (Adhikari et al. 2021).Further, given the success of producing reasonable large-scale red galaxy clustering using R h (Wechsler et al. 2022), we would like the new environment proxy e hc to be similar to R h on large scales.Therefore, we design a mapping from R h to e hc such that e hc approaches R h + c when R h is infinity and becomes independent of R h when R h is zero.c is an arbitrary constant irrelevant to color assignments because only the ranks of R h are related to galaxy col-ors (equation 14).Given these two insights, the environment proxy has the following functional form, where d h is the distance to the nearest massive halo with mass greater than M ch , and M ch , c 0 , c 1 are three free parameters.c 1 controls the sharpness of transition from de hc /dx = 0 to de hc /dx = 1.
One remaining problem of equation 15 is that all central galaxies with M > M ch have e hc = 0, making them the reddest galaxies at a given magnitude.Given that M ch is usually 10 12.8 h −1 M , assigning all central galaxies with M > M ch the reddest SEDs will contradict observations (e.g.Wetzel et al. 2012).We must make some central galaxies living in low-mass halos blue.The challenge is that our lightcone has different resolutions at different redshifts, and simple mass cut might transfer this reshift-dependent resolution onto galaxy colors.Fortunately, R δ gives a nice separation of centrals and satellites that is less sensitive to the resolution of the simulations (see figure 4).Specifically, using Chinchilla-T1, we find that none of the satellite galaxies in M > 10 12.8 h −1 M halos and none of the central galaxies with M > 2 × 10 13 h −1 M have R δ greater than two.That is, galaxies with R δ > 2 are likely to be low-mass isolated centrals whose color should be preferentially blue.Built on these insights, we can increase the value of the environmental proxy e hc of a galaxy living in a low-density environment (R δ > 2) and likely to be red (e hc < 1) in the original algorithm.In this way, these galaxies will be bluer because of a larger environmental proxy value.We, therefore, increase galaxies' e hc by c 2 if R δ > 2 and e hc < 1.
Finally, we allow the possibility that color ranks (rank(g − r)) and galaxy environment proxy ranks (rank(e p )) are not perfectly correlated.Instead of implementing equation 14, we implement In the above equation, rank(e p ) is constructed such that it is an unbiased estimator of rank(g − r) and is correlated with rank(g − r) with correlation coefficient r c .We construct rank(e p ) using halotools (Hearin et al. 2017).DeRose et al. (2021) find that galaxies with different stellar mass prefer a different r c using SDSS color dependent clustering measurements (Zehavi et al. 2011).Motivated by their findings, we parametrize r c to depend on M r via where c l , c h are free parameters that govern the asymptotic behaviors at low and high M r , and c m , c σ govern the transition M r and the steepness of this transition.Equation 17 has the same functional form as equation 8. Again the underlying physical behavior is pretty simple.For bright galaxies (small M r ), r c is determined by c l .Conversely, for faint galaxies, r c is determined by c h .Equation 17 simply ensures a smooth transition from bright to faint objects, and two additional parameters control the location and slope of the transition.In summary, our color assignment model has eight parameters: 1. M ch : a parameter defines the mass threshold of halos to which we calculate distances (d h ) of each galaxy.
2. c 0 and c 1 : two free parameters control the mapping of d h to the environment proxy e hc .
3. c 2 : a parameter is added to e hc of low-mass centrals to make it bluer.
4. c l , c h , c m , c σ : parameters control luminosity dependence of the correlation between colors and environment proxies.
We use color-dependent clustering measured in Zehavi et al. (2011) to constrain these parameters.Specifically, to separate the uncertainties of the color model and the galaxy clustering model, we use the ratio of galaxy clustering of red and blue galaxies to all galaxies to constrain the color model parameters.A simple downhill simplex algorithm is used to find the best-fit parameters.At each step of the downhill simplex algorithm, we shuffle the galaxy SEDs in the mocks using equation 16, regenerate galaxy colors using SEDs and SDSS filters, select galaxy samples according to Zehavi et al. (2011), calculate galaxy clustering using Corrfunc (Sinha & Garrison 2020), and compare the measurements with the data assuming a Gaussian likelihood.Finally, our best-fit model has χ 2 = 84.9 with degree-of-freedom 58, corresponding to a PTE value of 0.01.In Figure 5, we compare the best-fit model to the measurement.We find that our best-fit model can accurately describe the data where the predicted clustering differs from that of Buzzard v2.0.The differences between our model and Buzzard v2.0 are especially pronounced for the lowest luminosity sample.

Photometric noise
So far, we have generated a quarter-sky galaxy lightcone with realistic color, luminosity, and spatial distributions.We add galaxy shapes and sizes following methods described in DeRose et al. (2019).Finally, we apply lensing effects using the ray tracing code Calclens (Becker 2013).Specifically, we apply deflection, rotation, shear, and magnification for all galaxies to alter their positions, shapes, and photometry.
We cut out two DES-Y3 regions from this lightcone and rotate them into the DES-Y3 footprint (Sevilla-Noarbe et al. 2021).We then apply several observational effects on the mock catalogs.First, we apply the survey masks on the mock using the DES-Y3 survey mask that is in the form of a healpix map with N side =4096, corresponding to a resolution of 0.73 arcmin 2 .For each healpix pixel, we randomly select galaxy samples according to the FRACGOOD value, which describes the amount of masking within the healpix pixel.Second, we add photometry noise to each galaxy, a process that will be detailed later.This step is essential because magnitude noise can drastically affect the number of galaxies near survey depth limits due to Eddington biases.For analyses that use galaxies near survey depth limits, which is the case for most weak lensing analyses, properly modeling noise is essential.Finally, we cut out galaxies with observed magnitude fainter than the survey depth.
The remaining task is to add a realistic magnitude error on each galaxy.We follow the prescription described in (DeRose et al. 2019;Wechsler et al. 2022) that uses the effective exposure time (t eff ) and 10σ limiting magnitudes (m lim ) from survey data.Here, we briefly summarize the prescriptions.First, we assume that the observed total number of photons of a galaxy comes from two contributions: photons from galaxies I gal and photons from the noise, such as sky, readout noise, etc.I sky .The I gal can simply be related to galaxy magnitude (m gal ) via, where ZP = 22.5.Following Rykoff et al. (2015), I sky can be empirically determined from data.Sevilla-Noarbe et al.
(2021) provides the 10σ limiting magnitudes (m lim ) and the effective exposure time (t eff ) in the form of healpix maps with a resolution of 0.73 arcmin 2 .Using this information, we calculate I sky for each healpix pixel by We assume the observed number of photons follows a Poisson distribution.The noisy observed flux F obs is then given by where Poisson denotes a random draw from Poisson distribution.The second term in the above equation is to mimic the process of sky background subtractions in observations.We note that this implementation is different from DeRose et al. ( 2019) and Wechsler et al. (2022), where the authors performed a random draw from a Gaussian distribution with a scatter as the square root of the mean.The Gaussian approximation is valid for high signal-to-noise galaxy samples but could lead to a bias for low signal-to-noise galaxies.The

Gravitational lensing
The matter along the line of sight will distort the light from galaxies.This distortion will modify the position of galaxies, distort their shapes, and magnify their brightness.We include these effects by performing full-sky multi-plane raytracings using Calclens (Becker 2013).This step has been detailed in appendix C of DeRose et al. (2019).Here, we provide a brief summary.First, we decompose the full-sky particle lightcones into healpix maps with n side = 8192, corresponding to a resolution of 0.46 arcmins.Next, for each pixel, we divide the particle lightcones into equally-spaced lens planes from z = 0 to z = 2.4 with separations of 25 h −1 Mpc.Then, for each of the simulated galaxies, we calculate the deflection angle and distortion matrix of the light at each lens plane.To achieve this, we first calculate the lensing potential from integrated dark matter density fields by solving a two-dimensional Poisson equation (Jain et al. 2000).Next, we calculate the deflection angle using the lens equation (Teyssier et al. 2009), which relates the deflection angles to the first derivative of the lensing potential.Then, the distortion matrix is calculated using the second derivative of the lensing potential (Jain et al. 2000;Hilbert et al. 2009;Becker 2013).Finally, these deflection angles and distortion matrices at each plane are combined to produce the final total deflections, shears, and magnifications (equations 10 and 12 of Becker 2013).
We validate the ray-tracing shears by computing the tangential shear γ t around halos with M vir > 2 × 10 14 h −1 M and z = [0.6,0.62].This tangential shear can be related to the excess surface density (∆Σ) of the matter around halos, which can be measured directly using particles in the simulation.We relate the tangential shear to ∆Σ using the estimator presented in Sheldon et al. (2004), which reads where N halos is the number of halos in the bin, N Sources is the number of galaxies around halos at distance r, γ i, j t is the tangential shear of of galaxy j around halos i.To avoid contamination, we only use galaxies with a redshift of 0.1 greater than the redshift of the halos.In the above equation Σ i, j crit is given by where D h , D S , and D hs are angular diameter distances to halos, galaxies, and between halos and galaxies.The ∆Σ can also be calculated from the matter-halo cross-correlation functions ξ hm using Figure 6 shows the comparison of ∆Σ measured via ray tracing (equation 22, orange dots) and via direct measurements of particles (equation 24, green dots).We find that the ray tracing and particle calculations agree well on large scales but deviate on small scales.This deviation has been shown in literature (Kovacs et al. 2022;Takahashi et al. 2017) and likely comes from the finite angular resolution of 0.46 arcmins when we calculate the lensing potentials for ray tracing.This resolution problem in lensing can be problematic To et al for cluster lensing analyses when most of the signal lies below 1 h −1 Mpc.Because of the resolution problem, cluster lensing studies have relied on the dark matter particle-halo cross-correlation (Wu et al. 2022;Abbott et al. 2020).However, Wu et al. (2019) points out that while the particlehalo cross-correlation produces the right mean ∆Σ, it significantly underestimates the halo-to-halo variance of ∆Σ at large scales.This is because particle-halo cross-correlations ignore the line-of-sight structure's contributions in the lensing profile2 .
We empirically correct the shears of each galaxy using the measured particle-halo cross-correlations.We first bin the halos in simulations with redshift from z = 0.18 to z = 0.67 and the virial mass M halo above 10 13 h −1 M into ∆z = 0.02 and ∆log 10 (M) = 0.15 h −1 M bins.Next, we calculate ∆Σ for halos in each bin using particle-halo cross-correlations (∆Σ p ) using equation 24 and ray-tracing derived shears using equation 22 (∆Σ γ ).We calculate the differences between the two and apply a correction on the two shear components of each galaxy γ 1 and γ 2 .Specifically, our algorithm reads for Galaxies with redshift z do for Halos h with mass M h > 10 13 h −1 M , redshift z h < z, and distance to galaxies r h < 4.6 arcmins (10 times the resolution of the ray-tracings) do In the above algorithm, D g , D h , D h,g are angular diameter distances to galaxies, halos, and between halos and galaxies.Index h goes through haloes with mass greater than 10 13 h −1 M along the line of sight with angular separations less than 4.6 arcmins of the galaxies.The two equations relating ∆γ t to ∆γ 1 and ∆γ 2 are derived assuming ∆γ × = 0.This is motivated by the fact that gravitational lensing due to the localized mass distribution does not generate γ × modes.The blue dots in figure 6 show the ∆Σ measurements using the corrected galaxy shears.We find that it is consistent with the derivation from particle-halo cross-correlations, indicating the effectiveness of our algorithm.In appendix E, we fur-ther show that this correction of galaxy shears has well below 0.1 percent impact on cosmic shear (ξ +/− ) at large scales.At small scales, ξ +/− based on corrected galaxy shears is more consistent with theoretical predictions.

Conditional abundance matching color
In this section, we first diagnose mismatches between modeled and observed color distributions.We then describe our schemes for correcting these mismatches.This correction has several moving parts, but our tests (see figures 9 and 10) show that it has the desired effect.
Figure 7 compares the total magnitude distributions and color-color distributions of galaxies in the mock (dashed lines) and the data (solid black lines).In general, the galaxies in the mock have redder colors.This trend has also been shown in figure 5 of DeRose et al. (2019), where the authors compared Buzzard to COSMOS data.As described in section 3.4.1, the overall color distribution is determined by two different factors: (a) KCORRECT spectral templates in (Blanton & Roweis 2007), and (b) SDSS KCORRECT coefficients with redshift evolution controlled by PRIMUS' red fraction measurements.To identify the exact cause of this mismatch of colors, we first test the effectiveness of KCOR-RECT spectral templates in describing colors of high redshift galaxies.We calculate KCORRECT coefficients of redMaP-Per member galaxies in clusters with a richness greater than 20 measured in the DES-Y3 data.In this calculation, we use the redshift of the host clusters to minimize the error of photometric redshifts.We then reconstruct the magnitude of galaxies using these KCORRECT coefficients.Figure 8 compares the g − r color measured using the reconstructed magnitudes to those measured with the input magnitudes.We choose to show g − r for simplicity but find that the trend is consistent between different colors.Figure 8 shows that the KCORRECT reconstructed color is unbiased for the low redshift bin.However, for the high redshift bin, while the reconstructed color of blue galaxies seems to be mostly unbiased, the reconstructed color is increasingly biased for redder galaxies.This indicates that summarizing colors with KCORRECT spectral templates is valid for low redshift galaxies but biases the colors of high redshift red galaxies.We further test this finding using 31 COSMOS SED templates (Ilbert et al. 2009), and find a similar result.Interestingly, this bias makes red galaxies bluer, which contradicts the trend in figure 7. Therefore, we conclude that the trend in figure 7 is likely due to the insufficiency of the combination of SDSS KCORRECT coefficients and PRIMUS' red fraction measurements for describing the colors of high-redshift galaxies.
In summary, for red galaxies, two competing effects affect their colors: the insufficiency of SDSS KCORRECT coefficients and PRIMUS' red fraction make them too red, and summarizing the color with KCORRECT spectral templates makes them too blue.The cancellation of these two competing effects on the colors of red galaxies can potentially explain why the mean color of red-sequence galaxies in Buzzard (DeRose et al. 2019) is consistent with data.In contrast, the overall colors of galaxies are biased red.Further, while the cancellation of these two competing effects can make the colors of red galaxies unbiased, it could boost the total number of red galaxies because blue galaxies are more abundant than red-sequence galaxies.Figure 5 of DeRose et al. (2019) shows that the number of red galaxies in Buzzard is larger than in the COSMOS data.To further understand this hypothesis, we present a toy model in appendix J, showing that our hypothesis can qualitatively reproduce Buzzard's g − r color distribution shown in figure 5 of DeRose et al. (2019).
A more comprehensive and physical solution to these two causes of color mismatches between mocks and data is important but requires further research.Here we provide an empirical solution using DES-Y3 photometric data.The DES-Y3 data constrain galaxies' overall color and apparent magni-   c d ).c m is the g − r color constucted using k-correction templates and bestfit coefficients.This calculation uses redMaPPer member galaxies with redMaPPer run on DES-Y3 data.Each column corresponds to a redshift slice using redshifts of host galaxy clusters.Top panels: dots show the median c m − c d in each color bin, while error bars show 68 percent scatter.Magenta lines show 1σ widths of red sequence reported by redMaPPer.We split the samples into two subsets: galaxies that are bright with m z < 18 (green), and galaxies that best match redMaPPer's red-sequence template (blue).Both subsets are artificially shifted horizontally to improve clarity.Bottom panels: histograms show the measured color distributions of all galaxies in the calculation.Magenta lines show the mean (solid) and width (dashed) of the red sequence reported by redMaPPer.For all samples, the reconstructed color is mostly unbiased at z = 0.2-0.3.At z = 0.4-0.5, the reconstructed color is biased for red galaxies and mostly unbiased for blue galaxies.tude distributions.Because DES-Y3 data do not have redshift information for each galaxy, we must make some assumptions to calibrate our simulations with photometric data.We make the following assumptions: 1.The ranks of galaxy colors given observed magnitude in mocks are valid.
2. The relative colors of galaxies in different redshifts are correct.
With these two assumptions, we can then employ the DES-Y3 data to tune the multi-dimensional color and magnitude distributions of galaxies in Cardinal using the conditional abundance matching technique.We first match the observed z band magnitude distributions in Cardinal and data because the z band is the reference magnitude used to select red-MaGiC galaxies and redMaPPer clusters.To achieve this, we retune the third-order polynomial parameters that govern the redshift evolution of L * by matching the luminosity functions of the mock galaxy catalogs to the data.We use a downhill simplex algorithm to minimize a Gaussian likelihood with Poisson errors.Second, we match the color distributions by enforcing the equality of the following probabilities between mocks and data, where c i = [g − r, r − i, i − z] denotes the three colors used for lens galaxies and cluster selections.We use the method implemented in halotools (Hearin et al. 2017) to perform this matching.With this second step, we can ensure the matches of overall color distributions between mocks and data.The above algorithm has one important caveat.The red galaxies live in a tight color-magnitude space known as the red sequence.Because we match overall galaxy colors and magnitude, this process will widen the color-magnitude relations of red galaxies.In appendix G, we present an algorithm that uses redMaPPer to reduce this broadening.
So far, we have obtained galaxy catalogs with realistic magnitude and color distributions.Unfortunately, this catalog likely has incorrect environmentally-dependent galaxy colors because of additional noise introduced in the various conditional abundance matching processes.Despite this problem, we can still use this catalog to generate a new color template that captures the redshift dependence of color distributions more accurately than the original SDSS catalogs.Naively, we could rebuild the SED template coefficients using the catalog.But, as shown in figure 8, we find that summarizing galaxy colors using SED templates could lead to a bias of colors for red galaxies up to 0.3 dex.We, therefore, adopt a different approach.Instead of building new SED template coefficients using the abundance-matched catalog, we build an observed color table as a function of the galaxy's observed z-band magnitude m z and redshifts.We then use this table and repeat steps starting from section 3.4.2 to build a new mock catalog.
While we use a special treatment of red-sequence galaxies in the conditional abundance matching method (detailed in appendix G), the additional noise caused by this process makes the width of the red sequence too wide compared to the data.This too-wide red sequence could lead to overlypessimistic estimations of photometric errors of red galaxies and could cause massive background contamination of optical cluster identifications.To solve this problem, we further improve the consistency of the red sequence in mocks and the data using the algorithm presented in appendix H.The resulting red sequence in Cardinal is compared with DES-Y3 data in figure 9. We find that the red sequences in simulations and data are very consistent.
After the color adjustment, we add observational noise to generate the new mock catalog using the method described in section 3.5.The new mock catalog's overall color and magnitude distributions are shown as solid lines in figure 7. The agreement between mocks and data is greatly improved.In most bands, we achieve a fractional error in galaxy luminosity functions ∼ 5 percent.

COMPARISON TO DES Y3 OBSERVATIONS
Now that we have a realistic galaxy catalog with observed magnitudes down to survey depth limits, we now proceed to select various cosmic structure tracers and compare their properties with the DES-Y3 data.

redMaPPer clusters
In this section, we compare clusters in mocks and DES-Y3 data.As described in section 3.2.1,redMaPPer is the main cluster sample in DES cluster cosmology analyses.Observationally, redMaPPer clusters are selected based on their richness (λ).For example, DES-Y1 cosmology analyses (Abbott et al. 2020, To & Krause et al. 2021a) consider λ > 20 redMaPPer clusters as cosmological samples.
The simplest way to generate redMaPPer clusters in mocks is to select dark matter halos containing more than 20 bright red-sequence galaxies or some other value to match the ob-served abundance.However, the richness value of redMaP-Per clusters in the data contains significant components from correlated large-scale structure along the line of sight due to redshift uncertainties (Costanzi et al. 2019;Sunayama 2022).Further Abbott et al. (2020), Sunayama et al. (2020), To &Krause et al. (2021b), andWu et al. (2022) show that these projected components in richness can cause biases in the twopoint correlation functions of redMaPPer clusters, including cluster lensing, cluster-galaxy cross-correlations, and cluster clustering.Therefore, counting the numbers of red-sequence galaxies within three-dimensional distance to halos in simulations is not the same as redMaPPer richness, making comparisons with observed redMaPPer clusters hard to interpret.Costanzi et al. (2019); Sunayama et al. (2020); Wu et al. (2022) present an improved way to simulate redMaPPer richness by counting the numbers of red-sequence galaxies within cylinders along the line of sight.This approach provides a numerically efficient way to simulate redMaPPer richness that does not require lightcone simulations.The problem with this approach is that redMaPPer does not weigh galaxies uniformly along the line of sight; instead, galaxies that are further away from clusters along the line of sight usually have smaller p mem .Thus, counting galaxies within a cylinder is likely to overestimate the richness of clusters.One way to remedy this problem is to measure the p mem distributions of galaxies as a function of spectroscopic redshifts.However, obtaining spectroscopic redshifts for galaxies down to 0.2L * is challenging.
Finally, one could run redMaPPer algorithm on realistic mock galaxy catalogs to ensure apples-to-apples comparison of simulated and observed clusters.However, the richness value of redMaPPer clusters depends on member galaxies' colors, magnitudes, and positions, making it hard to reproduce in simulations.Therefore, redMaPPer has only been applied to a limited number of mock catalogs (Korytov et al. 2019;DeRose et al. 2019).
Given the realistic red-sequence galaxies in Cardinal, we adopt this last approach to generate redMaPPer clusters.We select galaxy clusters using redMaPPer v0.8.43 , which includes several improvements relative to its predecessors (Rykoff et al. 2014(Rykoff et al. , 2016;;McClintock et al. 2019).This includes a complete adaptation of the code from IDL to python and improved models of red-sequence galaxies.This additional improvement will affect the richness of redMaPPer clusters; therefore, we run the same redMaPPer on both Buzzard v2.0 (DeRose et al. 2022) and publicly available DES-Y3 data (Sevilla-Noarbe et al. 2021) to ensure apples-toapples comparisons.Figure 10    measured in Cardinal and DES-Y3 data.We find that the data and Cardinal generally agree except for the lowest redshift bins.This might indicate differences between the redshift evolution of the richness-mass relations in Cardinal and the data.In comparison, the blue lines in figure 10 show the cluster abundances measured in Buzzard v2.0.Cardinal demonstrates a much better agreement with the data in all redshift bins.There are three main differences between the Cardinal model and the Buzzard model, which are 1.The training SHAM model in Cardinal includes orphan subhalos as detailed in section 2.
2. The color assignment model that produces environment-dependent galaxy color distributions is improved.We detailed this step in section 3.4.
3. We include an additional abundance matching step to address the remaining color inconsistencies due to the insufficient training spectra and SED templates.We detailed this step in section 3.6.
To better understand how these steps affect the cluster abundances, we run redMaPPer on two additional realizations.First, we only update the training SHAM model in Cardinal but fix other color assignment models the same as Buzzard v2.0.The redMaPPer clusters identified in this realization are shown as the orange line in figure 10.We can then compare this with clusters in Buzzard v2.0 to see the impact of SHAM models on the redMaPPer cluster abundances.We note that this comparison depends on redshift and richness.For the simplicity of the argument, let us focus on the λ = [30, 40] and z λ = [0.4,0.5] bin.We find that including orphan models can boost cluster abundance with richness λ > 20 by roughly 50 percent, which is insufficient to account for the factor of three differences in cluster abundances between Buzzard v2.0 and data.Second, we update the training SHAM and color assignment models for environment-dependent galaxy color distributions.The redmapper clusters in this realization are shown as the green lines in figure 10.Combining the color assignment models and the SHAM model can boost the richness by ∼ 100 percent.Finally, comparisons of the full Cardinal model and the green lines show that the additional color abundance matching step can boost the richness by another ∼ 100 percent, making the cluster abundances in Cardinal agree with the DES-Y3 data.We, therefore, conclude that all three modifications of the Buzzard v2.0 model are needed to reproduce the observed cluster abundances.In appendix F, we further compare the conditional luminosity functions of satellites and centrals of redMaPPer clusters in Cardinal, Buzzard v2.0, and DES-Y3 data.
Figure 11 further compares the stacked galaxy surface density around redMaPPer clusters in DES-Y3 data and Cardinal, calculated using the method detailed in appendix I.We find decent agreements between Cardinal and DES-Y3 data.Compared to Buzzard v2.0 (e.g.figure 8 in Wechsler et al. 2022), the one-halo regime agrees with the data much better.This is likely due to the better match of richness values for a given cluster number density in Cardinal.On the other hand, the consistency on large scales is very interesting.As shown in To & Krause et al. (2021a,b), redMaPPer-redMaGiC cross-correlations in Buzzard has a much larger large-scale bias compared to DES-Y1 data.If this excess of large-scale bias is related to galaxy cluster samples, figure 11 suggests that Cardinal will not have this problem.We leave further investigations on this aspect in future work.

redMaGiC galaxies
As described in section 3.2.2,redMaGiC galaxies are one of the main galaxy samples in the DES cosmological analyses (Abbott et al. 2018(Abbott et al. , 2022)).Given the realistic galaxy cluster properties in Cardinal, we apply the redMaGiC algorithm on Cardinal galaxies in the same way as redMaGiC run on the DES-Y3 data.Specifically, redMaGiC galaxies are selected using redshifts of redMaPPer clusters instead of the true redshifts in the simulations.This allows us to have realistic galaxy selection-related systematics in Cardinal.Following the DES-Y3 cosmology analysis, we further bin redMaGiC galaxies into five tomographic bins with edges, [0.15, 0.35, 0.5, 0.65, 0.8, 0.9], using z redmagic , the redshift that maximizes p(z redmagic ).We then Monte-Carlo sample p(z redmagic ) for each redshift bin to estimate their redshift distributions.
Figure 12 compares the redshift distributions of redMaGiC galaxies in Cardinal and the data.We find that the red-MaGiC estimated redshift distributions are very similar to those in DES-Y3 data.This agreement is non-trivial.Although the same algorithm produces both samples, this algorithm is applied to two different galaxy catalogs.This similarity in redshift distributions demonstrates the realism of red galaxy properties in Cardinal.In the bottom panel of figure 12, we compare the redshift distributions estimated by red-MaGiC and the true redshift distributions in the simulation.While the redshift distributions estimated by redMaGiC are slightly narrower compared to the true redshift distributions, the overall shapes of redshift distributions are very consistent between redMaGiC's estimations and the true distributions.
We proceed to access the clustering properties of red-MaGiC in Cardinal.We first measure the galaxy correlation functions w(θ) using the weighted version of the Landy-Szalay estimator (Landy & Szalay 1993), where i, j goes through each galaxy pair that have angular speparation θ min < θ < θ max .The w i and w j are systematic weights associated with galaxies to remove spurious correlations of galaxies and survey depths.We detail the construction of these weights in appendix D. The index R corresponds to randoms that describe the survey footprint.In this analysis, we make the number of randoms (N R ) 10 times greater than the number of galaxies in each tomographic bin to remove additional shot noise caused by the randoms.The i, R and RR in equation 26 go through each galaxy-random and random-random pairs that are separated by θ.We compare the measured w(θ) of Cardinal, Buzzard v2.0, and DES-Y3 data in figure 13.To avoid cosmic variance, we compare Cardinal and Buzzard v2.0 generated with the same dark matter simulation.In the lower two redshift bins, the redMaGiC clustering in Cardinal has a stronger one-halo term than Buzzard v2.0.This is likely due to the combination of changes in the color assignment and the inclusion of orphan galaxies.This is consistent with the finding compared to SDSS galaxies, which are galaxies at z 0.1 (figure 5).For all redshift bins, the clustering in Cardinal is somewhat smaller than that of the DES-Y3 data.To better understand the origin of this inconsistency, we compare the clustering of redMaGiC in Cardinal, and Buzzard v2.0 generated with the same dark matter simulation.We find that for the first and second redshift bins, the clustering of redMaGiC in Cardinal is consistent with Buzzard v2.0.For the third redshift bin, the deficit of clustering is likely due to the slightly worse photometric redshift performances in Cardinal compared to Buzzard v2.0.
For the fourth and fifth redshift bins, we find that the red-MaGiC samples in these redshift ranges are slightly fainter than Buzzard v2.0 and data, which could potentially explain the differences in clustering.

CONCLUSIONS AND OUTLOOK
Large simulations with realistic galaxies are increasingly essential in precision cosmological analyses based on large surveys.One of the widely-used multi-purpose simulations is the Buzzard simulation, which produces DES, DESI, and LSST-like mock catalogs out to z = 2.35 to a depth of m r = 27 (DeRose et al. 2019(DeRose et al. , 2022;;Wechsler et al. 2022).10 2 (arcmin) 10 1 10 2 (arcmin) 10 1 10 2 (arcmin) 10 1 10 2 (arcmin) This simulation played an essential role in the core cosmology analyses for DES and in planning and methodology development for several other surveys.In this paper, we introduce several model improvements to the Buzzard catalogs and generate a new set of mock catalogs, the Cardinal mocks.The main improvements are listed below, with a more detailed list in appendix K.
1. We update the subhalo abundance matching model (SHAM) used to generate the Buzzard simulation.
The new SHAM model considers orphan galaxies and a flexible disruption model and incorporates massdependent scatter between galaxy luminosity and halo properties.For the first time, the SHAM model can simultaneously fit galaxy clustering and group-galaxy cross-correlations in three different luminosity thresholds measured in SDSS (Section 2).
2. A new color assignment model is developed to produce the environmentally dependent galaxy colors accurately.For the first time, the color assignment model can simultaneously fit color-dependent galaxy clustering in three different luminosity bins measured in SDSS (Section 3.4).
3. We identify two causes of the discrepancy between the color distribution in mocks and data.These include the need for redshift evolution in the spectroscopic training data and the insufficiency of summarizing galaxy colors using current SED templates.We provide a solution that uses photometric data and conditional abundance matching techniques.Applying this conditional abundance matching scheme, the apparent magnitude and color-color distributions are much more consistent with the DES-Y3 data (Section 3.6).
4. We address the lack of lensing shear due to limited raytracing resolution.We develop a novel method that uses dark matter particle-halo cross-correlations to fix this problem.We find that the ∆Σ around massive halos is more consistent with expectations after this correction is applied (Section 3.5.2).
We incorporate these improvements into the Addgals algorithm to generate the Cardinal mock, a one-quarter sky simulation out to z = 2.35 to a depth of m r = 27.We further cut out one DES-Y3 footprint and apply realistic DES-Y3 photometric errors and sky backgrounds.The latest redMaP-Per cluster finding algorithm and redMaGiC lens galaxy algorithm are also run on the catalogs to produce realistic DESY3-like cluster and lens samples (summarized in figure 1).
We compare the Cardinal mocks with DES-Y3 and SDSS data.These comparisons include the abundance of redMaP-Per clusters, the projected galaxy density profiles around redMaPPer clusters, the redshift distribution of redMaGiC galaxies, and galaxy clustering for various samples.In the cluster abundance comparison, we find that the Cardinal clusters have a much more consistent number density to the data as a function of richness and redshift than those identified in the Buzzard simulation (Figure 10).We further make the comparison by changing one model component at a time.We find that two main factors contribute to the long-standing issue of lower cluster number densities in the Buzzard simulation, where the cluster abundance in simulations is 10-25 percent of the cluster abundance in the data at the same richness.One of these is the previously postulated lack of orphans in the SHAM model (Wechsler et al. 2022;DeRose et al. 2019), and another is the galaxy color assignment model.We further find galaxy profiles around redMaPPer] clusters in Cardinal are in excellent agreement with the data (figure 11).As for the redMaGiC galaxy comparison, we find that the Cardinal algorithm produces reasonable redMaGiC galaxy properties compared to the DES-Y3 data.These include the redshift distributions of the lens samples and galaxy clustering in each tomographic bin.Finally, we compare color-dependent galaxy clustering between Cardinal and SDSS data and find excellent consistency.

To et al
With the updated model, which includes more realistic small-scale galaxy clustering and lensing properties, we expect many applications using the Cardinal mock.We list a few potential applications below.
1. Optically-selected cluster cosmology: Cardinal is a valuable tool to quantify the selection bias in optically selected clusters, which affects observables such as weak lensing profiles because cluster selection is based on richness.This bias has been identified as the dominant systematic in optical cluster cosmology (Sunayama et al. 2020;To & Krause et al. 2021a,b;Wu et al. 2022).Cardinal is one of the very few mocks that the redMaPPer cluster finder can be run on, and it is so far the only mock that yields realistic redMaPPer clusters as a function of richness and redshift without artificially boosting cluster member galaxies.Future work will use Cardinal simulations to understand the selection functions of redMaPPer clusters and the level of selection biases and to motivate a flexible model to mitigate these biases.
2. Photometric reshifts: given the more consistent galaxy color distributions to the data (figure 7), Cardinal is a valuable tool to assess the uncertainties of photometric redshifts (Myles et al. 2021a).For example, one could use Cardinal mock to perform controlled experiments to quantify and develop programs to mitigate various photometric redshift calibration systematics (see e.g.Newman & Gruen 2022, for a review).This includes (a) how heterogeneous selections of spectroscopic redshift samples and redshift errors in deep fields of cosmological surveys can affect photometric redshift calibrations of wide-field galaxies, (b) how astrophysical systematics can affect the effectiveness of crosscorrelation redshifts, such as lensing magnifications, non-linear clustering, and bias evolutions, and (c) how the sample variances due to limited area of deep fields propagate to redshift uncertainties of wide-field galaxies.
3. Quantify small-scale lensing systematics: the new empirical correction of the ray tracing algorithm (section 3.5.2) makes Cardinal accurately produce onehalo term lensing profiles.Cardinal can then serve as a tool to study systematic effects of the small-scale galaxy-galaxy and cluster lensings.This includes (a) developing methods to quantify boost factor measurements of cluster lensing (Varga et al. 2019) using different photometric redshift estimations and (b) validating small-scale galaxy-galaxy lensing and clustering models based on halo occupation distribution (HOD) or conditional luminosity function (CLF).
4. End-to-end tests of multi-probe analyses: with realistic clusters, galaxies, and lensing properties in the same simulation, Cardinal is the ideal mock to develop a combine-probe analysis pipeline that aims to perform joint analyses of auto-and cross-correlations of these observables (MacCrann et al. 2018;To & Krause et al. 2021a,b;DeRose et al. 2022).
5. Defining optimal galaxy samples for clusters, galaxies, and lensing joint analyses: one of the limitations of forward modeling multi-probe analyses is that different cosmological probes rely on different galaxy samples.Under the framework of HOD models, different galaxy samples require a different set of galaxyhalo connection parameters, making parameter inferences harder and diluting cosmological signals.One interesting question is whether it is possible to define a common galaxy sample that serves as lens and source galaxies as well as can be used to identify galaxy clusters.Cardinal can serve as a sandbox to develop this project.
Cardinal can be further improved in the following directions.First, the current model for generating the colors and magnitudes of galaxies that are dependent on their environment only considers information measured in the same snapshot, even though the formation history of a galaxy plays an important role in determining its colors and brightness.The reason for this choice is purely computational: accurately determining the formation history of halos requires simulations with a high level of resolution, which in turn requires significant computing resources.However, recent developments in the hybrid Lagrangian perturbation theory model can help bypass this problem.For example, Modi et al. (2020); Kokron et al. (2021) show that using the local density of dark matter with proper weights based on initial conditions can model galaxy clustering of mock galaxies generated using the halo occupation model down to k = 0.7 h −1 Mpc.The success of these models in producing small-scale galaxy clustering indicates a significant amount of information about galaxies encoded in the initial conditions beyond the dark matter density measured in the same snapshot.Furthermore, Lucie-Smith et al. (2022) find that the initial conditions play an essential role in determining the inner part of the dark matter profiles.Thus, we expect that combining initial conditions and local density will significantly improve the accuracy of the color and magnitude assignment models.Moreover, incorporating initial conditions does not place additional requirements on simulation resolution, and thereby does not require significant additional computational power.Second, we find that the SED templates cannot summarize galaxy colors with uniform accuracy across colors.In Cardinal, we use DES photometric catalogs to solve this problem empirically.However, by doing this, we lose some ability to generate mock catalogs with arbitrary photometric bands.Future work will expand the SED templates using advanced stellar population synthesis models and the Dark Energy Spectroscopic Instrument (DESI) data.Third, we empirically address the lack of redshift evolution in the training spectra using the photometric data.Although this method works well in the end, it requires significant tuning to remove additional noise that broadens the red sequences.The main problem is that cluster finders require accurate red galaxy colors down to 0.2L * , which is m z = 21.8 at z = 0.6.The spectroscopic training samples must go down to the same magnitude with uniform selections.This data is unavailable even with Stage-4 spectroscopic surveys (e.g., DESI).However, the Cardinal algorithm only requires a small area coverage for these spectroscopic samples.These requirements align well with the need for training spectra for photometric redshift calibrations.We anticipate future campaigns to obtain photometric redshift training spectra will be helpful for generating more accurate simulations.
Finally, the current approach of generating simulations is bottom-up.We first obtain a decent luminosity-dependent galaxy clustering and group-galaxy cross-correlations.We employ conditional abundance matching techniques to paint colors to ensure reasonable color-dependent galaxy clusterings.We then perform several conditional abundance matching steps to fix problems in the color model when matching to photometric datasets.This approach has been successful but may not be scalable with the increasing amount of data.A top-down approach might scale better with a plethora of data coming in.With all the datasets and targeted summary statistics in hand, one can optimize a galaxy-dark matter connection model.Here, we specifically chose galaxy-dark matter connections instead of galaxy-halo connections because halo finding and halo definitions could suffer from limited resolution.For this top-down approach to perform well, work has to be done to avoid overfitting the data with an overly flexible model and to increase the interpretability of the constrained model.For the latter, one can derive conditional probabilities from the constrained model and combine that with the bottom-up approach to gain physical insights.We leave this interesting direction for future work.
Although Cardinal was primarily developed for validating cosmological analyses, the method has the potential for broader applications.The ultimate goal of the program is to learn the physics of the universe from the vast amounts of data collected by cosmological surveys and use this knowledge to create a mock universe with higher fidelity.However, to achieve this goal, it is necessary to have high resolution to resolve important physical processes and sufficient volumes to leverage the full constraining power of the data, making the program computationally demanding.One of the critical features of Cardinal is connecting galaxy properties between high-resolution, small-volume simulations and lowresolution, large-volume simulations.While the current connection in Cardinal is limited to galaxy luminosity, one can extend this to other properties, such as rest-frame galaxy colors, sizes, ellipticities, gas densities, and temperatures.Using improved connections, replacing the SHAM training model with first principle-driven models (e.g.SAMs or hydrodynamical simulations), and employing emulator techniques, one can constrain physics of the baryonic component of our universe, the expansion histories of the universe, the growth of cosmic structures, and the interplays between these models using the full power of the survey data.At the same time, one can use this learned physics to create mocks with rich information to facilitate multi-wavelength and multi-probe cosmology analyses.However, significant challenges remain in making the first principle-driven models of baryons computationally efficient, identifying relevant physics, and judiciously selecting informative observational data or summary statistics derived from them.
Finally, although we have been focusing on optical surveys, the Cardinal simulations can also inform many multiwavelength studies.The dark matter halos in Cardinal are resolved down to M 200m = 10 13 h −1 M , making the simulation sufficient to paint SZ and X-ray signals.Future work will generate realistic SZ and X-ray groups and clusters, making Cardinal useful for combined probe analyses of SZ, X-ray, and optical observations.
To et al (DFG, German Research Foundation) under Germany´s Excellence Strategy -EXC-2094 -390783311.Some of the computing for this project was performed on the Sherlock cluster.We thank Stanford University and the Stanford Research Computing Center for providing computational resources and support that contributed to these research results.Additional computations in this paper were run on the CCAPP condo of the Ruby Cluster at the Ohio Supercomputer Center.
This study made use of the SDSS DR7 Archive for which funding has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England.The SDSS Web Site where Poisson(x) represents a realization from the Poisson distribution with mean x, and D(L, M) represents the mapping such that N(> L) = N(> M).This mapping (D(L, M)) can be understood as assigning the brightest galaxy to the most massive halos.In equation A1, L is the luminosity without scatter, and P(L|L ) represents the relations between L and the observed luminosity of galaxies L. In most of the Subhalo Abundance Matching Models, P(L|L ) is assumed as a Gaussian distribution with a constant scatter.Under this assumption, equation A1 can be considered a convolution problem.One can then obtain N(L ) from the observed φ(L) using the Richardson-Lucy algorithm (Lucy 1974).However, much evidence has shown that the scatter in P(L|L ) varies according to halo masses (see Wechsler & Tinker 2018 for a review).Thus, we want to incorporate this modeling flexibility into our model.We adopt P(L|L ) as a Gaussian distribution with a scatter σ(L ) defined as, where A, B, and C are free parameters.With this parameterization, equation A1 remains linear.The linearity allows one to solve it using the same Richardson-Lucy algorithm, even though it cannot be phrased as a convolution problem. Figure 14 compares the performance of our implementation of the Richardson-Lucy algorithm to a widely used abundance matching code AbundanceMatching.Evidently, our implementation reaches a better accuracy than the existing code.Further, we evaluate the performance of our algorithm in the case of non-constant scatter.We find that our algorithm can achieve an accuracy of 10 percent for the best-fit model of this paper.

B. SHAM EMULATOR
We build an emulator for the SHAM model to facilitate the likelihood inference.We first generate the training data by Latin hypercube sampling of the priors described in table 2. We then generate the data vector at each sample following the procedure described in section 2.2.2.We build an emulator for each radial bin of the data vector.Specifically, given the training data and the associated data vector at a given radial bin, we construct the model using the Polynomial Chaos Expansion implemented in Chaospy (Feinberg & Langtangen 2015).In short, we first model the objective function that maps training data to the associated data vector as where c α are free parameters and m α is a set of orthogonal polynomials up to a certain degree constructed using the discretized Stieltjes procedure (Stieltjes 1884).We determine c α by minimizing the mean squared error of the input data vector.Because m α are polynomials, this minimization can be done analytically.In practice, we find that using 1000 training data and including polynomials up to the third degree are sufficient to achieve accuracy so that the uncertainties associated with the emulator are subdominant of the error budget.
To assess the accuracy of the emulator, we perform the leave-one-out test: iteratively removing one training point off the training sets, training an emulator, and calculating the differences between the removed training point and the emulator's prediction.Figure 15 shows the result of the leave-one-out test.The mean of the error is shown as white dots.Evidently, the error is much smaller than the total error budget.We, therefore, ignore the emulator errors in the likelihood inferences.
C. GAUSSIAN PROCESS OF P(R δ |M r < X, Z) MODEL Following Wechsler et al. (2022), we employ a Gaussian process model to interpolate parameters θ(M r , z) of P(R δ |M r < x, z) model determined from SHAM simulation.We first generate the training sample by measuring θ(M r , z) on 18 evenly spaced magnitude bins from M r = −22.5 to M r = −18 at each snapshot of Chinchilla-T1.We then interpolate the training data using the Gaussian process algorithm implemented using george (Ambikasaran et al. 2015).That is, we model the joint probability of the targeted θ p at parameter space x = (M r , z), and the training set θ t as,  L|L) is a Gaussian distribution with a constant scatter, conv is a convolution operation and deconv is a deconvolution operation.The x-axis shows −2.5logL.In the first panel, we perform the process on a constant scatter (0.2 dex), where we can compare the performance of our implementation (blue line) to other existing codes (orange).In the second panel, we perform the process on varying scatter using the best-fit parameters in this paper.where θ t = (θ 1 , θ 2 , . . ., θ n ) is the measured θ at x t = ((M r , z) 1 , (M r , z) 2 , . . ., (M r , z) n ), I is an identity matrix, σ n are free parameters to prevent overfitting.One can then find the θ p that maximizes the probability and use it as the interpolated value of θ at x = (M r , z).The remaining task is then to determine K and σ n .We model K(x 1 , x 2 ) combination of exponential kernel and Matérn kernel that has the following forms, where l 1 , l 2 are free parameters.We can then determine l 1 , l 2 , σ n by maximizing equation C4 evaluated on the training set θ t .
Once l 1 , l 2 , σ n is determined, we can make a prediction of θ t at other parameters by maximizing equation C4 analytically.
Figure 16 shows the Gaussian process interpolations of the training data on grids of test points as well as the training data.We find that the Gaussian process model agrees with the training data and does not show signs of over-fitting.

D. SYSTEMATIC WEIGHT
The observed galaxy densities can have spurious correlations due to varying observing conditions across the sky.While we do not have most of the varying observing conditions in the simulations, such as dust reddening, seeing, and stars, we have spatially varying survey depths in the simulation.These spatially varying survey depths will produce spurious correlations of galaxies.We remove this signal following the methods described in Rodríguez-Monroy et al. (2022).First, we bin the galaxy density into healpix maps with nside = 512 corresponding to a resolution of 0.11 degrees.Next, we measure the relation of galaxy number density and the input survey depth s used to generate Cardinal.Finally, we fit a linear function to this relation.The fitted function reads the form F = ms + c, where m and c are free parameters.The weight of each galaxy to remove the spurious correlations is then given by 1/F. Figure 17 shows the performance of this algorithm.One can see that the weights significantly suppress the correlations between galaxy density and survey depths.

E. VALIDATION OF COSMIC SHEAR
In section 3.5.2,we present an algorithm to fix the small-scale lensing around halos (cluster lensing and galaxy-galaxy lensing) using particle-halo cross-correlations.In this appendix, we verify that this correction has a negligible impact on large-scale cosmic shears and makes small-scale cosmic shears less affected by ray-tracing resolutions.Figure 18   Emu calculated using CCL (Chisari et al. 2019).This indicates that the ξ +/− calculated using corrected shapes is less affected by the limited resolution of ray tracing.This finding is consistent with Takada & Bridle (2007), where the authors show that the lensing effect from massive clusters (M > 10 13 M ) significantly contributes to the one-halo term of cosmic shear power spectra.

F. CONDITIONAL LUMINOSITY FUNCTION
We compare the conditional luminosity function in Cardinal, Buzzard v2.0, and DES-Y3 data.The measurements follow the prescriptions detailed in To et al. (2020).Figure 19 shows the result.We find that for the lower two redshift bins, the central galaxy luminosity distributions are consistent between Buzzard v2.0 and Cardinal.For the highest redshift bin, the central galaxy  luminosity in Cardinal is somewhat fainter than Buzzard v2.0.However, the relative brightness of centrals and satellites in Cardinal is more consistent with the DES-Y3 data than Buzzard v2.0.This is an important feature for the performance of cluster finders, as indicated in Kovacs et al. (2022) and To & Krause et al. (2021b).In terms of the satellites, we find that the satellite luminosity distributions in Cardinal are more consistent with the DES-Y3 data than Buzzard v2.0 on the faint end.On the bright end, Cardinal and Buzzard's satellite luminosity distributions are consistent but are fainter than the DES-Y3 data.
To et al

G. ALGORITHM OF AVOIDING WIDENING RED SEQUENCE WITH CONDITIONAL ABUNDANCE MATCHING
Here, we present an algorithm that uses redMaPPer to avoid widening the width of the red sequence in the conditional abundance matching step.First, we run the redMaPPer cluster finder in the mocks generated before applying the conditional abundance matching.For each galaxy in the mock, we obtain a χ 2 , quantifying the consistency of the multidimensional colors with the empirically constructed red-sequence model, and z red , the redshift that minimizes the χ 2 (See details in Rykoff et al. 2014).For the galaxies consistent with the red sequence (i.e., small χ 2 ), z red provides a good estimator of galaxies' redshift, with a typical uncertainty ∼ 0.02.Second, we abundance-match this χ 2 measured in mocks to χ 2 measured in the data by enforcing the equality of p(< χ 2 |m z ).In this way, we can avoid the possibility that χ 2 distributions in the mocks and data might differ.Third, we select red galaxies in the data and mocks using the abundance-matched χ 2 .According to Rozo et al. (2015b), we select red galaxies as χ 2 < 20.Third, we perform abundance matching separately for blue and red galaxies.For blue galaxies with χ 2 > 20, we enforce the equality of equation 25 between mocks and data.For red galaxies with χ 2 < 20, we use the z red information.To avoid the possibility that z red distributions in mocks and data differ, we first compute z a,red , the abundance matched z red in Cardinal by matching p(< z red |m z ) between Cardinal and data.We then match the overall color distributions of mocks and data by enforcing the equality of the following equation, P(< c i |m z , c j<i , z a,red ).(G6)

H. ALGORITHM OF MATCHING RED SEQUENCE IN MOCKS AND DATA
In this appendix, we describe an algorithm that matches red sequences in mocks and the data.We first run the redMaPPer algorithm on mocks and data to obtain a red-sequence model.This model has the following functional form, P(c|z, m z ) ∝ exp(−0.In the above algorithm, x d represents the model evaluated using DES-Y3 data, x m represents the model evaluated using mocks, R n is the distance to the nearest resolved halo whose mass is M n , and s is the square root of the diagonal terms of C int (z).The (1 − χ 2 100 ) term is to ensure a smooth transition between red galaxies and blue galaxies.F controls the non-linear dependence of the color shift and (1− χ 2 100 ), which is given by F(x) = 0.5+0.5erf((x−0.1)/0.05),where erf is the error function.(s d /s m ) 2 term ensures consistency of the scatter in the observed colors, which are proportional to the square root of the scatter of c due to the Poisson draws.The G(z) removes additional noise in the red sequence introduced by the CAM method.This additional noise becomes large when the galaxies' magnitude approaches the survey depth limits.Therefore, G(z) must be larger at higher redshifts when galaxies are fainter than their low redshift counterparts.We empirically find that G(z) = 2(0.5 + 0.5erf((z − 0.8)/0.3)+ 1.2 produces reasonable galaxy properties.Further, shifting all galaxies with χ 2 < 100 will produce too many red galaxies.We, therefore, control the number of red galaxies and their clustering with two additional cuts M n > M(z) and R n < R(z).Again, because the additional noise introduced by the CAM model becomes larger at higher redshifts, M(z) should decrease, and R(z)  Szalay 1993) to estimate Σ g (R) like the one described in Chang et al. (2018) would include additional terms and potentially lead to biases.In survey data, not all clusters live at the same redshifts.We, therefore, follow the prescription described in Chang et al. (2018) for the full survey data.We first divide the redMaPPer clusters with λ > 20 from z = 0.2 − 0.55 into ∆z = 0.025 bins.We estimate galaxy density profiles around clusters for each redshift bin ( Σg (R) i ) using equation I11.We then estimate the Σ g (R) using the weighted sum of Σg (R) i , with the number of clusters in each redshift bin as the weight.To avoid mixing galaxies with different luminosities across the entire redshift range, we further make a cut on galaxy samples at each redshift bin M r − 5 log(h) < −20.The M r is calculated assuming galaxies are at the same redshift as galaxy clusters.
In Cardinal, we repeat all the above calculations.

J. COLOR TEMPLATE BIAS SIMULATION
This section employs a simple simulation to offer valuable insights into the potential impact of the color biases presented in figure 8 on galaxy color distribution.We accomplish this by generating mock galaxy samples through a combination of two Gaussian distributions.Specifically, we draw 100k blue galaxies with a mean g − r color of 1 and a standard deviation of 0.3, followed by drawing 5k red galaxies with a mean g − r of 1.65 and a standard deviation of 0.05.The resulting galaxy color distribution is presented as the blue histogram in figure 20, designed to mirror the cosmos samples' color distribution at z = 0.43 − 0.63, as depicted in figure 5 of DeRose et al. (2019).Next, we shift the color up by 0.1 magnitude, a typical value when comparing the g − r color distribution between Buzzard and DES-Y3 data.The resulting color distribution is illustrated as the red histogram in figure 20.Subsequently, we incorporate the color biases displayed in the right panel of figure 8, resulting in a galaxy sample with the color distribution depicted as the black histogram in figure 20.Strikingly, the black histogram reproduces two key features in figure 5 of DeRose et al. (2019).First, the red-sequence galaxies corresponding to the black histogram have a similar mean color to those in the unaltered galaxy samples.Second, the color distribution of red-sequence galaxies is significantly more peaked than those of unaltered samples.This simple yet compelling simulation further supports our hypothesis regarding the color discrepancies' origin between Buzzard and data.
galaxy and galaxy-group clustering SED templates + SDSS coefficients PRIMUS red fraction SDSS color-dependent clustering Data DES luminosity and color distributions New luminosity evolution and SED parameters (Section 3.6).

Figure 4 .
Figure4.Distribution of R δ , the radius that encloses a dark matter mass of 1.3 × 10 13 h −1 M , at z = 0.Each panel shows a magnitude cut, with the brightest samples on the top and the faintest samples on the bottom.Black dots show measurements in SHAM catalogs with error bars showing 1σ Poisson error.Red lines show measurements of R δ using mock galaxies in lightcone simulations.We further break the R δ distributions in mocks into centrals (blue) and satellites (orange).As a comparison, the previous version Buzzard v2.0(Wechsler et al. 2022) is shown as gray lines.

Figure 5 .
Figure 5.Comparison of color-dependent clustering measured in Cardinal (line) and data (dots with error bars,Zehavi et al. 2011).Errorbar shows 1σ uncertainties.Red lines and dots correspond to red galaxies, while blue lines and dots correspond to blue galaxies.Different columns correspond to different magnitude bins in galaxies.For comparison, results from Buzzard v2.0 are shown in gray.

Figure 6 .
Figure 6.Mean ∆Σ profiles measured around halos with M vir > 2 × 10 14 h −1 M and z = [0.6,0.62].The x−axis is measured in physical coordinates.The green dots are derived from dark matter particle-halo cross-correlations.The orange dots are calculated from ray-tracing shapes (and are similar to previous catalog versions).The blue dots show our improved model, calculated from corrected ray-tracing shears.Error bars are estimated with 10000 bootstrap resampling.The black vertical dashed lines show ten times the pixel size for ray-tracings.

Figure 7 .
Figure7.Comparison of apparent magnitude between Cardinal and DES Y3 data.The top left panel compares overall magnitude distributions, and the bottom left panel shows the fractional differences.The top right panel compares g − r and r − i distributions, and the bottom right compares g − r and i − z distributions.The black lines show measurements of DES Y3 data, the dashed lines show Cardinal before applying the conditional abundance matching scheme (detailed in section 3.6), and the solid lines show Cardinal after applying the conditional abundance matching scheme.Contours show 1σ and 2σ boundaries of the density distributions.The new conditional abundance matching scheme greatly improves the consistency between mocks and data.

Figure 8 .
Figure 8. Color (g − r) residuals as a function of g − r color (c d).c m is the g − r color constucted using k-correction templates and bestfit coefficients.This calculation uses redMaPPer member galaxies with redMaPPer run on DES-Y3 data.Each column corresponds to a redshift slice using redshifts of host galaxy clusters.Top panels: dots show the median c m − c d in each color bin, while error bars show 68 percent scatter.Magenta lines show 1σ widths of red sequence reported by redMaPPer.We split the samples into two subsets: galaxies that are bright with m z < 18 (green), and galaxies that best match redMaPPer's red-sequence template (blue).Both subsets are artificially shifted horizontally to improve clarity.Bottom panels: histograms show the measured color distributions of all galaxies in the calculation.Magenta lines show the mean (solid) and width (dashed) of the red sequence reported by redMaPPer.For all samples, the reconstructed color is mostly unbiased at z = 0.2-0.3.At z = 0.4-0.5, the reconstructed color is biased for red galaxies and mostly unbiased for blue galaxies.

Figure 9 .
Figure 9.Comparison of red-sequence galaxies in Cardinal (red) and DES Y3 data (black) estimated by redMaPPer.Three columns correspond to g−r, r−i, i−z colors, respectively.The top row shows the mean colors at m z = 19 as a function of redshift.The middle row shows the slope of colors-m z relation as a function of redshifts estimated by redMaPPer.The bottom row shows the scatter of redsequence colors.The largest discrepancy occurs at high redshift g − r colors, where the data is particularly noisy.
compares the cluster abundances as a function of richness in four different redshift bins

Figure 10 .
Figure 10.Comparison of cluster number density in Cardinal and DES-Y3 data.The top left plot shows redMaPPer cluster abundances per deg 2 as a function of richness (λ) and redshift (z λ ).The top right plot shows the relative cluster number density of each realization relative to Buzzard v2.0.Four different rows in the top two panels show four redshift bins based on z λ , the most probable redshifts of clusters estimated by redMaPPer.The black lines correspond to the DES-Y3 data, and the blue lines correspond to Buzzard v2.0 (DeRose et al. 2022).We note that in both cases, we re-run redMaPPer using the same version of redMaPPer as used in Cardinal to ensure apples-to-apples comparisons.The red lines show the number of redMaPPer identified clusters in Cardinal.Evidently, one can see that the cluster abundances in Cardinal are much more consistent with the data.The green and orange lines show two different realizations that are intermediate steps from Buzzard v2.0 to Cardinal.Specifically, the orange line corresponds to a realization where the only difference to Buzzard v2.0 is including orphans in the SHAM model (detailed in section 2).The green line represents one step forward by changing the galaxy color assignment models detailed in section 3.4.Error bars show the Poisson noise.For further comparison, the bottom panel shows the richness ratio as a function of cumulative cluster abundance between simulations and DES-Y3 data.The top x-axis shows the corresponding richness of DES-Y3 data at a given cumulative number density.Error bars are estimated with 50 jackknife resampling.

Figure 11 .
Figure11.Stacked surface densities of galaxies around redMaP-Per clusters with λ > 20 and redshift 0.2 to 0.55.Errorbars show 1σ uncertainties estimated using 64 jackknife resampling.Black dots show measurements for DES-Y3 galaxies and red dots show measurements for Cardinal.

Figure 12 .
Figure 12.Comparison of redshift distributions.The top panel shows redshift distributions estimated by redMaGiC in Cardinal (dashed) and data (solid).The bottom panel compares true (solid) and photometric redshift distributions estimated by red-MaGiC (dashed) in Cardinal.Different colors show different tomographic bins.

Figure 13 .
Figure 13.Comparison of redMaGiC clustering in Cardinal (red lines) and DES Y3 data (black lines).Five different panels show the five tomographic bins based on z redmagic , with edges at [0.15, 0.35, 0.5, 0.65, 0.8, 0.9].Error bars show 1σ confidence region.For comparison, blue lines show the galaxy clustering of Buzzard v2.0 using the same dark matter simulations as Cardinal.The magenta lines show galaxy clustering of Buzzard v2.0 averaged over 18 DES-Y3 realizations (DeRose et al. 2022).
is http://www.sdss.org/.The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions.The Participating Institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAM-OST), Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.This project used public archival data from the Dark Energy Survey (DES).Funding for the DES Projects has been provided by the U.S. Department of Energy, the U.S. National Science Foundation, the Ministry of Science and Education of Spain, the Science and Technology FacilitiesCouncil of the United Kingdom, the Higher Education Funding Council for England, the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the Kavli Institute of Cosmological Physics at the University of Chicago, the Center for Cosmology and Astro-Particle Physics at the Ohio State University, the Mitchell Institute for Fundamental Physics and Astronomy at Texas A&M University, Financiadora de Estudos e Projetos, Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro, Conselho Nacional de Desenvolvimento Científico e Tecnológico and the Ministério da Ciência, Tecnologia e Inovação, the Deutsche Forschungsgemeinschaft, and the Collaborating Institutions in the Dark Energy Survey.The Collaborating Institutions are Argonne National Laboratory, the University of California at Santa Cruz, the University of Cambridge, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas-Madrid, the University of Chicago, University College London, the DES-Brazil Consortium, the University of Edinburgh, the Eidgenössische Technische Hochschule (ETH) Zürich, Fermi National Accelerator Laboratory, the University of Illinois at Urbana-Champaign, the Institut de Ciències de l'Espai (IEEC/CSIC), the Institut de Física d'Altes Energies, Lawrence Berkeley National Laboratory, the Ludwig-Maximilians Universität München and the associated Excellence Cluster Universe, the University of Michigan, the National Optical Astronomy Observatory, the University of Nottingham, The Ohio State University, the OzDES Membership Consortium, the University of Pennsylvania, the University of Portsmouth, SLAC National Accelerator Laboratory, Stanford University, the University of Sussex, and Texas A&M University.Based in part on observations at Cerro Tololo Inter-American Observatory, National Optical Astronomy Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation.APPENDIX A. VARYING SCATTER IN THE SUBHALO ABUNDANCE MATCHING MODEL This section describes an algorithm to incorporate varying scatter in the Subhalo Abundance Matching Model.Given an observed luminosity function φ(L) and a halo mass function N(M), one can relate the two functions via, φ(L) = Poisson P(L|L )N(L ) dL , N(L ) := D(L , M)N(M) dM, (A1)

Figure 14 .
Figure14.Performance of the algorithm.The y-axis shows the relative error caused by the Richardson-Lucy process, evaluated as conv (deconv (φ(L))) /φ(L) − 1, where conv shows the process in equation A1, deconv shows the Richardson-Lucy process, and φ(L) shows the original luminosity function.If P(L|L ) is a Gaussian distribution with a constant scatter, conv is a convolution operation and deconv is a deconvolution operation.The x-axis shows −2.5logL.In the first panel, we perform the process on a constant scatter (0.2 dex), where we can compare the performance of our implementation (blue line) to other existing codes (orange).In the second panel, we perform the process on varying scatter using the best-fit parameters in this paper.

Figure 15 .
Figure15.Accuracy tests of the SHAM emulators.The top row corresponds to galaxy-galaxy group correlations in each magnitude bin and the bottom row shows galaxy auto correlations.Each panel shows emulator errors relative to the total error budget.Shaded blue regions are constructed by performing the leave-one-out test: removing one training data, training emulators, and calculating the error of the emulator at the removed point.The mean of the distribution is indicated by white dots and the black line shows 25 to 75 percent quantiles.Evidently, the emulator errors are much smaller than the total error budget in all radial bins and can therefore be ignored in the likelihood inferences.

Figure 16 .
Figure 16.Parameters of P(R δ |M r , z) model.The line shows the Gaussian process models for the redshift and magnitude dependence.

Figure 17 .
Figure 17.Performance of systematic weights on removing spurious correlations.The plot shows the observed pixel number densities evaluated on nside = 512 maps as a function of the limiting magnitudes of each pixel.The original redMaGiC number density is shown as the blue line, and the weighted number density is shown as the orange dots.Error bars show the 1σ error estimated with 150 jackknife resamples.

Figure 18 .
Figure18.Comparison of ξ +/− calculated using original shear from ray-tracing (orange) and corrected shears (blue) of galaxies in Cardinal with z = [0.6,0.9].Green lines show theory prediction using CAMB and Cosmic Emulator.Shaded regions show 1σ error estimated with 204 jackknife resampling.Bottom panels show fractional differences of ξ +/− measurements and theory.The shaded region shows 1σ error.The vertical dashed line shows ten times the ray-tracing resolution.

Figure 19 .
Figure 19.The conditional luminosity function for central galaxies (solid lines) and satellite galaxies (dashed lines) for redMaPPer clusters in Buzzard v2.0 and Cardinal (blue and orange lines, respectively) and the DES-Y3 data (black lines).Different panels show the different richness and redshift bins, as indicated in the legend in each panel.Richness increases from top to bottom, and redshift increases from left to right.Error bars and shaded regions show the 1σ errors estimated from 50 jackknife resampling.

F
5χ 2 ), χ 2 = (c − c|z, m z ) (C int (z) + C err (z)) −1 (c − c|z, m z ) T ,(H7)wherec = [g − r, r − i, i − z], C int (z)is the intrinsic scatter of red sequence, and C err (z) is the observational noise.The mean color c|z, m z is modeled as a simple power law c(z) + s(z)(m z − m p (z)), where c(z), s(z), and m p (z) are free parameters.With P(c|z, m z ) models for mocks and data, we can enforce the consistency of red sequence using the following algorithm,for galaxies with redshift z and observed magnitudes m g,r,i,z do m o g,r,i,z ← noisy realizations of m g,r,i,z using equation 20.Calculate χ 2 using m o g,r,i,z and P m (c|z, m z ) if χ 2 < 100 and R n < R(z) and M n > M(z) then for c ∈ {g − r, r − i, i − z} doc+ = c|z, m z m ) s d sm 2 + c|z, m z d − c end for m g ← c g−r + c r−i + c i−z + m z m r ← c r−i + c i−z + m z m i ← c i−z + m z end if end for

Figure 20 .
Figure 20.Color distributions of mock samples with the blue histogram representing the true color distribution.The red histogram corresponds to the samples when colors are shifted by 0.1 magnitude.The black histogram shows the case when we summarize the color using SED templates.To generate the black histogram, we apply the biases shown in the right panel of figure 8 to the galaxies in the red histogram.
K.DIFFERENCES BETWEEN CARDINAL AND BUZZARD V2.0This section summarizes the main improvements from Buzzard v2.0 to Cardinal.We divide these improvements into four categories, summarized below.1. Subhalo abundance matching: (a) We consider a luminosity-dependent scatter.(b) Orphan subhalos are included, whose abundances are controlled by a specially-designed orphan model.(c) The group-galaxy cross-correlation function is used in addition to the galaxy-galaxy correlation function to constrain the model.
Reddick et al. 2013;Lehmann et al. 2017;DeRose et al. 2021;Contreras et al. 2021), one parametrizes P(L|L ) as a log-normal distribution with mean L = L and scatter σ.With this assumption, equation 5 can be viewed as a convolution problem and can be solved using standard deconvolution algorithms.However, numerous lines of evidence based on observations and simulations have shown that the scatter in P(L|L ) might depend on halo mass (seeWechsler & Tinker 2018, for review).Wechsler & Posteriors of the extended SHAM model given SDSS galaxy clustering and group galaxy cross-correlations.Contours show 68 percent confidence region of the posteriors given w gg and w cg .Red lines show constraints based on Chinchilla-T1, and blue lines show constraints based on SMDPL.Details of these simulations can be found in table 1. Descriptions of these parameters are given in table 2.