Merger-tree-based Galaxy Matching: A Comparative Study across Different Resolutions

We introduce a novel halo/galaxy matching technique between two cosmological simulations with different resolutions, which utilizes the positions and masses of halos along their subhalo merger tree. With this tool, we conduct a study of resolution biases through the galaxy-by-galaxy inspection of a pair of simulations that have the same simulation configuration but different mass resolutions, utilizing a suite of IllustrisTNG simulations to assess the impact on galaxy properties. We find that, with the subgrid physics model calibrated for TNG100-1, subhalos in TNG100-1 (high resolution) have ≲0.5 dex higher stellar masses than their counterparts in the TNG100-2 (low resolution). It is also discovered that the subhalos with M gas ∼ 108.5 M ⊙ in TNG100-1 have ∼0.5 dex higher gas mass than those in TNG100-2. The mass profiles of the subhalos reveal that the dark matter masses of subhalos in TNG100-2 converge well with those from TNG100-1, except within 4 kpc of the resolution limit. The differences in stellar mass and hot gas mass are most pronounced in the central region. We exploit machine learning to build a correction mapping for the physical quantities of subhalos from low- to high-resolution simulations (TNG300-1 and TNG100-1), which enables us to find an efficient way to compile a high-resolution galaxy catalog even from a low-resolution simulation. Our tools can easily be applied to other large cosmological simulations, testing and mitigating the resolution biases of their numerical codes and subgrid physics models.


INTRODUCTION
Emerging as powerful tools in modern astrophysics, cosmological simulations enable scientists to study the formation of dark matter (DM) halos, galaxies, and large-scale structures.DM-only (DMO) simulations, paired with semianalytic models, have achieved significant success in advancing galaxy formation theory (e.g., Lacey & Cole 1993).Fur-thermore, the increase in computational power has also paved the way for the reproduction of baryonic components -such as the interstellar medium, stars, and black holes (for a recent review, see Crain & van de Voort 2023).However, even as the resolution of these simulations improves, it remains computationally challenging to fully resolve the intricate physics of many processes of interest.
Given the complexities of baryon physics and the differences in resolution scales, a purely ab initio approach is impractical.This is where subgrid models become indispensable in cosmological simulations, serving as a mechanism to implement crucial baryonic physics -star formation and feedback, interstellar medium, massive black hole (MBH) accretion and feedback, and so on -when the computational resolution is limited.The same model, however, can behave differently based on the numerical scheme and resolution employed in the simulation (Schaye et al. 2015).To enhance the reliability of simulations, it is crucial to quantify the impacts of resolution on the subgrid recipes.Schaye et al. (2015) introduced the terms 'weak convergence' and 'strong convergence' in the context of numerical simulations' predictive power with respect to observables.'Strong convergence' is achieved when the subgrid model yields consistent results across different resolutions.However, most observables exhibit only 'weak convergence', meaning that the results require re-calibration of the subgrid parameters with changes in resolution to some extent.For example, the EAGLE simulation (Schaye et al. 2015;Crain et al. 2015) exhibits a good weak convergence in the galaxy stellar mass function.This means that the convergence of the galaxy stellar mass function is achieved by adjusting the feedback parameters for a different resolution accordingly.However, a strong convergence test -conducted without calibrating subgrid parameters -shows that the galaxy stellar mass function at M ⋆ ∼ 10 9 M ⊙ in a simulation with 8 times better mass resolution is 0.4 dex higher.Schaye et al. (2015) also tested the convergence of the recalibrated model -feedback parameters were adjusted to achieve the galaxy stellar mass function -in the other properties.They found a good weak convergence in terms of galaxy sizes, black hole mass, and star formation rate for a given galaxy stellar mass in the EAGLE simulation.However, for the mass-metallicity relation, the strong convergence test yields significantly better results than the weak convergence test with the recalibrated model.This result suggests that, while weak convergence can be achieved for some properties by adjusting subgrid parameters with resolution, other properties may not converge using these parameters.
Resolution convergence have been studied in large-scale cosmological simulation, using matter power spectrum (van Daalen & Schaye 2015;Schneider et al. 2016;Snaith et al. 2018), gas fraction (Crain et al. 2007;Davé et al. 2013;Qin et al. 2017;Ludlow et al. 2020), internal structures of dark matter halos and stellar kinematics (Ludlow et al. 2019(Ludlow et al. , 2020(Ludlow et al. , 2023)), major mergers (Sparre & Springel 2016), and compact galaxies (Chabanier et al. 2020).In the ILLUSTRISTNG (TNG hereafter) simulations, Pillepich et al. (2018a) investigated the effects of varying numerical resolution while maintaining a fixed subgrid physics model.They observed that as the resolution increases, galaxies tend to have higher stellar masses.They also found that, for a given stellar mass, the increased resolution results in higher black hole mass and metallicity.In their highest resolution run, TNG50, a convergence test was conducted on morphological properties such as galaxy sizes, disk heights, and kinematics (Pillepich et al. 2019).Comparisons of various other properties, such as stellar morphologies and star-forming activities (Zanisi et al. 2021) and the radial distribution of satellite galaxies (Riggs et al. 2022) with simulations of different resolutions within the TNG suite have also been performed.
However, most of these resolution studies have been done through several key scaling-relation comparisons, not galaxy-by-galaxy comparisons.This is primarily due to the difficulty of matching halos and galaxies in simulations with different resolutions.It is therefore notable that in comparison studies between hydrodynamic simulations and their DMO counterparts -where the number of dark matter particles is identical in both simulations -researchers have been able to directly investigate the role of baryonic physics by matching galaxies across the two types of simulations (e.g., Sawala et al. 2013;Schaller et al. 2015).Lovell et al. (2022) adopted a similar matching technique, and applied a machine learning (ML) method to correct the differences induced by baryonic physics.These comparison studies were possible because DMO simulations, in most cases, start from the same initial conditions and with the same DM particle IDs as used in the hydro simulations.If the DM particles in both the hydrodynamic and DMO simulations share the same particle numbering scheme, matching pairs of halos can be readily identified by locating the halo containing particles with same particle IDs (e.g., as demonstrated by Schaller et al. 2015).This indicates that they originated from the same patch region in the initial conditions.
Applying this matching technique directly to pair halos/galaxies in high-and low-resolution simulations is challenging because the numbers of DM/gas particles are different in the two simulations.In such cases, one possible solution is to trace each particle back to its initial conditions and identify the nearest counterpart particle in the other simulation (e.g., Ludlow et al. 2023).Instead, we develop a physically-motivated matching algorithm that utilizes the positions and masses of halos along the merger tree in the matching process.Using this algorithm, we for the first time perform a galaxy-by-galaxy resolution study with our matching halo/galaxy catalogs, and investigate how the galaxy properties differ with varying resolutions in the TNG simulation suite.Moreover, inspired by ML techniques used to paint baryonic properties onto DM halos in DMO simulations (e.g., Jo & Kim 2019;Lovell et al. 2022;Jespersen et al. 2022), we seek to adjust the physical properties of halos found in low-resolution simulations.As a first test, we implement an ML model for TNG300-1 subhalos which "corrects" their properties to match those in a higher resolution simulation, TNG100-1.1.The key parameters for the TNG simulation suite used in our study.From left to right: the box length (in a comoving unit), number of DM particles, DM particle mass, target baryon mass, Plummer equivalent gravitational softening length of the DM and stellar particles, minimum value of the adaptive gravitational softening length for gas (in a comoving unit), and median gas cell radius.Note that the gas cell mass has continuous values centered around the target baryon mass, m gas , within a factor of two (Pillepich et al. 2018a).See Pillepich et al. (2019), Nelson et al. (2019a), or the TNG Collaboration website for a detailed list of the parameters.
The remainder of this paper is organized as follows.In Section 2, we describe the simulation suite used, the subhalo matching technique, and a brief illustration of the ML model employed.In Section 3 we demonstrate that, while DM halo properties exhibit strong resolution convergence, the baryonic properties change with resolution (Section 3.1).We further investigate the differences in the radial mass profiles of the subhalos (Section 3.2).We then apply an ML method to "correct" the resolution biases in stellar mass, gas mass, and metallicity (Section 3.3).Lastly, in Section 4, we discuss the validity of our matching catalogs and analyze which halo features are more influential in our ML model.

ILLUSTRISTNG Simulation Suite
ILLUSTRISTNG (TNG in short) is a series of large cosmological simulations using the moving-mesh code AREPO (Springel 2010).TNG implemented various sophisticated subgrid physics model including magnetohydrodynamics (Pakmor et al. 2011;Pakmor & Springel 2013), black hole accretion and feedback (Weinberger et al. 2017), galactic winds and stellar evolution (Vogelsberger et al. 2013;Pillepich et al. 2018a), and metal advection (Naiman et al. 2018;Pillepich et al. 2018a).In this paper, we primarily employ four simulations in their suite: TNG100-1, TNG100-2, TNG50-1, and TNG50-2.TNG100-1 served as the fiducial simulation for which the subgrid physics model parameters have been calibrated, with a box size of (106.5 cMpc) 3 (Nelson et al. 2019b). 1 The physics model and parameters used in TNG100-1 were implemented in all TNG simulations.TNG50-1 was the highest resolution run in the TNG suite, with a smaller box size of (51.7 cMpc) 3 .TNG100-2 (TNG50-2) was a lower-resolution counterpart of TNG100-1 (TNG50-1).With the same initial condition, each parti-cle mass in TNG100-2 (TNG50-2) is 8 times higher than the flagship run, TNG100-1 (TNG50-1).The mass and spatial resolutions of these simulations are listed in Table 1, including TNG300-1 which is used for the analysis in Section 3.3.To construct the matching algorithm in Section 2.2, we employ TNG100-1-Dark and TNG50-1-Dark, the DMO counterpart simulation of TNG100-1 and TNG50-1, respectively.The pair of simulations have an identical number of DM particles, but lack the baryon components.Therefore, the DM particle mass in TNG100-1-Dark and TNG50-1-Dark is Ω matter /Ω DM times higher than that in the hydrodynamic simulations.
TNG implemented different softening lengths by mass resolution, following ε DM, stars = L box /N 1/3 DM /40 for DM and stellar particles, where L box is the simulation box length and N DM is the total number of DM particles (Pillepich et al. 2018a).The adaptive softening length of gas followed ε gas = 2.5 r cell , where r cell is the effective radius of a gas cell (Pillepich et al. 2018a).The black hole kernel-weighted neighbor number was also scaled with the resolution as n ngb ∝ m −1/3 baryon (Weinberger et al. 2017).In addition, due to numerical reasons, the physics model for TNG50 was slightly modified -specifically, the dependence of the star formation timescale on gas density (see Section 2.2 in Nelson et al. 2019a).Except for these listed above, all the physics models are identical across all TNG simulations, regardless of the resolution.Most notably, all TNG simulations adopt the same star formation density threshold, n H ≃ 0.1 cm −3 , above which stochastic star formation occurs (Pillepich et al. 2018a).
DM halo catalogs distributed by the TNG Collaboration were compiled using halos identified by the Friends-of-Friends algorithm (FoF; Davis et al. 1985).The subhalos, which are self-bounded substructures like the central and satellite galaxies within the halos, were further identified using the SUBFIND algorithm (Springel et al. 2001;Dolag et al. 2009).In this work, the gas mass and stellar mass of the subhalos represent the total mass of components bound to the subhalos, determined by the SUBFIND algorithm.From the .Schematic diagram of the subhalo matching method between TNG100-1 (high-resolution) and TNG100-2 (low-resolution) using machine learning (ML).We start from subhalos in TNG100-1 and their DMO counterparts in TNG100-1-Dark, where the subhalo matching catalog already exists (first panel from top left).We extract positions and masses of each subhalo along the main branch of their merger tree, and evaluate the similarity scores for all possible pairs between subhalos in TNG100-1 and TNG100-1-Dark (second and third panel).The matching function is governed by several parameters as described in Eqs.( 1) and ( 2).After we tune the parameters that optimally predict the existing catalog (fourth panel), the matching function with the optimal set of parameters is used to construct a matching catalog between the subhalos in TNG100-1 and TNG100-2 (fifth panel).See Section 2.2 and Appendix A for a detailed description of the pipeline.
snapshot 0 (z = 20.05) to snapshot 99 (z = 0), subhalos across cosmic times were linked together in a merger tree using two different algorithms: LHALOTREE (Springel et al. 2005) and SUBLINK (Rodriguez-Gomez et al. 2015).For a detailed comparison of the similarities and differences between the two algorithms, refer to Nelson et al. (2015).Unless specified, we use the SUBLINK merger trees for our analyses.
Since the DM particle IDs in the hydrodynamic simulations and their DMO counterparts are numbered in exactly the same way, the TNG Collaboration provides subhalo matching catalogs between the hydrodynamic and DMO runs (Nelson et al. 2015).For each subhalo in the hydrodynamic simulation, the subhalo with the largest number of the shared DM particles in their LHALOTREE merger trees in the DMO run was chosen to be the best-matched candidate.After repeating the process, but this time finding a best-matched candidate in a hydrodynamic run for a subhalo in the DMO run, only the matching pairs that were bijectively matched (i.e., both directions yield identical matching results) were saved.

Matching Subhalos Between Simulations
To construct and test the matching algorithm, we start with TNG100-1 and its counterpart DMO run, TNG100-1-Dark, where matching catalogs between the two simulations are provided by the TNG Collaboration (Section 2.1).During this training phase, our aim is to determine the optimal Subhalo DM mass (log [M DM /M ]) Subhalos in TNG50-1 that have pairs in TNG50-2 Subhalos in TNG50-2 that have pairs in TNG50-1 The accuracy of the matching method compared to the existing subhalo matching catalog.Matching accuracies for subhalo matching between TNG100-1 and TNG100-Dark are represented by blue circles, and those between TNG50-1 and TNG50-Dark by orange triangles.Central subhalos (the primary subhalo in its FoF group) and satellite subhalos are differentiated by solid and dashed lines, respectively.
The matching accuracies of central subhalos in both simulation pairs remain ∼ 0.995 across all subhalo mass ranges, while the matching accuracies for satellite subhalos are relatively lower.Note that at the high mass end, with M DM > 10 14 M ⊙ , only central subhalos in TNG100-1 exist in the test set.(middle and right) The fraction of subhalos in the high-resolution simulation for which their counterparts in the lowresolution simulation are identified by our subhalo matching method (and vice versa).The gray vertical lines indicate 3 × 10 9 M ⊙ and 3 × 10 8 M ⊙ in the middle (for TNG100) and right panel (for TNG50), respectively.For over 98% of subhalos with mass M halo > 10 10 M ⊙ in TNG100-1 (TNG100-2), their matching pairs are identified in TNG100-2 (TNG100-1).See Section 2.2 for more details.
parameters for the galaxy/subhalo matching model between two simulations. 2 We first evaluate the "similarity" between a subhalo in TNG100-1 (denoted as "hydro") and a subhalo in TNG100-1-Dark (denoted as "DMO").This evaluation begins with acquiring the positions and masses of all the halos at z = 0 along the main branches of their merger trees, x , where I indicates a snapshot number (ranging from N start to 99, with N start indicating the snapshot of the last leaf on main branch).Our goal is then to determine whether the two subhalos have similar positions and DM masses at all times along the main branches of their respective merger trees.Specifically, we employ the following scoring scheme to evaluate the similarity S between the two halos: where 2 The basic assumption here is that the optimal ML model for matching subhalos between high-and low-resolution simulations is very close to that between hydrodynamic and DMO simulation.However, the optimal set of parameters could be somewhat different between the two pairs of simulations.Therefore, in order to acquire a solid performance, we impose a very strict criteria as we build our training set, selecting only the subhalo pairs that are bijectively matched between two simulations (Section 2.1), and those with prediction probabilities higher than 0.7 (Appendix A).
Here, γ, w mass , w I , and b are parameters that we have trained using TNG100-1 and TNG100-1-Dark pairs, choosing parameters that optimally predict the "true" pair in the existing catalog provided by the TNG Collaboration. 3 We calculate the distance, |x |, in each snapshot I, for all progenitors along the main branch of the merger tree.If a subhalo in the DMO simulation has no progenitors in a certain snapshot (e.g., when the subhalo formed at a later epoch), we impose a penalty value of f (I) = p to that pair.The value of p is also determined during the training phase.The score S close to zero means that the two subhalos in two simulations formed at similar epoch, and have almost identical growth histories in terms of their DM masses and positions.
In the left panel of Figure 2, we present the accuracy of matched pairs, defined as the fraction of predictions that are identical to those from the existing catalog, tested between the hydro and DMO simulations.For verification purposes, we allocate 60% of the simulation volume to the training set and 20% each to the validation and test sets.The results from the test set indicate that, for the pairs predicted by our matching algorithm, 99.6% of the central subhalos are matched identically with the catalog provided by the TNG Collaboration.However, the satellite subhalos exhibit relatively lower accuracy, with 87.5% of subhalo pairs in TNG100 and 84.3% in TNG50 being correctly matched.It is noteworthy that most of the incorrect matches are attributed to subhalos lacking counterparts in the existing matching catalogs.Our model shows a limitation in eliminating those subhalos from the catalogs, occasionally pairing with subhalos considered to have a similar growth history.A similar, possibly lower level of accuracy is expected when the model is applied to actual datasets, comparing the high-and low-resolution simulations.In Section 4.1, we further discuss that most of those pairs we match originate from a similar dark matter patch in the initial condition.However, for some pairs, the extent of the similarity -measured by the fraction of shared DM particles -is insufficient compared to those in the existing catalog by the TNG Collaboration.
During the application phase, we utilize the trained machine to match subhalo pairs between high-resolution (TNG100-1) and low-resolution (TNG100-2) simulations.By pairing the two subhalos with the highest similarity, we construct the matching catalog between the simulations with two different resolutions.We apply the same matching process for TNG50-1 and TNG50-2.The fraction of subhalos that have a counterpart in the other simulation is shown in Figure 2. We are able to identify matching subhalos for over 98% of those with M DM > 10 10 M ⊙ in TNG100-1 or TNG100-2 (or M DM > 10 9 M ⊙ in TNG50-1 or TNG50-2).In contrast, the fraction drops dramatically for subhalos close to the halo mass threshold of 2 × 10 9 M ⊙ or 33 DM particles for TNG100 (or, 2 × 10 8 M ⊙ or 55 DM particles for TNG50).This drop occurs mainly due to the cases where one of the subhalo pairs falls below the mass threshold.Therefore, for most of the subsequent analyses we employ the matched pairs in which DM masses in both of the simulations exceed 3 × 10 9 M ⊙ or 50 DM particles for TNG100 (or, 3 × 10 8 M ⊙ or 82 DM particles for TNG50; the gray vertical lines in Figure 2).We refer the interested readers to Appendix A for the detailed explanation of the matching process.

Machine Learning Model To Correct Low-resolution Subhalos
Because the properties derived from lower-resolution galaxies are biased, we compensate for the resolution difference with a model trained on the high-resolution simulations.We employ LIGHTGBM (Ke et al. 2017), a modified version of the gradient tree boosting algorithm, and construct three ML models -each predicting stellar mass, gas metallicity, and gas mass.In the algorithm, rather than training a single decision tree, an ensemble of decision trees is iteratively trained.Each subsequent tree emphasizes correcting the mistakes of its predecessor by assigning greater weight to exam-ples that the previous tree found challenging to predict.However, unlike traditional gradient boosting algorithms which build trees level (depth)-wise, LIGHTGBM employs a leafwise growth strategy, splitting the leaf with the maximum loss reduction.For an in-depth discussion of LIGHTGBM and the tree boosting algorithms, see Ke et al. (2017).
For our input data, we utilize 23 features for each snapshot, encompassing both the subhalo properties and their host halo properties.The features used are detailed in Table 2.It is worth to note that, inspired by the previous studies that either preprocessed the temporal history of a target halo into tabular data (Jo & Kim 2019), or those that used the merger trees themselves as inputs (McGibbon & Khochfar 2022;Jespersen et al. 2022), we extract all 23 features for all the main progenitors of the target subhalo.That is, we use a total of 2300 features (23 features × 100 snapshots) as inputs for a single subhalo.For the machine training we use the matching subhalo catalogs from TNG100-1 and TNG100-2 (Section 2.2).We divide the subhalo pairs into 80% for a training set and 20% for a test set.A 3-fold cross-validation method is used for the hyperparameter tuning during the training.
As we will discuss in Section 3.1, a significant number of subhalo properties at the lower-mass end of the subhalo sample have null values, due to the resolution limit of the simulation.To properly handle our data with frequent zero-valued properties (so-called "zero-inflated" data), we adopt a twostep approach.We develop a binary classification model that predicts whether a property (e.g., the gas mass of the subhalo) is zero or not.For this model, we use the AUC-ROC score (Area Under the Curve -Receiver Operating Characteristic) as a metric.Then, for the subset that the initial model identifies as possessing a physical component, we apply a regression model to predict the actual value of the property, using the mean squared error (MSE) as a performance metric.

Galaxy-by-galaxy Resolution Study On Key Halo Properties
In this section, we analyze the differences in subhalo properties between the high-resolution simulation (TNG100-1) and its low-resolution counterpart (TNG100-2; see Section 2.2).In Figure 3, we show how three key halo properties that are primarily influenced by DM and gravity -DM mass, velocity dispersion, and peculiar velocity -differ between TNG100-1 and TNG100-2.These properties show very similar values even when the resolution changes, although the subhalos' velocity dispersions in TNG100-1 are ∼ 4% higher than in TNG100-2, especially at the high-mass end.This agreement suggests that these subhalo properties exhibit ro-

Feature name Description
Total DM mass of the subhalo Total stellar mass of the subhalo Total gas mass of the subhalo Total black hole mass of the subhalo Maximum value of the spherically-averaged rotation velocity One-dimensional velocity dispersion of all the member particles/cells Magnitude of the subhalo total spin VelSpinAngle Cosine similarity between peculiar velocity and spin Magnitude of the subhalo peculiar velocity Mass-weighted average metallicity of the star particles Mass-weighted average metallicity of the gas cells within twice the stellar half mass radius Mass-weighted average metallicity of the gas cells within the maximum circular velocity radius Radius containing half of the DM mass of the subhalo Radius containing half of the stellar mass of the subhalo Radius containing half of the gas mass of the subhalo Sum of the star formation rates (SFRs) of all the gas cells in the subhalo DM mass of the host halo containing the target subhalo Stellar mass of the host halo containing the target subhalo Gas mass of the host halo containing the target subhalo Virial radius of the host halo containing the target subhalo Mass-weighted average metallicity of the host halo containing the target subhalo Distance between the subhalo and center of the host halo Mass fraction of the main progenitor among all progenitors Table 2.The list of features used in our ML model to correct the resolution biases of low-resolution subhalos to match their high-resolution counterparts.We use these features along 100 snapshots of the main branch of each subhalo's merger tree.The features marked with a superscript "(a)" follow the same definition described in the TNG webpage, https://www.tng-project.org/data/docs/specifications/.For the ablation study in Section 4.2, we classify some feature into the following types: the features marked with "(b)" can also be obtained in the DMO simulations, and the features marked with "(c)" contain local environmental information.
bust convergence to changes in DM mass resolution. 4We conduct the same analysis for the TNG50-1 and TNG50-2 subhalo pairs, as shown in Figure 4. We find similar agreements with the varying resolution.Although we observe a robust "resolution-wide" convergence for those three subhalo properties, there are noticeable systematic discrepancies in baryonic characteristics between different resolutions.In Figure 5, we compare four baryonic properties: stellar mass, gas metallicity, gas mass, and black hole mass.First, in panel (a), we observe that subhalos in TNG100-1 have ≲ 0.5 dex higher stellar mass than those in TNG100-2 across the entire stellar mass range.The difference in stellar masses are attributed to the (subgrid) baryon physics models in TNG, which exhibits a higher star formation rate and stellar mass in a simulation with higher resolution (Pillepich et al. 2018a,b).Our results align with the earlier studies in which the stellar-to-halo mass ratio was found to be a factor of two higher in TNG100-1 compared to TNG100-2 for halos with M DM ∼ 3 × 10 11 M ⊙ (Pillepich et al. 2018a).Note that some subhalos which lack a stellar component in either of two simulations are excluded.In the upper marginal panel, the black solid line represents the fraction of subhalos of a given stellar mass in TNG100-1 that also have stellar particles in their TNG100-2 counterparts.One can see, for example, that less than half of the subhalos with M star ∼ 10 7 M ⊙ in TNG100-1 have their counterpart subhalos in TNG100-2 that contain non-zero stellar particles.Conversely, the black dashed line in the right marginal plot indicates the fraction of subhalos in TNG100-2 that also have stellar particles in TNG100-1.Since it is easier for the subhalos at better spatial resolutions to host a gas cell that exceeds the star formation density threshold, many subhalo pairs have stellar particles only in TNG100-1.
The difference in star formation rates results in higher gas metallicities in the high-resolution versions of the two simulations, as depicted in panel (b) of Figure 5. Significant scatters exist between the subhalo pairs, suggesting that the   3, but for TNG50-1 (high-resolution run) and TNG50-2 (low-resolution run).Again, the three properties show good convergence with the varying resolutions.metallicity variations are influenced not only by the resolution differences, but also by factors beyond the resolution biases, such as the baryon cycle in the galaxies (Oppenheimer & Davé 2008).Here, the gas metallicity is calculated in a region within twice the stellar half-mass radius.It is noteworthy that for most of the subhalos in TNG100-1 with gas metallicities of log[Z gas /Z ⊙ ] ≲ −1.5, their counterpart subhalos in TNG100-2 have predominantly null values in gas metallicity, due to resolution limit.In these halos, either the gas components are absent, or the stellar components that define the stellar half-mass radius are absent.We note that similar trends are found in panels (a) and (b) of Figure 6 when comparing TNG50-1 and TNG50-2.
Next, in panel (c) of Figure 5 we compare the gas masses of matching subhalos.While gas mass shows a relatively good agreement between the two simulations for subhalos with high gas mass (M gas > 10 10 M ⊙ ), the gas mass in TNG100-2 tends to be higher than that in TNG100-1 for subhalos with M gas ∼ 10 8.5 M ⊙ , seen as a "kink" in that mass range.For these halos the gravitational potential well is relatively shallow (M DM ∼ 10 10 M ⊙ ).Therefore, stellar feedback and other baryonic physics can expel gas from the gravitational potential well of the halo (Schaller et al. 2015).The higher star formation rate and the associated increase in stellar feedback in the high-resolution simulation (TNG100-1) results in a lower gas mass than that in the low-resolution simulation The red dot-dashed line in each panel indicates a perfect convergence with respect to numerical resolution.Due to the higher star formation rates in TNG100-1 with higher resolution, both the stellar mass and the gas metallicity in TNG100-1 are higher than those in TNG100-2.Note that some subhalos are not included in these histograms since they lack either a stellar, gas, or black hole component in one of the two simulations.The black solid/dashed lines in the marginal plots at the top and right of each panel indicate the fraction of subhalos for which a given component is present in the counterpart simulation.The blue dot-dashed line in the top marginal plot of the panel (c) depicts the fraction of subhalos that have any stellar particle in TNG100-1.Similarly, the orange dotted line shows the fraction of subhalos that have a positive stellar mass in their TNG100-2 counterparts for a given M gas in TNG100-1.Here one can observe that, in the subhalos of M gas ∼ 10 8.5 M ⊙ , star formation occurs only in TNG100-1 but not in TNG100-2.This entails a lack of stellar feedback in TNG100-2 that would have otherwise expelled the gas.Consequently, the gas mass in TNG100-2 is higher than TNG100-1 at M gas ∼ 10 8.5 M ⊙ .See Section 3.1 for more in-depth discussion.
(TNG100-2).To verify this scenario, with a blue dot-dashed line, we show the fraction of subhalos which have non-zero stellar masses in TNG100-1.And with an orange dotted line we plot the fraction of subhalos which have non-zero stellar masses in their TNG100-2 counterparts for a given M gas in TNG100-1.Here one observes that in most subhalos of M gas ∼ 10 8.5 M ⊙ , stars form only in TNG100-1, but not in TNG100-2.This entails a lack of stellar feedback in TNG100-2 that would have otherwise expelled the gas.Consequently, the gas mass of a subhalo in TNG100-2 tends to be higher than that in TNG100-1 at M gas ∼ 10 8.5 M ⊙ .However, we note that this "kink" is less pronounced in panel (c) of Figure 6 where we compare TNG50-1 and TNG50-2.It is primarily due to the better numerical resolution of the lowerresolution simulation.The two simulations show relatively good convergence for M gas ≳ 10 7.5 M ⊙ . .Same plot as Figure 5, but for TNG50-1 (high-resolution run) and TNG50-2 (low-resolution run).While we observe similar trends with the change in resolution as those presented in Figure 5, the differences in gas masses are less pronounced.See Section 3.1 for more details.
At the low-mass end of M gas < 10 8 M ⊙ , however, stars do not form in most subhalos in either TNG100-1 or TNG100-2.Having an insufficient number of gas parcels (≲ 10 parcels in TNG100-2), the gas mass of a subhalo in TNG100-2 tends to be lower than that in TNG100-1 at M gas < 10 8 M ⊙ , or the subhalo may even lack a gas component entirely. 5astly, there is a resolution-wide agreement in the mass of MBH of a subhalo as can be seen in panel (d) of Figures 5  and 6.It shows similar black hole masses in both the highand low-resolution subhalos, except when near the resolution limit of the low-resolution simulation, M BH ∼ 10 7.5 M ⊙ (for TNG100-2) and ∼ 10 6.5 M ⊙ (for TNG50-2).For subhalos harboring an MBH, most of their counterparts in the lowresolution simulation also have an MBH, unlike the trend observed in gas and stellar components.This can be seen in the marginal plot at the top of panel (d), which shows the fraction of MBH-harboring subhalos in TNG100-1 (TNG50-1) for which an MBH is also present in its matching halo in TNG100-2 (TNG50-2).This behavior can be attributed to the TNG physics model that implements an MBH seed for all halo above a certain mass threshold (Weinberger et al. 2017).Since the seeding mechanism depends solely on the DM properties -which show a great convergence as illustrated in Figures 3 and 4 -the black hole mass in the TNG simulations exhibit robust numerical convergence with resolution.We caution the readers about the difference between TNG100 (Figures 5) and TNG50 (Figures 6) observed at the low-mass end.While subhalos in TNG100-1 tend to harbor less massive MBHs than those in TNG100-2 do, subhalos in TNG50-1 tend to have more massive MBHs than those in TNG50-2 do.This is because TNG100-1 and TNG100-2 use different MBH seed masses (M seed ), whereas TNG50-1 and TNG50-2 adopt the same seed mass, M seed = 8 × 10 5 h −1 M ⊙ (Weinberger et al. 2017).

Galaxy-by-galaxy Resolution Study On Radial Profiles of Halos
After compiling the subhalo-matching catalog between two simulations of different resolutions, we are able to conduct a galaxy-by-galaxy comparison study on the subhalos' radial profiles.In Figure 7, we examine the differences in the spatial distribution of particles within subhalos by plotting the radial profiles of enclosed masses in TNG100-1 and TNG100-2 (top row). 6We also plot the mass ratios in each radius bin (bottom row; dividing TNG100-2 by TNG100-1).We group the subhalos into three mass bins in units of M DM / M ⊙ as [10 12.5 , 10 13.5 ], [10 11.5 , 10 12.5 ], and [10 10.5 , 10 11.5 ].Each is denoted in the figure as ∼ 10 13 , ∼ 10 12 , and ∼ 10 11 M ⊙ , respectively.We also limit our analysis to central galaxies, which is the primary subhalo in its FoF group.Subhalos that possess no stellar particles, constituting 10% of the lowest mass bin, are excluded from this analysis.For reference, in each panel, we display vertical dotted lines representing three times the gravitational softening lengths of TNG100-2, divided by the mean virial radius of the halos in each mass bin, 3 × ε z=0 DM, star /⟨R 200 ⟩ (Pillepich et al. 2019). 7n the first column of the figure, the DM mass shows strong convergence between the two simulations, except near the resolution limit (r/R 200 ≲ 0.01, 0.02, 0.05 for each mass bin, or r ≲ 4 kpc), where the density in the low-resolution simulation is approximately 60% of that in the high-resolution simulation.This result is in line with previous studies, such as Jing & Suto (2000).The baryon components such as stars and gas display more significant differences than what is seen in the DM profiles.In the second column, the deficit in stellar mass in the low-resolution simulation is more pronounced in the inner region, within a few times of the gravitational softening length, compared to that in DM mass.Moreover, the median ratio in the outer region situated ∼ 0.6, showing clear deficit of stellar particles in the low-resolution galaxies with M DM ∼ 10 13 and ∼ 10 12 .This trend can be observed more clearly in the lowest mass bin, M DM ∈ [10 10.5 , 10 11.5 ] M ⊙ , where the median value of the stellar profile ratios is lower than 0.4 at all radii.
We label gas parcels with T < 10 4.5 K as cold gas and the remainder as hot gas, following Rohr et al. (2023).For galax-ies with M DM ∈ [10 12.5 , 10 13.5 ] M ⊙ , the median ratio of the cold gas masses between the two simulations is below unity at all radii, despite a large variance (see the third column of Figure 7).The deficit of gas is more pronounced near the central galaxies (r/R 200 ≲ 0.2) whereas the circumgalactic medium (r/R 200 ≳ 0.2) shows better convergence.In contrast, galaxies with M DM ∼ 10 12 M ⊙ and ∼ 10 11 M ⊙ exhibit better agreement with the resolution change; their median cold gas mass ratio is close to unity.We observe the opposite result for the hot gas distribution (fourth column).The galaxies in the highest mass bin (M DM ∼ 10 13 ) show good agreement with the resolution change, but those in lower mass bins tend to have decreased hot gas density near the central galaxies (r/R 200 ≲ 0.1 for the galaxies with M DM ∼ 10 12 M ⊙ and r/R 200 ≲ 0.3 for galaxies with M DM ∼ 10 11 M ⊙ ) in the low-resolution simulation. 8In the highest mass bin, AGN feedback from supermassive black holes plays important role in suppressing the cold gas mass (Davé et al. 2020).We speculate that gas content in the low-resolution is more easily heated by AGN feedback, as reported by Bourne et al. (2015).In the lower mass bins, where stellar feedback dominates over AGN feedback, the reduced stellar masses in the low-resolution galaxies could lead to the decrease in hot gas content in the inner region.

Mitigating the Resolution Biases: Correcting
Low-resolution Subhalos With Machine Learning ML techniques have been used to paint baryonic properties onto DM halos in DMO simulations after it learned the relationship between the DM properties and baryonic properties of galactic halos (e.g., Jo & Kim 2019;Lovell et al. 2022;Jespersen et al. 2022).Inspired by these pioneering studies, we aim to "correct" the baryonic properties of halos found in low-resolution simulations, by constructing a regression model that learns the relationship between the halo properties in a low-resolution simulation and in a high-resolution simulation (see Section 2.3).In this section, we illustrate the performance of the ML model we have built.Then, as a first test, we apply the ML model to the subhalos in TNG300-1, which "corrects" their baryonic properties to match those in a higher resolution benchmark, TNG100-1.
In the left panel of Figure 8, we present the machine's predictions of subhalo stellar masses.The machine predicts a subhalo stellar mass in a high-resolution simulation TNG100-1, solely based on its matching pair's stellar mass in a low-resolution simulation TNG100-2.Our ML model is shown to predict a subhalo's stellar mass in TNG100-1 with great accuracy, using only the information from TNG100- 1,cold gas (r) 1,hot gas (r) Figure 7. Top row: Median enclosed-mass profile of "central" galaxies in various sample mass ranges at z = 0. Solid lines denote subhalos in TNG100-1 (high-resolution run), while dashed lines represent those in TNG100-2 (low-resolution run).From left to right, the median enclosed mass profiles for the DM, stellar, cold gas, and hot gas components are depicted, respectively.The bottom row presents the median values of the mass ratios in each radius bin (dividing TNG100-2 by its TNG100-1 counterpart).The shaded regions indicate 16 th − 84 th percentile ranges.Vertical dotted lines in each panel show the three times the gravitational softening length of TNG100-2 with respect to mean virial radius of each mass bin, denoting the resolution limit of the lower-resolution simulation in each mass bin.In general, massive subhalos show better agreement with resolution changes, but there is a noticeable deficit of cold gas in the low-resolution galaxies with M DM ∼ 10 13 M ⊙ .For a detailed discussion on this plot, see Section 3.2.

Gas mass
Before correction (Figure 5) 0.477 0.043 0.1877 0.902 After correction by gaussian process regression 0.364 0.031 -0.0000 0.920 After correction by ML (Figure 8) 0.269 0.022 -0.0025 0.957 Table 3.Comparison of performance metrics for three subhalo properties: stellar mass, gas metallicity, and gas mass.Here we compare the root mean squared errors (RMSE), the mean absolute percentage errors (MAPE), bias, and Pearson correlation coefficient (ρ) when comparing the matching subhalos' properties between TNG100-1 and TNG100-2.The first method "before correction" means the pure, inherent resolution effect seen in Figure 5, between two simulations of different resolutions.The second method is the error after we have applied a gaussian process regression model to compensate for the resolution biases.Errors in stellar mass and gas metallicity are significantly reduced when the gaussian process regression model is applied.The ML model further reduces the error across all three properties, as illustrated in Figure 8. See Section 3.3 for more details. .The comparison of baryonic subhalo properties between the true values in TNG100-1 (high-resolution run) and the machine-predicted values based on the properties in TNG100-2 (low-resolution run).Our ML model predicts a subhalo's stellar mass, gas metallicity, and gas mass in the high-resolution simulation TNG100-1 (from left to right), based solely on its matching pair's properties in the low-resolution simulation TNG100-2.Using the ML technique, we successfully "correct" the resolution biases exhibited in Figure 5 (notice that the white contours in Figure 5 are copied to this figure as red contours).The range of the color bar is scaled down from that of Figure 5, by the ratio of the number of subhalos in the test set to the total number of subhalos.See Section 3.3 for more details. .Correction of subhalo properties at z = 0 in TNG300-1 (low-resolution run) using an ML model trained on TNG100-1 (high-resolution run) -TNG100-2 (low-resolution run) subhalo pairs.The left panel displays the stellar mass -halo mass (SMHM) relation for subhalos in TNG100-1 and TNG300-1, and for the TNG300-1 subhalos after ML-based corrections.The right panel shows the stellar mass -gas metallicity relation (MZR) for the same subhalos.Shaded areas indicate the 16 th − 84 th percentile ranges.While a systematic discrepancy exists between TNG300-1 (blue triangles) and TNG100-1 (orange squares), the corrected TNG300-1 subhalos (green circles) align very closely with the TNG100-1 subhalos, both in the SMHM relation and in the MZR.Interestingly, the ML model reproduces the scatter of the SMHM relation seen in the TNG100-1 subhalos (i.e., the green-shaded region nearly overlaps the orange-shaded region).See Section 3.3 for more details.
2. This is remarkable especially when compared to Figure 5 (or red contours in Figure 8), which clearly demonstrates the systematic discrepancy between high-and low-resolution simulations.Notice that the white contours in Figure 5 are copied to Figure 8 as red contours in order to guide the eyes and compare the two figures.Similar to Figure 5, in the upper marginal plot, the black solid line represents the fraction of subhalos of a given stellar mass in TNG100-1 that also have stellar particles in their "corrected" TNG100-2 counterpart halos.For example, nearly all the subhalos with M star ∼ 10 7 M ⊙ in TNG100-1 have their "corrected" counterpart subhalos in TNG100-2 containing some stellar particles.This finding is in stark contrast to Figure 5, where only less than half of these TNG100-1 halos had counterpart TNG100-2 subhalos with any stars.While a low-resolution simulation like TNG100-2 tends to lack stellar particles towards the lower end of the stellar mass range, our prediction model is shown to be able to "correct" the stellar masses of these subhalos across nearly the entire range of stellar masses above ∼ 10 7 M ⊙ .
To quantitatively evaluate the performance of our model, in Table 3 we present the root mean squared error (RMSE), mean absolute percentage error (MAPE), bias, and Pearson correlation coefficient (ρ), when comparing the matching subhalos' properties between TNG100-1 and TNG100-2. 9 The first column "before correction" means the pure, inherent error due to the resolution biases between TNG100-1 and TNG100-2, as can be clearly seen in Figure 5.The second column is the error after we have applied a gaussian process regression model to compensate for these resolution biases. 10ne can see that the errors in stellar mass are greatly reduced even with this simple fix.Then, the third column shows that our ML model further reduces the RMSE and MAPE in stellar mass by ∼ 50%, as is seen in Figure 8.
For the gas metallicity in the middle panel of Figure 8, we again see that our ML model successfully "corrects" the properties of low-resolution subhalos to match those of the high-resolution counterparts.However, the scatter is larger compared to that in the stellar mass panel, with a MAPE value about three times as high as that for stellar mass (0.059 vs. 0.019).The predictions also exhibit a steep lower bound at log [Z gas /Z ⊙ ] ∼ −1.5, indicating that the model complex-9 Errors are calculated for all the subhalos for which both the stellar mass predicted from TNG100-2 and the actual stellar mass in the two simulations are nonzero.For these calculations we use logarithmic values, The same approach is adopted for the gas metallicity (middle panel of Figure 8) and gas mass (right panel).Although we represent gas metallicity using the unit log [Z gas /Z ⊙ ], we compute the metrics using log [Z gas ] to prevent MAPE from diverging when log [Z gas /Z ⊙ ] is close to zero.
ity is significantly lower compared to those of other models.
In addition, about 30% of the subhalos with log [Z gas /Z ⊙ ] < −1.5 in TNG100-1 have their "corrected" counterpart subhalos in TNG100-2 with a wrongly-predicted, null metallicity value (see the upper marginal plot).The metallicity prediction is particularly challenging because there is a large scatter in the training set (see panel (b) in Figure 5), and because most TNG100-1 subhalos with log [Z gas /Z ⊙ ] < −1.5 have their counterpart TNG100-2 subhalos with null gas metallicity due to its resolution limit (see the upper marginal plot of panel (b) in Figure 5).Yet, the machine still performs significantly better than a gaussian process regression, reducing the RMSE and MAPE by ∼ 33% (see Table 3).
Our ML model for gas mass also performs well, as shown in the right panel of Figure 8.Most notably, the "kink" seen in panel (c) of Figure 5 has largely disappeared.The performance slightly deteriorates compared to the stellar mass model, exhibiting a larger RMSE and MAPE in Table 3.In the right marginal plot, the fraction, which denotes the accuracy of the model, deviates from unity for subhalos with M gas ≲ 10 8 M ⊙ .Nevertheless, when compared to the gaussian process regression approach, the ML model reduces the MAPE by ∼ 30% (see Table 3).
We conclude that ML techniques can help predict the properties of a subhalo in a high-resolution simulation based solely on its properties in a low-resolution simulation.The prediction is particularly good for the stellar mass and gas mass of subhalos, thereby mitigating resolution biases.In Appendix B, we present results for both central and satellite subhalos, which show no notable difference in performance, except in the prediction of gas mass.Now we apply this ML model to the subhalos in a large cosmological simulation volume (TNG300-1) and "correct" their properties to match those in a higher resolution benchmark, TNG100-1.TNG300-1 has an identical resolution as TNG100-2, but with a larger (302.6 cMpc) 3 box size.Therefore, we can apply our trained ML model to the subhalos in TNG300-1, adjusting their subhalo properties to those in TNG100-1.In Figure 9, we show two important relationships -the stellar mass -halo mass (SMHM) relation and the stellar mass -gas metallicity relation (MZR) -from the two flagship TNG simulations with different box sizes.11First, it can be observed that a systematic discrepancy exists between TNG300-1 (blue triangles) and TNG100-1 (orange squares) due to the fact that the resolution of TNG300-1 is 8 times worse in mass compared to that of TNG100-1.The deficit in stellar mass in a low-resolution simulation is exactly what is shown in panel (a) of Figures 5 and 6.Since both the stel-lar mass and the gas metallicity of a subhalo tend to increase with resolution (i.e., panels (a) and (b) of Figures 5 and 6), a higher resolution simulation TNG100-1 yields a slightly higher MZR than TNG300-1 does.
Second, the green circles in Figure 9 display the results from our ML model where properties of the subhalos in TNG300-1 are "corrected" as if they are from a higher resolution simulation.The SMHM relation of these "corrected" subhalos (green circles) in the left panel becomes nearly identical to that in TNG100-1 (orange squares), except on the high-mass end (M DM > 10 14 M ⊙ ), where the amount of the training set is limited due to a smaller (106.5 cMpc) 3 box size.The "corrected" MZR also closely matches that of TNG100-1, except on the high-and low-mass ends.Interestingly, the ML model is shown to be able to reproduce the same scatter in the SMHM relation, with a similar 1-sigma shade (68% confidence interval) as the TNG100-1's relation (i.e., the green-shaded region overlaps the orange-shaded region).Achieving this has been often considered unlikely when only the features from a DMO simulation were used as inputs to an ML model (Agarwal et al. 2018).However, because we directly provide information on baryonic properties, the model does not heavily rely on their halo mass at z = 0 alone, but also uses various other inputs to replicate the galaxy-to-galaxy scatter seen in a hydrodynamic simulation.And yet, we fail to reproduce the same amount of scatter when predicting the MZR.This finding is in agreement with the ML model's relatively weak performance in predicting metallicity, as demonstrated in Figure 8.

Further Verification of the Matching Algorithm
We have demonstrated that the subhalo in our matching catalog have nearly identical peculiar velocities, as well as similar DM masses and velocity dispersions in the highresolution simulation, TNG100-1 and TNG50-1, and its lowresolution counterpart, TNG100-2 and TNG50-2 (see Sections 2.2 and 3.1).In this section, we further test our matching algorithm by tracking the DM particles of the matching subhalo pairs.Although we only identify the matching pairs based only on the positions of subhalos along their evolution histories (see Sections 2.2), it is expected that DM particles in these subhalo pairs indeed come from the same Lagrangian patch in the initial conditions.We trace each DM particle of the subhalo in TNG100-1 back to the initial condition at z = 127, and locate the corresponding particles by finding the nearest particles in position in the initial condition of TNG100-2.Finally, we check whether these corresponding particles indeed belong to the matched TNG100-2 subhalo in our matching catalog.
In the left panels of Figure 10, we present the fraction of DM particles in a TNG100-1 subhalo that have correspond-ing particles inside its matching subhalo in TNG100-2.For the practical purpose, we limit our investigation to a subset of 50,000 subhalos.The fractions are plotted against the smaller of the two DM masses of each matched subhalo pair.For each matched pair of subhalos, we take the DM mass that is smaller and use that value for the plot.On average, a TNG100-1 subhalo shares 70% of its DM particles with its counterpart halo in TNG100-2.Based on the fractions of the pairs from the catalog provided by the TNG Collaboration, we consider pairs with a fraction below 0.4 to be insufficient for matching (see Figure 14).
The fractions of satellite subhalos are lower than those of central subhalos (orange and blue density histograms in the marginal plots of Figure 10).This finding aligns with the limited accuracy of our matching method for satellites during the verification step (see Figure 2).Furthermore, 0.89% of centrals and 8.86% of satellites (2.38% in total) have a fraction below 0.4, which is considered insufficient for a match in the matching method based on particle IDs.Therefore, it is apparent limitation of our study that including those subhalos in the analysis.Nevertheless, only 0.024% of the subset have a fraction of 0.0, indicating that the two matched pairs originated from completely unrelated patches in the initial conditions.
The subhalo pairs with an insufficient number of shared DM particles complicate the 'galaxy-by-galaxy' comparison approach.The dark matter halos of these pairs are too dissimilar to effectively study the impact of resolution changes on baryon physics.Furthermore, the chaotic behavior inherent in cosmological simulations introduces intrinsic scatter between the simulations; galaxies and circumgalactic media can diverge from each other by a tiny shift in the early universe, even for the pairs with sufficient shared DM particles (Genel et al. 2019;Keller et al. 2019;Borrow et al. 2023b).These studies show that minute perturbations in the early universe can lead to galaxy-level differences in cosmological simulations, especially in the presence of feedback.Late major mergers amplify this stochasticity (Keller et al. 2019), making satellites in dense environments particularly challenging to compare.Similarly, Grand et al. (2021) show that the trajectories of matched satellites at two different resolutions can diverge after passing the second pericentric passage.Consequently, satellite subhalos in different simulations could be highly dissimilar.Therefore, comparing subhalos across simulations necessitates examining a substantial number of pairs to draw meaningful conclusions.A conclusion drawn from a single galaxy pair is susceptible to being severely impacted by these chaotic effects.
We have demonstrated that most of the subhalo pairs originating from almost the identical regions in the initial conditions can be identified by only using the mass and positions of main progenitors.In Appendix C, we compare the fractions .The fraction of DM particles in a TNG100-1 subhalo that have corresponding particles in its matching subhalo in TNG100-2.A fraction of 1.0 suggests that all DM particles in a subhalo at z = 0 are present in its counterpart TNG100-2 subhalo.We plot only the subhalos that exceed the mass threshold for the matching process, 3×10 9 M ⊙ (see Section 2.2).Due to computational constraints, only 50,000 subhalos are represented in this plot.The marginal plots on the right and at the top display density histograms, showing that most matched pairs share approximately 70% of DM particles.Orange and blue histograms represent central and satellite subhalos, respectively, with the satellites exhibiting lower fractions.See Section 4.1 for more details.
with those obtained from the subhalo pairs identified between DMO and hydrodynamic simulations.These fractions are higher than those in comparisons between TNG100-1 and TNG100-2, which suggests that the resolution difference has a greater impact on the similarity between the subhalo pairs than the presence of baryons.

Feature Importances and the Ablation Study
We have shown that an ML technique LIGHTGBM, a modified version of the gradient tree boosting algorithm, can be used to mitigate the resolution biases by correcting the subhalo properties in a low-resolution simulation to match those in a high-resolution simulation (see Sections 2.3 and 3.3).An inherent advantage of using decision tree-based regression models such as LIGHTGBM is the ability to identify the key attributes that significantly influence the prediction of the output, among numerous input features.In contrast to a deep learning model, where it is often a challenge to understand the specific procedures that led to a particular decision, the ability to explain the results is an advantage of a LIGHTGBM model.This allows us to study which features are important to determine target properties.We study two types of feature importance of our ML model in Figure 11, gain (top row) and split (bottom row).Gain represents the improvement in accuracy (or the reduction in loss) brought about by a feature.On the other hand, split indicates how many times the feature is used for decision-making within the gradient boosting decision tree.Since we feed a total of 2300 features (23 features × 100 snapshots) as inputs to the ML model, the feature importances can be computed for each of 23 features at each of 100 epochs.Hence, Figure 11 shows the computed importances of the four features with the highest gain values in each prediction model at 100 epochs.
In the left column of Figure 11, one can see that in order to predict the stellar mass of a subhalo in TNG100-1, the ML model relies -obviously -on the stellar mass of its counterpart subhalo in TNG100-2 at z = 0 (SubhaloMassType4 in Table 2; green triangles).From z ∼ 0.5 to z = 0, the machine gives increasing weight to the stellar mass, considering it as a crucial component for predictions.However, other features also contribute to the stellar mass prediction.For example, the gain values of the subhalo's maximum circular velocity (SubhaloVmax in Table 2; pink triangles) or the host halo's virial radius (Group R TopHat200; gray crosses) are close to or above ∼ 10 5 at z ∼ 1.Even more interestingly, the split value of the host halo's gas mass at z ∼ 3 (GroupMassType0; purple squares) is equal to or greater than that of the subhalo's stellar mass at z = 0. Our ML model seems to reflect the baryonic physics at cosmic noon, when the cosmic star formation rate reached its peak due to mergers and other galaxy-galaxy interactions.
Given that gas metallicities are closely tied to stellar masses, the feature importances for the gas metallicity prediction model show similar results, as seen in the middle column of Figure 11.The model again emphasizes the subhalo's dynamical mass (SubhaloVmax; pink triangles) and the host halo's virial radius (Group R TopHat200; gray crosses) at high z.Since there exists a large scatter when comparing the gas metallicities between the low-and high-resolution simulations (see panel (b) of Figures 5 and 6), the model does not rely as much on the the subhalo's metallicity in TNG100-2 to predict that in TNG100-1.
In contrast to the two previous models, for the gas mass prediction, the ML model prefers to use features closer to z = 0.It bases its predictions on the subhalo's gas mass (SubhaloMassType0; orange triangles) and DM mass (SubhaloMassType1; blue circles), and its host halo's gas mass (GroupMassType0; purple squares).This suggests that the gas mass of the subhalo is more closely related to its state at z = 0 rather than to properties of its progenitor in the past.These trends are reasonable in that the stellar mass and the gas metallicity are integrated properties over several billion years in the past, while the gas mass can be considered as an instantaneous property. .Feature importances of our ML model that predicts a subhalo's stellar mass, gas metallicity, and gas mass in the high-resolution simulation TNG100-1 (from left to right), based on its matching pair's properties in the low-resolution simulation TNG100-2 (Sections 2.3, 3.3, and Figure 8).The number shows the degree to which the model depends on each input feature.The thin, semi-transparent lines show the raw data, while the thick lines represent the smoothed data.We present two types of feature importances: gain (top row) and split (bottom row).Since we provide a total of 2300 features (23 features × 100 snapshots) as inputs for a single subhalo, the feature importances can be computed for each of 23 features at each of 100 epochs.The computed numbers are presented for the four features with the highest gain values in each prediction model at 100 epochs.Notably, when predicting the stellar mass and gas metallicity of subhalos, the model heavily relies on the information from z ≳ 1. See Section 4.2 for more details.
Readers should note that features with higher importances do not necessarily mean that these features are essential for the prediction.For example, the models highly rely on the subhalo maximum circular velocity and host halo virial radius when predicting a subhalo's stellar mass and gas metallicity.Although the models prefer to use these features as proxies for the dynamical mass of the subhalo and host halo, the models could achieve comparable performance without these parameters by utilizing other features with similar physical meanings (e.g., subhalo DM mass).Interestingly, McGibbon & Khochfar (2022) show similar trends when predicting the stellar masses and stellar metallicities of galaxies out of DM halos.In their model, the feature importances of velocity dispersion and maximum circular velocity are more than five times higher than that of DM mass, with a similar peak observed at z ∼ 1.The pronounced preference for the virial radius of the host halo and the maximum circular velocity dispersion of subhalos, in comparison to other features related to dynamical mass, requires further investigation to elucidate the (possible) physical basis of this phenomenon.
To further verify the higher feature importances of highz progenitors, we conduct an ablation study, as presented in Figure 12.We study the importance of input features from early epochs when predicting the stellar mass in a highresolution simulation, by removing the high-z features in the input of our ML models.For example, one can observe that training the model with only a single snapshot (i.e., only 23 features at z = 0, instead of the full 2300 features at 20.5 ≤ z ≤ 0) leads to performance degradation -∼ 15% worse for stellar mass and ∼ 5% worse for gas mass and gas metallicity, as measured by the MAPE metric.The result with the RMSE metric is almost identical with this figure.As we include more features from earlier epochs, the machine prediction becomes more accurate.The model reaches its best performance when it considers more than 80 snapshots out of the total 100 (i.e., all the snapshots below z = 4.18).We conclude that the input features from the early epochs are important in our ML predictions, especially for the stellar mass prediction.This is reasonable because the subhalo's maximum circular velocity (SubhaloVmax)  Figure 12.The relative performance of the ML model that predicts the subhalo properties in a high-resolution simulation, as a function of the number of input snapshots fed to the model.Training the model with only a single snapshot (i.e., only 23 features at z = 0, instead of the full 2300 features) leads to 15% worse prediction for stellar mass, as measured by the MAPE metric.The model maintains its best performance as long as it considers more than 80 snapshots (i.e., all the snapshots below z = 4.18).As the number of input snapshots decreases, the prediction accuracy degrades significantly, especially for stellar mass.See Section 4.2 for more details.
virial radius (Group R TopHat200) in the early epoch plays a crucial role in this prediction, as illustrated in the left column of Figure 11.
We carry out two more ablation studies: (i) we train the ML model by excluding the local environmental information like the host halo's mass (i.e., excluding the features marked with "(c)" in Table 2), and, (ii) we train the ML model with only the features that can be obtained in DMO simulations (i.e., using only the features marked with "(b)" in Table 2).The gas mass prediction is the most sensitive in both ablation studies; that is, ∼ 2.5% and ∼ 13% performance degradation for test (i) and (ii), respectively.The inclusion of the baryonic properties is particularly essential for accurately predicting the gas mass of a subhalo in a high-resolution simulation.However, for the stellar mass prediction, the ML model maintains its performance even without the environmental features or the baryonic features -only ∼ 1% and ∼ 2.5% performance degradation for test (i) and (ii), respectively.We speculate that this is because the ML model can indirectly infer the subhalo's interactions with its surrounding environment by tracking the variations in subhalo features over time.For the stellar mass prediction, it seems more important to provide the information from the early epochs (as seen in Figure 12) than to provide the environmental features or the baryonic features.

SUMMARY AND CONCLUSION
To compare the subhalos between a high-resolution and low-resolution simulations, we have developed a unique subhalo matching algorithm which uses the main branches of the subhalo merger tree, instead of accessing the entire simulation snapshot data.Subhalos have been matched by pairing those with similar mass and position histories in the two simulations with differing resolutions (Section 2.2).However, this approach shows a limitation in accuracy when pairing satellite subhalos.We have studied the resolution biases using two large cosmological simulations that offer resolutiondependent results, TNG100-1/-2 and TNG50-1/-2.In our galaxy-by-galaxy resolution study with our matching subhalo catalogs, the following resolution biases have been found in subhalo properties: • The DM mass, velocity dispersion, and peculiar velocity of subhalos, which is largely determined by the gravitational interaction of DM, show strong resolution convergence in both TNG100 and TNG50 (Section 3.1; Figures 3 and 4).
• The stellar masses of subhalos in a high-resolution simulation (TNG100-1/TNG50-1) are ≲ 0.5 dex higher than in their low-resolution counterparts (TNG100-2/TNG50-2), consistent with previous studies on TNG simulations.Since the subhalos' gas metallicities are directly related to their stellar masses, subhalos in a high-resolution simulation have higher metallicities than their low-resolution counterparts (Section 3.1; panels (a) and (b) in Figures 5 and 6).
• The gas masses of subhalos with M gas ∼ 10 8.5 M ⊙ in a low-resolution simulation are higher than in their highresolution counterparts, due to the lack of stars and stellar feedback in a low-resolution run, which would have otherwise expelled the gas out of the relatively shallow gravitational potential.The MBH masses of subhalos show relatively good resolution convergence, as they are directly linked to the halo masses (Section 3.1; panels (c) and (d) in Figures 5 and 6).
We have then investigated the resolution biases in the radial profiles of enclosed masses for central galaxies.Our findings in this galaxy-by-galaxy resolution study on radial profiles are as follows: • The DM mass profiles show good resolution convergence, except in the region near the resolution limit (r ≲ 4 kpc).The deficit in the stellar mass profiles in a low-resolution simulation (TNG100-2) compared to their counterparts in a high-resolution run (TNG100-1) is more pronounced, especially in the inner regions (Section 3.2; left two columns in Figure 7).
• The cold gas mass profiles show relatively good resolution convergence for galaxies with M DM < 1012.5 M ⊙ , albeit with large galaxy-to-galaxy variance.However, for these galaxies, a deficit in hot gas mass is observed in the low-resolution simulation.In galaxies with M DM ∼ 10 13 , where AGN heating is significant, a deficit in cold gas mass is observed in the lowresolution simulation, while the hot gas mass shows good resolution convergence.The gas metallicity profiles also exhibit solid resolution convergence, except for the less massive central galaxies with M DM ∈ [10 10 , 10 11 ] M ⊙ (Section 3.2; right two columns in Figure 7).
Finally, using the matching catalog, we have developed an ML technique to construct a regression model that "corrects" subhalo properties in a low-resolution simulation, in order to mimic its counterpart's properties in a high-resolution simulation (Section 2.3).This technique allows us to "correct" discrepancies arising from the differences in resolution, particularly for the stellar mass and gas mass of subhalos, although its performance is lower in correcting gas metallicity.The ML model has been applied to the subhalos in TNG300-1 which has the same resolution as TNG100-2, our low-resolution run in the training set.Our tests demonstrates that the galaxy scaling relations, such as the SMHM relation or the MZR from the low-resolution TNG300-1, can be "corrected" to mitigate resolution biases.Our ML models can be used for a variety of scientific purposes.Large cosmological simulations like TNG300-1 are designed to sacrifice resolution in favor of a larger box size.However, we can still obtain a larger sample of galaxies with improved accuracies and reduced resolution biases, with the help of our ML model.
An important aspect of our study is the versatility of the pipeline we have constructed.Both the subhalo matching algorithm and the correction model can be easily applied to any other large cosmological simulation.This makes it a valuable tool for testing and mitigating the resolution biases of differ-ent numerical codes and physics models.The scripts used to perform the analysis and generate this manuscript, along with the matching catalogs and the corrected physical quantities of TNG300 subhalos, are available on GitHub 12 and archived in Zenodo (Jung et al. 2024).1) and ( 2).The left panel represents the weights of each snapshot, w I .A higher value of w I indicates that the main progenitors in snapshot I are more significant in the prediction.The right panel represents the parameters related to the mass differences, γ and w mass .Higher values of γ and w mass suggest that mass differences between the two subhalos outweigh differences in their positions.We divide subhalos into five different mass bins and choose different parameters for each bin: (a) indicates the smallest mass bin (2 × 10 9 ≤ M DM /M ⊙ < 5 × 10 9 ), and (e) indicates the most massive one (2 × 10 11 ≤ M DM /M ⊙ ).In the lower mass bins, the model prefers to give more weight to the progenitors at higher redshifts and prioritize position differences over mass differences.The higher mass bins show the opposite trend, giving more weight to the progenitors at low redshifts and emphasizing mass differences.See Section A.1 for more details.
Training the model with the cross-entropy loss function (− ∑ i y i log σ i (S); where y i is the true label), we observed that w I , the weight of a given snapshot I, fluctuates significantly between snapshots.We suspect these fluctuations stem from specific preferences for certain subhalos, rather than any inherent physical significance.Consequently, we have adopted several strategies to reduce model complexity and prevent overfitting, while still maintaining similar accuracy in model performance.First, we utilize the same weight w snap,I for every four snapshots, training only 25 sets of weights and expanding these into 100 (for a total of the 100 snapshots).Second, we incorporate a regularization term into the cross-entropy loss function to penalize fluctuations between consecutive weights.This discourages the model from overly depending on selective information from specific time periods, while maintaining similar performance during the training phase.Without these two strategies, the training could potentially be skewed by a small number of subhalos that display specific tendencies due to their unique characteristics.The loss function is presented as follows: We train this model until the cross-entropy loss (without the regularization term in Eq.( A1)) starts to rise in the test set.
It is important to note that even if a subhalo follows the same growth history in the simulations of two different resolutions, the main progenitors in the higher resolution simulation could be identified at an earlier snapshot.This occurs because the halo finder can identify smaller subhalos in a simulation with higher resolution.Consequently, a "true" pair could be incorrectly penalized because the corresponding progenitors in TNG100-2 have yet to be identified.Differences in the number of subhalos between TNG100-1 and TNG100-2 arise for those with M halo ∼ 10 9 M ⊙ .In order to minimize the impact of resolution differences between TNG100-1 and TNG100-2, we constrain our analysis to subhalos with masses greater than 2 × 10 9 M ⊙ (or 33 particles for TNG100-2).Additionally, we restrict their merger trees to only include the main progenitors with M DM > 1.5 × 10 9 M ⊙ (or 25 particles for TNG100-2).These two restrictions are applied during both the training phase and the application phase.
In the left panel of Figure 13, we display the weights of each snapshot across five mass bins, ranging from the lowest to the highest.Low-mass subhalos, which typically exhibit a relatively peaceful merger history and frequently coexist with many other subhalos of similar sizes and positions, tend to give more weight to information from the early epoch.In contrast, massive subhalos are more inclined to prioritize the position and mass information from the low redshifts over that from earlier times.The highest mass bin displays significant fluctuation due to a lack of subhalos.In the right panel of Figure 13, we show the two parameters: γ and w mass , across the five mass bins.The model prefers lower γ and w mass values for the smaller mass bins, emphasizing position differences over mass differences.As the DM masses in low-mass subhalos could vary to some extent due to the baryonic physics, the model downplays the mass differences.Consequently, the differences in parameters across the mass bins can be interpreted as the differences in growth histories of high-mass and low-mass galaxies.

A.2. Application Step
The similarity function (Eq.( 1)) with the trained parameters is utilized to pair subhalos in TNG100-1 and TNG100-2.We apply the same method as described in Section A.1 to match the subhalos in the two simulations.However, they are considered to be matched only if they are bijectively matched (i.e., both directions yield identical matching results).A step-by-step description of the application phase follows.
1.For a target subhalo in TNG100-1, we identify all subhalos in TNG100-2 that have M DM > 2 × 10 9 M ⊙ and are situated within a 5 Mpc/h sphere centered on the position of the subhalo in TNG100-1.
2. We compute the similarity scores for the subhalos in TNG100-2.The probability associated with those subhalos is computed using the softmax function, σ i (S) for the ith subhalo.We then identify the subhalo with the highest score and save the corresponding probability.
5. We merge the two matching catalogs, retaining matches where both catalogs give identical results.We further narrow down the selections based on the following criteria: (i) The probability of the prediction is higher than 0.33 in both directions.
While this method ensures matched subhalos will have notably similar growth histories, it does not guarantee that the two halos indeed originated from the same DM region in the initial conditions.We provide further verification in Section 4.1, demonstrating that nearly all subhalo pairs matched with this method indeed originate from similar dark matter regions in the initial conditions.

B. MACHINE LEARNING CORRECTION OF CENTRAL AND SATELLITE SUBHALOS
In Section 3.3, we present the correction of subhalo properties in the low-resolution simulation.Readers may expect that the satellite subhalos, which are less similar in terms of shared DM particles in each pair, show worse performance than the central subhalos.The performance metrics for central and satellite subhalos are listed in Table 4.However, there is no systematic difference in the four metrics for stellar mass and gas metallicity.For the regression model predicting gas mass, the model performance is noticeably higher for the central subhalos, with all four metrics showing better performance.We conclude that the ML model can correct the satellite subhalos with consistent performance comparable to the central subhalos.

C. SUBHALO MATCHING BETWEEN HYDRO AND DMO SIMULATIONS
We observe in Section 4.1 that the matching pairs we construct share a considerable amount of DM particles, even though we match subhalos by pairing those with similar growth histories.We now compare these results with those obtained from the Figure1.Schematic diagram of the subhalo matching method between TNG100-1 (high-resolution) and TNG100-2 (low-resolution) using machine learning (ML).We start from subhalos in TNG100-1 and their DMO counterparts in TNG100-1-Dark, where the subhalo matching catalog already exists (first panel from top left).We extract positions and masses of each subhalo along the main branch of their merger tree, and evaluate the similarity scores for all possible pairs between subhalos in TNG100-1 and TNG100-1-Dark (second and third panel).The matching function is governed by several parameters as described in Eqs.(1) and (2).After we tune the parameters that optimally predict the existing catalog (fourth panel), the matching function with the optimal set of parameters is used to construct a matching catalog between the subhalos in TNG100-1 and TNG100-2 (fifth panel).See Section 2.2 and Appendix A for a detailed description of the pipeline.

Figure 3 .
Figure3.Subhalo properties at z = 0 that are primarily influenced by DM and gravity, depicted as a two-dimensional histogram.The x-axis represents the values of subhalos in TNG100-1 (high-resolution run), while the y-axis represents the corresponding values of their matched subhalos in TNG100-2 (low-resolution run).From left to right, the dark matter mass, velocity dispersion, and peculiar velocity of the subhalos are presented.The color bar denotes the number of subhalos in each bin.The red dot-dashed line in each panel indicates a perfect correspondence between two simulations.All three properties exhibit good convergence between the simulations of two different resolutions, although the subhalos in the high-resolution run (TNG100-1) have a greater velocity dispersion at the high-mass end.See Section 3.1 for more details.

Figure 5 .
Figure5.Baryonic properties of the matching subhalos at z = 0 in TNG100-1 (high-resolution run) and TNG100-2 (low-resolution run), illustrated as a two-dimensional histogram: (a) the stellar mass, (b) the gas metallicity, (c) the gas mass, and (d) the black hole mass of the subhalos.The red dot-dashed line in each panel indicates a perfect convergence with respect to numerical resolution.Due to the higher star formation rates in TNG100-1 with higher resolution, both the stellar mass and the gas metallicity in TNG100-1 are higher than those in TNG100-2.Note that some subhalos are not included in these histograms since they lack either a stellar, gas, or black hole component in one of the two simulations.The black solid/dashed lines in the marginal plots at the top and right of each panel indicate the fraction of subhalos for which a given component is present in the counterpart simulation.The blue dot-dashed line in the top marginal plot of the panel (c) depicts the fraction of subhalos that have any stellar particle in TNG100-1.Similarly, the orange dotted line shows the fraction of subhalos that have a positive stellar mass in their TNG100-2 counterparts for a given M gas in TNG100-1.Here one can observe that, in the subhalos of M gas ∼ 10 8.5 M ⊙ , star formation occurs only in TNG100-1 but not in TNG100-2.This entails a lack of stellar feedback in TNG100-2 that would have otherwise expelled the gas.Consequently, the gas mass in TNG100-2 is higher than TNG100-1 at M gas ∼ 10 8.5 M ⊙ .See Section 3.1 for more in-depth discussion.
Figure6.Same plot as Figure5, but for TNG50-1 (high-resolution run) and TNG50-2 (low-resolution run).While we observe similar trends with the change in resolution as those presented in Figure5, the differences in gas masses are less pronounced.See Section 3.1 for more details.
Figure8.The comparison of baryonic subhalo properties between the true values in TNG100-1 (high-resolution run) and the machine-predicted values based on the properties in TNG100-2 (low-resolution run).Our ML model predicts a subhalo's stellar mass, gas metallicity, and gas mass in the high-resolution simulation TNG100-1 (from left to right), based solely on its matching pair's properties in the low-resolution simulation TNG100-2.Using the ML technique, we successfully "correct" the resolution biases exhibited in Figure5(notice that the white contours in Figure5are copied to this figure as red contours).The range of the color bar is scaled down from that of Figure5, by the ratio of the number of subhalos in the test set to the total number of subhalos.See Section 3.3 for more details.
Figure9.Correction of subhalo properties at z = 0 in TNG300-1 (low-resolution run) using an ML model trained on TNG100-1 (high-resolution run) -TNG100-2 (low-resolution run) subhalo pairs.The left panel displays the stellar mass -halo mass (SMHM) relation for subhalos in TNG100-1 and TNG300-1, and for the TNG300-1 subhalos after ML-based corrections.The right panel shows the stellar mass -gas metallicity relation (MZR) for the same subhalos.Shaded areas indicate the 16 th − 84 th percentile ranges.While a systematic discrepancy exists between TNG300-1 (blue triangles) and TNG100-1 (orange squares), the corrected TNG300-1 subhalos (green circles) align very closely with the TNG100-1 subhalos, both in the SMHM relation and in the MZR.Interestingly, the ML model reproduces the scatter of the SMHM relation seen in the TNG100-1 subhalos (i.e., the green-shaded region nearly overlaps the orange-shaded region).See Section 3.3 for more details.
Figure10.The fraction of DM particles in a TNG100-1 subhalo that have corresponding particles in its matching subhalo in TNG100-2.A fraction of 1.0 suggests that all DM particles in a subhalo at z = 0 are present in its counterpart TNG100-2 subhalo.We plot only the subhalos that exceed the mass threshold for the matching process, 3×10 9 M ⊙ (see Section 2.2).Due to computational constraints, only 50,000 subhalos are represented in this plot.The marginal plots on the right and at the top display density histograms, showing that most matched pairs share approximately 70% of DM particles.Orange and blue histograms represent central and satellite subhalos, respectively, with the satellites exhibiting lower fractions.See Section 4.1 for more details.
Figure11.Feature importances of our ML model that predicts a subhalo's stellar mass, gas metallicity, and gas mass in the high-resolution simulation TNG100-1 (from left to right), based on its matching pair's properties in the low-resolution simulation TNG100-2 (Sections 2.3, 3.3, and Figure8).The number shows the degree to which the model depends on each input feature.The thin, semi-transparent lines show the raw data, while the thick lines represent the smoothed data.We present two types of feature importances: gain (top row) and split (bottom row).Since we provide a total of 2300 features (23 features × 100 snapshots) as inputs for a single subhalo, the feature importances can be computed for each of 23 features at each of 100 epochs.The computed numbers are presented for the four features with the highest gain values in each prediction model at 100 epochs.Notably, when predicting the stellar mass and gas metallicity of subhalos, the model heavily relies on the information from z ≳ 1. See Section 4.2 for more details.

Figure 13 .
Figure 13.Parameters in the similarity function, Eqs.(1) and (2).The left panel represents the weights of each snapshot, w I .A higher value of w I indicates that the main progenitors in snapshot I are more significant in the prediction.The right panel represents the parameters related to the mass differences, γ and w mass .Higher values of γ and w mass suggest that mass differences between the two subhalos outweigh differences in their positions.We divide subhalos into five different mass bins and choose different parameters for each bin: (a) indicates the smallest mass bin (2 × 10 9 ≤ M DM /M ⊙ < 5 × 10 9 ), and (e) indicates the most massive one (2 × 10 11 ≤ M DM /M ⊙ ).In the lower mass bins, the model prefers to give more weight to the progenitors at higher redshifts and prioritize position differences over mass differences.The higher mass bins show the opposite trend, giving more weight to the progenitors at low redshifts and emphasizing mass differences.See Section A.1 for more details.