Zooming by in the CARPoolGP Lane: New CAMELS-TNG Simulations of Zoomed-in Massive Halos

Galaxy formation models within cosmological hydrodynamical simulations contain numerous parameters with nontrivial influences over the resulting properties of simulated cosmic structures and galaxy populations. It is computationally challenging to sample these high dimensional parameter spaces with simulations, in particular for halos in the high-mass end of the mass function. In this work, we develop a novel sampling and reduced variance regression method, CARPoolGP, which leverages built-in correlations between samples in different locations of high dimensional parameter spaces to provide an efficient way to explore parameter space and generate low-variance emulations of summary statistics. We use this method to extend the Cosmology and Astrophysics with machinE Learning Simulations to include a set of 768 zoom-in simulations of halos in the mass range of 1013–1014.5 M ⊙ h −1 that span a 28-dimensional parameter space in the IllustrisTNG model. With these simulations and the CARPoolGP emulation method, we explore parameter trends in the Compton Y–M, black hole mass–halo mass, and metallicity–mass relations, as well as thermodynamic profiles and quenched fractions of satellite galaxies. We use these emulations to provide a physical picture of the complex interplay between supernova and active galactic nuclei feedback. We then use emulations of the Y–M relation of massive halos to perform Fisher forecasts on astrophysical parameters for future Sunyaev–Zeldovich observations and find a significant improvement in forecasted constraints. We publicly release both the simulation suite and CARPoolGP software package.


INTRODUCTION
A wide range of cosmological surveys are coming soon, including cosmic microwave background (CMB) experiments such as the Simons Observatory (Ade et al. 2019) and CMB-S4 (Abazajian et al. 2016), galaxy surveys such as the Legacy Survey of Space and Time (LSST) at the Vera Rubin Observatory (Ivezić et al. 2019) with EUCLID (Euclid Collaboration et al. 2022), and the Nancy Grace Roman space observatory (Eifler et al. 2021), X-ray observatories such as e-Rosita (Brunner et al. 2022), and radio interferometers such as SKA (Dewdney et al. 2009).These new surveys have the potential to provide solutions to long-standing questions in cosmology.However, uncertainties in baryonic processes, such as active galactic nuclei and supernova feedback, will limit the theoretical interpretations of these future observations.The small-scale systematics these processes introduce, which have previously been a subdominant effect in cosmological analyses, will now be a prevalent source of uncertainty.So, it has become necessary to develop methods for modeling them.
The most straightforward treatment of baryonic processes is to remove small-scale data affected by baryonic systematics directly.This method has indeed been used, for example, in weak lensing analysis with the Dark Energy Survey (DES) (Secco et al. 2022), the latest KiloDegree Survey (KiDs) (Li et al. 2023a) and more recently in HyperSuprimeCam (HSC) (Li et al. 2023b).However, scale cuts must be sufficiently large to remove baryonic processes, resulting in loss of information.
Instead of a scale cut, one could model baryonic effects using cosmological hydrodynamical simulations (Vogelsberger et al. 2014;Crain et al. 2015;Pillepich et al. 2018;Davé et al. 2019).Scale cuts offers an advantage that the small, non-linear scales, shown to contain a wealth of cosmological parameter constraining power (Horowitz & Seljak 2017;Villaescusa-Navarro et al. 2021a,b;de Santi et al. 2023), can be incorporated into the analysis.However, even these simulations must approximate physical effects on scales below the resolution of the simulation grid (subgrid physics), which is uncertain and handled differently across simulations (van Daalen et al. 2014;Vogelsberger et al. 2020;Villaescusa-Navarro et al. 2021c).These differences reflect current uncertainties in feedback prescriptions and galaxy formation in general.
The high computational cost associated with hydrodynamical simulations limits their use in approaches that attempt inference through forward modeling.However, leveraging advances in Machine Learning (ML) can provide an avenue for marginalizing uncertainty in baryonic effects by developing neural networks (Ribli et al. 2019;Villaescusa-Navarro et al. 2022;Lu et al. 2023).An essential requirement of such a task is some data set to train machine learning models.The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS)1 (Villaescusa-Navarro et al. 2021c), are pursuing the generation of such a training set with state-of-the-art cosmological hydrodynamical simulations spanning multiple subgrid prescriptions including IllustrisTNG (Weinberger et al. 2017;Pillepich et al. 2018), SIMBA (Davé et al. 2019), and Astrid (Ni et al. 2022;Bird et al. 2022).Thousands of such simulations have already been created and have been shown to be helpful in a wide variety of astrophysical and cosmological contexts, ranging from improvements in galaxy cluster scaling relations (Amodeo et al. 2021;Wadekar et al. 2023a,b), to exploring the impact of active galac-tic nuclei (AGN) and supernova (SN) feedback on galaxy clustering (Gebhardt et al. 2023;Delgado et al. 2023).
All existing CAMELS simulations have in common that they encompass a volume of (25 Mpc h −1 ) 3 .Irrespective of cosmological or astrophysical parameters, this significantly limits the halo mass range in these volumes, with only around half of the simulations containing any halos with M > 10 13.5 M ⊙ h −1 and 15% containing any halos with M > 10 14.5 M ⊙ h −1 .The range of masses above 10 13 M ⊙ h −1 is particularly interesting for future cosmology and astrophysics studies as it will be observed through secondary CMB anisotropies such as the thermal and kinetic Sunyaev-Zeldovic effects.These observations can provide stringent constraints on the matter density of the universe (Carlstrom et al. 2002), the amplitude of matter fluctuations (Komatsu & Seljak 2001;Horowitz & Seljak 2017), the sum of the neutrino masses (Madhavacheril et al. 2017), and the dark energy equation of state.Furthermore, observations of the SZ effect can constrain astrophysical processes occurring in galaxy groups and clusters and their thermodynamic properties (Battaglia et al. 2017;Schaan et al. 2021;Moser et al. 2022;Hadzhiyska et al. 2023a;Mroczkowski et al. 2019, for review).Similarly, observed X-ray emission and spectra from galaxy groups and clusters can aid in constraining the complex feedback and formation processes in the multiphase intracluster medium (ICM) and Intragroup Medium (IGrM) (Singh et al. 2018;McCabe et al. 2021;Singh et al. 2022;Lee et al. 2022;Lovisari et al. 2023).
In fact, attempts to constrain both galaxy formation and cosmology using multi-wavelength observations, improved simulations, and analysis algorithms are currently ongoing, for example, in the Learning the Universe collaboration2 .However, for future analyses to proceed, increasing the mass range of simulated models to include group-and cluster-scale halos is imperative.
In this work, we seek to do just this.Our main goal is to generate a set of galaxy group to cluster scale zoom-in simulations that span a wide range of astrophysical and cosmological parameters in the IllustrisTNG model's parameter space.We design this suite so that it can be used to train an emulator to predict halo quantities (such as the integrated Compton Y SZ signal of the tSZ effect) throughout the entire parameter space.In this way, the emulated halo quantities can easily extend previous CAMELS experiments at the high-mass end of the mass function, opening up the possibility of future analyses that are focused independently on the effects of more massive halos3 .This task is non-trivial due to the roles played by sample variance and high parameter space dimensionality.Unlike the (25 Mpc h −1 ) 3 CAMELS boxes, where there are typically multiple halos at each mass and parameter space location (below the group mass scale), in this work, we will be limited to significantly fewer halos due to the high computational cost of simulating massive objects, leading to results that are more vulnerable to sampling effects.Similarly, generating a simulation suite to cover such a high-dimensional parameter space introduces the curse of dimensionality.Often, a Latin Hypercube or Sobol Sequence (Sobol' 1967) is used to sample high-dimensional spaces to best capture variations throughout the entire parameter space.However, whether either of these sampling approaches is optimal for generating a training set for an emulator remains to be determined.
Irrespective of the sampling strategy, it is imperative that we minimize the uncertainty introduced by the small number of groups and clusters we can simulate.In this work, we introduce the novel reduced variance emulator CARPoolGP with an associated active learning parameter sampling method, which builds on the success of the CARPool reduced variance estimator (Chartier et al. 2021;Chartier & Wandelt 2022a,b) and generalizes the approach to a broad parameter space.We then apply our novel method to the 28-dimensional Il-lustrisTNG galaxy formation model to build a new suite of 768 galaxy groups and clusters.
The organization of this work is as follows.In Section 2, we introduce the most general formalism for the CARPoolGP emulation and sampling approach and provide a one-dimensional example to demonstrate its efficacy.We then discuss the setup of the cosmological zoom-in simulation suite in Section 3 and how we have applied the CARPoolGP methodology to this case.We present simple applications of the zoom-in suite and CARPoolGP by emulating various scaling relations, including the Compton Y − M relation in Section 4. We then perform a Fisher forecast on astrophysical parameters using the derivatives of the emulated Y −M relation and discuss caveats to our testing methods in Section 5. We conclude with the key takeaways from this work in Section 6.

SAMPLING AND EMULATING WITH CARPOOLGP
Many situations arise where one is interested in some functional form of the parameter space dependence of a quantity but is limited to a few data points riddled with sample variance.For example, it would naively require a diverse population of halos across numerous masses to predict how the circumgalactic medium temperature changes as a function of halo mass.However, the generation of samples may be expensive, which in turn could stymie attempts at obtaining a low-variance prediction of the quantity's parameter dependence.It is then advantageous to develop an emulator that provides not only the quantity of interest but also a measurement of the associated uncertainty.One general solution to this problem is using Gaussian process regression, which treats each sample at a location in parameter space as a random process drawn from a distribution that can be learned through regression.In this way, one can extract the correlations between quantities throughout the parameter space and the mean parameter dependence on the quantities (see Bird et al. (2019); Aigrain & Foreman-Mackey (2022); Bird et al. (2022); Ho et al. (2022) for a review of the Gaussian process regression and applications).
Correlations in parameter space can often be constructed between samples, as in the case of cosmological simulations, where the initial conditions can be controlled.This work seeks to improve upon general Gaussian process regression in situations such as these, where correlations naturally exist or can be generated between samples.In this first section, we introduce a novel sampling approach and an associated emulator called CAR-PoolGP4 (Lee 2024) to do exactly this.
We begin the section by developing a general formalism for the method.We then describe an improvement to the sampling scheme, and finally, we end the section with a simple, one-dimensional toy example that demonstrates the power of CARPoolGP.We provide a consolidated summary of variables, their definitions, and the equations where they are introduced in Table 1 for easy reference.

Gaussian Process general formalism
Consider some quantity Q we want to fit as a function of parameter space locations θ i using noisy samples Q with sampling error ϵ.The quantity Q and samples Q are then related through

Xij
Covariance between base and surrogate Quantities Eq. 9

Yij
Smooth noise kernel for base and surrogate covariance Eq. 9

Mij
Covariance kernel for sample variance between base and surrogates Eq. 9 µR Prior on surrogate quantities Eq. 12 where we take ϵ to have ⟨ϵ⟩ = 0 and ) is the variance level on the noisy quantity that can depend on location in parameter space.
On the one hand, one typically wants to average over multiple Q(θ i ), each with a different noise realization, to obtain a good estimate of Q(θ i ).Using a fitting procedure, new parameter points can be emulated with some level of predictive variance on the noiseless quantity Q(θ i ).Unfortunately, if the cost of producing a single noisy measurement Q(θ i ) is high, as is often the case, this procedure is difficult and can lead to high variance and poor accuracy of the emulated quantity.Our task is then to answer: How do we achieve an emulation with a low predictive variance of a quantity across a parameter space, when we can only obtain a small number of noisy measurements of the quantity at each of a limited set of parameter locations?
With Gaussian process regression, we imagine that N B samples of some noisy quantity Q(θ i ) exist throughout the parameter space.We denote this set B ≡ {θ i |i = 1, 2, ..., N B } by the base samples (for reasons that will soon become clear).Base samples have a covariance matrix, (2) that contains two matrix terms, V , which represents the smooth and systematic variation in the quantity Q(θ i ) as a function of parameters, and F , which contains the sample fluctuations stemming from the noise.In this way, V can be considered to be the "true" covariance associated with Q, while F is some error associated with the covariance, which stems from the fact that our measurements are not of Q but rather of Q.
Both matrices encode correlations between individual samples in their systematic and sample variances, respectively, but in the typical case of independent samples with uncorrelated sample variance, F can be written in a less general form such that F ij = σ 2 Q,i I where I is the identity matrix.We have absorbed the parameter dependence into the subscripts of our matrices for brevity such that the parameter indices become matrix indices (e.g., V (θ i , θ j ) → V ij ).The noise matrices are any arbitrary linear combination of Gaussian kernel functions, such as Radial Basis functions (RBF), Matern, or linear exponential kernels (see Ch. 4 of Rasmussen & Williams (2006) for more kernels).Kernel functions contain free hyperparameters, τ , and depend on the distance between the parameter points.As an example, an RBF would have the form of: where d E (•) is the Euclidean distance between the parameter points, γ is a vector of smoothing scale factors, and α is an amplitude constant.The product comprises N p kernels where N p is the dimensionality of the parameter space, and p is the index over a given parameter, each containing unique smoothing scale factors.We can consider arbitrarily complex kernel functions with increasing numbers in the hyperparameter vector τ .
Given prior knowledge of a quantity's variation, we can impose the prior, µ Q (θ i ) on the mean function, and use the Gaussian likelihood function, which depends on the vector of hyperparameters, τ , used to generate the kernel matrices, and N B , the number of parameters in the set B. An optimal set of hyperparameters, τ can be found through an optimization method that maximizes the Eq. 4 such as Stochastic Gradient Descent (SGD).We impose prior bounds on the parameters in τ during optimization such that the hyperparameters controlling scale and amplitude for a given Gaussian kernel (e.g.γ p and α) can span between [0,∞), and the mean function, µ Q can span (−∞, ∞).τ can then be used to generate an estimator for Q(θ * i ) and its predictive variance σ 2 (θ * i ), at any new set of parameter values, with K * = V (θ * i , θ j ; τ ), and K * * = V (θ * i , θ * i ; τ ), both evaluated using the set of optimized hyperparameters, τ .
To be clear, the quantity σ 2 (θ * i ) is the variance associated with the prediction by the emulator of the mean Q(θ * i ) at one parameter location.But when predicting the mean at multiple parameter points, we obtain the predictive covariance: σ 2 (θ * i , θ * j ).Throughout the text, when making predictions on multiple values, we will refer to the diagonal of the predictive covariance matrix as the predictive variance and the square root of the diagonal σ(θ * i , θ * i ) as the predictive uncertainty.These quantities are distinct from the sample variance, σ 2 Q , representing intrinsic scatter associated with a quantity, which can, in general, be a hyperparameter learned through regression.
So far, we have described a brief and general formalism for typical Gaussian process regression.For a more detailed description, we recommend Aigrain & Foreman-Mackey (2022) or Rasmussen & Williams (2006).

Introducing CARPoolGP
CARPoolGP builds upon this general framework of Gaussian process regression while borrowing the key principle of the CARPool method to reduce variance in a predicted quantity (Chartier et al. 2021).CAR-Pool achieves this through an experimental design that leverages correlations between 'expensive' base samples A cartoon including the CARPoolGP components.The base samples are spread over the parameter space at θi, while the surrogates live on a parameter island at θ S .A base surrogate pair can be identified by the duo that has matching sample noise realization, denoted by the number in the circle.In this simple cartoon, the surrogates on the parameter island can broadcast their knowledge of their error to base samples represented here by the colors (e.g., red samples are above the mean at the parameter island, blue samples are below the mean at the parameter island).
and 'cheap' surrogate samples.In CARPool, numerous inexpensive surrogate samples can be generated at arbitrarily low costs to accurately estimate the mean of the surrogate quantity.The correlation between surrogates and a few base samples is then leveraged to apply this accurate estimate to the base quantity.A reduced variance and accurate expression of some mean quantity can be constructed without the need for many expensive samples.CARPool has been adopted for mock catalog generation in DESI (Ding et al. 2022), and similar work has explored sampling and emulation using correlated simulations at different fidelity such as Ho et al. (2022) The novelty of CARPoolGP is in applying this reduced variance procedure across an entire parameter space.In CARPoolGP, base samples provide parameter space coverage at more parameter space locations but are expected to be riddled with high variance.Surrogates, however, can be generated with the same or different process as the base (e.g.different cost), and are generated at an alternate set of locations that we call parameter islands.Parameter islands contain multiple surrogates, providing a reduced variance estimate at these locations.By correlating isolated base samples to surrogates, the parameter space can be sampled more densely while borrowing the variance reduction achieved at parameter islands.In Fig. 1, we present a simplistic picture of the CARPoolGP sampling method.We show three base samples spread through a one-dimensional parameter space with some associated noisy quantity Q(θ i ).Surrogate samples live on the parameter island, which is shown as the black bar, located at θ S .A basesurrogate pair, which has correlated sample noise realizations, is shown by samples with matching numbers.The colors show how surrogates from the parameter island can maintain some conception of their variation at the parameter island (e.g., red is too high, blue is too low) and broadcast this to the base samples.
Parameter islands have some set of parameter space locations S ≡ {θ i |i = 1, 2, ..., N S }, S / ∈ B, where N S is the total number of parameter islands.Surrogates are unique in that they can have their sample variance correlated with a base sample, and typically, there will be numerous surrogates on an individual island.In CARPoolGP, surrogate samples can be generated with varying cost-based processes, as done in Chartier et al. (2021) or with the same process as base samples (e.g., same cost).The advantage of this flexibility will be clear when we are interested in the evolution of a quantity across a parameter space for a particular process and assume that locations in this space contain correlations.Then, with surrogate quantities generated by the same process as base quantities (e.g., using the same cost for generation), one can leverage the variance reduction at surrogate locations and broadcast this via correlations at base locations.
Similarly to base quantities, we will call the quantities we measure from the surrogates QS , which have sample noise ϵ S , QS = Q S + ϵ S .
In principle, Q S and Q can arise from different functions, and ϵ S can differ from ϵ.This scenario could arise in the situation where surrogate quantities are drawn from processes with different costs, e.g., expensive versus fast simulations (Chartier et al. 2021).However, the surrogates may also be produced by the same process as the base samples, which indeed is the specific implementation in this work, as described below.Therefore, in the examples used in this work, the functions that generate Q and Q S are the same.Similar to the bases, surrogates have some covariance matrix, where W ij is again the smoothly varying portion of the quantity's variance in analogy to V ij in Eq. 2. Unlike the F ij matrix in Eq. 2, which is typically diagonal for independent samples, the E ij matrix does not need to be diagonal.Instead, off-diagonal elements can exist if correlations exist between the surrogate simulations at different islands.For example, if surrogates from different islands in parameter space share the same realization of the sample noise, then the off-diagonal elements E i̸ =j would contain values of r ij σ i σ j , where r ij represents the correlation between the two surrogates with sample variances σ i = σ S (θ i ) and σ j = σ S (θ j ).
Assuming again that the variance among surrogates is Gaussian and can be assigned some prior mean term µ S , then the surrogate and base samples can be represented as normally.
where X ij is the covariance between the base and surrogate samples.These, in turn, can again be modeled with a combination of both smooth and sample covariance kernels, where Y ij and M ij should be considered analogs to those shown in Eq. 2 and Eq. 7.However, care must be taken in their construction.First, the most general form of Y ij should allow the smooth functions Q(θ i ) and Q S (θ i ) to be different in which case the distance between the location of the parameters in B and S must be modified in such a scenario to account for the differences in functional form.We account for this modulation with the learnable hyperparameter ∆q BS , which can be linearly incorporated into the distance calculation with The M ij matrix is similar to the F ij and E ij matrices in Eqs. 2 and 7, except the M ij matrix considers the correlation between base and surrogates, and can be non-zero for any base-surrogate pair generated to have correlated sample variance.This correlation is the critical component of the CAR-PoolGP method, and without it, we do not expect an increased performance compared to a standard Gaussian process with an equal number of samples.With a high level of correlation between base and surrogate samples through M ij , both base and surrogate can independently calibrate themselves on nearby parameter samples from multiple locations in the parameter space.
Writing the block covariance matrix in a compact form as allows us to rewrite the likelihood function of Eq. 4 for the general CARPoolGP case as, It should be noted that the most general form of Σ ij allows the generation of multiple surrogates for each base, for example, in the scenario where a base sample has surrogates drawn from multiple different cost-varying processes.The effect on Σ ij would be the addition of kernel matrices analogous to D ij along the diagonal of the block matrix, X ij along the off diagonals with an associated hyperparameter ∆q BS , and a new set of kernel matrices that describe the correlations between surrogates of different processes to extend the off diagonals of the block matrix.By maximizing the likelihood function of Eq. 12 with respect to τ , and using the optimized Σ( τ ) as the covariance matrix, one can again build an estimator as in Eq. 5 for the quantity of interest at new points in parameter space, with the predictive kernels defined as, We again note that during optimization, the hyperparameters controlling scales and amplitudes can explore all positive real numbers, while the mean function and the hyperparameter, ∆q BS introduced in Eq. 10, can explore all real numbers.In practice, we set the initial values of τ to be random numbers close to zero and find that irrespective of initialization, we obtain the same optimized τ .Similarly, we find that the hyperparameters of the optimizer, such as the learning rate, do not affect our results.When we perform optimizations with higher learning rates and lower iteration numbers, we obtain the same results as when using a lower learning rate and higher iteration number.Finally, the results appear stable across different optimizers.We explored both ADAM and SGD, finding similar optimized results between them.We use the SGD optimizer in this work, as we noticed it converges to a solution faster than ADAM.
To summarize the CARPoolGP strategy, we provide a step-by-step algorithm.
1. Define a set of parameters, B, at which to generate base sample quantities Q.
2. Define a set, S, of parameter islands, θ S , in which to generate surrogate sample quantities, QS .Ensure that base-surrogate pairs have some level of correlation between them.
3. Define the noise kernels and associated hyperparameters for the base, surrogate, and combined quantities: 4. Maximize the likelihood function to obtain the optimal set of hyperparameters, τ .
5. Emulate using Eq. 13 to find the predicted mean Q(θ * i ), and its associated predictive covariance σ 2 (θ * i , θ * j ) at new locations in parameter space.

Active learning approach
The above outline for CARPoolGP says nothing about the optimal locations in parameter space to draw base or surrogate samples.Instead, it describes a process for performing emulation given base and surrogate samples that were designed to have correlated noise.We show in Section 2.4 that randomly distributing base samples and uniformly placing surrogate samples reduces the variance on some emulated quantity.However, one could imagine that drawing samples at specific regions in parameter space could provide more or less benefit in the pursuit of variance reduction.Indeed, there has been research on sampling optimal locations in parameter space using both Gaussian process regression and Bayesian optimization (Leclercq 2018).Here, we implement a simplified approach that fits within the framework developed in the previous sections.
This section explores a sampling method that picks base-parameter locations based on their predicted influence over the emulator's variance reduction.We develop an approach such that sample locations are not chosen a priori but instead on the fly using CARPoolGP.This approach equitably treats the parameter space by buttressing regions predicted to have enhanced variance with extra samples and sparsely collecting data in regions that do not require any additional assistance.
As we have already seen, given a conditioned covariance matrix, Σ( τ ) one can emulate some quantity at a new parameter point θ * i , obtaining the predicted quantity Q(θ * i ) and its variance σ 2 (θ * i , θ * j ) using Eq. 13.Furthermore, the predicted variance is merely a function of the optimal hyperparameters in Σ( τ ) and the locations in the parameter space used in conditioning, θ i , and emulating, θ * i (Eq.14); in particular, the variance is not explicitly dependent on the value of the quantities.This means that by incorporating more parameter space locations into the base and surrogate samples vector, we can predict the change in predictive variance following Eq.13 even before actually generating new samples at these parameter points.We use this change in predictive variance to explore a range of potential new parameter locations, called Candidate points, finding which candidate provides the largest reduction in predictive variance across the entire parameter space.
More rigorously, we consider a range of candidate points in the parameter space Each candidate is individually incorporated into the base parameter vector, θ i , and an associated surrogate with correlated sample variance is placed at a parameter island following some algorithm.In this work, we place surrogates on the nearest-neighbor islands, but other choices could also be made.These updated covariance kernels are used to predict the variance across the full set of test points using Eq. 13.From this we obtain the variance, σ 2 k (θ * i , θ * j ) from points in the test set, T .We place k in the subscript of σ 2 to make clear that the variance is unique for each candidate point θ k .We summarize the effect of adding the candidate as the trace of the predictive covariance matrix, which we call β k , The candidate point that obtains the smallest value for β k is considered the best place to perform the next sampling, θk , as it will provide the largest variance reduction when emulating across the entire space.This process is not limited to finding a single nextparameter location, θk .Instead, a batch of future points can be obtained by repeating the above process and updating the covariance matrices at each iteration.In this way, one can choose an arbitrary number of points to sample next.However, it should be noted that the parameters chosen are done with respect to the current best-fit set of τ .It could be that τ is sensitive to the addition of numerous new parameters, so being frugal in the number of new parameter points or testing the sensitivity of τ with respect to the number of additional parameter points is important.
There are three further generalizations to this approach, which we briefly identify to assist in better understanding this process.However, we do not explore them further in this work.First, in Eq. 15, we only compute the predictive variance of one quantity, but this is a simplifying choice that can be made arbitrarily more complex.In practice, one can choose to predict a plethora of quantities and sum their individual predictive variance traces following some normalization to remove any units.The quantity with the most dominant normalized variance will likely determine the choice of successful candidate points, but for those interested in a variety of quantities drawn from an individual sample, using this more generalized approach may be desirable.Second, the choice to use the trace in computing Eq. 15 is somewhat arbitrary, and in general, other compression operations can be used, such as the determinant.In practice, we use the trace, as this is typically a computationally cheap operation and, more importantly, the most straightforward intuitive measure of the hyperparameter space's predictive covariance.Finally, the abovementioned process focuses on placing new coordinates based on base samples, with surrogates located at the nearest neighbor parameter island.However, this process could be extended to remove any association with the candidate point.Then, the active learning process could determine whether the candidate point should be placed as a base or as a surrogate at a new parameter island location.In this second case, one must find the best base to correlate with the new surrogate.

A toy example
We now illustrate the potential of CARPoolGP and the active learning approach on a simple onedimensional toy model where the base and surrogate samples are generated via the same process.We consider some mean quantity, Q, which has a functional form, where a and b are some constants, and θ i is our independent, one-dimensional parameter.Once again, this quantity and its functional form are completely arbitrary and for the sake of a better understanding of how CARPoolGP works.As we will show in later sections, higher dimensional parameter spaces with more complicated processes work just as well with CARPoolGP.
We will follow the procedure outlined at the end of Section 2.1.

Generate random parameters and base sample quantities
Figure 2. The sampling for our toy model, which we use to explore the efficacy of CARPoolGP.The left panel shows fifty base samples in red and fifty surrogate samples in blue, where each base sample has a sample variance correlated to a surrogate sample at the nearest parameter island.The grey double-headed arrow indicates a pair of samples that have been correlated with one another.In the right panel, we show an alternative sampling approach in which one hundred data points have been drawn from a uniform random distribution but contain uncorrelated sample variance.Both panels show in black the true variation of the quantity Q, which follows Eq. 16.
We build the base quantities by sampling a uniformly random distribution within the domain [−10, 10] to generate a set of parameter points B ≡ {θ i |i = 1, 2, ..., N }, and quantities, Q(θ i ).We choose N = 50 and draw ϵ from a normal distribution with ⟨ϵ⟩ = 0 and ⟨ϵ 2 ⟩ = σ 2 Q which we set to a constant value of σ Q = 0.1.

Generate parameter islands at which to extract surrogate sample quantities
We then generate parameter islands in the set S ≡ {θ i |i = 1, 2, ..., N S } by linearly spacing N S = 5 points in the range [−8, 8] with the same process as defined in Eq. 16.For each base sample, the island closest to the parameter is identified, and a surrogate sample is drawn at this island location, θ i , to generate QS (θ i ) where the noise, ϵ s , is perfectly correlated with the noise of the base simulation (i.e., the same amplitude of the noise is used ϵ s,i = ϵ i ).
The left panel in Fig. 2 shows the data we construct for this toy model.The red points are the base samples, each of which is not correlated with all of the others and lives at an independent point in the parameter space.The blue squares are surrogate samples that live together on the parameter islands.Each surrogate has correlated noise to a neighbor base sample.We show a two-headed arrow in grey to represent a base-surrogate correlation pair, but each base sample on the left panel is correlated with a surrogate on a parameter island, making a unique base-surrogate pair.The quantity, Q(θ * i ), has some true evolution at linearly spaced test locations within the domain [−10, 10]: T ≡ {θ * i |i = 1, 2, ..., N T } that follows Eq. 16.We set N T = 1000, and plot Q(θ * i ) in black.
The right panel of Fig. 2 shows a more typical approach, where samples are randomly drawn from a uniform distribution throughout the parameter space and are not correlated with one another.We will be interested in comparing CARPoolGP, with fifty base and fifty surrogate samples, to a standard GP that uses the purely random sampling approach with one hundred uncorrelated samples.

Determine noise kernels
For both base and surrogate samples, we use a radial basis function defined in Eq. 3 to describe the smooth varying component of the covariance.Base and surrogates are drawn from the same underlying process and with the same level of sample variance, so the hyperparameters, τ , are shared across both matrices.
The only difference between the two matrices is the parameters that are used to generate them.V , uses the base samples, while W uses the surrogate samples.The full covariance for the base samples and the surrogate samples can be written following Eq. 2 and Eq. 7 We choose the kernel that describes the smooth covariance between the base and surrogate samples to be an RBF following Eq.9, but we set the additional parameter, ∆q BS = 0, as the processes between the base and .The CARPoolGP emulation with 95% confidence intervals (left panel as a blue line and shaded region) and a standard sampling approach (right panel as a blue line and shaded region).In both subplots, we show the true quantity variation as a black dotted line.Both emulations have the same cost in terms of the number of samples, but when using the CARPoolGP approach -with correlated base-surrogates samples that live on parameter islands, as shown in Fig. 2 -we obtain a much more accurate emulator with a greatly reduced predictive variance.Unlike the following figure (Fig. 4), in this figure, no active learning has been used to select the sample locations surrogates are the same.We use the same scale and amplitude parameters for the V ij and W ij matrices to define the covariance between base and surrogate samples, To relate the base samples to the surrogates, we use the fact that we have set a perfect correlation between the sample fluctuations and, therefore, set the M matrix to where the δ ij is a delta function that is 1 at locations of base-surrogate pairs, and 0 elsewhere.Recall that the distance between parameter space locations in Y ij and M ij are evaluated between base and surrogate samples.Following Eq. 9, we then have We can now build the block covariance matrix containing all of these components following Eq.11 where τ is the vector of hyperparameters, τ = (α, γ, σ 2 Q ) Because we are interested in studying the effect of using correlated samples, we build the same covariance matrices for the scenario where we have no surrogate simulations, and each sample is simply uncorrelated (right panel of Fig. 2.) In the scenario of completely uncorrelated samples, the block covariance matrix is equivalent to a standard RBF function with three free hyperparameters describing the amplitude and scale of the RBF and a diagonal term representing the variance level of the samples.

Maximize the likelihood function
We use the Gaussian likelihood function as defined in Eq. 12 and choose uninformative priors for µ B and µ S , but allow them to be learned as additional hyperparameters in the regression.We have written CARPoolGP in the JAX5 (Bradbury et al. 2018) programming language, allowing for auto-differentiation support and efficient gradient descent optimization.We minimize the negative log of the likelihood function to obtain an optimal set of hyperparameters, τ using Stochastic Gradient Descent (SGD) from the optax6 package (DeepMind et al. 2020).To compare the effect of CARPoolGP, we perform an identical optimization process on a set of one hundred uncorrelated samples.

Emulate the quantity
We now have all we need to perform an emulation at sample points from the set T using Eq. 13.In The results of emulation with this parameter sampling arrangement are shown in the 95% confidence interval.The next two panels show the following two stages of active learning, the last containing a cumulative total of fifty base samples with fifty associated surrogates as in the left panels of Fig. 2 and Fig. 3.
region shows the 95% confidence region evaluated from σ 2 (θ * i , θ * i ).We compare this with the right panel that contains the prediction and error when the samples are uncorrelated and drawn from a uniform distribution in the parameter space.Both panels were generated with the same 'cost,' meaning that they both contain one hundred samples.However, the variance in predicted quantities from the CARPoolGP panel is much smaller than in the typical case and contains fewer spurious fluctuations.

Apply active learning
We now show that by performing our sampling and emulation in stages, we can choose the next places to sample that will provide the largest variance reduction in our final emulation.We follow the procedure outlined in Section 2.3 and test the efficacy of the active learning approach against the CARPoolGP approach without active learning from the left panel of Fig. 3.
We follow the outline above for the CARPoolGP case but set N = 20.The emulator is trained with the same kernels as previously described but now with only twenty randomly sampled points and their associated surrogates, which live amongst the same five-parameter islands as the above case.The base and surrogate samples and the CARPoolGP prediction from this first stage are shown in the left-most panel of Fig. 4.This sparse sampling of points leads to a large predictive variance, particularly at undersampled locations (e.g., θ = −5).
Following Section 2.3, we first choose the number of new samples we want to generate, N t , and a set of candidate points to pick from, N c .To be clear, each new parameter location, θ j , added is chosen from a unique set of candidate points K c,j .The standard error σ(θ) is shown as a function of test parameter space locations, θ * , for the active learning stages (increasing dark red lines), the standard CAR-PoolGP approach (blue line), and No CARPoolGP (dashed black line).The first stage of active learning contains 20 base and 20 surrogate samples, and each subsequent stage adds 10 base and 10 surrogate samples following the active learning method, such that stage 4 has a total of 50 base and 50 surrogate samples, as does the case of standard CARPool, while the no-CARPoolGP case has the same total of 100 samples.CARPoolGP vastly outperforms the random sampling approach and is further boosted when combined with active learning.
K c,j / ∈ S, K c,j / ∈ T .The nearest neighbor parameter island is identified for each candidate point, and an associated surrogate sample is placed on this island, increasing the population by one.
We set N t = 10, and for each point, we check N c = 512 candidates and associated surrogates within the domain of the problem.Now we iteratively add each candidate point and associated surrogate to the covariance matri-ces to recalculate Σ( τ ), K * ( τ ) and K * * ( τ ), and compute the error on the set of test points following Eq.13.We then compute the trace of the predictive covariance matrix as in Eq. 15 and save this value into the vector, β c .We find the best next place to sample the space as: Repeating this process N t times provides a set of new locations to perform base sampling.Using the model of Eq. 16, we compute the value of Q at the new base and surrogate parameter locations and retrain the CAR-PoolGP emulator to obtain a new set of τ .In the second panel of Fig. 4, we show the results of our second stage of active learning.The ten new base samples are placed as red points, and the associated surrogates as blue squares.Not surprisingly, we find that the regions of the highest variance in the first subpanel are the regions that CARPoolGP's active learning procedure targets to sample next (e.g., θ = −5).From this addition, we further find that the accuracy of the mean prediction is enhanced, and the variance is vastly reduced.
As shown in the third and fourth panels, we performed two more active learning stages.Each shows the next base samples that were chosen and their associated surrogates.Following the final step, our parameter space has been sampled one hundred times, with fifty base samples and fifty surrogates.We wish to compare three errors: CARPoolGP vs.No CARPool, CARPoolGP vs. CARPoolGP active learning, and CARPoolGP active learning vs.No CARPool.These comparisons are presented in Fig. 5, where the error, σ(θ * ii ) = σ 2 (θ * i , θ * i ), is shown as a function of the parameter space location.The increasingly dark red lines represent the active learning stages, starting with twenty base and twenty surrogate samples (lightest red) and ending with fifty base and fifty surrogate samples (darkest red).The scenario in which CARPoolGP is used with no active learning but contains fifty bases and fifty surrogates is shown in blue.We finally show the error from typical random sampling with one hundred uncorrelated samples in dashed black.
Here, we see that by performing active learning, we get a ∼ 36% reduction in σ(θ * ) compared to the regular CARPoolGP, and ∼ 65% better than the standard random sampling approach.Even if active learning is not used, CARPoolGP outperforms the error in random sampling by ∼ 45%.To put these results in terms of costs, assuming that the error is reduced by 1/ √ N , Fig. 5 says that one would require ∼ 3 times more samples to achieve the same error as if one did not use CARPoolGP, and ∼ 8 times more samples to achieve the error in the active learning case.It should be noted that here, and in the rest of this work, we only use active learning to reduce the variance on a single quantity, which may not guarantee the same variance reduction on quantities not explicitly used in the active learning procedure.In general, more quantities can be used, and further research is required to investigate the changes in variance across quantities.We leave this multi-quantity active learning approach as an interesting avenue for further research.

SIMULATIONS
The following subsections detail the generation of our simulation suite and the context in which CARPoolGP is used.We begin with an overview of the galaxy formation model used in this work, and the parameter space spanned in the simulation suite.We then discuss the generation of base and surrogate simulations and how we utilize CARPoolGP active learning to iteratively choose future parameter space coordinates for simulations.
The flagship CAMELS suite parameterizes Illus-trisTNG's galaxy formation model with four astrophysical parameters corresponding to supernova and AGN feedback, but more recently Ni et al. (2023) (henceforth Ni23) extended this set to include 2048 simulations sampled from a Sobol sequence (Sobol' 1967) over a 28 dimensional astrophysical and cosmological parameter space in the IllustrisTNG model (TNG-SB28).
This work explores the same 28-dimensional parameter space as in Ni23.The parameter names, a brief description, and the bounds can be found in Appendix A of Ni23.We also add a parameter to control the desired halo mass of the main halo in the zoomin region.We limit this mass to a range of 13 ≤ log(M 200 / M ⊙ h −1 ) ≤ 14.5, where M 200 is the total enclosed mass inside a radius R 200 which describes the radius at which the density is two hundred times the critical density, ρ c , of the universe.We consider this mass range as it is relevant to current X-ray and upcoming CMB experiments.
We further highlight that our sampling method differs from TNG-SB28 in that we do not sample all of our simulations from the same Sobol sequence, but instead, we utilize the active learning method outlined in Section 2.3.We discuss this process and its implications in the following section.
At each parameter space location, θ i , we run three separate simulations: A parent simulation used to identify a halo for zooming in, a hydrodynamical zoomin, and dark matter-only analog zoom-in of the halo chosen from the parent simulation.We first create a (200 Mpc h −1 ) 3 parent box with 256 3 dark matter particles that have mass where Ω m is the matter density of the universe and is varied in the 28-dimensional parameter space.Note that this is a very low-resolution simulation and exists only for the purpose of generating a large cosmological box with a large population of different halo masses.Initial fluctuations are generated at z = 127 using MUl-tiScaleInitialConditions (MUSIC) (Hahn & Abel 2011) with second-order Lagrangian Perturbation Theory, and evolved to z = 0 using AREPO (Weinberger et al. 2020).We identify halos in the parent box using the Friends of Friends algorithm (Davis et al. 1985).We then look for a halo at z = 0 with mass M 200 (which is calculated by SUBFIND (Springel et al. 2001) around the particle with the minimum energy and with respect to the critical density) closest to the desired mass θ i,Mass .The true mass of the halo is used to replace the mass direction in this vector, θ i,Mass = M 200 .
All particles within 6 × R 200 of the chosen halo are identified and traced back to their locations in the initial conditions, where a bounding rectangular Lagrangian region is drawn to encapsulate all of the traced particles.We chose such a large region surrounding each halo to ensure reduced low-resolution dark matter contamination in the final zoom-in simulation (Oñorbe et al. 2014).The hydrodynamical zoom-in simulation is then run centered on the chosen halo, where the gas particles have M gas = 1.26 × 10 7 (Ω b /0.049) h −1 M ⊙ and the high-resolution Lagrangian region contains dark matter particles with mass M DM,HighRes = 6.49× 10 7 ((Ω m − Ω b )/0.251) h −1 M ⊙ , matching the resolution of the (25 Mpc h −1 ) 3 CAMELS boxes.For complete-ness, we perform associated dark matter-only zoom-in analogs to match the above using the same method and with the same resolution.

Base-surrogate simulation pairs
The benefit of CARPoolGP described in Section 2.1 relies on correlations in the sample fluctuations between the bases and surrogates.In the toy model (Section 2.4), we forced a perfect correlation by adding the same Gaussian distributed noise realization to base-surrogate pairs, but in the situation where the quantity of interest is extracted from a simulation, the sample variance, often called "cosmic variance", derives from the random seed describing the phases and amplitudes of initial fluctuations in the simulations themselves.The base and surrogate simulations are then built by performing a zoom-in of the same halo at two separate points in parameter space -one at a unique parameter space location (base), and one on a parameter island (surrogate).
The process for creating this pair starts with the generation of the base parent box and choosing a halo at some parameter space location, following the direction of the previous section.The surrogate halo is now chosen by first finding the closest (smallest Euclidean distance) parameter island, θ S , to the parameter space coordinate of the base simulation.We then generate a parent box for the surrogate following the same process as the base, but at the parameter location, θ S .While the amplitudes of the MUSIC initial conditions depend on cosmological parameters, the random numbers controlling the phases of the initial fluctuations of the surrogate simulation are matched to the base simulations so that their realizations are only affected by the changes in parameter space locations and have correlated sample variance.We determine the halo catalog in the surrogate simulation using FoF, and the surrogate halo is then found by performing a bijective matching between the base halo and the surrogate parent box to find the most probable surrogate halo match.Just as in base simulations, the mass parameter θ S,Mass is set to the true value of the surrogate halo mass: θ S,Mass = M 200 where M 200 is computed through SUBFIND.We note that the differences in cosmology translate to differences in mass between a base-surrogate pair.We describe this effect on the distribution of masses in the full suite of simulations in Sec.3.4.Now that both the base and surrogate have their associated parent boxes and zoom-in regions, we run the hydrodynamical and dark matter-only analogs following the previous section.

Sampling stages
In this work, we apply the active learning procedure of Section 2.3, unlike the Sobol sequence used in Ni23, or the Latin hypercube in Villaescusa-Navarro et al. (2021c).Our goal is to perform simulations at locations in the 29-dimensional parameter space (28 Illus-trisTNG parameter dimensions and 1 mass parameter dimension sampled logarithmically) that optimally reduce the predictive covariance of some chosen quantity, as we presented in the toy model of Section 2.4.
This process requires the choice of 1.) some quantity to extract from the simulations, 2.) a covariance kernel trained on a set of existing simulations, 3.) candidate parameter space locations to consider running future simulations, and 4.) an independent set of test parameter space locations to compute the variance over.We address each of these steps in the following paragraphs.
We focus our active learning on the Compton Y parameter defined as, Where, σ T is the Thompson cross section, m e is the mass of an electron, c is the speed of light, and the integral is done over the electron pressure (P e ) within a sphere of radius R 200 .We will refer to this quantity as Y in the text for brevity.
As discussed in Section 1, future CMB and X-ray experiments will resolve the tSZ effect in halos down to the M 200 ∼ 10 13 M ⊙ h −1 scale, providing constraints on astrophysical processes such as AGN feedback and galactic winds.In addition, Y is a proxy for halo masses following commonly used self-similar relations and also represents the thermodynamic properties of the halo gas (Nagai 2006;Battaglia et al. 2012;Ettori et al. 2013).In numerous recent studies using CAMELS, the Y signal has been explored for the purpose of finding improved scaling relations (Wadekar et al. 2023a,b), constraining galaxy environments (Hadzhiyska et al. 2023b), and predictions on feedback constraints from the CGM (Moser et al. 2022).In these works using CAMELS, the halo mass is typically limited to 10 13 M ⊙ h −1 , because of the small number of halos that exist across the CAMELS suite above this range.By including a reduced variance emulator of Y for high-mass halos, we hope to provide a means for improving the mass coverage of these works across the full range of parameter space.
The first set of base simulations, B 1 ≡ {θ i |i = 1, 2, ..., 128}, is drawn from a Sobol sequence with 128 parameter space locations across the 29-dimensional parameter space that spans the prior ranges shown in Ni23.The parameter islands, S ≡ {θ i |i = 1, 2, ..., 128}, are chosen to be a set of one-hundred and twenty-eight parameter space coordinates from a Sobol sequence with initialization such that S / ∈ B 1 , and the set of surrogates is chosen by finding the nearest neighbor island to each base simulation.The parameter space islands used in this first stage are kept constant throughout all the active learning stages, and each surrogate is always chosen in this way.We run Stage 1 of the group-scale zoom-in simulation suite with the first set of 128 base and 128 surrogate parameter space locations following Section 3.1 and compute the resulting Y from Eq. 23 for each simulation.
Covariance kernels are generated following Section 2.1 and using RBF kernels (Eq. 3) for smooth covariance matrices (V in Eq. 2, W in Eq. 7, Y in Eq. 9), and linear exponential kernels to correlate the sample variance (M in Eq. 9).Just as in the toy example of Section 2.4, the base and surrogate simulations are drawn from the same process (i.e., resolution), so the kernels can be written as: We minimize the negative log-likelihood function in Eq. 4 using an SGD optimizer and the extracted values of Y from each halo in the simulation to obtain the optimal covariance kernel.We can now generate predictions and find the predictive covariance of Eq. 13.
The next step in the active learning procedure is to test the location of candidate parameters and observe the relative effect on the predictive covariance matrix.To do this, we generate a new Sobol sequence with 1024 candidate points, K ≡ {θ i |i = 1, 2, ..., 1024}, K / ∈ B 1 , K / ∈ S and another with test points T ≡ {θ i |i = 1, 2, ..., 1024}, T / ∈ B 1 , T / ∈ S, T / ∈ K.Each candidate point is given a surrogate at the nearest-parameter island.We iteratively compute the predictive covariance matrix at the test points after each candidate (and associated surrogate) is added, independently of all other candidates, to the predictive kernel.We then build the β vector following Section 2.3 by computing the trace of the predictive covariance matrix.The candidate and surrogate pair that produces the smallest β value is saved, and these locations of the parameter space are chosen to perform the next base and surrogate simulations.We keep this base surrogate pair in the covariance matrix and repeat this process 128 times, with a new set of K each time, until we have a new vector of 128 base and 128 surrogate parameter space locations to perform simulations at.This next set of 256 simulations is called stage 2.
We perform a third and fourth active learning step following the same process as in the above paragraph, .The distribution of masses from the suite of zoomin simulations.We use a kernel density estimate of the distribution drawn as the blue line.For a Sobol sequence, the distribution in this space would be flat, i.e., have the same number of simulations in each bin, but it is clear that there are particular places in parameter space that are preferred using the active learning approach as opposed to a standard Sobol sequence.
except for a smaller number of simulations, 64 base and 64 surrogate.We call these stages stage 3 and stage 4, and following this set, we are left with a total of 384 base and 384 surrogate simulations.
Fig. 6 shows the mass distribution produced through the full suite of simulations.There are two pieces of important information from this plot.First, the distribution of simulations in the parameter space is not uniform, as one would expect from a Sobol sequence, and instead contains regions of apparent preference.This clarifies that the resultant vector of parameters from the active learning CARPoolGP approach differs from what one would obtain using a Sobol sequence or Latin hypercube.The second is that the prior range of masses in 13 ≤ log M ⊙ / M ⊙ h −1 ≤ 14.5 is extended due to the particular simulations.There are a few reasons for this.First, there are base simulations that have a desired mass parameter near the bounds of the prior range.In these scenarios, the halo with the most similar mass may be slightly outside of this range and cause the distribution to broaden.A more frequent occurrence is that a base simulation with a relatively high (low) extracted mass can contain a surrogate simulation at a parameter island with a larger (smaller) value of Ω M , causing the matching halo in the surrogate to be more massive (less massive) than the prior bounds.Both of these effects lead to a general broadening of the mass distribution.While we include all of the halos in the following analysis, we limit the emulated values to within the prior  bounds and treat masses outside this range as extrapolations instead of interpolations.

Active learning stages
For each active learning step, we compute the predictive covariance of Eq. 13 on a set of 4096 test locations drawn from a sobol sequence within the priors described in Ni23 and the additional mass prior.We then fit a Gaussian to the distribution of the square root of the diagonal of the predictive covariance matrix, which represents the predictive standard deviation or uncertainty, σ Y .In Fig. 7, we plot the fits for each stage in solid lines.As expected, there is a clear trend of decreasing predictive uncertainty with each stage.
Considering the expected variance reduction from CARPoolGP prior to generating simulations is also interesting.We can do this by simply incorporating the suggested new points into the covariance matrices in Eq. 13.The dotted lines in Fig. 7 represent what CAR-PoolGP predicts the distribution should look like after each step.
This experiment indicates that stage 2 did not improve the uncertainty as much as expected.On the other hand, stage 3 performed almost exactly as expected, and stage 4 outperformed expectations.This discrepancy could be due to several factors.First, the predictions of Y using only the data from stage 1 contain some predictive uncertainty that could propagate into the expected results.An alternative reason could be that stage 2 doubled the number of simulations compared to stage 1.As noted in Section 2.3, the active learning procedure operates under the assumption that the best-fit kernel parameters, τ , slowly vary with respect to the addition of simulations.We used that assumption to incorporate many samples in stage 2, but we likely included too many in this stage, violating this assumption.After observing these results following stage 2, we reduced the number of new simulations in each stage by a half, leading to variance reduction at the predicted level.
The most conservative approach to active learning would be to add the simulations one at a time in any active learning step.This would ensure that the optimal hyperparameters stay very similar between added stages but would require running the simulations in serial, which would be infeasible for running hundreds of simulations in a timely manner.A balance needs to be met where the number of new simulations cannot exceed a level that significantly changes the values of the hyperparameters but also allows for a generous amount of simulations to be run simultaneously in parallel.We do not investigate this further here, leaving exploration of this balance to future work.

RESULTS AND PHYSICAL INTERPRETATIONS
In this section, we present results using the zoom-in simulations and CARPoolGP.We demonstrate the new suite and emulator's utility in addressing questions in numerical galaxy formation and cosmology that were hitherto inaccessible.We begin by emulating the Y − M relation at the fiducial IllustrisTNG parameter space location and with variations of individual parameters around it to study its dependencies.We then used CAR-PoolGP to emulate other halo summary statistics, allowing qualitative explanations of the observed parameter trends in the Y − M relation.

Y-M emulations
Under the approximation that gas in the most massive halos is in hydrostatic equilibrium in their deep gravitational potential wells, a simple power-law relationship exists between the mass and integrated Compton Y parameter (Eq.23) Y ∝ M 5/3 (Kaiser 1986;Bryan & Norman 1998).The amplitude of this power law, c 0 must be calibrated against the largest mass halos as, in detail, it depends upon the baryonic and total matter density profiles, as well as the gas temperature.We computed c 0 for IllustrisTNG by fitting the power law to halos with mass log(M 200 / M ⊙ h −1 ) > 14 and found c 0 = 10 −27.84 M −5/3 ⊙ Mpc 2 h 1/3 .It is common to rescale Y as (Wadekar et al. 2023a), Poisson error bars while the emulated profile is shown as the blue line with its 1σ predictive uncertainty in the shaded band.It is worth noting that the resolution of our zoom-in simulations is similar to that of IllustrisTNG300-1, such that we ideally expect our emulator to reproduce the IllustrisTNG300-1 results closely.
In fact, there is an overall strong agreement between CARPoolGP and IllustrisTNG300-1 with a deviation from self-similarity in lower masses and a strong agreement in the highest masses.There appear to be four data points in tension with the IllustrisTNG result between 13.2 ≤ log(M 200 / M ⊙ h −1 ) ≤ 13.6, however the emulated values of these points are ∼ 95% correlated with one another implying that they can be represented as a single point in this mass range.The true tension then between the emulated and IllustrisTNG data in this region is small at the ∼ 2σ level.This result is promising evidence that CARPoolGP has learned enough to perform predictions across the entire mass range at the fiducial parameter values and match IllustrisTNG's results with a reduced variance estimate.We emphasize that the emulator was not trained on simulations at the fiducial IllustrisTNG parameter space coordinate, let alone the full mass range at this location.
In Fig. 9, we modulate some parameters between the extrema of their prior bounds while keeping all other parameters fixed at their fiducial values.We focus on the three stellar processes-related parameters with the largest influence on the Y − M relation while showing the Y − M relation over the full 28-dimensional parameter space in Fig. 15 of Appendix A (see Ni23 for a full description of the parameters).Fig. 9 shows that the Y − M relation and its deviation from self-similarity are sensitive to stellar processes such as SN feedback (controlled by ASN1 and ASN2) and star formation (IMF slope).As the strength or velocity of SN winds, via ASN1 and ASN2, respectively, increases (decreases), we find an overall enhancement (suppression) of the Y − M relation.Similarly, as the number of massive stars increases (decreases) with a shallower (steeper) IMF slope, we observe a suppression (enhancement) of the Y − M relation.We discuss the physical causes of these effects below in Section 4.2.
In the flagship CAMELS (25 Mpc h −1 ) 3 volume boxes, a set of simulations that performed this exact method of individual parameter variation (the CAMELS 1P set) to explore this dependency at the level of cosmological boxes.A similar suite of individual parameter variation zoom-in simulations containing two halos and a small subset of zoom-in simulations run for ASN1 and ASN2 were generated for validation purposes.We add points extracted from the CAMELS 1P set as well as points generated from the 1P zoom-in simulations in Fig. 9 to gauge the emulator's performance, particularly at the lowest masses.We see that qualitatively, there is agreement between the emulated prediction of a parameter's evolution and the 1P set.When the emulated Y − M relation produces the same results after modulating a parameter, this means that that parameter does not influence the Y − M relation.

A physical picture of parameter trends in the Y-M relation
Two competing effects modify the pressure in the IGrM/ICM and, therefore, the Y − M relation (Eq.23): SN winds originating from the ISM and AGN feedback from the central black hole.
This section provides a qualitative physical picture of how the IllustrisTNG model parameters influence SN and AGN feedback in IGrM and ICM.This allows us to better understand the parameter trends in the Y − M relation (see Fig. 9 and Fig. 15).It is important to note that a rigorous investigation into the interplay between SN and AGN feedback in the IGrM/ICM is beyond the scope of this work, requiring more in-depth theoretical study and comparisons to observations.However, for the first time, we can observe how changes in the galaxy formation model parameters influence these processes and their subsequent effects on galaxy formation.We begin by introducing various emulations of physical .The emulated temperature (top row), electron density (middle row), and pressure (bottom row) profiles for a log(M200/ M⊙ h −1 ) = 13.75 halo as a function of radius in twenty log spaced bins between 0.01 ≤ r/R200 ≤ 1 at the extreme values for three astrophysical parameters.Each plot shows the profiles normalized by the fiducial to highlight their deviations.We find that the dominant source of pressure changes for each of the chosen parameters is due to changes in their density profiles.Changes in the temperature profiles of the gas are suppressed except for the IMF slope.
quantities, which we then tie together into a physical picture of the IGrM and ICM.During this process, we discover puzzling effects that open the door to interesting avenues for future research.

ASN1
The model parameters showing dominant effects on the Y − M relation are ASN1, ASN2, and the IMF slope.Here, ASN1 and ASN2 are related to the Il-lustrisTNG model's supernova feedback and represent normalization factors for the energy of galactic winds per star formation rate and velocity of galactic winds, respectively (Pillepich et al. 2018;Ni et al. 2023).The IMF slope is the slope of a modified Chabrier (2003) Initial Mass Function above 1M ⊙ .
In the IllustrisTNG model, SN feedback is parameterized by the mass loading factor (Pillepich et al. 2018) where ṀW is the wind mass outflow rate, Ṁ⋆ is the star-formation rate, v w is the redshift dependent SN wind velocity normalized by ASN2, e w is the metalicitydependent energy per unit mass of formed stars in SN winds normalized by ASN1 and τ w is the fraction of thermal energy given to SN winds.AGN feedback in the IllustrisTNG model occurs in two main states: thermal and kinetic.In the thermal mode, the black hole is inefficient at heating or ejecting gas far away from the black hole, while in the kinetic mode, major ejection events occur that can efficiently heat the halo gas and even expel it out of the halo.The transition to the kinetic mode in the fiducial TNG model occurs when black holes are above the accretion rate threshold defined as (Weinberger et al. 2017) where χ 0 is a threshold normalization (QuasarThreshold) and β is a power law scaling in- Figure 11.The emulated black hole mass (upper panels) and gas metallicity (lower panels) as a function of halo mass at the high (red), fiducial (green), and low (blue) bounds of ASN1 (left panels), ASN2 (center panels), and the IMF slope (right panels).We find a significant suppression (enhancement) of the central black hole mass with an increase (decrease) in the energy per unit mass of stars formed in SNe winds, ASN1.We speculate that this is due to an increased (decreased) mass loading factor (Eq. 26), which removes gas from the black hole's accretion supply.The increases in the IMF slope lead to a metallicity enrichment in the halo, which cools and collapses the gas and allows the central black hole to grow larger.dex (QuasarThresholdPower) (Weinberger et al. 2017;Ni et al. 2023).In the fiducial IllustrisTNG model, this transition occurs at a black hole mass of 10 8 M ⊙ .Thus, any suppression or enhancement of the black hole's accretion can affect the thermodynamic state of the halo.
In Fig. 10, we present the emulated temperature, electron density, and pressure profiles for a log(M 200 / M ⊙ h −1 ) = 13.75 halo for ASN1, ASN2, and the IMF slope.To generate this, we compute the mass-weighted temperature, the mean electron density, and electron pressure in twenty log-spaced bins between 0.01 ≤ r/R 200 ≤ 1 from the suite of zoom-in simulations.We then train a CARPoolGP emulator for each bin.We use this set of CARPoolGP emulators to construct profiles at the highest and lowest bounds of ASN1, ASN2, and the IMF slope.Then, we normalize the profiles by emulations at the fiducial parameter space location in order to highlight the variations with respect to the fiducial model.
For reasons that will soon become clear, we complement Fig. 10 with Fig. 11 and Fig. 12 where we emulate the central black hole mass and gas metalicity as a function of halo mass, and the satellite galaxy quenched fraction for a log(M 200 / M ⊙ h −1 ) halo as a function of satellite stellar mass, respectively.We define the satel-lite galaxy quenched fraction in a stellar mass bin following Donnari et al. (2021), q j (28) where N i is the number of satellite galaxies in the i-th stellar mass bin, Ṁ⋆,j is the sum of star formation rates within twice the stellar half radius R 1/2,j of satellite j, M ⋆,j is the total stellar mass within R 1/2,j , and the sum is performed over all satellites within the appropriate stellar mass bin.We consider five logarithmic spaced bins between 8 ≤ M ⋆ /M ⊙ ≤ 11, and we only consider satellites inside R 200 .As ASN1 increases, the energy of the winds increases along with the mass loading factor.We expect a decrease in the star formation rate as stellar winds can transfer more energy and mass out of the Interstellar Medium (ISM) and into the IGrM and ICM.This, in turn, reduces the metallicity of the IGrM and ICM (Fig. 11, lower left panel) and restricts the growth of the central black hole (Fig. 11 upper left panel).
The delayed transition into the kinetic mode can reduce the influence of the AGN feedback compared to the fiducial.The combination of an increased mass loading factor that removes more gas from the ISM, decreased metallicity, which limits the cooling of gas and star formation, and delayed BH growth, which restricts the powerful expulsion of gas from the halo, leads to a net increase in the density throughout the halo (Fig. 10 left center panel).Further, we expect the reduced black hole growth to limit the powerful AGN quenching of surrounding satellite galaxies.We see this to be true in the left panel of Fig. 12.The overall temperature (Fig. 10 left upper panel) of the halo is not greatly affected by these changes as expected (Loken et al. 2002).The effects on the pressure profile (Fig. 10 left lower panel) appear to be dominated by the density changes.A similar, mirrored story can be told for the decrease in ASN1.The decrease in mass loading allows the central black hole to accrete faster and enter the kinetic feedback mode at earlier times.This efficient feedback mode displaces gas throughout the halo and decreases density at all radii while efficiently quenching satellite galaxies in the process.
In this qualitative picture, we speculate that the dominant mechanism affecting the pressure is the AGN feedback, even though the suppression/enhancement of the central black hole growth via SN winds spawns this effect.Strangely, we find that the AGN parameters, which can also affect the growth of the central black hole, do not give rise to significant changes to the Y − M relation.This is somewhat in tension with the above picture and implies either an alternative mechanism controlling the density of the IGrM and ICM or a complex interaction between the SNe and AGN feedbacks, which we have overlooked.We leave this as an exciting avenue for future research.

ASN2
For ASN2, as the velocity of stellar winds is enhanced, wind particles deposit their energy farther out in the halo and, similar to ASN1, decrease the star formation rate.This leads to a similar effect to ASN1, reducing the metallicity of the gas (Fig. 11, center lower panel) and starving the black hole (center upper panel).However, it is clear from Fig. 11 that this effect is smaller when compared to ASN1.In fact, the density profiles of Fig. 10 (center center panel) show what we would expect from an increase in SN wind velocity in the absence of kinetic AGN feedback.Namely, we find that increasing ASN2 depletes the ISM of gas moving it to the outer regions of the halo.This causes an increase in the quenching of satellite galaxies (Fig. 12, center panel) as their starforming material is more removed and placed at farther stretches in their CGM, less able to recycle back and contribute to further star-formation.
SN-driven winds are generally not powerful enough to expel wind above the virial velocity of these halos, as can kinetic AGN feedback.In this way, we find that for a high-velocity wind scenario, there are denser regions in the outer IGrM and ICM, whereas, in low-velocity scenarios, the density is enhanced in the inner 10% of the halo.We find a slight increase in the average gas temperature in the inner 10% of the halo but a mostly constant temperature profile throughout the rest of the halo.Therefore, the dominant effect on the pressure is again due to changes in density in this case.We can conclude from this that the observed effects in the Y −M relation of Fig. 9, are sourced by the changes of the SN feedback such that increases in the SN winds enhance the density and pressure of the IGrM which leads to an enhanced Y − M relation.

IMF Slope
The IMF slope controls the distribution of the masses of individual stars.As the IMF slope increases (becomes less negative), more massive stars are formed, which implies more supernovae and, hence, more chemical enrichment and SN feedback.On the one hand, we expect an increase in the IMF slope and, hence, in the number of supernovae to give rise to effects similar to those seen in ASN1: a reduction in the growth of the black hole and depletion of the gas.Instead, we find almost the exact opposite.The reason for this is the IMF slope's major effect on the gas's metallicity (Fig. 11, lower right panel).The increase in chemical enrichment in the TNG model leads to a reduction in supernova efficiency (Pillepich et al. 2018), but also to an improvement in the cooling function (Vogelsberger et al. 2013).The heavy ions lead to an enhanced cooling flow in the halo, allowing the central black hole to grow significantly faster (Fig. 11, upper right panel).The faster transition into the kinetic AGN mode ejects more gas than in the fiducial and low IMF slope cases.
To further support this picture, we show that the quenching of satellite galaxies (Fig. 12 right panel) is reduced as the IMF slope increases.The increased cooling from the enrichment of the IGrM and ICM leads to an enhancement of star formation in satellite galaxies.While the satellites may still experience environmental quenching from the central black hole in the kinetic feedback mode, their own black holes have likely not grown large enough to start internal quenching processes.Thus, we see that star formation due to the increased reservoir of cold gas in the satellites dominates over the environmental quenching mechanisms.IMF Slope

Quenched fraction
Figure 12.The quenched fraction of satellite galaxies in a log(M200/ M⊙ h −1 ) = 13.75 halo as a function of satellite stellar mass.The parameter variations are the same as in Fig. 11.Decreases in the black hole mass (Fig. 11) correspond to suppressed quenching of satellite galaxies for ASN1, while for ASN2 and the IMF slope, slowing the speed of stellar winds and enhancing the gas enrichment reduces the quenching of satellite galaxies.

DISCUSSION
In this section, we first compute Fisher matrix constraints on the astrophysical parameters for the Y − M relation.We show that by using high-mass halos, the predicted constraints from next-generation SZ experiments are significantly tightened compared to Wadekar et al. (2023b) (henceforth W23), but that when marginalizing over the entire IllustrisTNG parameter space, these constraints significantly diminish.We then discuss the caveats in our CARPoolGP testing methods.

Astrophysical parameter constraints
In Fig. 9, we showed that astrophysical parameters modify the magnitude and deviation from self-similarity in the Y − M relation.In this section, we consider the possibility of constraining the strength of astrophysical parameters with a CMB-S4 and a DESI-like survey.In W23, the strength of four astrophysical parameters was constrained using the CAMELS (25 Mpc h −1 ) 3 boxes containing halos with log(M 200 / M ⊙ h −1 ) < 13.Here, we replicate the analysis but use halos with 13 ≤ log(M 200 / M ⊙ h −1 ) ≤ 14.5 based on our zoom-in sim-ulations and CARPoolGP, and observational constraints in larger mass bins.Just as in W23, we use forecasts of the Y − M relation following Pandey et al. (2020).We refer the reader to Pandey et al. (2020) for details on the generation of the Y − M relation, and show in Table 2 the forecasted constraints for three log mass bins between 13 ≤ log(M 200 / M ⊙ h −1 ) ≤ 14.5 (Pandey et al. (2020) private communication).
W23 solely had access to four astrophysical parameters (WindEnergyin1e51erg -ASN1, RadioFeedback-Factor -AGN1, VariableWindVelFactor -ASN2, Ra-dioFeedbackReorientationFactor -AGN2).To provide a similar comparison, we limit the astrophysical parameter space to include these same four parameters.We perform a Fisher forecast assuming the parameter distributions are Gaussian, and compute (Tegmark 1997): where Ȳi is the mean Y for a given mass bin, θ, is the astrophysical parameter, and C −1 is the inverse covariance matrix.
CARPoolGP is equipped to emulate Y at any mass, so we treat log( Ȳ ) as an emulation at the center of each mass bin (log(M 200 / M ⊙ h −1 ) = [13.25, 13.75, 14.25]), given the fiducial set of IllustrisTNG parameters.We have written CARPoolGP in the JAX programming language, which offers auto-differentiation of input variables.This allows us to compute directly the derivatives with respect to astrophysical parameters.Finally, ASN1 0 ± 0.17 This work Wadekar et al. 2023 Figure 13.A Fisher forecast, using our high-mass zoom-in simulation suite, on the four astrophysical parameters used in the flagship CAMELS (25 Mpc h −1 ) 3 boxes.We generate derivatives and mean log Y values using CARPoolGP and its autodifferentiability, along with forecasted constraints on the Y-M relation from Pandey et al. (2020).The Fisher forecast using low mass halos in W23 are shown as red contours and lines.We find strong constraints on each astrophysical parameter, except for AGN1.
for the covariance matrix, we use the values in Table 2, and assume that the mass bins are uncorrelated so that the matrix is diagonal.We generate the matrix elements using the mean Y values and errorbars such that where ∆Y i is one-half the width of an error bar on Y shown in Table 2. Just as in W23, we center the covariance measurements on the emulated values of Y but find minimal effects when using the mean values provided in Table 2.
To ensure that the Fisher matrix is invertible when using three mass bins but four parameters, we add a weak Gaussian prior, σ log θ = 1, to the diagonal elements.We then invert this to find F −1 ab and construct Gaussian constraints.We show the results in Fig. 13.Our results are a significant improvement over those shown in Fig. 6 of W23.We find that, apart from AGN1, all astrophysical parameters are strongly constrained, at up to an order of magnitude in constraining power.One reason for this is that the error on the Y − M relation is much smaller at the highest masses, allowing for more sensitivity to ASN1 0 ± 0.94 With access to the 28-dimensional IllustrisTNG model parameter space, we can extend the analysis of W23.In Fig. 14, we show constraints on the same astrophysical parameters after marginalizing over the entire parameter space.We follow the same process as the preceding paragraphs and with the same covariance matrix from Table 2.We find that the constraints are greatly reduced.This highlights the importance of including the full set of parameters when constraining model parameters.Furthermore, it shows that with only three mass bins in the Y − M relation, we cannot provide tight constraints and instead would require more observations or complementary probes.

Testing CARPoolGP
CARPoolGP was designed to emulate astrophysical quantities across a high-dimensional astrophysical and cosmological parameter space, extract parameter dependencies for given quantities, and help to determine the best locations to draw future samples.Testing its performance in this context is difficult, as there is no existing suite of simulations with a similar parameter space exploration.In this work, we have used comparisons with existing simulations such as IllustrisTNG300-1 (Pillepich et al. 2018), a limited sample of CAMELS boxes (Villaescusa-Navarro et al. 2021c) (Section 4.1), or some internal measure of variance reduction (Section 3.3).Although each test is instructive, it is essential to discuss its caveats.
The most straightforward test we perform is an internal consistency test (Fig. 7), where we use Eq. 13 to predict the mean and variance of some quantity at parameter space locations that have not been used in the simulation suite.We use this test to better understand the parameter space locations that provide a global reduction in variance and to explore the efficacy of each active learning step.We found that the variance in the predictions decreased effectively after each of the four stages, with the latter stages performing better than expected.However, testing this way does not provide insight into a potentially biased estimator.It could be that each step internally reduces the global variance, but ultimately, the emulator maintains some level of biased predictions.It is unclear if a bias could even occur due to this sampling strategy, but this cannot be ruled out without further testing.In some ways, this can be mitigated by using the base samples of stage 1, which are drawn from a Sobol sequence.This would reduce any potential bias but result in an increased predictive variance.
In Fig. 8, we compare the emulated results with Il-lustrisTNG300.We found strong agreement between the emulated results and IllustrisTNG throughout the full mass range, with a small tension between 13.2 ≤ log(M 200 / M ⊙ h −1 ) ≤ 13.6.We noted that sample variance fluctuations in this mass range were highly correlated, and therefore, the discrepancy between the emulated predictions and the IllustrisTNG results is consistent with a low-significance fluctuation.While we expect the CARPoolGP emulator to provide a lowvariance description across the entire space, we note that this emulation cannot be sensitive to large quantity fluctuations that may exist between very nearby parameter space locations.This is the case for any emulator or interpolation with limited training data.Furthermore, testing CARPoolGP at a single location in the parameter space does not provide any information on its ability to capture specific parameter dependencies or degeneracies.These degeneracies could be necessary for scenarios where one is interested in performing inference on some survey data set to obtain the best-fit astrophysical and cosmological parameters, or if one wants to marginalize over baryonic effects effectively, knowledge of the individual parameter dependencies will prove critical.While testing CARPoolGP's ability at the fiducial set of parameters does not explore these dependencies, it does provide an intuitive demonstration of CARPoolGP's capabilities.
We show CARPoolGP's ability to capture parameter dependencies by comparing emulations to previous CAMELS simulations and find a strong agreement between the emulated parameter dependencies and the simulated parameter dependencies (Fig. 9 and Fig. 15).However, we must consider two caveats when we explore individual parameter variations with the CAMELS 1P set.First, the flagship CAMELS box is small at (25 Mpc h −1 ) 3 , while the suite of zoom-in simulations is generated in a large, (200 Mpc h −1 ) 3 box.The zoomin simulations are, therefore, generated with large-scale modes that the small boxes cannot access.These largest scales can affect the formation of galaxy groups and clusters and provide different results compared to small boxes when extracting summary statistics from them.It is then difficult to discern whether any tension between the small CAMELS boxes is due to the scale differences or sample variance.Second, there are not enough highmass halos within these boxes for a robust comparison.Although some halos of mass 10 13 M ⊙ exist, this can only provide constraints on the lowest mass predictions from CARPoolGP.We also include in Fig. 9 two zoomin clusters at two mass scales that were simulated with single-parameter variations at a time, but here it is clear that a single halo contains a great deal of sample variance, again, making any comparison difficult.Finally, while the CAMELS 1P set, which contains individual parameter variations, can assist us in exploring parameter dependencies, we run into a similar issue as described above when testing on IllustrisTNG300-1.

CONCLUSIONS
This work introduces the CARPoolGP sampling and regression method as well as a suite of group-to-cluster mass zoom-in simulations and utilizes these tools to study the SZ effect in terms the Compton relation Y −M and additional scaling relations within the framework of the IllustrisTNG galaxy formation model.We highlight the key conclusions in the following.
• We developed the general formalism for a novel reduced variance sampling and regression method called CARPoolGP.In CARPoolGP, correlations between quantities across a parameter space can be leveraged to provide high accuracy and low variance emulations.The CARPoolGP structure places one set of samples throughout the parameter space and a correlated set of samples at or near predefined locations in the parameter space to calibrate the predictions and variance effectively.
• We showed that in a simple one-dimensional problem, CARPoolGP outperforms a random sampling approach, providing more accurate and reduced variance emulations.With CARPoolGP, we achieved ∼ 45% better variance reduction than the case of random sampling in a 1-dimensional toy example.
• We built an active learning algorithm that predicts the best parameter space locations to draw future samples and which reduces variance on some quantity of interest across the entire parameter space.We found that this approach reduced the variance by ∼ 65% when compared to the random sampling approach and ∼ 36% amount compared to the standard CARPoolGP approach in a 1-dimensional toy example.
• We used CARPoolGP and the active learning approach to generate a suite of 768 hydrodynamical and analogous dark matter-only simulations spanning a 28-dimensional astrophysical and cosmological parameter space in the IllustrisTNG model, as well as a mass range 13 ≤ log(M 200 / M ⊙ h −1 ) ≤ 14.5.We used active learning to enhance the variance reduction on the Compton Y parameter.
• We trained a CARPoolGP emulator on the Y − M relation.Our emulator predicted the fiducial IllustrisTNG Y − M relation to high accuracy with reduced variance and matched expectations when emulating individual parameter variations and comparing them to existing simulations.
• We explored the emulated thermodynamic profiles of halos along individual parameter variations, finding that the normalization of supernova wind energy per star formation rate (ASN1), normalization of supernova wind speed (ASN2), and IMF slope above 1 M ⊙ provide the largest changes to the Y − M relation, while remarkably, the AGN feedback parameters have little impact on this relation.
• We developed a qualitative physical picture of gas in the IGrM and ICM using emulations of various thermodynamical profiles, metallicity, and black hole mass -halo mass relations, and satellite quenched fraction relations to explain the observed trends in the Y − M relation due to ASN1, ASN2, and IMF slope.
• We used the auto-differentiable capabilities of CARPoolGP and constraints on future SZ experiments to perform a Fisher forecast on four Il-lustrisTNG parameters.We found tighter constraints, by an order of magnitude, on three of the four astrophysical parameters compared to previous CAMELS studies, yet further showed that marginalizing over the full parameter space significantly weakens these constraints.
We highlight that in this work, we applied CAR-PoolGP to simulate group and cluster mass halos, but this is just one use case.Future astrophysical studies considering the exploration of a model's parameter space, and with prior knowledge of correlations across this space, can benefit greatly from using CARPoolGP.While more research is required to investigate alternative architectures, such as multiple surrogates per base setups, the current "out of the box" implementation is already beneficial and ready for use.
Further, with the novel suite of high-mass halos spanning the IllustrisTNG model parameter space, a wide range of science applications are possible.We foresee that this suite will aid in developing galaxy formation models and allow for more robust cosmological analyses.We find a weak effect on the thermodynamic properties of halos from the AGN feedback parameters in the IllustrisTNG model.This has been shown in multiple CAMELS-based works and matches our expectations (Moser et al. 2022;Wadekar et al. 2023a).Similarly, Singh et al. (2022) and Tillman et al. (2023) found that when exploring the six-dimensional parameter space of the original (25 Mpc h −1 ) 3 CAMELS boxes, the most significant effects on the multiphase CGM and intergalactic medium (IGM) occurred with modulations to the supernova parameter ASN2.Note that both Singh et al. (2022) and Tillman et al. (2023) speculated that the effects of ASN2 on the CGM and IGM were due to the complex interactions between SN winds and AGN feedback, such that changes in the SN parameters affected the growth of black holes and the power of the AGN feedback.With the high halo mass range of the mass function and the expanded parameter space, we find new parameter dependencies, such as a significant impact made by the IMF slope.
Figure3.The CARPoolGP emulation with 95% confidence intervals (left panel as a blue line and shaded region) and a standard sampling approach (right panel as a blue line and shaded region).In both subplots, we show the true quantity variation as a black dotted line.Both emulations have the same cost in terms of the number of samples, but when using the CARPoolGP approach -with correlated base-surrogates samples that live on parameter islands, as shown in Fig.2-we obtain a much more accurate emulator with a greatly reduced predictive variance.Unlike the following figure (Fig.4), in this figure, no active learning has been used to select the sample locations

Figure 4 .
Figure4.A demonstration of the active learning approach to CARPoolGP.Each panel represents a stage in the active learning process, and the first panel shows a standard CARPoolGP approach with twenty base and twenty surrogate samples.In the next subplot, we show the new ten base points (red dots) and ten surrogate points (blue) chosen for sampling at the next stage.The results of emulation with this parameter sampling arrangement are shown in the 95% confidence interval.The next two panels show the following two stages of active learning, the last containing a cumulative total of fifty base samples with fifty associated surrogates as in the left panels of Fig.2and Fig.3.
Figure5.The standard error σ(θ) is shown as a function of test parameter space locations, θ * , for the active learning stages (increasing dark red lines), the standard CAR-PoolGP approach (blue line), and No CARPoolGP (dashed black line).The first stage of active learning contains 20 base and 20 surrogate samples, and each subsequent stage adds 10 base and 10 surrogate samples following the active learning method, such that stage 4 has a total of 50 base and 50 surrogate samples, as does the case of standard CARPool, while the no-CARPoolGP case has the same total of 100 samples.CARPoolGP vastly outperforms the random sampling approach and is further boosted when combined with active learning.
Figure6.The distribution of masses from the suite of zoomin simulations.We use a kernel density estimate of the distribution drawn as the blue line.For a Sobol sequence, the distribution in this space would be flat, i.e., have the same number of simulations in each bin, but it is clear that there are particular places in parameter space that are preferred using the active learning approach as opposed to a standard Sobol sequence.

Figure 7 .
Figure7.The predictive error is tested for each stage in the simulation suite over a set of 4096 parameter space locations drawn from a Sobol Sequence.A Gaussian distribution is fit to the histogram of error values and is shown for each stage in the solid lines.We plot the expected histograms in dotted lines with their associated colors.Each stage reduces the mean of the predictive error distribution and decreases the width.

Figure 8 .
Figure8.The fiducial parameters for IllustrisTNG300-1 are emulated using CARPoolGP (blue line) with their associated 1σ error bars and compared with the true values from IllustrisTNG (black points).The bottom panel shows the same emulation and comparison, but for the scaled Y given by Eq. 25.The discrepancy in emulated and IllustrisTNG predictions between 13.2 ≤ log(M/[M⊙ h −1 ]) ≤ 13.6 contain sample fluctuations that are ∼ 95% correlated with their neighbors consistent with a low significance fluctuation.

Figure 9 .
Figure9.Individual parameter variations for the scaled Y − M relation.We emulate each line with CARPoolGP trained on the full suite of zoom-in simulations, where all parameters are fixed to their fiducial values, while one parameter modulates between its bounds.Shaded regions represent the 1 − σ predictive uncertainty.We explore in particular the stellar evolution and feedback parameters with the largest influence on the Y − M relation.The CAMELS 1P set is incorporated into this plot using the largest halos in each box (dots), and its parameter values are colored accordingly.We include a set of two zoom-in halos, each with individual parameter variations (crosses), and a small subset of zoom-in halos run at the prior bounds of ASN1 and ASN2.See Fig.15for the full 28-dimensional emulation of the Y − M relation.
Figure10.The emulated temperature (top row), electron density (middle row), and pressure (bottom row) profiles for a log(M200/ M⊙ h −1 ) = 13.75 halo as a function of radius in twenty log spaced bins between 0.01 ≤ r/R200 ≤ 1 at the extreme values for three astrophysical parameters.Each plot shows the profiles normalized by the fiducial to highlight their deviations.We find that the dominant source of pressure changes for each of the chosen parameters is due to changes in their density profiles.Changes in the temperature profiles of the gas are suppressed except for the IMF slope.

Figure 14 .
Figure14.Exactly the same as Fig.13, but after marginalizing over the full 28-dimensional IllustrisTNG parameter space.
supported by the Simons Collaboration on "Learning the Universe."The Flatiron Institute is supported by the Simons Foundation.MEL is supported by NSF grants DGE-2036197 and AST-2108678.DN acknowledges support from the NSF grant AST 2206055.G.L.B. acknowledges support from the NSF (AST-2108470, ACCESS), a NASA TCAN award, and the Simons Foundation.DAA acknowledges support by NSF grants AST-2009687 and AST-2108944, CXO grant TM2-23006X, JWST grant GO-01712.009-A,Simons Foundation Award CCA-1018464, and Cottrell Scholar Award CS-CSA-2023-028 by the Research Corporation for Science Advancement.APPENDIX A. Y-M RELATION ACROSS FULL PARAMETER SPACE We present in Fig. 15 emulations of the Y −M relation for individual parameter variations across the entire pa-rameter space of the IllustrisTNG model.We add this as a complement to Fig.9, where we selected three SN parameters providing the largest impact on the Y − M relation and a few AGN parameters.We show this to allow readers to draw their own interpretations from the full set of parameters and further show the efficacy of CARPoolGP at emulating across the high dimensional parameter space.

Figure 15 .
Figure 15.We present individual parameter variations for the scaled Y − M relation.CARPoolGP emulates each line where all parameters are fixed to their fiducial values, while one parameter modulates between its bounds--Add high low.Shaded regions represent the 1 − σ errors on predictions.The CAMELS 1P set is incorporated into this plot using the largest halos in each box, and their parameter values are colored accordingly.

Table 1 .
Summary of variables used in developing CAR-PoolGP from Section 2.1 and Section 2.2 into a table for reference.