Emulating Power Spectra for Prereconstructed and Postreconstructed Galaxy Samples

The small-scale linear information in galaxy samples typically lost during nonlinear growth can be restored to a certain level by the density field reconstruction, which has been demonstrated for improving the precision of the baryon acoustic oscillation (BAO) measurements. As proposed in the literature, a joint analysis of the power spectrum before and after the reconstruction enables an efficient extraction of information carried by high-order statistics. However, the statistics of the postreconstruction density field are difficult to model. In this work, we circumvent this issue by developing an accurate emulator for the prereconstructed, postreconstructed, and cross-power spectra ( Ppre , P post, P cross) up to k = 0.5 h Mpc−1 based on the Dark Quest N-body simulations. The accuracy of the emulator is at the percent level; namely, the error of the emulated monopole and quadrupole of the power spectra is less than 1% and 10% of the ground truth, respectively. A fit to an example power spectrum using the emulator shows that the constraints on cosmological parameters get largely improved using Ppre +P post+P cross with kmax=0.25hMpc−1 , compared to that derived from Ppre alone; namely, the constraints on (Ω m , H 0, σ 8) are tightened by ∼41%–55%, and the uncertainties of the derived BAO and RSD parameters (α ⊥, α ∣∣, f σ 8) shrink by ∼28%–54%, respectively. This highlights the complementarity among Ppre , P post, and P cross, which demonstrates the efficiency and practicability of a joint Ppre , P post, and P cross analysis for cosmological implications.


Introduction
Wide-area spectroscopic surveys are fundamental tools for cosmological studies since they enable us to probe the Universe both geometrically and dynamically.In particular, the observed baryon acoustic oscillations (BAOs) and redshift-space distortions (RSDs), which are specific 3D clustering patterns of galaxies, can be used to reconstruct the cosmic expansion history and the growth rate of the cosmic structure.Over the last few decades, massive spectroscopic surveys, including the Sloan Digital Sky Survey (SDSS; York et al. 2000), the Twodegree Field Galaxy Redshift Survey (2dFGRS; Colless et al. 2001), WiggleZ (Drinkwater et al. 2010), the SDSS-III Baryon Oscillation Spectroscopic Survey (BOSS; Dawson et al. 2013), and the SDSS-IV extended Baryon Oscillation Spectroscopic Survey (eBOSS; Dawson et al. 2016) have proven to be powerful probes for cosmology (Peacock et al. 2001;Cole et al. 2005;Eisenstein et al. 2005;Percival et al. 2007;Blake et al. 2011;Alam et al. 2017Alam et al. , 2021)).
In Fourier space, the BAO feature manifests itself as a set of wiggles in the power spectrum, which can be used as a standard ruler to measure the cosmic expansion history.Unfortunately, the BAO feature is generally blurred by the nonlinear evolution of the cosmic structure, reducing its strength as a cosmic probe.To sharpen the BAO feature, the reconstruction scheme was proposed (Eisenstein et al. 2007), which effectively restores the linearity of the density field to a certain extent by partially undoing the nonlinear structure evolution.This process brings the high-order information dominated by the three-and fourpoint statistics back to two-point statistics (Schmittfull et al. 2015), such that it is not only useful for boosting the BAO signal but also helpful for a general full-shape analysis of the power spectrum (Hikage et al. 2020).
Recently, a novel method was proposed (Wang et al. 2022) to extract information carried by high-order statistics from a joint analysis of the power spectrum of the prereconstructed Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
density field (P pre ), the postreconstructed field (P post ), and the cross-power spectrum between the pre-and postreconstructed fields (P cross ).Their analysis, based on the Fisher matrix method, showed that a joint analysis using P pre , P post , and P cross can tighten the constraints on the cosmological parameters compared to that using P post alone, as part of the information from three-point and four-point statistics of the density field can be efficiently extracted (Wang et al. 2022).
In order to exploit the information content from the galaxy clustering, an accurate model for the statistics of the density field before and after the reconstruction is required.Traditional methods for the model building rely on the perturbation theory (PT).For P pre , PT-based models can work up to scales of k = 0.2 or 0.25 h Mpc −1 , depending on the effective redshift of the galaxy sample (Taruya et al. 2010;Carrasco et al. 2012;Beutler et al. 2014;d'Amico et al. 2020;Ivanov et al. 2020;Chen et al. 2022).However, it is much more challenging to build PT-based models that can work on the same scales for P post or P cross due to complexities brought in by the reconstruction process (Hikage et al. 2020).One alternative to building PT-based models is to develop simulation-based models, i.e., the emulators, which have been extensively studied and developed for statistics for the prereconstructed density fields (Wibking et al. 2019;Winther et al. 2019;Zhai et al. 2019;Kobayashi et al. 2020;Donald-McCann et al. 2022;Yuan et al. 2022;Cuesta-Lazaro et al. 2023a, 2023b;Kwan et al. 2023).
In this work, we develop an emulator for P P , pre post , and P cross up to k = 0.5 h Mpc −1 , which is trained using the DARK QUEST simulations (Nishimichi et al. 2019) and a halo occupation distribution (HOD) model (Zheng et al. 2007).Our emulator is then validated using simulations that are not used for the training.Using our emulator, we perform a likelihood analysis using the monopole and quadrupole of galaxy power spectra up to the scale of k = 0.25 h Mpc −1 and find a significant information gain by a joint P P P , , pre post cross { } analysis, compared to using P pre alone.
This paper is structured as follows.The next section is a description of the simulations and galaxy mocks used for the training and validation, and Section 3 presents the details for creating the emulator.In Section 4, we perform a likelihood analysis using various types of power spectra and show the main result of this work before concluding in Section 5.

The DARK QUEST Simulations and Galaxy Mocks
The DARK QUEST simulations that we use to develop our emulator are a suite of N-body simulations with 2048 3 dark matter particles in a 2 h −1 Gpc side-length box (Nishimichi et al. 2019).The emulator is built using a single DARK QUEST snapshot at z = 0.549.The cosmologies used in the DARK QUEST simulations cover the 100 spatially flat wCDM models16 with six variable parameters and one spatially flat ΛCDM model with the best-fit value of Planck 2015 (Planck Collaboration et al. 2016) presented in Table 1, where ω b ≡ Ω b h 2 and ω c ≡ Ω c h 2 are the physical density parameters of the baryon and cold dark matter, respectively.Ω de is the dimensionless dark energy density parameter.A s and n s are the amplitude and slope of the primordial power spectrum, respectively.w is the equation-of-state parameter of dark energy.In addition, the total neutrino mass is fixed to ∑m ν = 0.06 eV.The effect of massive neutrinos was included in simulations at the level of linear transfer function.Cosmological parameters are sampled over the parameter range presented in Table 1 using optimal maximum distance sliced Latin hypercube designs (Ba et al. 2015) so that parameter samplings can cover the parameter space as uniformly as possible (Nishimichi et al. 2019).We have 15 realizations for the fiducial ΛCDM cosmology.
The halos were identified using the phase-space temporal friends-of-friends halo finder, ROCKSTAR (Behroozi et al. 2013).The center of each halo is given as the center-of-mass location of a subset of member particles in the inner part of that halo, i.e., "core particles," and the velocity of each halo is defined as the center-of-mass velocity of the core particles.r .The direct outputs of ROCKSTAR contain both distinct "host" halos and substructures.For the subsequent analyses, we remove substructures, which are found within the R 200m of a more massive nearby halo.
Galaxy mock catalogs are constructed from the DARK QUEST halo catalogs using the HOD framework, which is implemented in Halotools (Hearin et al. 2017).We use the functional form of HOD as proposed in Zheng et al. (2007) to model the mean number 〈N(M)〉 of galaxies in halos of mass M. The mean occupation functions of central and satellite galaxies are parameterized as M log s describes the profile for the halo mass cutoff, making 〈N c (M)〉 smoothly transit from 0 to 1. M 0 is the minimum halo mass to host satellite galaxies.M 1 is the normalization mass scale.α is the power-law slope of the satellite HOD at the massive end.
The occupations of central and satellite galaxies are drawn from Bernoulli and Poisson distributions, respectively.Central galaxies are placed at the halo centers with the same velocities as their host halos, where we have ignored the effect of galaxy velocity bias (Guo et al. 2015(Guo et al. , 2016)).We also assume that the satellite galaxy distribution within the halos follows the Navarro-Frenk-White profile (Navarro et al. 1996).
We adopt the fiducial values of HOD parameters based on the best-fit values ( M log 13.09 min = , 0.596 , and α = 1.0127) obtained by fitting to the CMASS ("constant mass") galaxy sample (Manera et al. 2013).The number density n can be derived by performing an integral over the mass function, where dn dM M ( ) is the halo mass function.We take its fitting formula from Tinker et al. (2008).The resulting HOD catalog has a number density of n = 5.6 × 10 −4 h 3 Mpc −3 .In our work, we choose to fix the number density,17 then we sample four out of the five HOD parameters of our model.Here we reparameterize the HOD parameters as as used in Wibking et al. (2020).Their fiducial values and flat prior ranges are presented in Table 1.We utilize the (randomized) quasi-Monte Carlo method to sample reparameterized HOD parameters in the prior range.Specifically, we generate 2450 points in 4D using the Sobol sequence (Sobol' 1967) utility in the scipy.stats.qmcpackage (Virtanen et al. 2020).We scramble the Sobol sequence with a random seed searched among integers from 0 to 65535 to minimize the mixture discrepancy (Zhou et al. 2013) as the uniformity measure.The first 2400 HOD samples are assigned to 80 cosmologies for training; i.e., each training cosmology is assigned 30 HODs.The remaining 50 HODs are assigned to each testing cosmology, yielding a testing set of 1000 models.For each sampling, we use Equation (3) to find the value of M log min that yields the fixed n.We include shifts due to RSD along the z-axis to generate the simulated galaxy samples in the redshift space; i.e., the simulated galaxies in the real space are shifted by v z /(aH), where v z is the peculiar velocity of the simulated galaxies along the line of sight and a is the scale factor.

Emulating Pre-and Postreconstructed Galaxy Power Spectra
In this section, we use the galaxy samples described in the previous section to emulate the P P , pre post , and P cross of galaxies.We first present the measurement of power spectra with and without the density field reconstruction, then detail the training process of our emulator, and finally discuss the performance of the emulator.

The Density Field Reconstruction and the Power Spectrum Measurement
Before performing the density field reconstruction, we implement the Alcock-Paczynski (AP) effect (Alcock & Paczynski 1979), which arises from the discrepancy between the fiducial cosmology used for redshift-distance conversion and the underlying true cosmology.Although the equation relating the power spectrum before and after applying the AP effect is analytically known (Ballinger et al. 1996), including this effect in the reconstruction is complicated and requires nontrivial modeling (Sherwin & White 2019).An easier way to account for the AP effect is to manipulate the catalog by changing the coordinates of the samples.Specifically, we convert the galaxy positions in the true coordinate x¢ to the "observed" coordinate x and stretch the size lengths of simulation box L using the relations of here D A and H are the comoving angular diameter distance and Hubble parameter, and quantities with subscript "fid" denote those in the fiducial cosmology.The galaxy density field is smoothed by convolving with the kernel where k is the modulus of the conjugate wavenumber k of the observed coordinate x, and we set the smoothing scale to be Σ s = 10 h −1 Mpc, which is close to the optimal smoothing scale for the reconstruction efficiency (Seo et al. 2016;Vargas-Magaña et al. 2017).The displacement field is then estimated using the Zeldovich approximation, i.e.,, , where δ(k) denotes the nonlinear redshift-space galaxy overdensity in the observed coordinate, b in is the input linear bias of the galaxy sample, and f in is the input logarithmic growth rate.An inverse Fourier transformation on s ˜returns the configuration-space shift field s(x), which is used to move both galaxies and randoms.Although it is natural to use the true (fiducial) values of b and f as b in and f in for the reconstruction, this does not have to be the choice.Actually, the true values of b and f are not known before performing the analysis.As we shall demonstrate later, the final parameter estimation is largely insensitive to the choice of b in and f in .In what follows, we use the fiducial b and f to start with and repeat the analysis with a significantly different set of b in and f in to demonstrate the robustness of the final result against the choice of these input parameters.
We measure the multipoles of P P , pre post , and P cross using a fast-Fourier-transform-based estimator (Hand et al. 2017) implemented in nbodykit (Hand et al. 2018).The number density field of galaxies is constructed using the cloud-in-cell scheme to assign galaxies to the grid, and we correct for the aliasing effect using the interlacing scheme in Sefusatti et al. (2016).For the monopole of the auto power spectrum before and after density field reconstruction, the shot noise is removed as a constant.Note that the shot noise of P cross is scaledependent (Wang et al. 2022), which is estimated using the "half-sum half-difference" approach and then subtracted off, as in Ando et al. (2018) and Wang et al. (2022).The k-bin width is set to be Δk = 0.01 h Mpc −1 for all P(k) measurements.

Emulating the Power Spectra
In order to avoid the emulated quantities spanning several orders of magnitude, we choose to normalize the power spectrum multipoles using the linear Kaiser power spectrum (Kaiser 1987) with the BAO feature removed in the fiducial cosmology, i.e., where P nw,lin is the linear power spectrum without the BAO feature (Eisenstein & Hu 1998).The superscript "X" runs for "{pre, post, cross}."To well capture the BAO wiggles in the monopole, we decompose R 0 X into two parts, i.e., the smoothed broadband shape (S) part and the BAO wiggles (W) part.The S part is obtained by applying a Savitzky-Golay filter (Savitzky & Golay 1964) to R 0 X , i.e., fitting to a certain number of data points (N) with a polynomial of pth order, and we find that N = 41 and p = 4 is a reasonable choice for the filtering.Then the BAO wiggles are extracted, i.e., W R S We follow Zhai et al. (2019Zhai et al. ( , 2023) ) to construct the emulator, based on the George (Ambikasaran et al. 2016) code.In the Gaussian process (GP) modeling, the correlation between different training data points is modeled by a covariance matrix generated by a kernel function.This is of critical importance in the GP modeling since it defines the function we wish to learn.Due to the lack of prior knowledge of the correlation between training data points, the definition of the kernel function can be arbitrary.For the modeling of the galaxy power spectrum in this work, we adopt a Matern class kernel (), as we find that it produces sufficiently accurate predictions.In this model, the hyperparameters in the kernel define the strength of the correlation between neighboring points.The following process of training is to optimize the hyperparameters in the kernel function: where M I 2 s = +  ,  is the covariance matrix populated by the kernel function, and σ represents the error of the training data P. Since each cosmology in the training data has only one realization, we estimate the uncertainty of the training data using the fiducial cosmology with 15 realizations.We find that 15 simulations gives a good approximation of σ, but more simulations might improve the emulator, which would be interesting to check in the future.With the optimized hyperparameters fed into the kernel function, we can obtain the power spectra for an arbitrary point in the parameter space.

Covariance Matrix
The DARK QUEST simulations only have 15 realizations in the fiducial cosmology, which is insufficient to construct a robust covariance matrix for galaxy clustering analysis.We therefore compute the correlation matrix using GLAM simulations, for which there are 986 independent realizations in the Planck cosmology 18 (Klypin & Prada 2018).We adopt the best-fit HOD parameters for the M i < −21.6 CMASS samples in Guo et al. (2014), leading to a contribution of shot noise (∼2 × 10 −4 h 3 Mpc −3 ) to the covariance.The side length of the GLAM simulation box is 1 h −1 Gpc.Simulations with a larger box size can help investigate the effect of supersample covariance (SSC) (Bayer et al. 2023).Limited by the simulation suite, the covariance determination in this work neglects supersample variance components. 19To be close to the volume of the BOSS survey (Alam et al. 2017), the data covariance matrix C data is rescaled by a factor of 3. Specifically, we derive the C data from GLAM mocks, i.e., where the mean of the power spectra is defined as and N s = 986 is the number of mocks.Note that the effect of the error induced by the estimation of the covariance matrix from the finite number of mocks will be corrected as described later in this work.

Emulator Validation
In Figure 1, we show the prediction from our emulator for multipole moments of P P , pre post , and P cross for the fiducial cosmology that is not used for the training.The symbols in the upper panels are the average of the power spectra measured from 15 realizations in the fiducial cosmology.The error bars are the statistical errors computed using Equation (9).
The lower panels of Figure 1 show the fractional difference between the emulated and the measured power spectra from the galaxy mocks.It indicates that the monopole and quadrupole measured from the galaxy mocks can be well described by our emulator by better than 1%-2% for the monopole and 2%-10% for the quadrupole at most scales.The quadrupole error can sometimes be >10%, particularly around the scales where the quadrupole happens to cross 0. For the hexadecapole, the fractional difference is noisy because the amplitudes of the hexadecapole are close to 0. Within the statistical errors, our emulator gives an excellent prediction for the hexadecapole as well.
We quantify the accuracy of our emulator using 1000 test galaxy mocks that are not used in the training set.The three columns of Figure 2 from left to right show the performance of our emulator for P pre (left), P post (middle), and P cross (right). 18The cosmological parameters of the GLAM simulations are from Planck 2013 (Ade et al. 2014), which is slightly different from the fiducial cosmology used in the DARK QUEST simulations.The minor difference between them is ignored in this work. 19 Hikage et al. (2020) found that the improvement of the error on the growth rate by the reconstruction was comparable between the cases with and without the SSC effect.It would be interesting to explore the impact of the SSC on cosmological parameters in future work.
The symbols in the upper panels of Figure 2 show the average fractional error of the monopole power spectrum obtained by comparing the emulator predictions with the P 0 measurements from 1000 test mocks.The error bars are the standard deviation estimated from 1000 test mocks.The fractional error is within ∼1%-2% over most scales.The solid lines show the inverse signal-to-noise ratio computed using the average of the monopole measurements among 15 realizations from the fiducial cosmology.
Because the quadrupole and hexadecapole moments can cross 0, leading to large fractional errors, we instead show the absolute error between the emulator prediction and the measurement from the testing mocks relative to the statistical error in the middle and lower panels of Figure 2. We find that Figure 1.Upper panels: the prediction from our emulator for multipole moments of P P , pre post , and P cross for the fiducial cosmology that is not used for the training.The symbols are the average of 15 realizations in the fiducial cosmology.The errors are the statistical errors for a volume of 3 h −3 Gpc 3 .Lower panels: the fractional difference between the emulator prediction and the measured power spectra from mocks in the fiducial cosmology.
Figure 2. Upper panels: the average of the fractional difference between the emulated and the measured monopole power spectrum from 1000 testing mocks.The black solid lines show the inverse signal-to-noise ratio of the mean fiducial monopole measurement.The statistical error for the monopole power spectrum P0 s is computed using Equation (9).Middle panels: the average of the difference between the emulated and the measured quadrupole power spectrum relative to the statistical error.Lower panels: the average of the difference between the emulated and the measured hexadecapole power spectrum relative to the statistical error.
the emulator error for P 2 and P 4 is subdominant, roughly 50%-70% of the statistical error for a volume of 3 h −3 Gpc 3 .

Cosmological Application to Mock Catalogs
In this section, we test our emulator by applying it to the power spectrum measurements from mock galaxy catalogs in the fiducial cosmology, which are not in the training set.We use Cobaya (Torrado & Lewis 2021) to perform a Markov Chain Monte Carlo sampling of the 9D parameter space within the flat ΛCDM framework; i.e., the w parameter is fixed to −1.The following χ 2 gets minimized in the fitting, and we add a Gaussian prior for ω b centered on 0.0223 with the width 0.00036 from Big Bang Nucleosynthesis (BBN) constraints (Mossa et al. 2020) and a Gaussian prior for n s parameters centered on 0.965 with the width 0.0042 from Planck constraints (Aghanim et al. 2020).For other parameters, a flat prior over the range shown in Table 1 is used.Here P emu is the prediction of the emulator, and P mea denotes the average of the power spectra measured from 15 realizations in the fiducial cosmology.C is the covariance matrix consisting of two terms, where C data (computed in Section 3.3) is the contribution of the sample statistics, and σ emu corresponds to the uncertainty due to the emulating error in the model prediction.Since the emulator is constructed for individual scale bins, here we assume that the emulating error is independent among different scale bins, which is computed using the testing set as discussed in Section 3.4.Since the covariance matrix C data is estimated from a finite number of mocks, C data is generally biased.To correct, we multiply C data by a factor of  (Percival et al. 2022), Here, N s is the number of simulations used to estimate the covariance, N d is the number of the data vector, and N θ is the number of parameters that are being fitted.Note that the -factor generally dilutes the constraints on the parameters being fitting by rescaling the covariance; namely, when using P pre , P post , or P cross alone, the -factor increases the covariance by 9%.For joint analyses of P P pre post

+ and P P
pre post

+ +
P cross (i.e., P all ), the -factor enlarges the covariance by 22% and 36%, respectively.These (10%) enlargements of the posteriors are a first-order way to correct for the lack of convergence of the covariance due to using a small number of simulations.Running more simulations would reduce the size of this correction and provide more robust contours.We plan to do this in future work.When a covariance matrix is constructed using Equation (9), we have, in effect, drawn the matrix as a random variable from a Wishart distribution.Using this knowledge, we can consider the set of covariance matrices that could have been drawn and determine the average effects on results from them.It is often found that the results are biased because the values of interest are skewed by the errors in the covariance.Examples of such effects include a skewed inverse covariance matrix (Hartlap et al. 2007) or skewed parameter errors (Dodelson & Schneider 2013).The effects can be modeled using PT, leading to correction terms such as those in Equation ( 13).The PT-based derivation assumes a linear model, but this has been shown to work well for typical cosmological problems (Percival et al. 2022).Nevertheless, this is only a first-order correction in terms of the link from the likelihood to model parameters, and having more mocks will always be better.The correlation matrix being estimated for the combined power spectra (P P P , , pre post cross ) is shown in Figure 3. Using k modes at k 0.25 h Mpc −1 for both the monopole and quadrupole of the power spectra,20 we obtain the 1D posterior distributions and 2D contour plots for the derived cosmological parameters Ω m , H 0 , and σ 8 , as shown in Figure 4.The mean values with 68% credible intervals of Ω m , H 0 , and σ 8 are presented in Table 2.The left contour plot in Figure 4 shows a comparison of the fitting results using the prereconstructed power spectrum alone (gray), the postreconstructed power spectrum alone (red), the cross-power spectrum alone (green), and the joint fitting of prereconstructed, postreconstructed, and cross-power spectra (blue) for k max = h 0.25 Mpc 1 -. Figure 4 shows that our emulator-based analysis can recover the expected values of cosmological parameters within statistical errors.The postreconstructed power spectrum alone is more informative, tightening the constraints on Ω m , H 0 , and σ 8 by 10.9%, 35.7%, and 23.7%, respectively, compared to that from P pre alone.It is found that the joint fit of the prereconstructed, postreconstructed, and cross-power spectra, denoted as P all , gives the tightest constraint; namely, the constraints on Ω m , H 0 , and σ 8 from P all are improved by 44.5%, 41.7%, and 55.3%, respectively, compared to that from P pre alone.The relative information gain from P all compared to that from P pre is expected to be greater in the nonlinear regime, i.e., including modes with k > 0.25 h Mpc −1 .However, given the number of mocks and data points, we do not go further than k h 0.25 Mpc max 1 = -for a P all analysis.Instead, we perform a P post -alone analysis with k h 0.5 Mpc max 1 = -for a demonstration.We compare the constraints on Ω m , H 0 , and σ 8 using P post alone for k 0.25 max = and 0.5 h Mpc −1 in the right panel of Figure 4.As shown, adding modes on smaller scales helps to constrain σ 8 ; namely, its uncertainty gets reduced by 44.8% as k max increases from 0.25 to 0.5 h Mpc −1 .The galaxy clustering on smaller scales is more sensitive to the amplitude-related parameter σ 8 compared to Ω m and H 0 .In addition, we perform the analysis for different k max using P post alone.The fitting results as a function of k max are presented in Figure 7 in the Appendix.Also, adding more modes does not generate bias in the posteriors, demonstrating the robustness of our emulator.
We then derive the BAO and RSD parameters (α ⊥ , α || , fσ 8 ) and show the 1D posterior distributions and 2D contour plots in Figure 5, with the mean values and 68% credible intervals of the BAO and RSD parameters listed in Table 2. Compared to P pre alone, the constraints on the (α ⊥ , α || , fσ 8 ) parameters from P all are improved by 33.9%, 28.8%, and 54.8%, respectively.P post alone gives a tighter constraint than that using P pre only but is outnumbered by P all by 13.6% for α ⊥ , 20.8% for α || , and 42.2% for fσ 8 .The right panel of Figure 5 shows the contours of the derived BAO and RSD parameters with two different choices of k max , as in Figure 4.As expected, adding small-scale modes (k ä [0.25, 0.5] h Mpc −1 ) helps to tighten the constraint on fσ 8 significantly; namely, the uncertainty gets reduced by .Left: the 1D posterior distribution and 2D contour plots showing 68% and 95% credible regions for Ω m , H 0 , and σ 8 using the prereconstructed power spectrum alone (gray), the postreconstructed power spectrum alone (red), the cross-power spectrum alone (green), and the joint result of the prereconstructed, postreconstructed, and cross-power spectra (blue).Right: the same plot derived from the postreconstructed power spectrum alone with two choices of k h 0.25 Mpc  Note.The fiducial values of the parameters derived are Ω m = 0.3156, H 0 = 67.24,σ 8 = 0.831, α ⊥ = 1, α || = 1, and fσ 8 (z = 0.549) = 0.485, which are well recovered in all cases.The factor  here is calculated using Equation (13).Our default choices of (b, f ) parameters for reconstruction are b fid = 1.824 and f fid = 0.778 determined in the fiducial cosmology.To explore the effect of these inputs, we vary the bias by −10% (i.e., 0.9 b fid ) and the f by −30% (i.e., 0.7 f fid ).
44.4%.Note that this level of constraint can be achieved by using P all with k h 0.25 Mpc max 1 = -.We confirm that the information content in P cross is complementary to that in P pre and P post , as claimed in Wang et al. (2022).Specifically, adding P cross to our joint analysis using P pre and P post improves the constraints on (Ω m , H 0 , σ 8 ) and (α ⊥ , α || , fσ 8 ) by 5.5%-25.6%,as presented in Table 2.
Since the BAO reconstruction process requires a pair of input b and f, denoted as b in and f in , the reconstructed power spectrum depends on b in and f in .One natural question is whether and how much the final posterior depends on b in and f in .To investigate, we use a set of b in and f in that are significantly different from the fiducial b and f, namely, b in = 0.9 b fid and f in = 0.7 f fid .Note that this level of deviation from the true values is much greater than that constrained by a typical galaxy survey such as BOSS (Beutler et al. 2017) and thus is sufficient to study the impact of using the "wrong" cosmological parameters for the reconstruction on the final result (Sherwin & White 2019).We repeat our analysis using this set of b in and f in and show the parameter constraint from P all in this case in Table 2 and in Figure 8 in the Appendix.As shown, the constraint is largely unchanged, demonstrating the robustness of our method against the choice of b in and f in .
For completeness, we show the full contour plot for all parameters, including the cosmological and HOD parameters, in Figure 9 in the Appendix using different combinations of the power spectra.As expected, P all provides the tightest constraint for all parameters, as predicted by the Fisher matrix analysis (Wang et al. 2022).
The results presented so far do not include information from P 4 , the hexadecapole, so it is useful to explore how P 4 can help to reduce the uncertainties.We perform an additional analysis using P all including P 4 for all types of power spectra with k h 0.25 Mpc max 1 = -and find that P 4 can barely further improve the constraints on the cosmological parameters, as shown in Figure 10 in the Appendix, because the hexadecapole has a relatively lower signal-to-noise ratio compared to P 0 and P 2 .A similar conclusion is found when only using P pre , and the contour plot is presented in Figure 11.

Conclusion and Discussions
In this work, we develop an emulator for galaxy power spectra for catalogs with and without the BAO reconstruction based on the DARK QUEST simulations with HOD models to populate galaxies.The theoretical predictions of power spectra derived from our emulator are in excellent agreement with the ground truth (with a deviation of less than 10%).Our emulatorbased likelihood analysis of mock galaxy catalogs demonstrates that input cosmological parameters can be accurately recovered from power spectra up to scales of k = 0.5 h Mpc −1 .
Our analysis shows that P pre , P post , and P cross are highly complementary; thus, jointly using these power spectra can significantly improve constraints on cosmological parameters, which is consistent with the claim based on a Fisher matrix analysis (Wang et al. 2022).Specifically, the uncertainty of (Ω m , H 0 , σ 8 ) derived from P P P pre post cross + + gets tightened by 44.5%, 41.7%, and 55.3%, respectively, compared to that derived from P pre (k h 0.25 Mpc max 1 = -in all cases).The derived BAO and RSD parameters, α ⊥ , α || , and fσ 8 , are better determined by 33.9%, 28.8%, and 54.8%, respectively.Adding small-scale modes to the analysis helps to constrain parameters related to the amplitude of the power spectra.For example, extending k 0.25 max = to 0.5 h Mpc −1 for P post reduces the uncertainty on σ 8 and fσ 8 by 44.8% and 44.4%, respectively.We also find that the posteriors of the parameters are largely insensitive to input values of b and f, which are required for the BAO reconstruction process.
The methodology and pipeline developed in this work make it possible to extract high-order information from two-point statistics, which is of significance for cosmological studies.Our Figure 5. Left: the 1D posterior distribution and 2D contour plots showing 68% and 95% credible regions for the derived α ⊥ , α || , and fσ 8 parameters using the prereconstructed power spectrum alone (gray), the postreconstructed power spectrum alone (red), the cross-power spectrum alone (green), and the joint result of the prereconstructed, postreconstructed, and cross-power spectra (blue).Right: the result using the postreconstructed power spectrum alone with two choices of k h 0.25 Mpc method and emulator can be directly applied to existing and forthcoming galaxy surveys including BOSS (Dawson et al. 2013), eBOSS (Dawson et al. 2016), the Dark Energy Spectroscopic Instrument (Aghamousa et al. 2016a(Aghamousa et al. , 2016b)), the Prime Focus Spectrograph (Takada et al. 2014), and so forth after the required tuning in the emulation process for the number density, effective redshifts of the galaxy samples, etc., which is technically straightforward.13 The Astrophysical Journal, 966:35 (15pp), 2024 May 1 Wang et al.
the halo mass definition in DARK QUEST, where R 200m is the spherical halo boundary radius within which the mean mass density is 200 times the mean mass density today, m0

a
and 〈N s (M)〉 = 0 when M < M 0 .M min is the cutoff halo mass scale for hosting central galaxies, with N M 0 shows the observables (2400 in total) used for training the emulator.

Figure 3 .
Figure 3.The correlation matrix for the monopoles and quadrupoles of power spectra P P P , , pre post cross ( ) .

Figure 4
Figure4.Left: the 1D posterior distribution and 2D contour plots showing 68% and 95% credible regions for Ω m , H 0 , and σ 8 using the prereconstructed power spectrum alone (gray), the postreconstructed power spectrum alone (red), the cross-power spectrum alone (green), and the joint result of the prereconstructed, postreconstructed, and cross-power spectra (blue).Right: the same plot derived from the postreconstructed power spectrum alone with two choices of k h 0.25 Mpc

Figure 8 .
Figure 8.The 1D posterior distribution and 2D contour plots showing 68% and 95% credible regions for the derived parameters (Ω m , H 0 , σ 8 , α ⊥ , α || , fσ 8 ) from P all reconstructed using two different sets of b in and f in shown in the legend.The dashed lines show the expected values of the parameters.

Figure 9 .
Figure 9.The 1D posterior distribution and 2D contour plots showing 68% and 95% credible regions for cosmological and HOD parameters derived from combinations of different types of power spectra shown in the legend.The dashed lines show the expected values of the parameters.

Table 1
Cosmological and HOD Parameters Used in Our Emulator