MODELING THE Lyα FOREST IN COLLISIONLESS SIMULATIONS

Daniele Sorini; José Oñorbe; Zarija Lukić; Joseph F. Hennawi

doi:10.3847/0004-637X/827/2/97

1. INTRODUCTION

The neutral hydrogen in the intergalactic medium (IGM) imprints a characteristic pattern in the absorption spectra of quasars, known as the "Lyα Forest." It represents an extraordinary cosmological probe, which is able to trace density fluctuations in the redshift range of 0 ≤ z ≲ 6 (for a recent review, see Meiksin 2009). The large number of quasars discovered to date enables statistical analyses of the absorption spectra by considering the transmitted flux along many different lines of sight, called "skewers." The measured statistical properties can be compared to theoretical models of the IGM, constraining cosmological parameters as well as the thermal history of the IGM. In this work, we focus on the three observationally most relevant statistics of the transmitted flux: the probability density function (PDF; Rauch et al. 1997) the line-of-sight power spectrum (1DPS; Croft et al. 1999; Palanque-Delabrouille et al. 2015), and the 3D power spectrum (3DPS; Slosar et al. 2011). An additional motivation for studying the IGM is that it contains 90% of the baryons in the universe, acting as a gas reservoir for forming galaxies within the context of ΛCDM cosmology (see, e.g., Rauch 1998). Furthermore, the 3DPS can be used for an independent measurement of the Baryon Acoustic Oscillations (BAO) characteristic scale (Font-Ribera et al. 2014b; Delubac et al. 2015); the future increase in the number of observed quasars at redshifts z > 2 promises tight constrains on the expansion history of the universe at high redshifts and other cosmological parameters (Font-Ribera et al. 2014a).

However, performing all the above mentioned studies requires not only precise observations, but also accurate theoretical modeling, which is far from being straightforward. The Lyα forest is the observational signature of neutral hydrogen, and is set by the interplay of gravitational collapse, expansion of the universe, and reionization processes due to the buildup of a background of UV photons emitted by active galactic nuclei (AGN) and star forming galaxies (Cen et al. 1994; Hernquist et al. 1996; Zhang et al. 1997; McDonald et al. 2000; Meiksin & White 2001; Croft et al. 2002). There is no analytic solution for the small-scale evolution of the (baryon) density fluctuations over time. In order to precisely describe the behavior of the IGM, it is therefore necessary to treat the problem numerically. In this respect, hydrodynamic cosmological simulations have led to a consistent description of the IGM in the framework of structure formation (Cen et al. 1994). However, they are computationally expensive, making it challenging to reach high resolutions. Furthermore, available memory limits how large volume can be run in high-resolution simulations. For example, it would be necessary to run a simulation of ∼1 cGpc on a side to probe the scales of BAO and to study their signature in the Lyα forest (Norman et al. 2009; Slosar et al. 2009; White et al. 2010). The absorption lines are set by physical processes occurring around the Jeans scale, of which the order of magnitude is expected to be 100 ckpc (Gnedin & Hui 1996, 1998; Rorai et al. 2013; Kulkarni et al. 2015). Recent work indicates that a resolution of 20 ckpc is required to achieve ∼1% precision in the description of the statistics of the Lyα forest (Lukić et al. 2015). This implies that IGM BAO simulations would require at least 50,000³ resolution elements to span such a wide dynamic range, far beyond current (and near future) computational resources.

Collisionless simulations neglect baryonic pressure; therefore, they are not as accurate as hydrodynamic simulations on small scales. However, on large scales, baryonic forces are negligible; thus, collisionless simulations are as good as hydrodynamic ones in this regime. For this reason, N-body collisionless simulations are often used in cosmology to study the formation and evolution of structure in large volumes, but with poor mean inter-particle spacing (often several hundreds of ckpc). Clearly, it is desirable to find strategies that combine the volume of collisionless simulations but retain the accuracy of high-resolution hydrodynamic simulations. This objective has been recognized in the past, resulting in the development of various approximate methods to predict the Lyα forest from N-body simulations.

The simplest approach assumes that baryons perfectly trace dark matter (DM; e.g., Petitjean et al. 1995; Croft et al. 1998, pp. 69–75). In this over-simplified picture, the baryon density field is the scaled version of the matter density field. However, DM particles are collisionless, so the pressure of baryons, which competes with gravitational collapse, is simply neglected. The effect of pressure was instead included by, e.g., Gnedin & Hui (1998) as a modification of the gravitational potential. Another approach consists in running three simulations, all with insufficient resolution or box size, and compensating for the resulting errors in the estimation of the power spectrum applying splicing techniques (McDonald 2003; Borde et al. 2014). It has been shown by Lukić et al. (2015) that the accuracy of this method is around 10%. A different widely used strategy is to mimick baryon pressure by smoothing the matter density field with a Gaussian kernel (Gnedin & Hui 1996; Meiksin & White 2001; Viel et al. 2002, 2006; Rorai et al. 2013). The flux field produced by the smoothed density field can then be computed imposing a polytropic temperature–density relation to the IGM (Hui & Gnedin 1997). This method reproduces the statistics of the flux field reasonably well. For example, Meiksin & White (2001) claim 10% agreement between Gaussian-smoothed collisionless simulations and hydrodynamic simulations in the cumulative distribution of the flux.

A more refined way to reconstruct the baryon density is applying ad hoc transformations to the matter density field, calibrated with a reference hydrodynamic simulation (Viel et al. 2002). Mocks of Lyα forest spectra can be obtained generating a Gaussian random field and then transforming it so that it matches a certain flux PDF and power spectrum (Font-Ribera & Miralda-Escudé 2012; Font-Ribera et al. 2012a; Bautista et al. 2015). Recently, Peirani et al. (2014) exploited a hydrodynamic simulation to calibrate a mapping from the density field of an N-body simulation to the Lyα forest flux, tuned to reproduce the PDF of the flux. Then, artificial flux skewers are created in order to reproduce the two-point function of the flux given by the calibrating simulation. This is done first by computing the conditional PDF of the flux, given the DM density, from the reference simulation. Subsequently, each pixel is assigned a value of such "conditional flux." This procedure seems to yield reasonable correlation functions, but noisy skewers as well. This problem is remedied by drawing flux values from the Gaussianized percentile distribution of the conditional flux, and then forcing it to match the PDF of the conditional flux. Visually examining the plots of the resulting flux power spectrum, it appears close to the one provided by the reference hydrodynamic simulation, but the accuracy is not quantified by the authors.

The lack of quantitative assessments in the literature makes it harder to compare the results obtained via different methods. Conducting a more quantitative study is important for establishing which problems can be addressed by what methods. Another important point regards the value of the filtering scale generally adopted in the Gaussian smoothing of matter. The value of the filtering scale has not yet been measured from observational data (Rorai et al. 2013). So, in previous numerical studies it has been set to "reasonable" values, in any case not smaller than the mean inter-particle spacing of the simulations involved (otherwise, the smoothing would have a negligible effect). For example, White et al. (2010) use 139 ckpc as a smoothing scale in a simulation with a box size of 1.02 cGpc and 4000³ particles. Other authors have chosen larger values, for example, Peirani et al. (2014) compare their method with Gaussian-smoothed DM simulations with a filtering scale of 300 h⁻¹ ckpc and 1 h⁻¹ cMpc.

In this work, we use the Gaussian smoothing technique as a starting point upon which we add more refined transformations of the matter density field. Following this line of reasoning, we develop two methods, named 1D-IMS and 3D-IMS, where IMS stands for "Iteratively Matched Statistics," the technique on which they are grounded. The purpose of our methods is to accurately obtain the flux statistics from collisionless simulations. This is done through hydro-calibrated mappings, which are conceptually simpler than those adopted by Peirani et al. (2014). We quantify how accurately our methods reproduce the PDF, 1DPS and 3DPS of the flux given by a reference hydrodynamic simulation.

The high accuracy of our methods and a weak dependence on the initial smoothing scale, represents a clear advantage over the Gaussian smoothing technique. Our methods thus enable using large-box collisionless simulations, which do not resolve the Jeans scale. One important application we have in mind is modeling the BAO signature in the Lyα forest. However, there are more topics that can benefit from it: studies of UV background fluctuations, cross-correlations between galaxies and Lyα forest, and others.

The paper is organized as follows. In Section 2, we describe our simulations and the calculation of Lyα flux. In Section 3, we discuss the impact of the most important assumptions underlying approximate techniques to predict the Lyα forest in collisionless simulations. The Gaussian smoothing method is explored in great detail and we present a first quantitative analysis of its accuracy in reproducing the 3DPS of flux, as a function of the smoothing length. In Section 4, we describe 3D-IMS and 1D-IMS, assessing their accuracy. We compare the performances of the various methods in Section 5. In Section 6, we apply 3D-IMS to a future relevant context: we compute the flux statistics through an N-body simulation, calibrating the transformations involved in our technique with a smaller hydrodynamic simulation. We show that the method retains accuracy, while at the same time we demonstrate that the Gaussian smoothing technique does not yield accurate predictions when applied to simulations involving large boxes. We also compare the techniques considered by us with previous work in Section 7. Finally, in Section 8, we present our conclusions, discussing possible future applications of our work as well.

2. SIMULATIONS

The hydrodynamic simulations we use in this paper are carried out with the Nyx code (Almgren et al. 2013; Lukić et al. 2015), while N-body runs are performed with the Gadget code (Springel 2005). Both codes employ leapfrog—a second order accurate method for integrating the equations of motion of particles. Both codes also adopt the particle-mesh (PM) method with cloud-in-cell (CIC) interpolation for calculating gravitational forces. On top of the PM calculation, Gadget adds gravitational short-range force using Barnes-Hut (Barnes & Hut 1986) hierarchical tree algorithm, therefore, going to higher resolution than our Nyx runs done on a uniform Cartesian grid. Nyx, in addition to gravity, solves equations of gas dynamics using a second order accurate piecewise parabolic method. To better reproduce the 3D fluid flow, a dimensionally unsplit scheme with full corner coupling is adopted (Colella 1990). Heating and cooling are integrated using VODE (Brown et al. 1989) and are coupled to hydrodynamics through Strang splitting (Strang 1968). All cells are assumed to be optically thin and radiative feedback is considered only through the UV background model, given by Haardt & Madau (2012). For cooling rates and further details on the physics in Nyx simulations, we refer the reader to Lukić et al.'s (2015) paper. The cosmological model assumed is the ΛCDM model with parameters consistent with the seven-year data release of WMAP (Komatsu et al. 2011): Ω_m = 0.275, ${{\rm{\Omega }}}_{{\rm{\Lambda }}}=1-{{\rm{\Omega }}}_{{\rm{m}}}=0.725$ , Ω_b = 0.046, h = 0.702, σ₈ = 0.816, n_s = 0.96. Six baryonic species are implemented: e⁻, H i, H ii, He i, He ii, and He iii. The simulations are initialized at z = 159 with a grid distribution of particles and Zel'dovich approximation (Zel'dovich 1970).

To recover the absorption spectra from our simulations, we choose the lines of sight, which we refer to as "skewers," drawn parallel to one of the sides of the simulation box. The optical depth is computed according to Equation (1). After extracting skewers, we rescale the optical depth so that the mean flux is $\langle F\rangle =0.68$ at z = 3, a value consistent with current observations (Faucher-Giguère et al. 2008; Becker & Bolton 2013). Unless otherwise indicated, the results presented in this work refer to a redshift of z = 3, but in order to confirm that our conclusions are not dependent on redshift, we have also analyzed redshifts z = 2 and z = 4.

In this work, we use two hydrodynamic and two N-body simulations. The two Nyx hydrodynamic simulations have identical physics and the same spatial resolution, differing only in the choice of the box size: the smaller one has a box of 14.2 cMpc on a side, while the larger one has a box of 114 cMpc. The two simulations have 512³ and 4096³ resolution elements, respectively, and they were part of the convergence study done in Lukić et al. (2015). We will first use only one Nyx simulation to test how well we can reproduce the forest statistics given only DM particles and no gas information. This simulation is ∼1% converged resolution-wise, but the box size is too small for accurate reproduction of flux statistics. However, the main point of our work is to test how well we can match the given flux statistics, and it is irrelevant how accurately the statistics describe a particular cosmological model. In other words, it is important to have a resolution good enough to correctly capture small-scale physics, but it does not matter that the large-scale power is missing in the simulation.

We then want to test how accurately the forest statistics can be reproduced in large-volume simulations, i.e., with box sizes of 1 cGpc and larger. Of course, we do not have the "true" answer for such large boxes, as it would be obtained with hydrodynamic simulations. Instead, we will use the results of the large-box Nyx run, which is demonstrated to be converged both in resolution and box size (Lukić et al. 2015), as the "truth," and we will reconstruct its flux statistics using an N-body Gadget run in the same box together through the small-box Nyx run. We ran two Gadget N-body simulations with the same box size as the larger Nyx simulation, 114 cMpc, but with only 512³ and 256³ particles. The number of particles in Gadget runs is chosen to be representative of the mean inter-particle spacing in the state-of-the-art N-body simulations of "Hubble" volumes (e.g., Habib et al. 2012, 2013; Skillman et al. 2014). As an example, we show in the left panel of Figure 1 all particles in a 0.7 cMpc thick slice from the 256³ Gadget run. This run in 114 cMpc box yields approximately the same mass resolution as one trillion particles in a 4 cGpc simulation would. The right panel displays only 1% of the particles in the same region, from the 4096³ Nyx hydrodynamic run. The comparison of the two panels makes the high resolution, which is needed for modeling the Lyα forest statistics at ∼1% accuracy, immediately apparent.

Gadget simulations share the same phases in the initial conditions as the large Nyx run (as clearly visible in Figure 1), enabling the comparison of individual skewers. We emphasize that although somewhat artificial, this test is actually more difficult than the real-world situation, and thus we expect that we can only overestimate the error of our method. The reason is that, in reality, we would use a fully converged ∼100 cMpc hydrodynamic simulation to model flux statistics in ∼1 cGpc N-body simulations, making box size errors negligible, whereas here we cannot avoid them.

2.1. Lyα Skewers

The Lyα forest arises from the scattering of photons along their path from a background quasar to the observer. The fraction of the transmitted flux is $F=\exp (-\tau )$ , where τ is the opacity of the intervening IGM. The opacity in redshift space at a given velocity coordinate u along the line of sight is given by

$\begin{eqnarray}&&\tau (u)=\int {du}^{\prime} \,\displaystyle \frac{{\lambda }_{\mathrm{Ly}\alpha }\sigma \,{n}_{{\rm{H}}{\rm{I}}}({{\boldsymbol{u}}}^{\prime })}{H(z)b({{\boldsymbol{u}}}^{\prime })}\exp \left[-\displaystyle \frac{{(u-{u}_{0}({{\boldsymbol{u}}}^{\prime }))}^{2}}{b{({{\boldsymbol{u}}}^{\prime })}^{2}}\right]\end{eqnarray} \tag{ 1 }$

where u' is the component of the Hubble flow velocity field ${{\boldsymbol{u}}}^{\prime }$ along the line of sight, over which the integral is calculated. In the above expression, ${n}_{{\rm{H}}{\rm{I}}}({{\boldsymbol{u}}}^{\prime })$ is the number density of neutral hydrogen and σ and λ_Lyα are the cross section⁴ and wavelength of the Lyα transition in the rest frame, respectively. The line-of-sight velocity of gas particles is then given by ${u}_{0}({{\boldsymbol{u}}}^{\prime })=u^{\prime} +{u}_{\mathrm{pec}}({{\boldsymbol{u}}}^{\prime })$ , the second term being the peculiar velocity of the gas. In Equation (1), thermal broadening is described by $b({{\boldsymbol{u}}}^{\prime })=\sqrt{2{k}_{B}T({{\boldsymbol{u}}}^{\prime })/{m}_{{\rm{p}}}}$ , where $T({{\boldsymbol{u}}}^{\prime })$ is the temperature of the gas and m_p the proton mass. The convolution with thermal broadening and peculiar velocities actually yields a Voigt profile in Equation (1) instead of a Gaussian. However, the latter is a good approximation for τ < 100 (Lukić et al. 2015), a regime relevant for the Lyα forest studies. Computation of τ requires a determination of the neutral hydrogen density, which in turn depends on baryon density and temperature, as well as the hydrogen ionization and recombination rates. The challenge for approximate methods is to recover relevant Lyα forest statistics, without the knowledge of baryon thermodynamical quantities.

3. LIMITATIONS OF APPROXIMATE METHODS

The very first task of approximate methods is to obtain an estimate for the baryon density field. This is commonly done via manipulation of the density field in an N-body run (Meiksin & White 2001; Viel et al. 2002; Peirani et al. 2014), to account for the baryonic pressure smoothing. The functional form for the smoothing is usually a Gaussian and that is indeed the starting point for all methods considered in this paper. Second, approximate methods need other assumptions, concerning the estimate of the temperature of the IGM and its velocity field. In this section, we review these approximations and assess their impact on the accuracy of the Lyα flux, as a function of the smoothing length. In order to better understand inherent limitations of the approximate methods, we will also consider separately the accuracy of baryon density reconstruction of other thermodynamic quantities.

3.1. Gaussian Smoothing

A pseudo baryon density field can be generated from a collisionless simulation by smoothing the matter density fluctuations δ at a characteristic smoothing length λ_G, given as

$\begin{eqnarray}&&{\delta }^{{\lambda }_{G}}({\boldsymbol{k}})=\delta ({\boldsymbol{k}})\exp (-{\lambda }_{G}^{2}{k}^{2}).\end{eqnarray} \tag{ 2 }$

This length is expected to be of the order of the Jeans filtering scale, which is in comoving units (Binney & Tremaine 2008):

$\begin{eqnarray}&&{\lambda }_{J}^{2}(t)=\displaystyle \frac{{c}_{{\rm{s}}}^{2}(t)a(t)}{4\pi G{\rho }_{0}},\end{eqnarray} \tag{ 3 }$

where c_s is the speed of sound at time t, a(t) is the scale factor, ρ₀ is the mean matter density, and G is Newton's gravitation constant. The same line of reasoning can be applied to the line-of-sight velocities of particles as well. Once both matter density and velocities are smoothed, flux skewers can be computed with some approximation for the IGM temperature (discussed in Section 3.2), replacing baryon density and velocity fields with the corresponding smoothed matter quantities. For the sake of clarity, we summarize the inputs required to apply the Gaussian smoothing technique (and our methods, which will be discussed in Sections 4.1 and 4.2) in Table 1.

Table 1. Inputs Needed for the Different Methods Considered

Method	DM Particle Distribution	(λ_G, T₀, γ)	F_real: 3D Power Spectrum and PDF	F: 1D Power Spectrum and PDF
GS+FGPA	✓	✓
3D-IMS	✓	✓	✓
1D-IMS	✓	✓	✓	✓

Download table as: ASCII Typeset image

There are quantitative studies in the literature aiming to understand how well the Gaussian smoothing technique reproduces various flux statistics computed through hydrodynamic simulations (see Section 7 for details), but none of them consider the flux 3DPS. We take λ_G as a free parameter and assess the accuracy with which the Gaussian smoothing technique recovers the flux 1DPS, PDF, and 3DPS, through the following steps.

1.
We have particle positions and velocities from a simulation. We deposit them on a grid using CIC deposition; we use the grid with as many cells as the number of particles in the simulation.
2.
We smooth this density field with a certain smoothing scale λ_G and the velocity field at 228 ckpc (see the Appendix).
3.
We compute 1DPS, 3DPS, and PDF of the flux field obtained using FGPA (see Section 3.2).

3.2. Fluctuating Gunn–Peterson Approximation

Equation (1) can be simplified expressing the neutral hydrogen density ${n}_{{\rm{H}}{\rm{I}}}$ as a function of the baryon density fluctuations δ_b. Let us consider a gas composed by hydrogen and helium. Let ${x}_{{\rm{H}}{\rm{II}}}$ , x_{He ii}, and x_{He iii} be the fractions of ionized hydrogen, singly and doubly ionized helium respectively. The total number densities of hydrogen n_H and helium n_He are related through ${n}_{{\rm{He}}}=\chi {n}_{{\rm{H}}}$ , where $\chi =X/4Y$ . Assuming photoionization equilibrium, the number density of neutral hydrogen is given by

$\begin{eqnarray}&&{n}_{{\rm{H}}{\rm{I}}}=\displaystyle \frac{\alpha (T)}{{{\rm{\Gamma }}}_{{\rm{H}}{\rm{I}}}}{x}_{{\rm{H}}{\rm{II}}}[(1+\chi ){x}_{{\rm{H}}{\rm{II}}}+\chi {x}_{\mathrm{He}{\rm{III}}}]{n}_{{\rm{H}}}^{2,}\end{eqnarray} \tag{ 4 }$

where $\alpha (T)\propto {T}^{-0.7}$ is the Case A recombination coefficient per proton and Γ_{H i} is the photoionization rate of hydrogen. The commonly used Case A and B definitions differentiate media that allow the Lyman photons to escape or that are opaque to these lines (except for Lyα), respectively. Case A is used in the optically thin limit, which can be applied in most of the Lyα forest, since it describes regions where the density is low enough to let Lyman limit photons escape (Furlanetto et al. 2006; Kuhlen & Faucher-Giguère 2012; see also the discussion in Miralda-Escudé 2003 and Kaurov & Gnedin 2014). If helium is only singly ionized, the factor between square brackets in Equation (4) becomes $(1+\chi ){x}_{{\rm{H}}{\rm{II}}}$ , while for ${x}_{{\rm{H}}{\rm{II}}}={x}_{\mathrm{He}{\rm{III}}}$ it is $(1+2\chi ){x}_{{\rm{H}}{\rm{II}}}$ . Apart from the detailed modeling of the ionized fractions, the important point of Equation (4) in this context is that ${n}_{{\rm{H}}{\rm{I}}}\propto {T}^{-0.7}{n}_{{\rm{H}}}^{2}$ .

Simulations show that the temperature–density relationship of the IGM is a power law over a wide range of density and temperature (Hui & Gnedin 1997), so that

$\begin{eqnarray}&&T({\boldsymbol{u}})={T}_{0}{(1+{\delta }_{{\rm{b}}}({\boldsymbol{u}}))}^{\gamma -1}\end{eqnarray} \tag{ 5 }$

where T₀ and γ are constants. From our simulation, at redshift z = 3, we obtained T₀ = 1.09 × 10⁴ K and γ = 1.56, following the fitting procedure described by Lukić et al. (2015). Assuming (5), the relationship between n_{H i} and δ_b can be expressed in terms of the parameters of our simulation as follows.

$\begin{eqnarray}{n}_{{\rm{H}}{\rm{I}}}({\boldsymbol{u}}) & = & A\displaystyle \frac{8.28\,\times \,{10}^{-13}\,{{\rm{s}}}^{-1}}{{{\rm{\Gamma }}}_{{\rm{H}}{\rm{I}}}}\displaystyle \frac{{{\rm{\Omega }}}_{{\rm{b}}}{h}^{2}}{0.0227}{\left(\displaystyle \frac{1+z}{4}\right)}^{3}\\ & & \times {\left(\displaystyle \frac{{T}_{0}}{1.09\times {10}^{4}{\rm{K}}}\right)}^{-0.7}{[1+{\delta }_{{\rm{b}}}({\boldsymbol{u}})]}^{2-0.7(\gamma -1)},\end{eqnarray} \tag{ 6 }$

where A is a proportionality constant. For our simulation, A = 3.09 × 10⁻¹² cm⁻³. Neglecting the scatter in the temperature–density relationship of the IGM, i.e., assuming (5) and consequently ${n}_{{\rm{H}}{\rm{I}}}\propto {(1+{\delta }_{b})}^{2-0.7(\gamma -1)}$ , is usually referred to as "Fluctuating Gunn–Peterson Approximation" (FGPA; Weinberg et al. 1997; Croft et al. 1998, pp. 69–75). Since the FGPA is useful when one cannot, or does not, wish to run a hydrodynamic simulation, one also needs an approximation for δ_b in Equation (6). For this reason, in any practical situation, δ_b is replaced by the DM density fluctuations δ_DM, with or even without Gaussian smoothing. For the sake of clarity, in the remainder of our work, we shall refer solely to the operation described by Equation (2) with "Gaussian smoothing." On the contrary, the Gaussian smoothing of the DM density field, combined with the FGPA to compute the Lyα flux field, shall be denoted as "Gaussian smoothing and FGPA" (GS+FGPA).

We now define a new field, the "flux in real space" (or simply "real flux") F_real as the flux that would be obtained neglecting thermal broadening and peculiar velocities. This is not a physical observable, but the shape of its power spectrum is sensitive to the Jeans scale (Kulkarni et al. 2015) and it will be a useful quantity in our computations. As such, we can define the opacity in real space

$\begin{eqnarray}&&{\tau }_{\mathrm{real}}({\boldsymbol{u}})=\displaystyle \frac{{\lambda }_{\mathrm{Ly}\alpha }\sigma }{H(z)}{n}_{{\rm{H}}{\rm{I}}}({\boldsymbol{u}}).\end{eqnarray} \tag{ 7 }$

Within the FGPA, ${\tau }_{{\rm{real}}}\propto {n}_{{\rm{H}}{\rm{I}}}\propto {(1+{\delta }_{b})}^{2-0.7(\gamma -1)}$ . Convolving (7) with the gas velocities and thermal broadening, one obtains (1).

3.3. Accuracy of FGPA

We now assess the accuracy of the FGPA using the baryon density field from the hydrodynamic simulation as a reference. Indeed, we want to focus on how the accuracy of the FGPA is influenced by approximating the baryon velocity field with the Gaussian-smoothed DM velocity field and by the assumption that the temperature–density relationship is a power law. In this way, we investigate the "inherent" accuracy of the FGPA.

In our hydro simulation, we also have the velocities of DM particles. Thus we construct the velocity field by CIC-binning them on a grid with as many cells as the number of particles and then smooth it with a Gaussian kernel. In principle, the smoothing length of velocity could be different from the one of the DM density field. We keep it fixed to 228 ckpc throughout the paper, since we verified that this value gives the best overall accuracy in reproducing the statistics considered (see the Appendix for further details). However, we have also checked that modifying the smoothing length for the velocity field does not significantly change our conclusions.

In Figure 2, we show different physical quantities along one skewer as an example, to display the differences between the hydrodynamic simulation (solid green lines) and the FGPA (dashed blue lines). The top panel shows the density fluctuations along the skewer considered. The second panel underscores the differences between a temperature–density relationship with no scatter and the temperature given by the hydrodynamic simulation. We see that the biggest differences arise around the highest density peaks, where shocks could be present. In the third panel, we plot the line-of-sight velocity of DM particles⁵ (black line) and baryons (green line). Here we also plot the smoothed DM velocity, which is the one we actually adopt (dashed blue line). In the last panel, we show the difference between the flux computed as explained in this section and from the hydrodynamic simulation. We notice that the FGPA recovers the flux skewer remarkably well.

We show the results about the statistics of flux skewers in Figures 3 and 4. In the upper panels of Figure 3, we show the flux 1DPS and PDF given by the hydrodynamic simulation and the FGPA applied as explained above. In the lower panels, we show the relative difference of the statistics obtained with respect to the results of the reference simulation. Analogous plots for the flux 3DPS can be seen in Figure 4. We recall that the 3DPS can be expressed as a function of the norm of the ${\boldsymbol{k}}$ -mode considered and of $\mu =\hat{{\boldsymbol{n}}}\cdot {\boldsymbol{k}}/k$ , where $\hat{{\boldsymbol{n}}}$ is the unit vector parallel to the line of sight. We shall denote the dimensionless 3DPS as ${{\rm{\Delta }}}^{2}(k,\mu )={k}^{3}{P}_{F}(k,\mu )/2{\pi }^{2}$ .

**Figure 4.** We show the dimensionless 3DPS Δ²(k, μ) of the flux given by our reference hydrodynamic simulation (solid green lines) and of the flux computed by imposing a one-to-one temperature–density relationship on the baryon density given by the hydrodynamic simulation, and using the Gaussian-smoothed line-of-sight velocities of dark matter instead of baryons (dashed blue lines). We consider four bins of μ, and show ${{\rm{\Delta }}}^{2}(k,\mu )$ as well as the relative difference between the spectra. The dashed vertical line marks the dynamic range considered to compute the accuracy of the FGPA (see the text for details). The relative errors plotted show the intrinsic limitations of approximate techniques predicting the Lyα forest through the manipulation of the DM density field given by collisionless simulations.
Download figure:
Standard image High-resolution image

**Figure 4.** We show the dimensionless 3DPS Δ²(k, μ) of the flux given by our reference hydrodynamic simulation (solid green lines) and of the flux computed by imposing a one-to-one temperature–density relationship on the baryon density given by the hydrodynamic simulation, and using the Gaussian-smoothed line-of-sight velocities of dark matter instead of baryons (dashed blue lines). We consider four bins of μ, and show ${{\rm{\Delta }}}^{2}(k,\mu )$ as well as the relative difference between the spectra. The dashed vertical line marks the dynamic range considered to compute the accuracy of the FGPA (see the text for details). The relative errors plotted show the intrinsic limitations of approximate techniques predicting the Lyα forest through the manipulation of the DM density field given by collisionless simulations.
Download figure:
Standard image High-resolution image

The accuracy of the FGPA depends on the Fourier modes considered for the power spectra and on the specific binning adopted for the flux PDF. We now define a set of parameters describing the overall goodness of the method. For this purpose, we first delimit a range of Fourier modes and flux in which it is sensible to compare the statistics obtained via the simulation and the FGPA. In current state-of-the-art high-resolution spectra, metal absorption features can increase the 1DPS at k ≳ 0.1 s km⁻¹ (McDonald et al. 2000, 2005; Lidz et al. 2010; Viel et al. 2013). For this reason, this is typically the maximum k considered in high-resolution power spectra studies. We shall therefore set our upper limit of the range of k considered to 0.1 s km⁻¹. This upper bound is indicated with the vertical dashed line in the left panels of Figure 3. The lower bound of the k-range considered by us is simply given by the scale of the simulation box. The overall accuracy of the FGPA is assessed by the arithmetic mean of the modulus of the relative error in the range of k considered:

$\begin{eqnarray}&&m=\displaystyle \frac{1}{N}\sum _{k\lt 0.1\,{\rm{s}}\,{\mathrm{km}}^{-1}}\,\displaystyle \frac{| {P}_{F}^{\mathrm{hydro}}(k)-{P}_{F}^{\mathrm{FGPA}}(k)| }{{P}_{F}^{\mathrm{hydro}}(k)}\end{eqnarray} \tag{ 8 }$

where N is the number of modes in such a range. A small value of m implies a good mean accuracy. Note, however, that it does not necessarily mean that the accuracy is good everywhere. Indeed, a low value of m can be achieved by a set of points where the relative error is extremely close to zero for many of them but large for just a couple of modes. In other words, m tells us nothing about the dispersion of the relative error around its mean value. To estimate such a dispersion, we simply compute the rms s of the relative error in the range considered:

$\begin{eqnarray}&&{s}^{2}=\displaystyle \frac{1}{N}\sum _{k\lt 0.1\,{\rm{s}}\,{\mathrm{km}}^{-1}}\,{\left(\displaystyle \frac{| {P}_{F}^{\mathrm{hydro}}(k)-{P}_{F}^{\mathrm{FGPA}}(k)| }{{P}_{F}^{\mathrm{hydro}}(k)}-m\right)}^{2}.\end{eqnarray} \tag{ 9 }$

The range within which we compute m and s is 0.1 < F < 0.9. The upper bound means that we are excluding a range of flux often limited by continuum placement uncertainties (Lee 2012), whereas the lower bound translates into ignoring flux values susceptible to inaccuracies in modeling optically thick absorbers (Lee et al. 2015). The same analysis is applied to the 3DPS as well, by doing a separate calculation for each bin of μ (we consider four μ-bins, evenly spaced between zero and one).

The mean accuracy of FGPA at a smoothing length of 228 ckpc in reproducing 1DPS and PDF of the flux given by the hydrodynamic simulation is 2%. For the 3DPS, it is between 3% and 5%, depending on the μ-bin considered. We stress that these levels of accuracy are obtained employing in the computations the baryon density provided by the hydrodynamic simulation. This means that, regardless of how well we create the pseudo density field, this sets our limiting accuracy. To improve it even more, one should come up with more refined ways of reproducing the velocity field and the scatter in the temperature–density relationship.

To sum up, one source of error is considering DM velocities instead of baryonic ones. This is minimized because we looked for the optimal smoothing length for the velocity field. The remaining uncertainty arises from the scatter in the temperature–density relationship, which is not captured by the FGPA.

4. ITERATIVELY MATCHED STATISTICS

To better model the Lyα forest in collisionless simulations, we developed two novel methods, which iteratively match certain Lyα forest flux statistics given as input. The most accurate inputs today come from hydrodynamic simulations, and that is what we use here. We name this technique "Iteratively Matched Statistics" (IMS); the two methods are called 3D-IMS and 1D-IMS.

4.1. 3D Iteratively Matched Statistics

The basic idea of 3D-IMS is to compute the flux from a collisionless simulation and match its one- and two-point statistics to a reference hydrodynamic simulation. Because redshift space distortions and thermal broadening make the flux field anisotropic, for simplicity, we conduct this matching in real space, where the flux is an isotropic random field. So, in general, one needs a collisionless simulation and a model for the 3D power spectrum and PDF of the flux in real space to apply 3D-IMS. In our case, the model for these statistics is the result of our hydrodynamic simulation. The tabulated 3D power spectrum and PDF of the flux in real space are the inputs of the method, together with the DM particle distribution given by the collisionless simulation and the thermal parameters of the IGM (see Table 1). Before going into the details of the procedure, it is worth enumerating the main steps, to better understand the logical flow.

1.
As a starting point, the DM density is smoothed with a Gaussian kernel with a smoothing length λ_G. In a situation where the DM was simulated on a coarse grid (e.g., with a PM code), λ_G would be at least as large as the inter-particle simulation. The smoothed field is used to compute the flux in real space within the FGPA, following Equations (6) and (7). We shall call this flux field ${F}_{{\rm{real}}}^{\mathrm{DM}}$ .
2.
The input real flux dimensionless 3DPS and PDF, taken from the hydrodynamic simulation, are used to calibrate two transformations. Such transformations are iteratively applied to ${F}_{{\rm{real}}}^{\mathrm{DM}}$ , forcing its dimensionless 3DPS and PDF to match the ones given as input. The iterations are implemented until both statistics are matched with high precision.
3.
From the resulting pseudo real flux field, a pseudo baryon density field is obtained inverting Equations (7) and (6).
4.
The pseudo baryon density is Gaussian-smoothed with a smoothing length equal to the size of a grid cell. As we shall explain later, this step is necessary to remove hot pixels that give rise to non-physical density skewers. The smoothed baryon density field is then used to compute flux skewers within the FGPA.

The points just enumerated, which can be visualized using the flow chart in Figure 5, give our method its name: Iteratively Matched Statistics (IMS). The prefix 3D stresses that we are matching the dimensionless 3DPS of the flux in real space. Matching this statistics is straightforward, as the ${F}_{\mathrm{real}}$ 3DPS can be analytically fit by a power law with a Gaussian cutoff (Kulkarni et al. 2015) and is isotropic in redshift space.

On the contrary, reproducing the 3DPS of flux in redshift space would be more complicated because it is an anisotropic power spectrum. It would require performing transformations in real space after deconvolving redshift space distortions and thermal broadening. Matching the 3DPS of the baryon density would not be optimal either, since it does not exhibit an obvious Jeans cutoff (Kulkarni et al. 2015), being dominated by higher density structures in collapsed halos at small scales. Although these rare dense regions dominate the baryon power spectrum, they contribute negligibly to variations in the Lyα forest flux because the exponentiation of the opacity field maps them to zero. As such, we choose to match the statistics of the real-space flux field, since this is an isotropic field that is directly related to the observable, that is, the flux in redshift space.

We shall now examine the details of each step of the method. We want to remap ${F}_{{\rm{real}}}^{\mathrm{DM}}$ to a new field ${F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ with the same dimensionless 3DPS as ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ . To do this, let us consider the real flux fluctuations ${\delta }_{{F}_{{\rm{real}}}^{\mathrm{DM}}}$ and ${\delta }_{{F}_{\mathrm{real}}^{{\rm{HYDRO}}}}$ in Fourier space. We define ${F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ as ${\delta }_{{F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}}({\boldsymbol{k}})=T(k){\delta }_{{F}_{{\rm{real}}}^{\mathrm{DM}}}({\boldsymbol{k}})$ , where T(k) is a function tuned to match the dimensionless 3DPS of ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ . We shall call it the "transfer function" and its explicit expression is given by

$\begin{eqnarray}&&T(k)=\sqrt{\displaystyle \frac{{{\rm{\Delta }}}_{{F}_{\mathrm{real}}^{{\rm{HYDRO}}}}^{2}(k)}{{{\rm{\Delta }}}_{{F}_{{\rm{real}}}^{\mathrm{DM}}}^{2}(k)}},\end{eqnarray} \tag{ 10 }$

where ${{\rm{\Delta }}}_{X}^{2}(k)={k}^{3}{P}_{X}(k)/2{\pi }^{2}$ denotes the dimensionless 3DPS of field X. Let us point out that in our case it is straightforward to apply Equation (10), because both ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ and ${F}_{{\rm{real}}}^{\mathrm{DM}}$ sample the same modes, having been built from the same simulation. However, one can apply it also to the more interesting case where ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ is computed from a small-box hydrodynamic simulation and ${F}_{{\rm{real}}}^{\mathrm{DM}}$ from a large-box N-body simulation. This will be discussed into more detail in Section 6.

At this point, we compute the pseudo real flux field simply as ${F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}({\boldsymbol{x}})={\bar{F}}_{\mathrm{real}}^{{\rm{HYDRO}}}(1+{\delta }_{{F}_{\mathrm{real}}^{3{\rm{D}}-\mathrm{IMS}}}({\boldsymbol{x}}))$ , where ${\bar{F}}_{\mathrm{real}}^{{\rm{HYDRO}}}$ is the mean value of the real flux field obtained from the hydrodynamic simulation.

The field ${F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ does not have the same PDF as ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ . To match the PDF, we use the argument explained by Peirani et al. (2014). We compute the cumulative distribution of both fields and we construct a mapping between the two fluxes by assigning to each value of ${F}_{\mathrm{real}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ the value of ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ corresponding to the same percentile in their respective cumulative distributions. We now have a new pseudo real flux field, whose PDF matches by construction the one of ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ . However, its dimensionless 3DPS is no longer the same as ${{\rm{\Delta }}}_{{F}_{\mathrm{real}}^{{\rm{HYDRO}}}}^{2}(k)$ .

To match both dimensionless 3DPS and PDF of ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ , we iterate the two transformations. We verified that both 3DPS and PDF converge to their counterparts in the simulation. This is a non-trivial result.⁶ Convergence occurs between 10 and 20 iterations, after which the improvement in the transfer function at every additional iteration is less than 0.3%. It is worth pointing out that every time we match the 3DPS there is no warranty that the new flux field has physically meaningful values, i.e., between zero and one. This is indeed the case, so we cannot simply compute ${\delta }_{{\rm{b}}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ from the resulting flux field. This issue is fixed naturally when we match the PDF. Since the distributions are mapped percentile to percentile and ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ obviously contains only physical values, pixels with negative flux are mapped to small but positive values and pixels with flux larger than one are mapped to values close to but less than one. It is then fundamental to conclude the iteration process matching the PDF.

At the end of the last iteration, we have the final pseudo real flux field, whose PDF matches by construction the one of ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ . Since in our model there is a one-to-one correspondence between δ_b, ${n}_{{\rm{H}}{\rm{I}}}$ , and F_real, the PDF of the pseudo baryon density and the hydrogen number density have also converged to an asymptotic distribution. However, in the hydrodynamic simulation there is no such correspondence, since skewers are not computed within the FGPA. As a result, the PDF of the final ${\delta }_{{\rm{b}}}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ does not perfectly match the corresponding field ${\delta }_{{\rm{b}}}^{{\rm{HYDRO}}}$ from the reference hydrodynamic simulation. Furthermore, pseudo baryon density skewers present some non-physical cuspy overdensities. They arise because, whereas the transformation matching the dimensionless 3DPS of real flux keeps the overall shape of real flux skewers intact, it generally changes the value of the real flux in single pixels. Even a fluctuation as little as ∼10⁻³ can increase the real flux at a local minimum above the values of the neighboring pixels, effectively changing their rank ordering. Since the transformation matching the real flux PDF preserves the rank ordering of pixels, and because of the exponentiation of Equation (7), low-flux regions can give rise to large discontinuities in density. These cusps can be eliminated with a Gaussian smoothing. In this way, the density values in neighboring pixels are "blended" together and, as a result, very high values are turned into physical ones. The drawback is that, if we compute the real flux from the smoothed field, it will not have the same PDF as ${F}_{\mathrm{real}}^{{\rm{HYDRO}}}$ anymore. A good compromise is adopting the shortest possible length scale for the smoothing, that is, the size of one cell of the grid on which we CIC-binned the DM particle distribution. In our case, that corresponds to 28 ckpc. We emphasize here that this last smoothing must always be below the smallest relevant physical scale in the hydrodynamic simulation, which in our context is the Jeans scale, not to considerably affect the resulting statistics.

Running the method for different values of λ_G, we investigate if there is a trend of the accuracy of the various flux statistics. We remind the reader that the initial smoothing serves only as a starting point for the method. In any realistic situation, the value of λ_G is going to be related to the inter-particle separation of the underlying simulation. Indeed, smoothing on a scale smaller than that would make the PDF of the baryon density inaccurate, especially in voids (Rorai et al. 2013), which are the most relevant regions as far as the Lyα forest signal is concerned. The results of our analysis are discussed in Section 5.

4.2. 1D Iteratively Matched Statistics

The method called 1D-IMS has 3D-IMS as a starting point, on top of which further transformations are applied. Alongside the inputs required by 3D-IMS, one needs to provide a model for the line-of-sight power spectrum and PDF of the flux in redshift space as well (see Table 1). Once again, we computed these inputs from the hydrodynamic simulation. After running 3D-IMS, we are left with a real flux field whose dimensionless 3DPS and PDF match the ones of the real flux from the reference hydrodynamic simulation. We then compute the flux in redshift space and apply again the IMS procedure, this time aiming at matching the dimensionless line-of-sight power spectrum and PDF of the flux in redshift space from the hydrodynamic simulation. As in 3D-IMS, we apply two ad hoc transformations. Analogously to Equation (10), we define a transfer function as follows.

$\begin{eqnarray}&&T(k)=\sqrt{\displaystyle \frac{{P}_{{F}^{\mathrm{HYDRO}}}^{{\rm{1D}}}(k)}{{P}_{{F}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}}^{{\rm{1D}}}(k)}},\end{eqnarray} \tag{ 11 }$

where ${P}_{{F}^{\mathrm{HYDRO}}}^{{\rm{1D}}}(k)$ and ${P}_{{F}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}}^{{\rm{1D}}}(k)$ are the line-of-sight power spectra of the flux in redshift space given by the hydrodynamic simulation and obtained after running 3D-IMS, respectively. After multiplying the Fourier modes of the fluctuations of ${F}^{3{\rm{D}} \mbox{-} \mathrm{IMS}}$ by T(k), we have a flux field whose dimensionless power spectrum matches ${{\rm{\Delta }}}_{{F}^{\mathrm{HYDRO}}}^{2}(k)$ by construction.

At this point, we match its PDF to the one given by the hydrodynamic simulation exploiting the cumulative distributions, just like in Section 4.1. We then reiterate the two transformations until we achieve convergence in both 1DPS and PDF. Since these statistics are now matched by construction, it would be interesting to check if the 3D correlations are preserved. We then investigate the trend of the accuracy of the 3DPS as a function of λ_G.

We run 1D-IMS for different values of the initial smoothing length λ_G. The results are discussed in the next section.

5. VALIDATION OF IMS

After implementing the methods described in the previous sections, we assess the accuracy with which we can reproduce the results of the hydrodynamic simulation. In Section 5.1, we compare the performance of the various techniques in reproducing the skewers of the simulation. In Sections 5.2 and 5.3, we investigate how accurately the statistics of flux are recovered.

5.1. Skewers

In Figure 6, we show different quantities along one skewer as an example. From top to bottom, we plot the baryon density fluctuations, temperature, velocity field, flux in real space, and flux in redshift space. In all panels, solid green lines refer to the skewers extracted from the hydrodynamic simulation, solid blue lines to the ones obtained through GS+FGPA, and solid cyan and dashed red lines to 3D-IMS and 1D-IMS, respectively. Each curve corresponds to the optimal smoothing length for the respective method. The dashed blue line in the third panel from the top refers to the line-of-sight velocities of DM, Gaussian-smoothed at λ_G = 228 ckpc. This is the velocity field used in all approximate methods to compute all quantities above (see the Appendix).

We recall that 1D-IMS has 3D-IMS as a starting point and differs from it for two additional transformations to match the dimensionless 1DPS and the PDF of the flux in redshift space with the results from the hydrodynamic simulation. Therefore, the flux in real space, and consequently baryon density fluctuations and temperature fields, are the same as in 3D-IMS.

We can see that all methods result in skewers that trace those of the hydrodynamic simulation very well, and are also consistent with one another. This means that not only is IMS able to reproduce the statistics of the Lyα forest correctly, but it also generates reasonable mock skewers. This did not have to be the case. For example, the method LyMAS (Peirani et al. 2014) is designed to match the 1DPS and PDF of the flux from hydrodynamic simulations as well, but only the more complex version of LyMAS, which involves two additional transformations, produces reasonable-looking skewers. In our techniques, the mappings guarantee that both statistics and flux skewers are reproduced accurately.

Furthermore, not only is the flux accurately reproduced, but also the other quantities plotted in Figure 6. The biggest difference between IMS methods and GS+FGPA is that the former better reproduces high and narrow density peaks, like the ones around 3 and 4 cMpc in Figure 6.

Conversely, this is not always the case for smaller overdensities, such as the one around 0.7 cMpc in Figure 6, where IMS produces a lower density peak than in the hydro. Note, however, that at this location the flux in real space is still much more accurate with our methods, since they are designed to match its 3D power spectrum. Small differences in F_real can easily yield large differences in density because of the exponential in Equation (7). Such differences persist also in temperature, which in our context is connected to the density through a pure power law. Flux skewers in redshift space appear to be more similar among the various methods, since the convolution of real flux with the velocity field and thermal broadening tends to smooth out the differences.

5.2. Comparison of the Methods

All methods we considered have GS+FGPA as their starting point, with λ_G as a free parameter. We now compare the flux statistics given by each method with the ones from the hydrodynamic simulation, varying λ_G in the range of 0–570 ckpc, in steps of 57 ckpc. The dependence on this parameter of the accuracy in reproducing the various statistics is different for each method. It is generally possible to identify an optimal value of λ_G for a given method by matching a certain statistic, but this may not be optimal for all flux statistics.

In the top panels of Figure 7, we compare 1DPS and PDF of the flux in redshift space given by the hydrodynamic simulation (solid green line) to the approximate methods considered. Solid blue, solid cyan, and dashed red lines refer to GS+FGPA, 3D-IMS, and 1D-IMS, respectively. We plotted the curves corresponding to λ_G = 228 ckpc for each method. Since 1D-IMS is designed to match dimensionless 1DPS and PDF of the flux, solid green and dashed red lines are indistinguishable. We can see that 3D-IMS reproduces well both 1DPS and PDF, whereas GS+FGPA does not recover the 1DPS well. The lower panels make the comparison quantitative, showing the relative error of each method by recovering the results of the reference simulation. The gray shaded area represents the region within which the relative difference is smaller than the one obtained applying the FGPA to the baryon density given by the hydrodynamic simulation and using the smoothed DM velocity field, as explained in Section 3.3. We recall that this sets the limits on the accuracy due to adopting the DM-smoothed velocity field and neglecting the scatter in the temperature–density relationship of the IGM.

Figure 7 then tells us that 1D-IMS is able to recover the information lost with these approximations by construction, since it was forced to match the redshift space 1DPS and PDF of the hydrodynamic simulation. In contrast, the flux PDF given by 3D-IMS does not appear very accurate, perhaps even erroneously suggesting a flaw in the method. This is not the case, as 3D-IMS matches the PDF of the flux in real space, whereas in the right panel we are considering the PDF of the flux in redshift space. Although the relative error of the 3D-IMS PDF is as large as 30% at F = 0.2, the average accuracy is 15%. When the optimal value of λ_G is used for 3D-IMS (57 ckpc), the PDF is reproduced with an average accuracy of 8%. The variability of the accuracy at different flux values is not too surprising because in the last step of 3D-IMS we smooth the pseudo baryon density field to remove hot pixels and this impacts the accuracy of the corresponding flux PDF.

Figure 8 shows the 3DPS given by the simulation and the various methods at λ_G = 228 ckpc, as well as the relative error in matching this statistic. The color coding is the same as in Figure 7. Each panel refers to a different μ-bin of the 3DPS. In all bins, the accuracy of 3D-IMS and 1D-IMS looks on average comparable to the limit set by the FGPA.

In Figure 8, one can clearly see that GS+FGPA in the top-left panel (0.0 < μ < 0.25, farthest from the line of sight) does not match the hydrodynamic result as well as in the other panels. This is due to the different effects at work at different directions from the line of sight. For very transverse modes (0.0 < μ < 0.25) the behavior of baryons is mostly influenced by the filtering scale. Whereas for modes that are parallel to the line of sight (0.75 < μ < 1.0) the effect of the Jeans filtering is degenerate with thermal broadening and redshift space distortions (Rorai et al. 2013). As a result, in the bin closest to the line of sight (0.75 < μ < 1.0), one can compensate a bad choice of λ_G with an accurate description of the thermal state of the IGM, whose parameters (T₀ and γ) are obtained by fitting outputs of the hydrodynamic simulation. However, it is not possible to apply this correction in the bin with modes most transverse to the line of sight, where the effect of the filtering scale dominates the shape of the 3D power. Furthermore, 3D-IMS is superior to GS+FGPA to the extent that it also matches the dimensionless 3D power spectrum of flux in real space by construction.

Regarding 1D-IMS, it might seem puzzling that it does not perfectly match the result of the hydrodynamic simulation in the bin closest to the line of sight, since 1D-IMS is forced to reproduce the dimensionless line-of-sight power spectrum by construction. However, the 3DPS in the bin closest to the line of sight is not exactly the 1DPS. Indeed, the power spectrum in that bin considers all flux fluctuations whose wavevector forms an angle with the line of sight such that its cosine is between 0.75 and 1. This is a 3D region in redshift space. In the case of the 1DPS, the situation is much different, since one considers flux fluctuations exclusively along the direction of the line of sight. Therefore, the 1DPS and 3D power for the bin closest to the line of sight are not exactly the same, and matching the 1DPS by construction does not guarantee a perfect match in this bin of the 3DPS. It is true, however, that the agreement should be much better if one considers μ values progressively closer to being parallel to the line of sight (μ = 1), which we have verified directly.

5.3. Accuracy versus Smoothing Length

The best simulations reproducing BOSS/DESI-like surveys have a mean inter-particle separation of ∼400 ckpc. As we have already mentioned, Gaussian smoothing below the the inter-particle separation has a negligible effect. Therefore, it is of great interest to test the accuracy of GS+FGPA and our methods at different values of the smoothing length, including λ_G > 300 ckpc. For this purpose, we compute the mean and rms of the accuracy for each value of λ_G, as explained in Section 3.3.

We show the results of our analysis for the 1DPS and PDF in Figure 9, in the left and right panels, respectively. The information given by Figure 7 is condensed in three points here, one for each method, at the corresponding value of λ_G. Blue squares represent the mean accuracy m of GS+FGPA and the corresponding error bars the rms s, as defined in Equations (8) and (9), respectively. Likewise, cyan triangles and red circles refer to 3D-IMS and 1D-IMS, respectively. The information encoded by the gray shaded areas in Figure 7, which shows the limitations of the FGPA, is represented by the green band in Figure 9. Hence, the green line shows the mean accuracy given by the FGPA implemented as in Section 3.3 and the shaded green area delimits 1σ deviations from this mean. There is no dependence on λ_G in this case, because the flux within the FGPA is computed from the baryon density field given by the hydrodynamic simulation and not from a Gaussian-smoothed DM density field. Figure 10 shows the results for the 3DPS, with the same format as Figure 9. Each panel of Figure 10 refers to a different μ-bin.

As expected, Figures 9 and 10 show that GS+FGPA is strongly dependent on λ_G. The optimal value appears to be around $114\,\mathrm{ckpc}$ for the 1DPS and between 57 and $114\,\mathrm{ckpc}$ for the PDF. Around these values, m ≈ 7% for the 1DPS and m ≈ 4% for the PDF, as can be seen from the blue points in Figure 9.⁷

The trend of the accuracy of GS+FGPA in reproducing the 3DPS is similar to the one of the 1DPS (blue squares in Figure 10). The mean accuracy achieved in the different μ bins at the optimal scale for the 3DPS (λ_G = 57 ckpc) is around 4%. Remarkably, the accuracy of GS+FGPA for all statistics approaches the limit set by the FGPA, as long as the "correct" smoothing length is chosen. Since the optimal scales for the statistics considered vary up to a factor of two, one should decide in advance whether to prioritize 1DPS, 3DPS, or PDF. For λ_G ≳ 171 ckpc, the accuracy of all statistics becomes worse than ∼20%. Moreover, the error bars are very large for smoothing scales ≳200 ckpc. As such, it can be much worse than the mean in certain ranges of k-modes and flux. We also note that the performance of GS+FGPA degenerates as one moves farther from the line of sight, as previously discussed in the context of Figure 8.

Even for initial smoothing lengths ≳200 ckpc, 3D-IMS yields m < 20% for 1DPS and PDF, as shown by the cyan triangles in Figure 9, performing significantly better than the Gaussian method for these large smoothing lengths. At smaller smoothing lengths, 3D-IMS is basically as accurate as GS+FGPA. The accuracy in the 1DPS is better than in the PDF. This is not so surprising since, as we already pointed out, we ended our iterations matching the PDF of the flux in real space, and not redshift space, of the hydrodynamic simulation. Figure 10 shows that 3D-IMS does a remarkable job of reproducing the 3DPS, with an accuracy comparable to the FGPA at small λ_G, and still around 7% even for initial smoothing lengths as large as 500 ckpc. Moreover, the accuracy of 3D-IMS is only weakly dependent on λ_G, and it performs much better than GS+FGPA for large smoothing lengths.

By construction, 1D-IMS matches 1DPS and PDF, resulting in m ≈ 0.03% independent of λ_G (red circles in Figure 9). Figure 10 shows that 1D-IMS preserves 3D correlations, yielding m = 3.3% for the 3DPS in the best case (57 ckpc in the bin 0.25 < μ < 0.5) and m = 27% in the worst one (570 ckpc in the bin 0.25 < μ < 0.5). In all bins, 1D-IMS is as accurate as Gaussian smoothing at small smoothing lengths and it performs better than this method for λ_G ≳ 142 ckpc.

The accuracy of 1D-IMS improves as the μ-bin considered approaches the line of sight. It performs worse than 3D-IMS in the bin closest to it (0.75 < μ < 1.0). This is counter-intuitive, but we recall that the most parallel bin takes into account correlations in a 3D region of space and is thus conceptually distinct from the 1DPS, which 1D-IMS matches by construction. When recovering the 3DPS in the bin closest to the line of sight (0.75 < μ < 1.0), it is still more important to correctly reproduce the 3D correlations rather than the correlations along the line of sight. This is why 3D-IMS looks better than 1D-IMS close to the line of sight. Similar to 3D-IMS, the accuracy of 1D-IMS is only weakly dependent on the smoothing length, and much better than GS+FGPA for large smoothing lengths.

Among the methods considered in this work, 1D-IMS seems to perform the best. Indeed, it perfectly matches the 1DPS and PDF (by construction) and reproduces the 3DPS with good accuracy. If one is primarily interested in the 3DPS, 3D-IMS may be more suitable, since it yields the best accuracy in this statistic, though the differences with 1D-IMS are small. The drawback of 3D-IMS is the relative inaccuracy in the 1DPS and PDF compared to 1D-IMS, which matches these statistics by construction. The Gaussian smoothing can recover all statistics as well as 3D-IMS, provided the appropriate λ_G is adopted. In particular, the errors in estimating the 3DPS in the bin farthest from the line of sight (0.0 < μ < 0.25, top-left panel in Figure 10) are larger than ∼20% for λ_G ≳ 171 ckpc. For comparison, 3D-IMS and 1D-IMS achieve ≲10% accuracy for λ_G ≲ 228 ckpc in the aforementioned μ-bin. This means that our methods are able to recover information that otherwise gets lost when performing a Gaussian smoothing. They are accurate and computationally cheap ways to reproduce the statistics of the Lyα forest, which have promise for future modeling and data analysis.

We have also applied the same analysis described so far to two snapshots at redshifts z = 2 and z = 4, respectively. The accuracy of all methods are comparable with the results obtained at z = 3, meaning that the techniques tested are robust in the range of 2 < z < 4. We have also verified that, with 256³ resolution elements, the accuracy of all methods differs by ∼1% from the values obtained with our reference simulations, in most of the range of λ_G. This means that the accuracy of the methods has converged in our study.

When applying our methods, the choice of the initial smoothing length for the DM density is set by the inter-particle separation of the simulation adopted. If this is smaller than the optimal smoothing length, then one should smooth the DM density at the optimal λ_G. Otherwise, the best one can do is adopt a smoothing length of the order of the inter-particle separation. In any case, the smoothing scale for the DM line-of-sight velocities can be larger than the value adopted for the DM density. Indeed, the velocity field itself is smooth in voids (van de Weygaert & van Kampen 1993; Aragon-Calvo & Szalay 2013) and these are the most relevant regions for our study, as the exponentiation in Equation (7) suppresses large overdensities. In our analysis, we kept the smoothing length for the DM velocity field fixed to 228 ckpc, which is the value that yields the best overall accuracy in reproducing the flux statistics considered (see the Appendix). In this way, we focused on the impact of the initial smoothing of the DM density field and on the accuracy of the methods. Due to our choice of optimizing the smoothing length of the DM velocity field, the errors quoted for the different methods are minimized. However, even if we did not use the optimal λ_G for the velocity, the trend of the accuracy versus the smoothing length of the DM density field would be unaffected, as well as the rank ordering of the accuracy of the various techniques investigated (see the Appendix for a detailed discussion).

6. LARGE-VOLUME COLLISIONLESS SIMULATION

We want to check if our methods still perform well when applied to an actual N-body run, with a larger box than our calibrating hydrodynamic simulation. In fact, in the previous sections, we have validated our methods extracting all relevant fields from the same hydrodynamic simulation. However, the purpose of approximate methods is avoiding expensive hydrodynamic simulations. In practice, one would assume a certain model for the flux statistics and apply our techniques to a large-box low-resolution DM-only run. In this way, one would be able to probe large scales and at the same time accurately describe the small-scale physics thanks to our iterative procedure.

We consider the snapshot at redshift z = 3 of a Gadget DM-only run with a box size of 114 cMpc and 512³ particles. We CIC-bin the particle positions and velocities on a grid with 512³ elements, to get the density and velocity fields. To mock baryonic pressure, we smooth both fields with a length scale of λ_G = 228 ckpc, very close to the cell size (223 ckpc). This is the smallest smoothing length one can choose to have a non-negligible effect on the density and velocity fields.⁸ We then apply GS+FGPA in Section 3.1.

To apply 3D-IMS, we need an input model for the 3D power spectrum and PDF of the flux in real space. These statistics are computed from the flux in real space given by our hydrodynamic simulation. However, its box size is smaller than the one of the N-body simulation. This does not affect the computation of the PDF, but it poses some problems with the 3D power spectrum because the hydrodynamic simulation lacks the large modes that are present in the DM-only simulation. To generalize the 3D-IMS method, we construct the transfer function defined in Equation (10) as follows. First of all, we fit the 3D power spectrum of the flux in real space given by the hydrodynamic simulation with the formula provided by Kulkarni et al. (2015). Then, at every iteration of 3D-IMS, we define the transfer function, applying Equation (10) for k-modes larger than the fundamental mode ${k}_{{\rm{f}}}^{{\rm{HYDRO}}}$ of the hydrodynamic simulation. For the modes smaller than ${k}_{{\rm{f}}}^{{\rm{HYDRO}}}$ we set the transfer function to a constant, equal to the value assumed at ${k}_{{\rm{f}}}^{{\rm{HYDRO}}}$ . In this way, we have a continuous transfer function, which simply rescales by a constant the large-scale modes probed only by the DM-only simulation. This guarantees that any peculiar large-scale feature in the DM-only simulation (e.g., BAO signal) will not be affected.

Modeling the 1DPS of the flux in redshift space presents similar issues. Once again, one should come up with a method to estimate the large-scale Fourier modes without actually running a calibrating large-box hydrodynamic simulation. It would then be desirable to adopt an approach analogous to the one described for the 3D power spectrum of the real flux field. Unfortunately, there is no simple fitting function available for the 1DPS of the flux in redshift space, which would grant the high level of accuracy we are aiming for. Therefore, we are not applying 1D-IMS to the DM-only simulation. Nevertheless, following Kulkarni et al. (2015), future work may provide a fitting function for the flux 1DPS as well.

To assess the accuracy of the various techniques in this test, we shall compare the results of each method with the flux statistics obtained from a Nyx hydrodynamic run with a box size of 114 cMpc and 4096³ resolution elements, which we assume to be the "truth." This simulation, to which we shall refer as "L80N4096," covers the largest modes present in the DM-only simulation and, at the same time, has the same resolution limit (∼28 ckpc) as the small calibrating hydrodynamic simulation, thus resolving the Jeans scale.

In Figure 11, we show a sample of five skewers extracted from the hydrodynamic simulation (solid green line) and obtained through GS+FGPA and 3D-IMS (solid blue and dashed orange lines, respectively). By visual inspection, all skewers look consistent with one another.

In the upper-left panel of Figure 12, we show the 1DPS of the flux given by the hydrodynamic simulation (solid green line) and obtained applying GS+FGPA and 3D-IMS to the DM-only run (solid blue and cyan lines, respectively). In the lower-left panel, we plot the relative difference of the flux 1DPS obtained in each case, with respect to the results of L80N4096. In the right panels, we show analogous plots for the flux PDF, following the same color coding. Applying the same analysis outlined in previous sections, we find out that GS+FGPA (3D-IMS) recovers the flux 1DPS with m = 41% (10%) and s = 18% (4%). Thus, GS+FGPA is not accurate at all in this context. This fact should be born in mind when dealing with a low-resolution DM-only simulation with ∼100 cMpc boxes. On the contrary, the accuracy is dramatically improved by 3D-IMS. For the PDF, we have m = 18% (13%) and s = 10% (6%) for GS+FGPA (3D-IMS). Therefore, the statistics are better reproduced by 3D-IMS. We note that the precision achieved for the flux 1DPS and PDF is of the same order as what we obtained when we extracted the DM density field from the same hydrodynamic simulation used for the calibration.

In Figure 13, we plot the 3DPS of the flux given by the hydrodynamic simulation and obtained with GS+FGPA and 3D-IMS, applied to the DM-only simulation. The color coding is the same as in Figure 12. The different panels show the 3DPS in four evenly spaced μ-bins. We recall that the bin 0.0 < μ < 0.25 corresponds to modes farther from the line of sight, whereas the bin 0.75 < μ < 1.0 is the closest to it. Below the plots obtained for each bin, we show the relative difference of the results of each method with respect to the ones given by L80N4096. Once again, we applied the same analysis technique adopted throughout this work, obtaining that m ≈ 10% for 3D-IMS in all μ-bins. On the contrary, the accuracy of GS+FGPA is strongly dependent on the μ-bin considered. We have m = 10% in the μ-bin farthest from the line of sight, whereas m = 58% in the bin closest to it. These results are consistent with the findings presented in the previous sections.

The accuracy of 3D-IMS in reproducing the 1DPS, 3DPS, and PDF of the flux in redshift space is higher than in the case of GS+FGPA. The accuracy is comparable to the results obtained when applying 3D-IMS to the DM field extracted from our reference simulation. To probe the limitations of our technique, we applied the analysis explained in the present section also to the 256³ Gadget run, adopting the same calibrating simulation. We verified that the ratio of the accuracy of the two methods does not change significantly. In conclusion, our method is solid when applied to a large-box low-resolution DM-only simulation. This achievement makes our method attractive for studies requiring both large volumes and high-resolution simulations, for which running hydrodynamic simulations is not a viable option. One example is modeling the signature of the BAO on the Lyα forest flux power spectrum.

The results presented in this section clearly show that GS+FGPA can yield a very poor accuracy with respect to what would be obtained through a hydrodynamic simulation. Any result claimed after applying this technique should then be considered carefully. Future works making use of GS+FGPA should refer to Figures 9 and 10 to assess the error intrinsic in the method adopted.

1D-IMS has not been validated for a large-box DM-only run because of the lack of a recipe to model the flux 1DPS (e.g., an analytic fitting formula), extending it to large scales. Though, we are confident that in future work such fitting procedure could be provided. If such a technique becomes available, we do not expect that 1D-IMS will fail the test presented in this section. Indeed, 1D-IMS and 3D-IMS are both grounded on the IMS technique. We proved that 3D-IMS is accurate when applied to a large-box DM-only run, so this is encouraging for 1D-IMS as well.

7. COMPARISON TO PREVIOUS WORKS

With our analysis technique, we defined a criterion to assess the accuracy in reproducing 1DPS, PDF, and 3DPS of flux skewers, which we wish will be used also by other authors in the future. This would make the comparison with upcoming works more direct and straightforward. It is of course interesting to compare our results with previous work. We do that with the best of our possibilities, since the statistics discussed in the relevant literature are not always the same as the ones considered by us. Furthermore, it is the first time that the performances of approximate methods in reproducing the 3DPS of flux are quantified.

Gnedin & Hui (1998) proposed a hydro-particle mesh (HPM), a method to describe baryonic pressure as a modification of the gravitational potential in collisionless simulations. They compare the results of their own technique with two reference hydrodynamic simulations. After computing 300 flux skewers at z = 3, they found out that the mean error on the fractional flux decrement is smaller than 10% in the whole dynamic range. They do not consider other statistics of the flux, but they show that the accuracy in reproducing the column density distribution is around 13%. They claim that HPM would be suitable when an accuracy of 10%–15% is needed in the modeling. Both our methods and GS+FGPA (with appropriate smoothing length) result in higher accuracy.

Meiksin & White (2001) also used the HPM technique to compute the flux field. In addition, they considered an N-body particle mesh code, from which they computed the flux using GS+FGPA. They show that the two methods yield the same cumulative distribution of the flux within ∼10% accuracy, for four different cosmological models. The cumulative distributions of column density and Doppler parameter differ up to ∼10% and ∼20%, respectively. Although we consider the PDF and not the cumulative PDF of the flux, their findings agree with our results for the PDF given by GS+FGPA.

Viel et al. (2002) tested GS+FGPA against a smoothed particle hydrodynamic simulation. Furthermore, they developed a hydro-calibrated approximate method to predict the Lyα forest, based on an adaptive filtering scale for the DM density. Although the accuracy of these techniques in recovering the logarithm of the flux PDF given by the hydrodynamic simulation is not quantified, it can be inferred from their plots that GS+FGPA recovers such statistics on average better than 10%, even though the agreement looks worse in certain regions of flux (e.g., around 0.2 or 0.8). The PDF is reproduced much better by the hydro-calibrated method. Its accuracy can be estimated through eye-balling to be at percent level. It would then mean that 3D-IMS is comparable to the method provided by Viel et al. (2002), as far as the flux PDF is concerned. 1D-IMS still performs much better, matching this statistics by construction.

Font-Ribera et al. (2012a) introduced a method to generate mock data sets for the measurement of the Lyα forest flux correlations at large scales. As a first step, a Gaussian random field is generated along a set of line of sights. Flux skewers are then generated transforming such a field via an exponential function dependent on two parameters, which are tuned to reproduce a certain flux PDF. The flux power spectrum can be modeled to the desired shape by choosing the power spectrum of the initial Gaussian random field accordingly. Bautista et al. (2015) compared the mock data obtained through this technique with real high-redshift quasar spectra from BOSS. It can be deduced from the plots presented that the agreement between the real and mock 1D correlation functions is almost perfect for small wavelength separations, but it can be as worse as ∼30% for larger separations. The mock and measured 3D correlation functions are compatible within ±1σ. The advantage of the described method is that one does not need to model a large-box density field nor a velocity field, but only to generate the required sample of mock skewers. On the other hand, IMS has proven to be accurate at reproducing the flux PDF, 1DPS, and 3DPS even on scales much larger than the ones probed by the calibrating hydrodynamic simulation.

The LyMAS method (Peirani et al. 2014) also matches the flux PDF by construction. In its simplest version, this method consists of two hydro-calibrated transformations of the matter density field. Qualitatively, it can be seen that the method reproduces well the 1DPS given by the reference hydrodynamic simulation, except for $k\gtrsim 1.5\,{\mathrm{cMpc}}^{-1}$ . However, LyMAS can be extended with two further transformations of the flux field (LyMAS full scheme). In this way, the accuracy of the 1DPS is dramatically improved, though this was not quantified in Peirani et al. (2014). Flux skewers appear reasonable only in the full scheme, while in the simplest incarnation they are quite noisy.

Comparing LyMAS to our methods, it certainly does better than 3D-IMS at reproducing the PDF. In this respect, it is as good as 1D-IMS, since both match the flux PDF by construction. It also looks like LyMAS recovers the 1DPS given by the hydrodynamic simulation to a very high accuracy. Likewise, 1D-IMS reproduces the 1DPS almost perfectly and accurately reproduces the 3DPS as well. Furthermore, both 3D-IMS and 1D-IMS provide good-looking skewers applying simple transformations. The LyMAS methodology may be improved by also using the velocity field of the N-body run, which is currently being neglected.

An important feature of 3D-IMS is that it matches the flux provided by the hydrodynamic simulation in real space. This sets the correct filtering scale, allowing us to explore many values of T₀ and γ when computing the redshift-space flux. On the contrary, LyMAS connects the DM density directly with the redshift-space flux, so each choice of (T₀, γ, λ_G) requires an additional hydrodynamic simulation. Therefore, in this regard, 3D-IMS appears to be more flexible than LyMAS.

Recently, Lochhaas et al. (2015) used LyMAS to predict the cross-correlation between DM halos and Lyα forest flux, and compared it to quasars-damped Lyα systems cross-correlation measurements from BOSS (Font-Ribera et al. 2012b, 2013). From the plots presented, one can tell that the DM halos-Lyα forest cross-correlation given by LyMAS reproduces very well the results of their calibrating hydrodynamic simulation (Horizon-AGN; Dubois et al. 2014). However, other statistics relevant for our work, like the 1DPS, are not computed.

8. CONCLUSIONS

In this work, we investigated approximate methods to obtain statistical properties of Lyα forest from N-body simulations. We focus our attention on the PDF, 1DPS and 3DPS of the flux field, comparing results of approximate methods with a reference hydrodynamic simulation.

We studied the limitations of the FGPA, which is the basis of many approximate methods. The primary sources of error are the differences between DM and baryon velocity fields and, to a smaller degree, the impact of scatter in the temperature–density relationship of the IGM. The accuracy of the FGPA in reproducing the 1DPS and PDF is around 2%, and around 5% for the 3DPS. We also assessed the accuracy of the widely used Gaussian smoothing technique, combined with the FGPA (GS+FGPA). This method consists of mocking the baryon density by smoothing the matter density with a Gaussian kernel. Such a field is then used to compute the flux within the FGPA. The accuracy at which the statistics of the flux given by the reference hydrodynamic simulation is reproduced varies a lot with the choice of the smoothing scale λ_G. We explored a wide range of smoothing lengths and found out that the best accuracy achieved for 1DPS and 3DPS is ∼7% and ∼5%, respectively (at λ_G = 57 ckpc), and ∼4% (λ_G = 114 ckpc) for the PDF. For smoothing scales ≳171 ckpc, the mean accuracy is worse than 20%. This dependence of GS+FGPA on the smoothing scale is rather unfortunate, as the "optimal" smoothing scale is guaranteed to differ for models with different thermal IGM history. As one does not know a priori this optimal value, in practice, it means that works using any particular smoothing scale will have errors varying in an uncontrolled manner.

To remedy these problems, we have developed two new methods, 3D-IMS and 1D-IMS, based on the idea of "Iteratively Matched Statistics" (IMS). Their starting point is also Gaussian smoothing the matter density on a certain scale, which corresponds to the mean inter-particle spacing of the simulation considered. In 3D-IMS, smoothing is followed by matching the 3D power spectrum and PDF of the flux in real space to the reference hydrodynamic simulation. With 1D-IMS, we additionally match the 1DPS and PDF of the flux in redshift space. In contrast to GS+FGPA, 3D-IMS is much less dependent on the initial smoothing length. It reproduces the 3D power spectrum of the flux in redshift space as accurate as GS+FGPA when smoothing scales are small, but performs significantly better for large smoothing scales, with an accuracy of ∼7% even for smoothing scales as large as ∼500 ckpc. This is a very important property because it provides significantly more accurate models of the Lyα forest statistics in large-volume simulations, where the mean inter-particle spacing has to be large due to computational constraints. The 1D-IMS method matches flux 1DPS and PDF by construction. It still performs equally well, or better than GS+FGPA in reproducing the 3DPS (∼5%). It is not necessary to use both methods; one can use 3D-IMS only, reproducing the flux 3D power spectrum accurately, at the expense of a lower accuracy for 1DPS and PDF.

These assessments stand for modeling the Lyα forest even in high-resolution N-body simulations, but are especially prominent when large-volume (thus coarse resolution) N-body simulations are used. We have showed that IMS approximate methods significantly outperform GS+FGPA in such cases. Indeed, through the iterative procedure, our method correctly recovers small-scale physics, which is otherwise not present in low-resolution simulations. In particular, 3D-IMS improves the accuracy in the 1DPS by a factor of four with respect to GS+FGPA. In addition, 3D-IMS appears more robust and easy to implement, constituting an improvement over previous techniques.

8.1. Perspectives

Our methods have applicability in any context where large-box simulations are needed. The high accuracy of 3D-IMS and 1D-IMS at large smoothing lengths demonstrates that the hydro-calibrated mappings are able to "paste" information about the small-scale physics of the IGM not present in a large-volume simulation, without compromising large-scale statistics. To be quantitative, at λ_G = 228 ckpc 1D-IMS matches perfectly the flux 1DPS and PDF of the reference hydrodynamic simulation and recovers the 3DPS within 7% accuracy. Since in any realistic situation λ_G has to be at least as large as the inter-particle separation, it means that one would achieve the aforementioned accuracy applying 1D-IMS to a collisionless simulation with a trillion particles in a ∼2.5 cGpc box. This size is large enough to comfortably study the signature of the BAO signal in the Lyα forest. As a reference, in the context of the BOSS survey, White et al. (2010) ran a suite of N-body simulations with a box size of 1.02 cGpc and 4000³ particles (inter-particle separation 260 ckpc), applying the Gaussian smoothing technique. Currently, the state of the art for N-body simulations is represented by the "Outer Rim" (box size 4.3 cGpc, 10240³ particles; Habib et al. 2012, 2013) and "Dark Sky" (box size 11.5 cGpc, 10,240³ particles; Skillman et al. 2014) simulations.

The BAO signal can be modulated by UV background fluctuations, which are coupled to fluctuations in the mean-free path of ionizing photons on large scales (Gontcho A Gontcho et al. 2014; Pontzen 2014). For a proper modeling, one needs a simulation with a box size much larger than the mean-free path (Davies & Furlanetto 2016), which is of the order of the BAO scale at z ∼ 2.5 (Worseck et al. 2014 and references therein). Therefore, one would have to run radiative transfer simulations with box sizes of the order of 1 cGpc—far beyond current computational capabilities. The high quasar density in the BOSS survey allows for the measurement of the 3DPS, which can be exploited to improve cosmological constrains and/or constrain IGM thermal properties (McQuinn et al. 2011; McQuinn & White 2011). Finally, our technique could help the modeling of the cross-correlation between the Lyα forest and the H i 21 cm signal (Guha Sarkar & Datta 2015), as well as between CMB lensing and Lyα forest (Vallinotto et al. 2009, 2011).

We are thankful to the anonymous referee for useful comments. We thank Martin White, Casey W. Stark, and the members of the ENIGMA group at the Max Planck Institute for Astronomy (MPIA) for helpful comments and discussions. We are grateful to Vetter's Alt Heidelberger for providing a supportive and enriching environment for many of those discussions. We thank the Esalen Institute for kind hospitality. Calculations presented in this paper used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the U.S. Department of Energy under contract No. DE-AC02-05CH11231. Hydrodynamical runs in this paper were done under the ASCR Leadership Computing Challenge (ALCC) allocation. Z.L. acknowledges support from the Scientific Discovery through the Advanced Computing (SciDAC) program funded by U.S. Department of Energy Office of Advanced Scientific Computing Research and the Office of High Energy Physics. This work made extensive use of the NASA Astrophysics Data System and of the astro-ph preprint archive at arXiv.org.

APPENDIX: OPTIMAL SMOOTHING LENGTH FOR VELOCITY

The baryon density field can be mocked through a Gaussian smoothing of the DM density field. Also, the velocity field of DM should be smoothed accordingly to reproduce the velocity field of baryons. In principle, there might be two different optimal values of λ_G for density and velocity, so one should vary λ_G for both quantities and explore all possible combinations within the dynamic range considered. However, this extensive study can be quite time consuming and is probably not the most efficient way to proceed. In all techniques tested, we shall vary λ_G for the density, keeping it fixed for the velocity. We determine the optimal fixed smoothing length for the velocity field as follows. We apply the FGPA (Equations (1) and (6)) using the baryon density fluctuations given by our reference hydrodynamic simulation, but smoothing the DM velocity field from the same simulation at different values of λ_G. We then choose the smoothing length best matching the flux 1DPS, PDF, and 3DPS given by the hydrodynamic simulation.

The outcome of our analysis can be seen in Figure 14. In the left panel, we plot the accuracy of reproducing the 1DPS of the hydrodynamic simulation, versus the smoothing length. The right panel displays the analogous plot for the PDF. We notice that the smoothing of the velocities has a strong impact on the 1DPS, the optimal value being 171 ckpc, for which the mean accuracy is ∼1%. The PDF is less sensitive to the smoothing length for λ_G ≳ 285 ckpc.

In Figure 15, we show the results for the 3DPS. We see that the impact of the smoothing length is more important for larger μ, whereas λ_G ≲ 228 ckpc yield a better accuracy with respect to λ_G ≳ 228 ckpc. Given the different trends of the 1DPS, 3DPS, and PDF accuracy, there is no unique optimal value of λ_G to maximize the accuracy in all statistics, so we have to make a compromise. We chose λ_G = 228 ckpc/h, for which the accuracy of both PS and PDF is ∼2% and the accuracy of the 3DPS is between 3% and 5%, depending on the μ-bin considered. We kept this value fixed in all our work.

Our choice of optimizing the smoothing length of the velocity allows us to focus our analysis on the impact of the smoothing length of the density field on the accuracy of our methods (see Section 5.2 for details). As a consequence, the errors quoted for the techniques considered are minimized. Indeed, when we show the error for λ_G < 228 ckpc in the density field, we are still smoothing the velocity at 228 ckpc. We recall that the smoothing length has to be at least as large as the inter-particle separation of the simulation adopted. If such separation is 228 ckpc, one can smooth the velocity field at this value and still adopt a larger λ_G for the DM density. Conversely, if the inter-particle separation is smaller than 228 ckpc, one can smooth the DM density choosing λ_G to be equal to such a separation. One can still smooth the velocity field at 228 ckpc, without losing much information. Indeed, the velocity field in voids (which are the most important regions in terms of the flux statistics due to the exponentiation of Equation (7)) is very smooth for very different cosmological models (van de Weygaert & van Kampen 1993; Aragon-Calvo & Szalay 2013). As such, there is no conspicuous small-scale structure in the velocity field (see also Figure 2), which is thus less affected by the smoothing than the DM density field. So, it is sensible to consider a fixed smoothing length for the velocity field but varying it for the DM density.

To get a sense of the accuracy obtained with a certain method adopting a different smoothing length for the velocity field, one can sum in quadrature the errors reported in Figures 9 and 10 with the errors shown in the corresponding plots in this section, i.e., Figures 14 and 15. Doing so, the mean accuracy of our methods would decrease, but the trend of the accuracy versus the smoothing length would be unaffected. In particular, except for the bin along the line of sight of the 3DPS, GS+FGPA would still look worse than our methods.

MODELING THE Lyα FOREST IN COLLISIONLESS SIMULATIONS

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

Dates

ABSTRACT

1. INTRODUCTION

2. SIMULATIONS

2.1. Lyα Skewers

3. LIMITATIONS OF APPROXIMATE METHODS

3.1. Gaussian Smoothing

3.2. Fluctuating Gunn–Peterson Approximation

3.3. Accuracy of FGPA

4. ITERATIVELY MATCHED STATISTICS

4.1. 3D Iteratively Matched Statistics

4.2. 1D Iteratively Matched Statistics

5. VALIDATION OF IMS

5.1. Skewers

5.2. Comparison of the Methods

5.3. Accuracy versus Smoothing Length

6. LARGE-VOLUME COLLISIONLESS SIMULATION

7. COMPARISON TO PREVIOUS WORKS

8. CONCLUSIONS

8.1. Perspectives

APPENDIX: OPTIMAL SMOOTHING LENGTH FOR VELOCITY

Footnotes

MODELING THE Lyα FOREST IN COLLISIONLESS SIMULATIONS

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

Dates

ABSTRACT

1. INTRODUCTION

2. SIMULATIONS

2.1. Lyα Skewers

3. LIMITATIONS OF APPROXIMATE METHODS

3.1. Gaussian Smoothing

3.2. Fluctuating Gunn–Peterson Approximation

3.3. Accuracy of FGPA

4. ITERATIVELY MATCHED STATISTICS

4.1. 3D Iteratively Matched Statistics

4.2. 1D Iteratively Matched Statistics

5. VALIDATION OF IMS

5.1. Skewers

5.2. Comparison of the Methods

5.3. Accuracy versus Smoothing Length

6. LARGE-VOLUME COLLISIONLESS SIMULATION

7. COMPARISON TO PREVIOUS WORKS

8. CONCLUSIONS

8.1. Perspectives

APPENDIX: OPTIMAL SMOOTHING LENGTH FOR VELOCITY

Footnotes