Quijote-PNG: Simulations of primordial non-Gaussianity and the information content of the matter field power spectrum and bispectrum

Primordial non-Gaussianity (PNG) is one of the most powerful probes of the early Universe and measurements of the large scale structure of the Universe have the potential to transform our understanding of this area. However relating measurements of the late time Universe to the primordial perturbations is challenging due to the non-linear processes that govern the evolution of the Universe. To help address this issue we release a large suite of N-body simulations containing four types of PNG: \textsc{quijote-png}. These simulations were designed to augment the \textsc{quijote} suite of simulations that explored the impact of various cosmological parameters on large scale structure observables. Using these simulations we investigate how much information on PNG can be extracted by extending power spectrum and bispectrum measurements beyond the perturbative regime at $z=0.0$. This is the first joint analysis of the PNG and cosmological information content accessible with power spectrum and bispectrum measurements of the non-linear scales. We find that the constraining power improves significantly up to $k_\mathrm{max}\approx 0.3 h/{\rm Mpc}$, with diminishing returns beyond as the statistical probes signal-to-noise ratios saturate. This saturation emphasizes the importance of accurately modelling all the contributions to the covariance matrix. Further we find that combining the two probes is a powerful method of breaking the degeneracies with the $\Lambda$CDM parameters.


INTRODUCTION
Measurements of the cosmic microwave background (CMB) (Fixsen et al. 1997;Bennett et al. 2013;Planck Collaboration I 2020) and of the large scale structure (LSS) (Alam et al. 2017;d'Amico et al. 2020;Ivanov et al. 2020;Kobayashi et al. 2021) have revolutionized our understanding of the primordial Universe, demonstrating that it was nearly homogeneous and isotropic with small, almost scale-invariant perturbations. However, wide classes of theoretical models, ranging from inflationary to ekpyrotic models (Lehners 2010;Martin 2016;Meerburg et al. 2019), can explain these observations. There are two proposed observational channels that have great power at discriminating between these theories: measurements of primordial gravitational waves and measurements of primordial non-Gaussianity (PNG) (see e.g Achúcarro et al. 2022, for a recent overview of prospects for learning about inflation).
Primordial non-Gaussianity is interesting due to its strong sensitivity to the early universe physics: PNG can reveal information about the field content of the Universe, the strength of interactions and particle content (Chen et al. 2007). Work by Green and Porto (2020) suggests that PNG measurements offer a method to ver-ify the quantum nature of the primordial perturbations. The information encoded in PNG is complementary to other probes, such as primordial gravitational waves and the slope and running of the primordial power spectrum, that contain information about the duration and energy scale of inflation (see e.g. Lyth and Riotto 1999, for a review).
Given the strong theoretical motivation, there have been many searches for primordial non-Gaussianity (Ferreira et al. 1998;Komatsu et al. 2002Komatsu et al. , 2003Creminelli et al. 2006;Slosar et al. 2008;Leistedt et al. 2014), with the leading constraints from measurements of the CMB by the Planck satellite (Planck Collaboration IX 2020). The CMB is an ideal probe for studying PNG as CMB anisotropies are linearly related to the primordial perturbations and so CMB measurements can be straightforwardly related to the primordial universe. Whilst upcoming CMB experiments will improve upon existing constraints (Ade et al. 2019;Abazajian et al. 2016), the challenges from foregrounds and the limitations imposed by silk-damping and the 2D nature of the CMB mean that improvements needed to reach the most interesting theoretical levels will be highly challenging (Hill 2018;Jung et al. 2018;Kalaja et al. 2021;Coulton et al. 2022a).
Constraints on PNG from measurements of the LSS promise, in principle, to far surpass existing constraints (Meerburg et al. 2017;Slosar et al. 2019). However, to date, the important measurements of PNG from LSS (Slosar et al. 2008;Leistedt et al. 2014;Giannantonio et al. 2014;Ross et al. 2013;Ho et al. 2015;Cabass et al. 2022a,b;D'Amico et al. 2022) have not reached the level of the CMB. This is due to complexities of large scale foregrounds (Pullen and Hirata 2013;, understanding the optimal statistics to use, and the challenge of modelling the LSS. Unlike the CMB, measurements of the late-time LSS are non-linearly related to the primordial perturbations which complicates inferences about the primordial universe. For the largest scales, powerful perturbation theory models have been developed and recently applied to data (Baldauf et al. 2011;Cabass et al. 2022a,b;D'Amico et al. 2022). However these methods will not be able to probe arbitrarily small scales, and thus it is unclear whether alternative approaches may be better suited to the problem. This is particularly interesting as recent work has suggested that information of PNG could be separated out from other small scale effects by exploiting the locality of gravitational and baryonic processes (Baumann and Green 2021).
To aid our understanding of the relation between the primordial non-Gaussianity and late time observables, we have run an extensive suite of simulations with PNG. This work builds on the results of numerous previous investigations : Dalal et al. (2008); Desjacques et al. (2009) generated simulations containing local PNG and used these to discover and understand the impact of PNG on scale dependent bias. This work was followed by Wagner et al. (2010); Scoccimarro et al. (2012) who developed and implemented methods to generate simulations with PNG beyond the local. In this work we use the methods developed in Scoccimarro et al. (2012) to generate a large ensemble of simulations with four different PNG shapes.
These simulations were designed to fit within the quijote suite of simulations ); a large suite of simulations designed to quantify the information content on generic summary statistics and to provide enough training data for machine learning algorithms. By making this choice our PNG simulations can be used within a consistent framework to study how uncertainties in the standard cosmological parameters impacts inferences of PNG. For example, whilst previous work has shown that the halo mass function (Lucchin and Matarrese 1988;LoVerde and Smith 2011), the matter probability density function (Valageas 2002;Uhlemann et al. 2018;Friedrich et al. 2020), topological measures (Biagetti et al. 2020 and field level analyses (Andrews et al. 2022) are potentially powerful probes of PNG, simulations are required to validate theoretical predictions or model these novel probes. An aim of this work is to help facilitate such analyses.
As both a validation of our simulations and a first application we explore the properties of the matter power spectrum and bispectrum in these simulations. On the largest scales the impact of PNG on the matter power spectrum and bispectrum is well understood (e.g. Desjacques et al. 2009;Baldauf et al. 2011;Karagiannis et al. 2018) and so these measurements can be used to validate our simulations. However our simulations also allow us to push beyond the perturbative regime (k ≈ 0.1 h/Mpc at z = 0) and explore the information content available on smaller scales. Recent work by Hahn et al. (2020) has shown that for other cosmological parameters, small scale bispectrum measurements of the halo and galaxy field potentially contain large amounts of information. Through the use of a simulation-based covariance matrix we are able to assess the importance of all the contributions, including the super-sample covariance, for parameter constraints.
This paper and our companion paper, Jung et al. (2022), are the first two papers in a series dedicated to investigating how we can learn more about primordial non-Gaussianity from upcoming measurements of the LSS.
In our companion paper, Jung et al. (2022), we explore the properties of matter power spectrum and bispectrum from a different and complementary perspective: we use the modal bispectrum (Fergusson et al. 2012;Schmittfull et al. 2013) to measure the matter bispectrum and then develop and validate a set of optimal compressed statistics, which enable the information in the matter power spectrum and bispectrum to be captured within a data vector of length the number of parameters. Additionally the Jung et al. (2022) analysis focuses on measurements at z = 1.0, the redshift range relevant for upcoming surveys such as Euclid (Amendola et al. 2018), whereas here we examine z = 0.0. Through this complementarity we can better understand the importance of small scale information across redshifts. We stress that both of these works consider an idealized setup -the 3D matter field is not directly observable and we neglect all observational effects and their associated challenges, which will likely limit future analyses -and thus it is difficult to directly relate constraints reported here to future surveys. Further the unique scale dependent bias feature that PNG introduces to biased tracers, means that the measurements of biased tracers could contain more information than the matter field, on the same scales. However the aim of these works is not to generate specific predictions but rather assess the value of pushing to smaller scales and provide simulations that can be used as a sandbox to develop intuition into new estimators in an simplified environment. This represents a first step, and motivation, towards more complex and realistic analyses.
These two papers represent the first comprehensive investigation into what can be learnt about PNG from small scale measurements of the matter bispectrum and power spectrum. These papers are accompanied by the release the simulation data and our power spectrum and bispectrum measurements. In the next papers we will consider the information content in the halos (Coulton et al. 2022b) and the information accessible from analyses of the field.
Our paper is structured as follows: in Section 2 we briefly overview the shapes of non-Gaussianity considered in this work and in Section 3 we describe our simulations of PNG. We describe our power spectrum and bispectrum estimators in Section 4 and then present our measurements in Section 5. In Section 6 we describe the details of how we compute our Fisher forecasts and then explore the constraining power of the bispectrum and power spectrum measurements in Section 7. We then summarize our key results in Section 8. In the three ap-pendices we describe the details of the generation of non-Gaussian initial conditions (Appendix A), the details of the computation of the power spectrum and bispectrum covariance matrices (Appendix B), joints constraints of PNG and ΛCDM extensions and finally the impact of a prior on the power spectrum convergence (Appendix D).

PRIMORDIAL NON-GAUSSIANITY
Whilst primordial non-Gaussianity refers to any deviation from Gaussian initial conditions, in this work we restrict our analysis to the primordial bispectra, (1) where Φ(k) is the primordial potential. We further restrict our analysis to four different primordial bispectra shapes, detailed below. These shapes are studied as they accurately approximate theoretically well motivated shapes, provide insight into the primordial physics and can be generated by a range of generic methods (see e.g. Chen 2010; Achúcarro et al. 2022, for reviews of PNG). The first shape we consider is the local shape and has the primordial bispectrum where P Φ (k 1 ) is the primordial power spectrum and f local NL is the amplitude of this type of non-Gaussianity. Local non-Gaussianity is of observational interest as it is a powerful probe of the primordial field content. Maldacena (2003);  showed that in single-field slow roll inflation, the amplitude of the local bispectrum is < O(η, ϵ), where ϵ and η are the slow roll parameters, thus a measurement of larger local non-Gaussianity would present difficulties for slowroll, single-field inflation. On the other hand, observable levels of local non-Gaussianity can be generated in multi-field inflationary models, such as the curvaton or modulated reheating models (Lyth and Wands 2002;Dvali et al. 2004), and alternatives to inflation models (Lehners 2010). Note that whilst there is on-going discussion (Pajer et al. 2013;Matarrese et al. 2021) as to whether the squeezed-limit bispectrum exactly vanishes for single-field models, this does not impact the power of a detection of local >> O(ϵ, η) to rule out these models.
The second shape we consider is the equilateral non-Gaussianity, which has the following primordial bispec- (3) Equilateral non-Gaussianity well approximates non-Gaussianities generated in Dirac-Born-Infeld inflation (Alishahiha et al. 2004), ghost inflation (Arkani-Hamed et al. 2004) and generically in models with local, derivative interactions (Cheung et al. 2008a,b). The final shape that we consider is the orthogonal shape. The orthogonal bispectrum, along with the equilateral bispectrum, is used to span the types of non-Gaussianity within the effective field theory of inflation (EFTi) (Cheung et al. 2008a,b). Measurements of the equilateral and orthogonal bispectra can be used to constrain the parameters of EFTi, including constraining the sound speed of primordial perturbations (Senatore et al. 2010;Planck Collaboration XXIV 2014). The full shape of the orthogonal bispectrum in EFTi is not separable (i.e expressible as F (k 1 )G(k 2 )H(k 3 ) + perms.) and so is challenging to simulate and analyze. Two approximations have been developed in Senatore et al. (2010); the first approximation was performed to approximate the orthogonal bispectrum in the CMB and hence we refer to this as orthogonal-CMB. It has the bispectrum Whilst this functional form is a good approximation for the CMB, which is a 2D projection of the primordial bispectrum, it is less accurate for measurements of the LSS (Senatore et al. 2010;Creminelli et al. 2011). In particular, its squeezed-limit divergence, which is suppressed for the 2D CMB, differs from the EFTi orthogonal shape. This different squeezed limit has important consequences for scale dependent bias measurements and for the detectibility. Despite these issues, we include it in our suite of simulations as the resulting scale-dependent bias (∼ 1/k) is different from the local case (∼ 1/k 2 ), and orthogonal-LSS (see below) and equilateral shapes (∼ 1), so the simulations help span possible PNG signatures (Schmidt and Kamionkowski 2010). Further folded bispectra, which can arise if the assumption of the Bunch-Davis vacuum is violated, is well approximated by a linear sum of this shape with the equilateral shape (Meerburg et al. 2010;Chen et al. 2007). The impact of these subtleties on the halo field, where scale dependent bias is important, is further explored in Coulton et al. (2022b). The second approximation, also developed by Senatore et al. (2010) and hereafter referred to as orthogonal-LSS, has a bispectrum given by where This shape is a better approximation to the EFTi orthogonal shape for the matter field and exhibits the correct squeezed limit scaling. Note -as is discussed in Appendix A -that the orthogonal-LSS shape we consider is slightly modified to account for the fact that our simulations are not scale invariant, i.e. n s ̸ = 1.

SIMULATIONS
In this work we extend the quijote simulations by running a set of simulations at the quijote fiducial cosmology with primordial non-Gaussianity: quijotepng. These simulations have been run with the same settings (i.e. PM grid size, force resolution...etc) as the original quijote simulations. Additionally, quijotepng uses simulations with matching random seeds to compute partial derivatives. Thus these simulations are ideal for investigating the information in PNG jointly with other cosmological parameters. In Tables 1 and 2 we summarize the properties of the new simulations. In this section we describe the details of the PNG initial conditions and the simulations. We refer the reader to  for full details on the quijote simulations. All the PNG simulations are run with the ΛCDM cosmological parameters shown in Table 1 . They follow the evolution of 512 3 dark matter particles from z = 127 down to z = 0. The following simulation products are publicly available: particle data at z=0.0, 0.5, 1.0, 2.0, and 3.0 and the power spectrum and bispectrum measurements. See https://quijote-simulations.readthedocs.io/ en/latest/png.html for more details.

Non-Gaussian Initial Conditions
To run non-Gaussian simulations we use the method developed in Scoccimarro et al. (2012) to generate initial conditions that have primordial non-Gaussianity, and we implement several small changes to the public code released by Scoccimarro et al. (2012). Our updated code is available at https://github.com/dsjamieson/ 2LPTPNG. Here we briefly review this method and refer the reader to Scoccimarro et al. (2012) for more details.
Generating initial conditions with local non-Gaussianity is straightforward. First, the modes of a Gaussian primordial potential field, Φ(k), are generated in Fourier space from the input power spectrum. This field is then inverse Fourier transformed to realspace. The real-space field is squared, mean subtracted, scaled by the desired amplitude, f local NL , and then added back to the original potential. The resulting field is real Fourier transformed to obtain the modes of the primordial potential with local PnG.
In Fourier space this corresponds to a convolution so that the process can be summarized as where the second term removes the contribution from the mean, < Φ 2 >, which otherwise would contribute to background expansion, Generating other types of primordial bispectra corresponds to adding a kernel to the convolution Note that we impose K(k 1 , k 2 , 0) = 0 as this automatically accounts for the removal of the mean.
Crucially, for the orthogonal-CMB, orthogonal-LSS and equilateral shapes the kernels required can be written as a linear combination of separable terms, i.e. they can be written as sums over terms with the form K(k 1 , k 2 , k) = G(k 1 )H(k 2 )M (k). These can thus be generated in a similar manner to the local with two modifications: first the Gaussian potential modes are filtered before the inverse Fourier transform, and then again after the final Fourier transform operation. The specific kernels we used and their coefficients are given in Appendix A.
The initial conditions are generated as follows: we generate the primordial anisotropies from a primordial power law spectrum characterized by the amplitude, A s , and tilt, n s . These anisotropies are generated on a grid of size 1024 3 to minimize the impact of aliasing effects. We add non-Gaussianity via Eq. 8. The perturbations are evolved to redshift z = 0.0 using linear transfer function, T (k) from CAMB (Lewis et al. 2000). We then scale back its amplitude to z = 127 using the traditional growth factor, D(z): P (k, z = 127) = D 2 (z = 127)/D 2 (z = 0)P (k, z = 0) 1 . The gridded field at z = 127 is then used with 2LPT to create the initial displacements of particles for our simulation. We note that we made a small change to the public code to generate initial conditions with n s ̸ = 1, and with orthogonal-LSS non-Gaussianity.

N-body Evolution
After generating the initial conditions, we follow their gravitational evolution down to z = 0 using the TreePM code Gadget-3 (Springel 2005). As stated above, we use the same parameter settings as the original quijote simulations. Halos are identified through the Friends-of-Friends (FoF) algorithm (Davis et al. 1985) with a value of the linking length equal to b = 0.2.

Grid assignment
To measure our statistics we construct a mesh with N side = 512 voxels for each simulation at the considered redshift. We use the Cloud-in-Cell (Hockney and Eastwood 1981) assignment scheme to distribute the particle positions to the grid. We account for this when computing the power spectra and bispectra by deconvolving the Cloud-in-Cell window function (Jing 2005). Note that the large N side used in our analysis means the effects of the window function and aliasing are negligible on the scales of interest: k ≤ 0.5 h/Mpc. This was verified by comparing our measurements to a set of grids generating using a fourth order interpolation scheme with interlacing (see e.g. Sefusatti et al. 2016, for a detailed discussion of these effects). Name In this work we examine two statistics of the field: the matter power spectrum and matter bispectrum.

Matter power spectrum
The matter power spectrum is defined as We estimate the binned power spectrum,P (k i ) from simulations by computinĝ where we sum over all modes that lay within a k-space bin of width ∆k and N i is the normalization that effectively counts the number of modes within our bin. We use the public Pylians3 2 code with bins of width the fundamental mode, k F , from k F to k max = 0.5 h/Mpc.

Matter bispectrum
The matter bispectrum is defined as We estimate the bispectrum using the binned bispectrum estimator, as with the implementation details as in Foreman et al. (2020) 3 and uses the methods developed in (Scoccimarro 2000;Tomlinson et al. 2019) The binned bispectrum estimator computeŝ which sums over all bispectra configurations that have wavenumbers that fall within three bins and is normalized by N ijk , which counts the number of configurations in each bin. Note that as we sum over the discretized modes on the grid, the Dirac Delta function is really a Kronecker Delta for each of the three directions. The estimator is efficiently implemented by Kronecker delta where the summation is over the grid points with a volume factor, V , and then exchanging the order of the summations. The resulting expression can be rapidly evaluated by fast Fourier transforms. In this work we use the same bins as in Hahn et al. (2020): we use bins of width 3k F starting from half the fundamental to k max = 0.5 h/Mpc.

POWER SPECTRUM AND BISPECTRUM MEASUREMENTS
We start by exploring the initial conditions used in our simulations. To do so we generate 500 realizations of the Gaussian primordial potential field and 500 realizations for each of the primordial bispectra. In Fig. 1a we show the impact of PNG on the primordial potential power spectrum. As is discussed in Wagner et al. (2010); Scoccimarro et al. (2012), care needs to be taken that the kernels to generate the bispectrum do not generate unphysical corrections to the power spectrum. We see that there are no divergent corrections to the power spectrum introduced by our method. We do see that there are small corrections, 1 × 10 −5 f NL , to the power spectrum for the orthogonal-CMB non-Gaussianity. These corrections arise due to the loop correction to the power spectrum, some of which give k-dependent corrections and others are equivalent to an amplitude renormalization (see the Appendix of Wagner et al. 2010, for a detailed discussion). In principle the scale dependent contributions could be further minimized by different kernel choices, however there is limited theoretical motivation to chose one parameterization over another and hence we opted for the simplest implementation. In Fig. 2a we plot three slices showing how the bispectrum of the primordial initial conditions is affected by PNG. In addition we plot the theoretically computed bispectrum. Firstly we note that the simulated initial conditions are in excellent agreement with the theoretical bispectrum, validating that our initial fields do contain the intended bispectrum shape. Note the deviations seen in the squeezed limit of the equilateral shape are statistical fluctuations that reduce when more simulations are considered. They are most visible in the squeezed limit of the equilateral shape as there the signal is smallest. Second it can clearly be seen that there are strong differences between the orthogonal-LSS and orthogonal-CMB templates. The differences are particu-larly large in the squeezed limit, as is expected, and will be very important for accurate modelling of effects such as scale-dependent bias (Dalal et al. 2008;Desjacques et al. 2009).
It is interesting to compare the primordial measurements to those at late times; in Fig. 1b we show the matter power spectrum derivative with respect to PNG at z = 0.0. We see that the impact of PNG on the power spectrum for the local and equilateral shapes is similar in structure. Our results for local and equilateral are consistent with the results found in Wagner et al. (2010), which is a non-trivial check as Wagner et al. (2010) use a different approach to generating the equilateral shape initial conditions. We also show the 1-loop power spectrum prediction from EFT of LSS (D' Amico et al. 2022;Cabass et al. 2022a,b) -as expected we find excellent agreement on the largest scales and significant differences on small, non-perturbative scales, where the 1-loop term is inaccurate. Note that a subtlety of the theory model is that the counter term, required for the EFT of LSS to be a consistent theory, has an implicit dependence on f NL which needs to be accounted for. The level of agreement is consistent with past work, e.g. Wagner et al. (2010) who compared simulated power spectra with PNG at z = 0 to the time renormalization group prediction.
Finally in Fig. 2b we explore how the z = 0.0 bispectrum is affected by PNG. We can see that at z = 0.0 the tree level prediction for the bispectrum agrees on the largest scales, providing a simple validation of our simulations. However as we move towards smaller scales we find significant deviations from the tree-level prediction and the simulations -as is expected. Our results are in agreement with previous investigations of the matter bispectrum in Sefusatti et al. (2010), who measured the bispectrum in simulations with local non-Gaussianity up to k max ≈ 0.3 h/Mpc. Whilst our simulations are based off the same codes this check is an important validation of the robustness to choices of the simulation parameters, e.g. resolution and force softening. Our results for the local case are also in agreement with the results in Enríquez et al. (2022), who investigated the similarities between local PNG and relativistic effects on the matter bispectrum. Note that the tree level approximation is the leading method for most analyses and forecasts (Karagiannis et al. 2018;Slosar et al. 2019;Karagiannis et al. 2020;Cabass et al. 2022a,b) to date. 1-loop perturbative methods (e.g Sefusatti 2009) have very recently been applied to galaxy survey measurements (D' Amico et al. 2022) and have been used to extend range of scales that can be accurately modeled. These extract significantly more information about the primordial universe than the tree level calculations, but it is impossible to use such methods on the maximum scales considered here as they are non-perturbative. In this work we investigate whether we can utilize the visible small scale information.

FISHER METHODOLOGY
To estimate the information contained in the power spectrum and bispectrum we use the Fisher formalism (Fisher 1935;Tegmark et al. 1997). The Fisher informa- tion, F IJ , on parameters, θ, is defined as the variance of the score where L(X|θ) is the likelihood. The Fisher information is useful as the minimum variance of an unbiased estimator,θ, for θ is given by (Fréchet 1943;Aitken and Silverstone 1942;Darmois 1945) with no summation over the repeated indices. Thus by computing the inverse of the Fisher information we can infer the maximum information that we can learn about θ from measurements of an observable, X.
In this work we assume that our observables, the matter power spectrum and bispectrum, are well approximated by a normal distribution. Thus where O(k i ) denotes either the power spectrum or bispectrum,Ō is the observable mean and Σ is the covariance matrix. We also consider the information available from joint measurements of the power spectrum and bispectrum, in which case O(k i ) is then the vector of both the power spectrum and bispectrum measurements. Whilst the Gaussian likelihood assumption is not perfectly accurate (see e.g. Scoccimarro 2000; Sellentin and Heavens 2018), it is sufficiently accurate to get an estimate of the information available. For a Gaussian distribution, whose covariance is independent of the parameters of interest, the Fisher Information can be rewritten as In practice the covariance matrix of the power spectrum and bispectrum will likely have some cosmological dependence, however as was shown in Carron (2013) neglecting this dependence gives a better approximation for the true information of statistics like the power spectrum, whose true distribution is close to but not exactly Gaussian.
Thus to assess the information content in the bispectrum and power spectrum we require two ingredients, the derivative of our statistics of interest with respect to the parameters and the covariance of the statistics.
The derivatives are computed numerically by central differencing: we compute mean power spectra and bispectra signal at θ + δθ and θ − δθ and estimate the derivative as for the PNG derivatives we use our new simulations, which where run with δf NL = ±100 and for the other cosmological parameters considered (Ω m , Ω Λ , Ω b , σ 8 , h) we use the simulations summarized in Table 1 in . The covariance matrix is the second important part of our analysis. The contributions to the covariance matrix can be broken down into three terms the Gaussian contribution, C Gaussian ij , is straightforward to compute, whereas the connected, i.e. non-Gaussian, C connected , and super sample covariance terms C SSC are generally difficult to compute (see e.g Kayo et al. 2013;Chan and Blot 2017;Gualdi et al. 2018;Sugiyama et al. 2020). We compute the first two terms by measuring the covariance of the observables using 12500 simulations in the fiducial cosmology (given the highly converged covariance matrix, we did not need to use all 15000 simulation available in the quijote suite) . The super sample covariance term is computed as in Li et al. (2014a); Chan et al. (2018), the details are summarized in Appendix B. Our base analysis does not include the super sample covariance term, but in Section 7.3 we assess the relative importance of it compared to the other terms. When inverting purely numerically estimated covariance matrices, we include a correction, the Hartlap/Anderson factor, to unbias our estimate of the inverse covariance matrix, precision matrix (Hartlap et al. 2007).
We validate the accuracy of our results by verifying that the derivatives and covariance matrix are converged. In Fig. 3 and Fig. 4 we examine how our Fisher forecasts change as we vary the number of simulations used to estimate the two components of our forecast. We see that the forecast is highly stable against variations in the number of simulations used in the calculation. We also see that the bispectrum and the combined power spectrum and bispectrum -hereafter jointforecasts are also stable to changes in the number of simulations used to estimate the derivatives. However we see that constraints from the power spectrum are not converged. This means that constraints from these statistics are likely over optimistic, indicating that actual constraints derived from power spectrum measurements will be reduced compared to the results reported below.
Typically one expects the power spectrum to converge more rapidly than higher order statistics so the convergence results in Fig. 4 are at first surprising. In the case of primordial non-Gaussianity however, the power spectrum is only sensitive to PNG through 1-loop and higher order terms, whereas the bispectrum is sensitive at the tree level. The power spectrum is thus only weakly sensitive to variations in the PNG parameters. Further, as will be seen in the Section 7, the effect of PNG on the power spectrum is highly degenerate with other parameters. The strong degeneracies mean that the Fisher information needs to be estimated to a very high precision to be stable. Whilst the inferences can be regularized, e.g. by the use of a prior as is explored in Appendix D, in the main text we prefer to present these unconverged forecasts. Alternatively we could consider reparameterizations, for example those of the form σ 8 f α NL , which characterize the information probed by the power spectrum. However we find that the bispectrum both measures different combinations and allows inferences on the physical parameters, e.g. f NL , thus we retain the original parameterization. It is straightforward to see that the unconverged forecast is biased towards being overly constraining, see e.g. Coulton and Wandelt (2023), and it is interesting to compare how broad even these optimistic power spectrum constraints are when compared to the bispectrum ones. Note, if we exclude the PNG parameters from the Fisher forecast then the power spectrum forecasts are highly stable to variations in the number of derivative simulations used.

COSMOLOGICAL CONSTRAINTS
Next we explore the constraining power from measurements at scales beyond the perturbation regime. Our constraints are obtained for a forecast experiment at z = 0.0 and using modes from the fundamental mode, k F = 6.3 × 10 −3 h/Mpc, to k max = 0.5 h/Mpc.

Degeneracies
In Fig. 5 we plot the constraints obtainable with measurements of the matter power spectrum and bispectrum including modes up to k max = 0.5 h/Mpc. Focusing first on the constraints from only power spectrum measurements, we find that they have negligible information on primordial non-Gaussianity when jointly measuring the different templates and marginalizing over cosmological parameters. This is expected given the small impact of PNG on the power spectrum and the similarity of the induced changes to other cosmological parameters (e.g. n s and s 8 ). Quantitatively the impact of marginalization, as seen in Fig. 6, can widen constraints by up to a factor of ∼ 100! Note that this marginalization also significantly degrades the cosmological parameter constraints -especially σ 8 . The σ 8 − f NL degeneracy is very strong as the PNG contributions to the matter power spectrum are very featureless, unlike for example h which impacts the BAO, and can be largely captured by a rescaling of the amplitude.
Focusing now on the bispectrum, we see that it offers vastly improved constraints. However, the bispectrum constraints also exhibit strong degeneracies with cosmological parameters. This indicates the challenge of separating the gravitationally sourced non-Gaussianity from PNG. When considering the joint power spectrum and bispectrum constraints, we see significant improvements that are largely driven by the improved constraining power on the standard, ΛCDM cosmological parameters. The joint analysis partially breaks the strong degeneracies (e.g. with Ω m and σ 8 ) between the ΛCDM parameters, leading to better PNG constraints.
In Appendix C, we explore the degeneracies present when including two beyond ΛCDM parameters: w and m ν .

The Value of small scales
Given the large information content available in the bispectrum, it is interesting to assess how much has been gained from the smaller scales. To answer this, we plot the constraining power of our probes as a function of maximum included scale in Fig. 6. We perform this investigation for three cases 1) when constraining just one PNG shape, 2) when jointly constraining one PNG shape and the cosmological parameters and 3) when we jointly fit the three PNG shapes and the cosmological parameters.
Focusing initially on case 1), we see that all probes yield significant gains by pushing the maximum scale up to k max ≈ 0.3 h/Mpc. However, the rate of improvement slows significantly beyond this scale. This arises due to the increasingly strong non-Gaussian contributions to covariance matrix, which we explore further in Section 7.3. Importantly, the combined power spectrum and bispectrum analysis leads to significant improvements over analyses of either probe alone. This is explored more in Jung et al. (2022).
By considering cases 2) and 3), when we include the effects of marginalization, we can see the strength of the degeneracies. For the power spectrum the constraints degrade by up to two orders of magnitude. Whilst the bispectrum constraints are less affected we still find degradation's of ∼ 100%, reflecting the correlations seen earlier in Fig 5. In this scenario there is more benefit to pushing to smaller scales as the extra information helps resolve the parameter degeneracies. The joint constraint panel shows that combining the probes mitigates the impact of marginalization and again the constraints generally improve only modestly beyond k max = 0.3 h/Mpc. We also find minimal improvement for the equilateral shape, further underlying the degeneracy between the primordial shape and gravitational bispectra.
To contextualize the expected improvement with scale, we investigate the information available in the initial conditions. Given that the primordial universe is nearly Gaussian, we can compute the constraining power of an optimal estimator as where the sum counts the discrete Fourier modes in our simulation box. In Fig. 7 we compare our constraints to the information available in the primordial bispectrum as a function of scale. As expected, the primordial constraints are tighter on all scales. Further, we see that pushing to increasingly small scales produces a far faster improvement on the constraining power (see Kalaja et al. 2021, for a more detailed discussion). As a cross check we have compared the primordial information captured if we were to use our sub-optimal, binned estimator. We find that the constraints degrade by 20 − 30% . This could be reduced by optimizing the binning choice further (e.g. using a smaller bin width or having more bins for configurations where the bispectrum varies rapidly) or using an alternative estimator such as the modal estimator. In this analysis we also consider the information available to constrain the orthogonal-CMB PNG but note that in our joint fits we only include either the orthogonal-CMB or -LSS shapes, not both. We do not include both simultaneously as they are both approximations to the same EFTi bispectrum and thus are expected to be highly degenerate. We see similar behavior in the constraints for the orthogonal-CMB bispectrum.

The importance of covariance matrix modelling
The flattening of the information content arises as the bispectrum and power spectrum SNR flatten beyond k ≈ 0.3 h/Mpc as is seen in Fig. 8. These results are consistent with those reported in Chan and Blot (2017), who found that the bispectrum signal-tonoise ratio (SNR) is degraded by a factor of ∼ 10 at k max = 0.5 h/Mpc at z = 0.0 and that the SNR flattens above k max ≈ 0.  as the nonlinear evolution of structure formation leads to strong correlations between the small scale modes, reducing the available primordial information.
To further understand the flattening of our constraints we explore how the different contributions to the covariance matrix propagate to parameter constraints. In Fig. 9 we compared the bispectrum only constraints obtained with the full covariance matrix to the constraints obtained when only a subset of the contributions are included. We find that including the non-Gaussian contributions, both the diagonal and off-diagonal elements, has a significant impact on the constraining power and leads to a degradation of the constraints by a factor of ∼ 4. This supports our assertion that the non-Gaussian terms cause the flattening of the SNR. These results agree with those seen in Barreira (2020); Biagetti et al. (2021) and Gualdi and Verde (2020); Biagetti et al. (2021) find that the off-diagonal covariance matrix terms are highly important for constraints on local PNG and, when included, lead to significantly degraded constraints. Our work extends these results highlighting their importance for other types of PNG.
Note that at smaller k max , the size of the non-Gaussian diagonal and off diagonal terms is reduced, but not negligible. At k max = 0.1 h/Mpc there is still a degradation of some constraints by ∼ 2. It is interesting to note that the super sample covariance matrix terms have minimal impact on our parameter constraints despite being a non-trivial contribution to the covariance matrix as was found in Hamilton et al. (e.g. 2006); Rimes and Hamilton (e.g. 2006); Takahashi et al. (e.g. 2009);Chan et al. (e.g. 2018). When we examine the unmarginalized constraints we find, for the power spectrum, that these super-sample covariance terms are important. This is similar to the results found in (Li et al. 2014b). Thus for the power spectrum, the large increase in constraints due to marginalization swamps the impact of the supersample covariance terms, at least for the setup considered here. For the bispectrum the limited degradation to the constraints arises as the strongly affected configurations are small-scale equilateral shapes (Chan et al. 2018;Barreira 2019), which contribute minimally to the constraining power due to the large non-Gaussian covariance.

CONCLUSIONS
In this work we have presented and validated a suite of N-body simulations that contain primordial non-Gaussianity: quijote-png. These augment the extensive quijote suite, matching the random seeds, configurations, and data formats of the previously run quijote simulations. This allows for consistent and seamless joint analyses of available information from the primordial universe and cosmological parameters. The simulation data, include particle data at z = 0.0, , 0.5 , 1.0 , 2.0 and 3.0 and the power spectrum and bispectrum measurements at z = 0 are publicly available. Please refer to the documentation at https:// quijote-simulations.readthedocs.io/en/latest/png.html As a first use case, we investigated the information content accessible via measurements of the matter power spectrum and bispectrum. Our simulations allow us to extend the range of scales used in our analysis beyond the perturbative scales used in most advanced forecasts and analyses (e.g. Karagiannis et al. 2018;Cabass et al. 2022a;D'Amico et al. 2022). This analysis, along with our companion paper (Jung et al. 2022), are the first investigations of this kind. Our results show that significant gains can be made by including modes up to ∼ k max ≈ 0.3 h/Mpc, after this the improvements are less dramatic.
Beyond these scales, the large non-Gaussian covariance of our statistical probes limits improvements, despite large signals from PNG on these scales, by saturating the SNR of our probes. This saturation is not captured when Gaussian approximations are used to model the covariance matrix, thus highlighting the importance of accurately including the non-Gaussian terms. Similar saturation has been observed in the power spectrum (Neyrinck and Szapudi 2007;Rimes and Hamilton 2005), however these authors suggest that when pushing to even smaller, k > 1h/Mpc, the information content again improves rapidly with scale. The resolution limitations of our simulations prevent us from investigating this, though we note that robustly exploiting such scales would be challenging to due the impact of baryonic processes (see e.g. Chisari et al. 2018). Interestingly, we find that on the scales considered here the super sample contributions to the covariance matrix are unimportant. In our companion paper Jung et al. (2022) we examine the detailed features of the joint power spectrum -bispectrum covariance, which is an important component.
The challenge of modelling the large bispectrum covariance provides a large motivation to consider more efficient representations of the bispectrum information. Two such representation are explored in detail Jung et al. (2022). First we measure the modal bispectrum, which -by a judicious choice of basis functions-allows the information in the bispectrum to be fully captured in O(100) modal coefficients instead of the 1800 bispectrum bins used here. Second we then implement a further compressed estimator that compresses the information down to just 8 numbers.
We chose to work at z = 0.0, despite the limited volume of our Universe that is observable at z = 0.0. This was done to allow a comparison to the information content in the halo field (Coulton et al. 2022b), where resolution limitations mean we need to work at z = 0.0 to obtain observationally relevant tracer densities. Measurements at higher redshifts will demonstrate similar features to those seen in our analysis with the modification that the non-linear scale will be moved to smaller scales -this is explicitly seen in our companion paper, Jung et al. (2022), where we investigate the information at z = 1.. A second important choice is the size of box, and therefore k min , which is small compared to the volumes probed by upcoming experiments (e.g. Doré et al. 2014;DESI Collaboration et al. 2016). PNG constraints are highly sensitive to the minimum scale used in the analysis, see Kalaja et al. (2021) for an extended discussion. Thus the constraints on PNG obtainable from larger surveys will not just scale as √ Vol. compared with those presented here. It is also expected that the degeneracies seen with cosmological parameters will be somewhat reduced as the shapes of the gravitationally induced and the primordial bispectra will be more easily differentiated over a larger domain. Running larger simulations, whilst maintaining the small scale resolution is computationally too expensive for this work, however our results demonstrate the importance of small scale measurements that will only be enhanced for larger volume surveys.
A comparison between the information captured by our bispectrum measurements and the information available in the primordial field, which bounds how much we could possibly know about PNG, shows that the matter power spectrum and bispectrum estimators only capture ∼ 10% of the total available information. This highlights the need to explore alternative approaches, such as topological measures (Biagetti et al. 2020), the matter pdf (Friedrich et al. 2020), machine learning approaches (Giusarma et al. 2019; or field level approaches (Andrews et al. 2022), to fully capture the information available in the matter field. The simulations presented here are designed to facilitate future investigations along these lines.
Finally, it is important to emphasize that the 3D matter field considered in this work is not directly accessible. Observationally, we can access the 2D integrated matter field through gravitational lensing and these observations will be impacted by observational systematics and baryonic processes, which are neglected here. Both of these effects will likely impact the observationally attainable information. We can also make biased estimates of the matter field from surveys of biased tracers, such as galaxy positions and velocities, and signatures of the neutral hydrogen distribution through 21cm mapping. Thus we stress the results of this work are not forecast constraints for upcoming surveys, but provide an assesses the value of smaller scale bispectrum measurements and provide a suite of simulations for the community to use to develop and test new analysis tools for PNG. This work represents an important first step towards these more sophisticated probes. In a followup work we will explore more direct observables, such as the halo and galaxy fields.  The kernels used to generate initial conditions with local, equilateral and orthogonal-CMB non-Gaussianity are identical to those used in Scoccimarro et al. (2012). We use the notation and the set of kernels defined in Table 3 As in Scoccimarro et al. (2012), we generate local non-Gaussianity using and orthogonal-CMB with In this section we derive the kernels used in Eq. 8 to generate initial conditions with the orthogonal-LSS non-Gaussianity. This derivation mirrors the derivation and choices made for the other shapes in Scoccimarro et al. (2012). The bispectrum in Eq. 5 is separable and the first three terms are of the same form as the equilateral and f terms used in Scoccimarro et al. (2012). The next step is to enumerate the possible kernels associated with each term and these are listed in Table 3. Table 3. The kernels associated with each term in the bispectrum (as shown in the first column). Each bispectrum term can typically be generated by multiple kernels. Whilst the result is equivalent at the bispectrum level, for other statistics they have different properties.
Using this we can show that the orthogonal-LSS bispectrum can be generated by the following operations There is a large family of fields with the correct bispectrum with differing N ̸ = 3-point functions. Our approach to choosing the terms is to impose the constraint that the largest corrections to the power spectrum should scale as k −2 and, for simplicity, to set as many of the remaining coefficients as possible to zero. For two kernels K A (k 1 , k 2 , k 3 ) = k a 1 k b 2 k c 3 and K B (k 1 , k 2 , K 3 ) = k α 1 k β 2 k γ 3 the leading correction to the power spectrum is given as Bringing these terms together we have These operators have been derived assuming a scale invariant spectrum (n s = 1) for simplicity. It is simple to adapt to a non-scale invariant spectrum by replacing k with P (k) − 1 3 . Whilst this replacement alters the primordial bispectrum, the distortion is small, as n s ∼ 1, and it allows us to simply use non-scale invariant simulations in our analysis (i.e. n s ̸ = 1) as is required to be consistent with observations (Planck Collaboration VI 2020).

B. COVARIANCE MATRIX COMPUTATION
The covariance matrix given in Eq. 19 is composed of several terms, the Gaussian, connected and super sample covariance terms.

B.1. Gaussian Covariance matrix
The computation of the Gaussian covariance matrix is for the power spectrum estimator, Eq. 10, is given by (Feldman et al. 1994) where N modes is the number of modes in the power spectrum binned. For the binned bispectrum estimator the equivalent expression is where N triplets is the number of bispectrum configurations in the bin, the sum is over all bispectrum triplets and g IJK ijk selects only the diagonal elements of the covariance matrix and is 1,2 or 6 depending on whether there are 3,2 or 1 unique values in {ijk}.

B.2. Super sample covariance
Generally the super-sample covariance contribution to the covariance matrix is given by We use follow Li et al. (2014a) and use separate universe simulations to compute the responses to a long wavelength mode (the derivative terms in Eq. B14). For the power spectrum of global mean observables this is given by (k,k,k) P(k) Figure 10. The response functions of the power spectrum and bispectrum to a long wavelength mode. These have been computed using separate universe simulations as described in Li et al. (2014a) and for local power spectra observables we have an additional −2P (k). For the bispectrum of global mean observables this is (Chan et al. 2018) and for local mean observables there is an additional −3B(k 1 , k 2 , k 3 ). In both case the derivative ∂P SU (k)/∂δ b and ∂B SU (k 1 , k 2 , k 3 )/∂δ b is computed from the separate universe simulations using finite difference. In Fig. 10 we plot the response functions and these show great agreement with the equivalent plots in Li et al. (2014a) and Chan et al. (2018). Further we find that the relative importance of the super-sample covariance terms is very similar to that found in Chan et al. (2018) providing validation of our implementation.

C. JOINT CONSTRAINTS WITH BEYOND ΛCDM PARAMETERS
In this Appendix we extend our analysis to include two common extensions: the dark energy equation of state parameter, w, and the sum of the masses of the neutrinos m ν . We use the simulations described in  to compute the derivatives with respect to w and the simulations in Hahn et al. (2020) for the neutrino mass constraints. For the neutrino mass we use two modifications to our method: firstly we compute the properties with respect to the total matter field (rather than just the dark matter field). This is trivial done by include the neutrinos when gridding the simulation outputs. Second we use the third order difference method, Eq. 4.5 in Hahn et al. (2020), to compute the derivatives. This is used as the neutrino mass is positive definite and thus we cannot use the central difference method used for the other parameters. Additionally this higher order method provides a more accurate estimate of the derivatives. In Fig. 11, we plot the joint constraints of all the cosmological and PNG parameters. The inclusion of w primarily degrades the constraints on PNG by decreasing the constraining power on the ΛCDM parameters. On the other hand we find strong degeneracies between m ν and the f NL parameters. This is not unexpected given that the PNG parameters exhibit degeneracies with σ 8 and the impact of m ν is similar to σ 8 on these scales.

D. IMPACT OF A PRIOR ON THE POWER SPECTRUM CONVERGENCE
In Section 6 we found that the power spectrum constraints were not converged. As is seen in Fig 6 the power spectrum constraints are increased by a factor of ∼ 10−100 when marginalizing and this large degeneracy is the reason for the lack of convergence. To test this idea we perform our convergence test when imposing a Gaussian prior of width f NL = 1000 for all the PNG shapes. In Fig. 12 we see the convergence in this case finding that all derivatives are sufficiently converged. We show the parameter constraints obtained when including this prior in Fig. 13. The only effect of the prior is for the power spectrum constraints where it improves the constraints on almost all of the cosmological parameters by ∼ 20% with the exception of s 8 which is significantly improved.