Multi-tracer power spectra and bispectra: formalism

The power spectrum and bispectrum of dark matter tracers are key and complementary probes of the Universe. Next-generation surveys will deliver good measurements of the bispectrum, opening the door to improved cosmological constraints and the breaking of parameter degeneracies, from the combination of the power spectrum and bispectrum. Multi-tracer power spectra have been used to suppress cosmic variance and mitigate the effects of nuisance parameters and systematics. We present a bispectrum multi-tracer formalism that can be applied to next-generation survey data. Then we perform a simple Fisher analysis to illustrate qualitatively the improved precision on primordial non-Gaussianity that is expected to come from the bispectrum multi-tracer. In addition, we investigate the parametric dependence of conditional errors from multi-tracer power spectra and multi-tracer bispectra, on the differences between the biases and the number densities of two tracers. Our results suggest that optimal constraints arise from maximising the ratio of number densities, the difference between the linear biases, the difference between the quadratic biases, and the difference between the products b 1 b Φ for each tracer, where b Φ is the bias for the primordial potential.


Introduction
Spectacular advances in observations and computations have transformed our understanding of the Universe.At the same time, this has brought new puzzles to the fore, such as the still unknown nature of dark matter and dark energy that are core to the current standard model, and the emergence of tensions between observations that appear to challenge the foundations of the standard model.The next generation of large-scale structure surveys will deliver major advances in precision cosmology, cast new light on current tensions and anomalies, and open new windows on the primordial and late-time Universe.
This suggests the use of a multi-tracer bispectrum analysis, in combination with the corresponding multi-tracer power spectra.Here we develop for the first time a Fisher forecast multi-tracer formalism for the tree-level power spectra + bispectra combination.Multi-tracer power spectra have been combined with the auto-bispectrum of one of the tracers by [38].We extend such approaches to include all the bispectra.A multi-tracer forecast has been applied to bispectra alone (in real space) by [42], on the basis of simplifying assumptions and without giving the covariances or other details.We fill the gap by providing a systematic analysis in redshift-space, including all clustering bias and noise parameters at tree level and giving explicit forms for the covariances.
In this first paper, we do not perform detailed forecasts, which will be presented in future work.However, we do show qualitatively the improvements that are potentially delivered by the full power spectrum and bispectrum multi-tracer of 2 tracers, by investigating the parametric behaviour of the conditional errors on PNG.To quantify the potential of this method we consider a comparison between the forecasts generated by the multi-tracer and each single-tracer, over a single volume and with the same redshift range.This is done in order to investigate the effect of combining the samples or not on primordial non-Gaussianity, without the effect of survey specifications for the different tracers, e.g.new volumes and different redshift overlap.To begin with, we do not include wide-angle or relativistic effects.

Single-tracer power spectrum and bispectrum
The power spectrum of the Bardeen gauge-invariant primordial gravitational potential is defined in Fourier space by where P Φ (k) is directly related to the power spectrum of the primordial curvature perturbations ζ (during the matter-dominated era, Φ = 3 ζ/5), which are generated during inflation.
In the case of the standard single-field slow-roll inflationary scenario, they have a nearly perfect Gaussian distribution, which means that they can be adequately characterised by their power spectrum.The primordial perturbations Φ are in turn related to the linear dark matter over-density field through the Poisson equation, δ(k, z) = M (k, z) Φ(k), where Here D(z) is the linear growth factor (since the linear fluid equations generate a linearly evolved matter density field), normalised to unity at z = 0, and T (k) is the matter transfer function normalized to unity at large scales k → 0. The factor g dec is the Bardeen potential growth factor at decoupling, which ensures that f NL is in the CMB convention [43,44].The linear power spectrum is computed with the numerical Boltzmann code CAMB [45].
Deviations from the standard inflationary model will produce a violation of Gaussian initial conditions [46].PNG generates nonzero higher-order correlators in the primordial curvature perturbations, starting with the bispectrum, i.e. the Fourier transform of the threepoint function, Higher-order correlators in the primordial density field are a product of nonlinear interactions during the inflationary or reheating stage.The primordial bispectrum arising from such interactions is characterized by a dimensionless amplitude parameter f NL , giving the strength of the PNG signal, and a shape function F describes the dependence of the bispectrum on various triangles, where different inflationary models generate signal that peaks at distinct triangle configurations.By Eq. (2.2), the leading order PNG contribution to the matter density bispectrum is The matter bispectrum has additional contributions from nonlinearity induced by gravity, even at zeroth order.In other words, the dark matter density field is intrinsically non-Gaussian, even in the absence of PNG.Therefore, in order to disentangle the PNG information from late-time nonlinearity, an adequate description for the latter is needed.Here we model nonlinearity in the framework of Standard Perturbation Theory (SPT) (see e.g.[47] for a review).In addition to the nonlinear nature of the dark matter density field, the hierarchical formation of halos and the way they are populated by luminous matter induces non-Gaussianity in the distribution of observable structures.Hence, in order to extract the PNG signal from large-scale structure, we need the bias relationship between the observed tracers and the matter field.
We follow the perturbative approach of [48][49][50], in which the halo over-density field δ h is described as a function of all possible local gravitational observables, which are introduced in the expansion in the form of renormalised operators.A complete Eulerian bias expansion can be built from the tensor ∂ i ∂ j Φ, which contains the trace ∇ 2 Φ ∝ δ and the trace-free tidal field Note that the term 'halo' here can be replaced by a general matter tracer.
We consider up to second-order terms in the expansion, which are sufficient for the spatial scales considered here, i.e. much larger scales than those involved in halo formation.For Gaussian initial conditions, the Eulerian halo density contrast is [44,[48][49][50]: Here τ is conformal time, x are spatial comoving Eulerian coordinates, s 2 = s ij s ij is the simplest scalar that can be formed from the tidal field, ε is the leading stochastic field [51][52][53] and ε δ is the stochastic field associated with the linear bias.These fields take into account the stochastic relation between the tracer density and any large-scale field.The second-order tidal field bias coefficient, following the convention in [54], is given by b s = −4 (b 1 − 1)/7.
In the presence of PNG there is a scale-dependent correction to the linear bias b 1 , especially in the case of local PNG [55][56][57][58][59][60], since the local PNG bispectrum peaks in squeezed triangles.(A similar scale-dependent bias correction can be derived for any general nonlocal quadratic non-Gaussianity template [61][62][63][64].)Then the non-Gaussian bias terms, linear in f NL , are [65]: where q are the spatial coordinates in the Lagrangian frame and Ψ is a nonlocal transformation of Φ: The parameter α = 2, 1, 0 for equilateral, orthogonal and local shapes.Note that in the local case, Ψ → Φ.The stochastic counterpart of Ψ is ε Ψ .The bias coefficients in Eqs.(2.5) and (2.6) can be derived by using the peak-background split argument (see e.g.[21]).The single-tracer power spectrum and bispectrum in redshift space are At tree level, the first-and second-order kernels Z 1,2 include redshift space distortions (RSD) [66][67][68] and PNG.In order to deal with redshift errors and nonlinear RSD, we include the phenomenological factors D P,B .Explicit expressions are in Appendix A. In the case of power spectra, an alternative to the phenomenological factor D P is to use the 1-loop EFT model, as in the multi-tracer power spectrum analysis of [69].The stochastic terms P ε , P εε δ , B ε are generated by stochastic bias and their fiducial values are taken to be those predicted by Poisson statistics [70]: (2.10) 3 Multi-tracer power spectra and bispectra In the multi-tracer framework (e.g.[25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41]) we consider the combined information of all correlations of different tracer samples in a given redshift bin and sky patch.The multitracer power spectra and bispectra are ) where here and elsewhere we omit the z-dependence for brevity.The tracer indices I, J, K range over the tracer labels: I, J, K = t, t ′ , t ′′ , • • • .Note that the cross-power spectra are symmetric under permutations of the tracers, while the cross-bispectra require symmetrisation in order to avoid double counting of information: We will also use the short-hand notation where convenient.We define the estimators: where q 123 ≡ q 1 + q 2 + q 3 .For the power spectra the sum runs over all wavenumbers q in the spherical shell of radius k and width ∆k.For the bispectra, the Kronecker delta symbol ensures the formation of 'fundamental triangles' with sides q i , satisfying the condition q 123 = 0, that fall in the 'triangular bin' defined by the triplet of bin centres (k 1 , k 2 , k 3 ) and width ∆k.The volume in Fourier space of the power spectra shells and the bispectra fundamental triangle bins is, in the thin shell limit [71], The estimators in Eqs.(3.6) and (3.7) in the case of one sample (I = J = K) reduce to the well known auto-correlation results.

The two-tracer case
From now on we consider the case of two tracers, t and t ′ , so that I, J, K = t, t ′ .The data vector of the power spectra is The expected values of the spectra are: where Z I 1 and D IJ P are given in Appendix A. The data vector of the bispectrum is In a tree-level description, we consider terms up to second order PT theory, RSD and bias, while keeping only linear f NL terms in the perturbative expansion.Then the expected values of the estimator for each bispectrum data vector entry are: ⟨ B(ttt where Z I 2 and D IJK B are given in Appendix A. Note that we omitted the redshift dependence in all terms, and also the k-dependence in D IJK B .The stochastic terms have fiducial values: We assume that the cross shot noise of the power spectra and bispectra is zero: The extension of the power spectrum and bispectrum multi-tracers from 2 to 3 tracers is given in Appendix B.
Note that the multi-tracer tree-level expressions listed above are for a general form of the redshift-space kernels Z I 1 and Z I 2 .This means that additional effects that modify these kernels can be introduced fairly easily in the formalism.For example, we can incorporate local relativistic effects by replacing the kernels given in Appendix A with those given in [14].

Multi-tracer covariance
The multi-tracer combination of power spectra and bispectra for two tracers has data vector The covariance of the data vector contains four blocks, coming from the covariance of the power spectra, the bispectra and the power spectra -bispectra covariance.In matrix form: where We assume that the covariance between two power spectra is Gaussian, i.e. the offdiagonal contributions are neglected.Then the cross-sample covariance of the power spectra is a symmetric 3×3 diagonal block matrix: , where each sub-block is a diagonal matrix in k i .
The covariance of the symmetrised bispectra is a symmetric 4×4 block matrix: , where s 123 = 6, 2, 1 for equilateral, isosceles and scalene triangles respectively.Each block is an N T ×N T matrix, where N T is the total number of triangles formed.The index a represents the triangle triplet (k a 1 , k a 2 , k a 3 ).The Kronecker delta indicates that each sub-block of the bispectrum-bispectrum covariance is a diagonal matrix, i.e. we only consider the Gaussian contribution.
These sub-block matrices are: Finally, the cross-covariance between the power spectra and bispectra is itself a 3×4 block matrix: where Here the partially symmetric bispectra are given by: These bispectrum tracer indices are used for convenience and can be generalised for any 2-tracer combination, e.g.tt ′ t ′ .Note that a nonzero power spectrum-bispectrum crosscovariance is generated by configurations in which the power spectrum k mode coincides at least with one of the sides of a triangle configuration.The results presented above for the cross-covariance between power spectrum and bispectrum, consider only the dominant contribution, while the 5-point connected correlation function part is neglected (see [72] for a discussion).Note that the block matrix of Eq. (4.8) and each of the sub-blocks are not square matrices.
The expressions for 3 tracers are in Appendix B.

Fisher information on f NL
The previous section presented the formalism for the multi-tracer power spectra, the multitracer bispectra and their combination.In this section, we use the formalism for a simplified qualitative analysis of the improvements from the multi-tracer constraints on f NL relative to the single-tracer as baseline.In order to achieve this, we consider the same volume and redshift range for each single-tracer and their multi-tracer.This means that the improvement shown here on local PNG only refers to the effect of combining the samples or not, without taking into account the effect of different volumes and redshift overlap of two different samples.
To this end, we consider only the conditional errors, and we use nominal surveys without considering details and neglecting systematics.More realistic forecasts, that marginalise over relevant cosmological and nuisance parameters and use specifications for current and upcoming galaxy surveys, will be treated in a follow-up paper [73].Here we also confine ourselves to separate investigations into the precision from the multi-tracer power spectra and from the multi-tracer bispectra -without combining the information obtained from the two correlators.The follow-up paper [73] will include this combination of multi-tracers as part of more comprehensive forecasts.The Fisher information on f NL in each redshift bin for the multi-tracer power spectra is given by For a rough estimate of the improvements on precision that are produced by the multi-tracer, we consider the error σ P (f NL ) = (F P f NL ) −1/2 .Then an analytic expression can be derived for the single-tracer power spectrum: where we omitted the redshift dependence.Here α = 1, 0 for orthogonal and local shapes, respectively [44,55,56,61,[74][75][76]. Note that equilateral PNG (α = 2) cannot be constrained using the galaxy power spectrum, as shown in [65].Using the linear redshift space kernel Eq. (A.1), the analytic result Eq. (5.2) shows that, for a fixed number density and µ, the tightest constraints on f NL can be obtained from a sample that has b I Ψ ≫ b I 1 .For the multi-tracer case the analytic expression of the Fisher matrix on f loc NL is given by: where . (5.4) is the Gaussian version of Eq. (A.1), i.e. for f NL = 0. Note that this expression is valid for a general linear redshift kernel (e.g.including non-integrated relativistic corrections).
Our Eq. (5.3) generalises the result of [40], which only considers local PNG, takes the cosmic variance limit and neglects RSD effects.At fixed redshift bin and scales, Eq. (5.3) indicates that precision on f NL is increased by (5.5) One way to maximise the former is by choosing appropriately the line-of-sight angle µ to gain a boost from the presence of RSD, which is also evident in forecasts that utilise single tracers (e.g.[77]).If we neglect RSD (or fix µ) and consider local PNG, then Eq. (5.5) implies maximising , as shown also in [40].For the multi-tracer bispectra, the Fisher matrix is Analytical calculations for the bispectrum multi-tracer are long and uninformative, without assuming some limit, either for the shot noise or the triangle configurations.Consequently, we perform the bispectrum analysis numerically.
• Tracer t ′ : lower density, O(10 −4 ) (h/Mpc) 3 ; n t ′ (z) similar to DESI sample [79]; linear bias b t ′ 1 (z) = 0.84/D(z).In order to derive the higher-order and non-Gaussian bias coefficients needed for the tree-level bispectrum, we use a simplified halo-occupation distribution (HOD) model from [80].The HOD model has two free parameters which are calibrated from the values of the linear bias and number densities at each redshift.
This set-up for the nominal surveys exploits the findings of Eq. ( 5.3) and will function as the benchmark case.The fiducial cosmology is a flat ΛCDM model, with ω c = 0.1199, ω b = 0.02237, h = 0.6736, n s = 0.96488, A s = 2.1 × 10 −9 , as measured by [81].In addition, we assume non-Gaussian initial conditions, with an amplitude of f NL = 20, for all three PNG types considered here.The scales used to perform the Fisher matrix analysis are well within the perturbative regime: where k f (z) and V s (z) are the fundamental frequency and the volume of the survey within a given redshift bin, respectively, and k NL is given by the inverse square root of the one-dimensional velocity dispersion (see [19]; a plot of k NL (z) is shown in [4]).
σ(f NL )(z): benchmark case Fig. 1 presents the relative change in the conditional f NL errors from the multi-tracer approach with respect to the two nominal single-tracers t and t ′ , as a function of redshift.The results are for three types of PNG, i.e. local, equilateral and orthogonal, using the power spectrum and bispectrum.For both correlators, the multi-tracer delivers a noticeable overall improvement relative to each single-tracer survey.
In the case of local PNG, the multi-tracer power spectrum improvement over the f NL errors from a single tracer is significant -well above 30% across the redshift range.For the bispectrum, this applies only in the low-redshift bins of tracer t, while for tracer t ′ the gain is more modest.However, the overall enhancement of the bispectrum multi-tracer approach is still important and in the range 20 − 30%.
For orthogonal PNG, the improvement from the multi-tracer with respect to tracer t ′ is equivalent for both correlators (∼ 70%), while regarding tracer t it is significantly more for P MT than for B MT .Note that the performances of the single tracers t and t ′ are opposite to the local case.Eq. (5.2) shows that the single-tracer conditional error from the power spectrum on f NL is parametrically determined by b 1 , b Ψ and P ε .Numerical tests and the analytic results show that the constraints are more sensitive to the values of b Ψ , in particular for the case where b I Ψ ≫ b I 1 .For the local case, b t ′ Ψ > b t Ψ and hence we see that tracer t outperforms the constraints from t ′ , for both correlators.On the other hand, for orthogonal PNG, due to the nature of this shape (see [44] for a discussion), b t Ψ > b t ′ Ψ , leading to an opposite behaviour with respect to the local PNG.
For equilateral PNG, the multi-tracer bispectrum gives only a marginal improvement over the performance of tracer t, as in the orthogonal case.On the other hand, the gain for tracer t ′ increases with redshift, reaching a substantial ∼ 50%.Note that the power spectrum does not provide any constraints on f equil NL (see e.g.[65]).The attributes of the nominal surveys were selected to produce a large |b J The results in Fig. 1 indicate that in the general benchmark case -namely, with no additional assumptions -this is enough to achieve a significant gain from the multi-tracer approach in most cases, but for the rest it is insufficient.Apart from the dependence on PNG type shown in Fig. 1, additional parameters have a significant effect on the level of improvement achieved by the multi-tracer.The main tracerdependent variables that affect σ(f NL ), besides the linear and PNG biases (b I 1 and b I Ψ ), are the number densities n I , which in turn affect the stochastic contributions in Eq. (2.10).In addition, for the bispectrum, higher-order bias parameters such as b I 2 will have an effect on the constraining power.
Further examination is needed into the parametric behaviour of the multi-tracer power spectrum and bispectrum, in order to determine the tracer properties that maximise gain from combining their information.Here we confine attention to local PNG, and we fix the redshift at z = 1.Note that b Ψ → b Φ for local PNG, as follows from Eq. (2.7) with α = 0.

σ(f loc NL ): effect of number densities
We start with the effect of number densities on f loc NL conditional errors.We vary n t , while fixing all other parameters of tracers t and t ′ , including n t ′ .The dependence of the f loc NL errors on the ratio n t /n t ′ , at z = 1, is shown in the left panel of Fig. 2. In addition to the multi-tracer results (marked with the superscript MT), we present the forecasts from simply The dependence of the 1σ conditional f loc NL errors on the ratio n t /n t ′ , at fixed redshift z = 1.We vary only the number density of sample t, while fixing all other parameters of tracer t, together with those of sample t ′ .Right: Testing the cosmic variance limit, by enforcing equal number densities, i.e. n t = n t ′ = n, and varying n (at z = 1).Both panels show the multi-tracer forecasts (circles) and the results from the simple combination of the two Fisher matrices (squares) (without considering the cross-terms in the data vectors and block covariances), divided by the benchmark results of Fig. 1 at z = 1, for both summary statistics.
adding the Fisher matrices of the two tracers (marked with the superscript t + t ′ ), without considering the cross-terms in the data vectors, Eqs.(3.9) and (3.11), and block covariances, Eqs.(4.4) and (4.5).
The left panel of Fig. 2 shows that for both correlators the multi-tracer gives maximal improvement on f loc NL errors for n t /n t ′ ≳ 10.In other words, the number density of one tracer should be at least an order of magnitude larger than that of the other.In fact, if the ratio is beyond that point, the constraints do not improve further, as we remain shot-noise dominated by the low-density tracer.This behaviour occurs since only one of the two tracers is allowed to reach the limit where stochastic contributions can be neglected (i.e. the cosmic variance limit).The other tracer has a constant shot-noise component, that ultimately saturates the multi-tracer Fisher information.Nonetheless, in this setup, the multi-tracer power spectrum seems to be more sensitive to the cosmic variance limit of tracer t, than the bispectrum.This is an expected behaviour, since the effect of local PNG appears only on the largest scales for the tracer power spectrum, whereas the effect on the tracer bispectrum is from the squeezed configurations that are not purely dependent on the largest scales, since they couple large scales to small scales.
On the other hand, the constraints from just summing the Fisher matrices of the two tracers (i.e.neglecting the multi-tracer cross-terms), exhibit a minimal change as a function of the number density ratio.This indicates that the complete multi-tracer approach takes advantage of the increasing number density in an optimal way, by maximising the contribution of the cross-terms and hence providing a significant improvement on the f loc NL constraints, i.e. up to 10% for n t /n t ′ ≳ 10.Moreover, for the low density regime (n t /n t ′ < 1), neglecting the multi-tracer cross-terms, when considering the summed information from two overlapping tracers, can lead to severe overestimation of the f loc NL errors (even up to 40 − 60%), in the case of both correlators.
Note that the scenario with equal number densities of tracers (i.e.n t /n t ′ = 1) is not an optimal set-up for a multi-tracer application, at least when neither of the tracers approaches the high-density limit.
In order to investigate further the limit where the shot noise can be neglected, we assume equal number densities.Our aim is to explore the low-and high-density limits.The dependence of the forecast error on f loc NL as a function of number density is presented in the right panel of Fig. 2, for both correlators.This indicates that the multi-tracer constraints on local PNG exhibit an inverse power-law behaviour as the number densities of the tracers approach the cosmic variance limit, n ≫ 10 −3 (h/Mpc) 3 .By contrast, the results of simply adding the Fisher matrices of the two tracers saturate around the same value, without exhibiting any further improvement as the number densities increase.This shows the clear advantage of the multi-tracer approach, where the cross-term contribution is enhanced substantially, due to the decreasing shot-noise terms, providing significant improvement on f loc NL .Our power spectrum results clearly show the benefit of the multi-tracer, for the high density samples, as well as for tracers that approach the cosmic variance limit.This agrees with the findings of the original work on the multi-tracer approach [25], as well as the more recent results of [40].Moreover, here we show that the advantage of the multi-tracer towards the high-density limit can also be extended to the bispectrum, which exhibits a significant improvement compared to the power spectrum in the cosmic variance limited regime.

σ(f loc NL ): effect of linear biases
Next we investigate the effect of the linear biases on the multi-tracer power spectrum and bispectrum.Due to the symmetrisation of the two correlators under tracer permutations, the outcome of the parametric analysis does not depend on which tracer's parameters we chose to vary.Note that the higher-order bias coefficients of each tracer depend on the HOD model, whose free parameters are calibrated by the value of the linear bias at each redshift.This means that varying b t 1 will, in the HOD framework, affect all the bias coefficients of tracer t.Here we are interested in isolating the effect of a change in bias and we vary only b 1 t for tracer t, while fixing all other parameters for tracers t and t ′ .
The relative change of the multi-tracer constraints on f loc NL over the benchmark case, for the two correlators and at z = 1, are presented in Fig. 3 as a function of the difference between the linear biases, b t The results indicate that when the difference between the two biases is ≳ 2, there is a significant improvement (up to ∼ 30%) with respect to the benchmark results, σ(f loc NL )/σ(f loc NL ) bench = 1, for the multi-tracer power spectrum.For the multi-tracer bispectrum, the gain is more modest.Beyond that point, the improvement slowly saturates (|b t 1 − b t ′ 1 | > 5) for both correlators.The results show that the constraining power on Note that the case where the linear biases of the tracers are similar, |b t 1 − b t ′ 1 | ∼ 0, corresponds to the least tight constraints on f loc NL , for both correlators.In this case, the power spectrum constraints degrade significantly (∼ 60%) compared to the bispectrum (∼ 10%). 1 , while fixing all other tracer-dependent parameters for both t and t ′ .The results are normalised to the benchmark forecasts of Fig. 1.
We proceed by allowing all other bias coefficients of tracer t to vary as well, through the dependence of the HOD free parameters on the b t 1 value, including b t Φ (b t 1 ) via the universality relation assumption.The relative change of the f loc NL forecasts, compared to the benchmark results, as a function of 4. The behaviour exhibited indicates that for both correlators, the gain on the f loc NL forecasts increases as a function of For the multi-tracer power spectrum, the improvement on f loc NL constraints -i.e., the case where σ(f loc NL )/σ(f loc NL ) bench < 1 -saturates at ∼ 20% when dealing with large positive values of b The maximum gain seems to be achieved towards the negative The improvement towards the negative values is in fact significantly larger than the saturation regime towards the large positive values.Nonetheless, maximising provides the highest gain from the multi-tracer power spectrum, verifying the analytical findings of Eq. (5.3).
For the multi-tracer bispectrum, the tightest constraint on Beyond that, the improvement relative to the benchmark saturates.This indicates the different behaviour of the two correlators as a function of Φ , once we allow for all bias parameters of tracer t to change and b t Φ ∝ b t 1 .Moreover, the multi-tracer power spectrum results for b ) relative to the benchmark case, while for the bispectrum the constraints are close to the optimal outcome.
The different parametric behaviour of the multi-tracer power spectrum and bispectrum can also be seen in the contour plots of Fig. 5.In this case, we vary b t 1 and b t Φ , without assuming any relation between the two (i.e.no universality relation), while fixing all other tracer-dependent parameters for both tracers.The black dotted contours are lines of constant The results show that for the multi-tracer power spectrum, the case b t ′ 1 b t Φ − b t 1 b t ′ Φ ≈ 0 corresponds to the weakest constraints, where errors increase compared to the benchmark case.The maximal improvement on the forecast precision originates from the case of maximal b t ′ 1 b t Φ − b t 1 b t ′ Φ values -in particular for −10 ≤ b Φ ≤ 10 and for b 1 ≥ 2. The behavior of the multi-tracer bispectrum is somewhat similar, but in the b t ′ 1 b t Φ − b t 1 b t ′ Φ ≈ 0 regime, which corresponds also to weaker constraints than the benchmark, the performance is better than the power spectrum equivalent.As shown in Fig. 4, the optimal constraints from the multi-tracer bispectrum arise for an increasing In particular, in order to achieve a significant improvement, growing b t 1 values and very negative b t Φ are required (blue shaded region in the right panel of Fig. 5).
The results presented in Figs. 3, 4 and 5 indicate the potential of the multi-tracer to significantly improve constraints on local PNG by carefully choosing the two tracers.For the multi-tracer power spectrum and bispectrum, increasing gives the optimal results, where the latter leads to overall substantial enhancement.σ(f loc NL ): effect of second-order biases In order to investigate further the parametric behaviour of the multi-tracer tree-level bispectrum, we vary the second-order bias coefficient b t 2 , while all other parameters for both tracers are fixed.The relative improvement results are shown in Fig. 6 as a function of the difference b t 2 − b t ′ 2 for the complete multi-tracer approach, and for the sum of the individual Fisher matrices.Comparing the two shows once again the importance of utilising the full multi-tracer in order to optimise the gain on f loc NL constraints compared to the summed information of the bispectrum of two tracers.
In addition, varying b t 2 shows a peak around the case where the nonlinear bias parameters of tracers t and t ′ are similar, i.e. the difference |b t 2 − b t ′ 2 | is small.This indicates the regime where the multi-tracer bispectrum offers minimal improvement with respect to the benchmark results.Therefore, the constraining power on  3 and 4).This reveals the importance of the nonlinear bias parameter in utilising the full potential of the multi-tracer bispectrum.σ(f loc NL ): effect on scales The benefits of the multi-tracer approach for local PNG constraints beyond the nominal scenario studied here, should be explored.Although this is left for a future work, we can provide in this section some preliminary indication of whether the improvement shown in Fig. 1 would persist in any upcoming survey.To achieve this we investigate the sensitivity of the results to the minimum and maximum accessible scales.This is done in order to explore whether or not the multi-tracer gain originates from the expected scale range -and hence if it is likely to persist in a realistic set-up.The information on the amplitude of local PNG, in the case of the power spectrum, comes from the scale-dependent correction to the linear bias (see Section 2), while for the bispectrum it mostly originates from squeezed configurations that couple long and short modes.The latter has an additional source of information, similar to the scale-dependent bias correction in the power spectrum (see e.g.[77]).While the power spectrum PNG constraints are very sensitive to these scales, the bispectrum can still provide competitive results in the absence of the very large scales, as long as enough squeezed triangles are probed by the survey [19,21,77].
The sensitivity of the multi-tracer improvement in f loc NL constraints, as a function of the largest accessible scales k min (z), is presented in the right panel of Fig. 7.Here the expression for k max (z) remains the same as in the benchmark case.The results are presented for both correlators, where the 1σ errors originate from the cumulative information over the whole redshift range and each point assumes a fixed value of the ratio k min (z)/k f (z).
In the case of the power spectrum, the relative gain of the multi-tracer with respect to the single-tracer exhibits only a small loss (up to ∼ 10% reduction), once we reduce the access to the large scales (large values of the ratio k min (z)/k f (z)).Although power spectrum constrains on f loc NL are the most sensitive to these scales, the multi-tracer approach successfully maintains the relative improvement over single-tracer results, at more or less the same level.
The multi-tracer bispectrum follows a similar trend, where a ∼ 10% reduction to the benchmark (k min (z)/k f (z) = 1) improvement is observed.This decrease can be important for surveys like those considered here, where a maximum 20 − 30% gain from the multi-tracer bispectrum over the single-tracer (see Fig. 1) is provided.However, in the case where the parametric behaviour (discussed in the previous subsections) is used to advantage in the tracer selection process, the multi-tracer bispectrum could still provide a significant improvement to the local PNG constraints.
To investigate further the sensitivity of the multi-tracer improvement seen in the benchmark case to the small clustering scales, we vary the ratio k max (z)/k NL (z), while fixing k min (z)  to the fiducial choice.The results are presented in the left panel of Fig. 7.The power spectrum exhibits a reduction, in the overall gain, that does not surpass ∼ 10%.The constraints on f loc NL in this case do not depend on the information within the small scales.The observed decrease in the relative gain is simply attributed to the improvement of the f loc NL constraints from each single-tracer, while the multi-tracer errors are more or less unaffected.
On the other hand, limiting substantially the accessibility to small scales, i.e. in the regime k max (z)/k NL (z) < 0.5, reduces the number of squeezed triangles, affecting the performance of the bispectrum in constraining f loc NL .This is true for both the multi-tracer and single-tracer analysis.However, in this case we see that the gain of the multi-tracer bispectrum is greater, managing to maintain the relative improvement relative to each single tracer at significant levels (within the margin 20-50%).This means that reducing the small-scale access affects the single-tracer analysis substantially more than the multi-tracer.In other words, the multi-tracer bispectrum does much better in a regime which is not shot-noise dominated.The gain exhibited reaches the benchmark values as soon as the small scale access is restored to the fiducial choice (i.e.k max (z)/k NL (z) ≳ 0.75).
The overall trends suggests that the f loc NL constraints behave differently in the single and multi-tracer case.The single-tracer case is more sensitive to k max than the multi-tracer, which leads to a larger improvement at lower k max cuts.This indicates the significance of the multi-tracer approach, and especially for the bispectrum, in upcoming surveys.In cases where the small scales are lost due to the survey's design and systematics, the multi-tracer approach becomes ever more important.

Summary and conclusion
In this work we investigate the potential of the multi-tracer power spectrum and multi-tracer bispectrum analysis in improving the constraints on the amplitude of PNG, with a particular focus on local PNG.We consider two nominal next-generation surveys and use the Fisher matrix to probe the information on f loc NL , within the power spectra and bispectra of the two tracers.The main goal is to quantify the impact of the various tracer-dependent parameters on the constraining power on f loc NL , in order to maximise the gain from the multi-tracer technique and provide the tracer-selection criteria for future galaxy surveys.This is done by fixing the survey specifications for each single-tracer and the multi-tracer, which effectively means comparing the potential gain from combining the samples or not over the same volume and redshift range.The main findings can be summarised as follows: • The complete multi-tracer approach takes advantage of increasing number density in an optimal way, by maximising the contribution of the cross-terms.This is particularly important for high-density samples, as well as for tracers that approach the cosmic variance limit.The scenario where the number densities of the tracers are similar does not provide the tightest constraints, except for high-density samples.
• The constraining power of the power spectrum and bispectrum multi-tracers is , where the power spectrum exhibits a much larger gain compared to the bispectrum.In particular the optimal regime is for an increasing difference between the linear biases of the two tracers, as well as for a negative b Φ for one of the two tracers.
• Maximising the quantity |b t ′ 1 b t Φ − b t 1 b t ′ Φ | is not enough for to fully exploit the potential of the multi-tracer bispectrum.Increasing as well the second-order bias difference |b t 2 − b t ′ 2 | provides a significant gain in the bispectrum constraints, compared to only increasing • The findings of this work demonstrate a substantial improvement in constraining f NL through the utilization of the multi-tracer bispectrum.However, realizing this benefit necessitates satisfying the criteria outlined in Section 5, similar to the specifications employed by the nominal surveys in this analysis.These criteria involve maximising the number density ratio, as well as the difference between the linear biases, quadratic biases, and the products b 1 , b Φ for each individual tracer.
• The advantage exhibited by the multi-tracer approach over a single-tracer analysis, for both power spectrum and bispectrum, could be maintained in a realistic setup, where the accessible scales can be limited.In particular, a bispectrum multi-tracer analysis could improve single-tracer constraints on f loc NL up to ∼ 50%, for an analysis restricted within the very large linear and not shot-noise dominated scales.
The results presented here do not consider correlations between the various parameters, within the Fisher matrix formalism, but address only the case of the conditional f loc NL error.Correlations are important since the power of the multi-tracer approach can be applied to reducing or breaking the various degeneracies between parameters.In order to fully investigate the parametric behaviour of the multi-tracer technique we need to perform a complete Fisher analysis, including the various parameters of the model.This is the subject of the follow-up paper [73].
We considered only the full shape of the power spectrum and bispectrum.We leave the study of the possible gains when using different types of data compression of the bispectrum for follow-up work.Future work will also investigate the inclusion of relativistic and wideangle effects, which have been shown to be important in the multi-tracer power spectrum and single-tracer bispectrum.

Figure 1 .
Figure 1.Relative change in the cumulative 1σ error on f NL , provided by the multi-tracer (MT) approach relative to each single tracer (ST), as a function of redshift.Power spectrum (red lines) and bispectrum (blue lines) results, are shown for three PNG types.t and t ′ are tracers of the two nominal next-generation surveys considered here.

Figure 2 .
Figure2.Left: The dependence of the 1σ conditional f loc NL errors on the ratio n t /n t ′ , at fixed redshift z = 1.We vary only the number density of sample t, while fixing all other parameters of tracer t, together with those of sample t ′ .Right: Testing the cosmic variance limit, by enforcing equal number densities, i.e. n t = n t ′ = n, and varying n (at z = 1).Both panels show the multi-tracer forecasts (circles) and the results from the simple combination of the two Fisher matrices (squares) (without considering the cross-terms in the data vectors and block covariances), divided by the benchmark results of Fig.1at z = 1, for both summary statistics.

Figure 3 .
Figure 3.The dependence of the 1σ conditional f loc NL error on linear bias difference b t 1 − b t ′ 1 , at fixed redshift z = 1, for the multi-tracer power spectrum and bispectrum.Here we vary only b t 1 , while fixing all other tracer-dependent parameters for both t and t ′ .The results are normalised to the benchmark forecasts of Fig. 1.

Figure 4 .
Figure 4. Same as Fig. 3, where we only vary b t 1 , but we allow all the other bias parameters of tracer t to change in accordance with the dependence of the HOD model on the linear bias, including b t Φ .The relative change, with respect to the benchmark results, is shown as a function of b t ′ 1 b t Φ −b t 1 b t ′ Φ .

Figure 5 .
Figure 5.The ratio σ(f loc NL )/σ(f loc NL ) bench , with values given by the colour bar, in the (b t Φ , b t 1 ) plane, for multi-tracer power spectrum (left) and bispectrum (right).We vary b t 1 and b t Φ independently, without assuming a universality relation, while fixing all other parameters for both tracers.Black dotted lines are contours of constant b t ′ 1 b t Φ − b t 1 b t ′ Φ .

Figure 6 .
Figure 6.σ(f loc NL )/σ(f loc NL ) bench as a of the difference b t 2 − b t ′ 2 , varying b t 2 , while keeping fixed all other parameters for both tracers.Square points correspond to the summation of the two Fishers, while dots represent the full multi-tracer results.

Figure 7 .
Figure 7.The relative change of the multi-tracer compared to single tracers t and t ′ , for the power spectrum (red) and bispectrum (blue), as a function of the maximum wavenumber k max (left panel) and minimum k min (right panel).The f loc NL errors correspond to the cumulative contribution from all redshift bins, while k NL and k f are given in the text under Fig. 1.Note that the benchmark results in Fig. 1 correspond to k max /k NL = 0.75 and k min /k f = 1 respectively.
In fact, any value outside of the range −1 < |b t 2 − b t ′ 2 | < 5 can have an improvement of > 20%.Moreover, increasing the value of |b t 2 − b t ′ 2 | improves the gain on f loc NL as much as, or even more than, the case of an increasing |b t 1