Constraining primordial non-Gaussianity from DESI quasar targets and Planck CMB lensing

We detect the cross-correlation between 2.7 million DESI quasar targets across 14,700 deg2 (180 quasars deg-2) and Planck 2018 CMB lensing at ∼30σ. We use the cross-correlation on very large scales to constrain local primordial non-Gaussianity via the scale dependence of quasar bias. The DESI quasar targets lie at an effective redshift of 1.51 and are separated into four imaging regions of varying depth and image quality. We select quasar targets from Legacy Survey DR9 imaging, apply additional flux and photometric redshift cuts to improve the purity and reduce the fraction of unclassified redshifts, and use early DESI spectroscopy of 194,000 quasar targets to determine their redshift distribution and stellar contamination fraction (2.6%). Due to significant excess large-scale power in the quasar autocorrelation, we apply weights to mitigate contamination from imaging systematics such as depth, extinction, and stellar density. We use realistic contaminated mocks to determine the greatest number of systematic modes that we can fit, before we are biased by overfitting and spuriously remove real power. We find that linear regression with one to seven imaging templates removed per region accurately recovers the input cross-power, f NL and linear bias. As in previous analyses, our f NL constraint depends on the linear primordial non-Gaussianity bias parameter, bϕ = 2(b - p)δc assuming universality of the halo mass function. We measure f NL = -26+45 -40 with p = 1.6 (f NL = -18+29 -27 with p = 1.0), and find that this result is robust under several systematics tests. Future spectroscopic quasar cross-correlations with Planck lensing can tighten the f NL constraints by a factor of 2 if they can remove the excess power on large scales in the quasar auto power spectrum.


Introduction
Single-field slow roll inflation (SFSR) generates nearly Gaussian, nearly scale-invariant primordial fluctuations. Deviations from Gaussianity are of order of the slow-roll parameter, 10 −2 . In fact, the consistency relations for SFSR inflation [1][2][3] state that the squeezed-limit primordial bispectrum follows the "local" shape, with f loc NL = 0.015 for n s = 0.963. In contrast, multi-field inflation models can produce large amounts of local primordial non-Gaussianity, with f loc NL generically around 1 [4]. Hence, a detection of f loc NL 0.01 would rule out single-field inflation, whereas constraining f loc NL < 1 would put significant pressure on multi-field models. In tandem with the tensor-to-scalar ratio measured from CMB B modes, local primordial non-Gaussianity measurements can disfavor significant portions of inflationary parameter space.
The current best constraints on f NL are from Planck's measurement of the bispectrum, f NL = −0.9 ± 5.1 [5]. In what follows, we will drop the superscript "loc" for clarity, but all constraints should be understood as referring to the local shape only. The Planck measurements are close to the cosmic variance limit, and CMB-S4 is only expected to improve the constraint by a factor of 2 [6]. Thus, future CMB experiments will not be sufficient to reach O(1) constraints on f NL . Local primordial non-Gaussianity also induces a scale-dependent halo bias ∝ 1/k 2 , leading to an enhancement in clustering on very large (low k) scales [7,8]. The scale dependence of the bias can also be used to probe interactions in multi-field inflation [9,10] and to measure the mass of massive fields during inflation [11].
Measuring galaxy clustering on very large scales is challenging. The fluctuations are small, and systematic effects can be quite significant, leading to significant added noise power on large scales. Indeed, all LSS f NL constraints to date have been systematics-limited. CMB lensing cross-correlations [33][34][35][36][37][38][39][40][41][42] are therefore an attractive alternative to galaxy autocorrelations. Most sources of noise are uncorrelated between galaxy and CMB surveys [e.g. [43][44][45][46]. Hence, galaxy survey systematics add a noise bias to the galaxy autocorrelation, but only add noise to the cross-correlation, since the cross-correlation covariance is proportional to the square root of the autocorrelation. Therefore, cross-correlations can be quite useful for controlling large-scale systematics [24,46]. CMB lensing is particularly sensitive to high redshifts, where the f NL signal is stronger. Last, the Planck CMB lensing map covers 70% of the sky, critical for reducing cosmic variance on the largest scales. Ref. [29] forecast that LSST-CMB lensing cross-correlations can reach σ f NL < 1, taking into account sample-variance cancellation for future low-noise surveys.
In this paper, we cross-correlate DESI quasar targets with Planck 2018 CMB lensing to constrain f NL solely from the cross-correlation, neglecting information from the highlycontaminated auto-correlation. In Section 2, we describe the theory necessary to compute angular correlation functions at low . In Section 3, we describe the quasar and CMB lensing data, and in Section 4, we describe our angular power spectrum pipeline. In Section 5, we validate our pipeline on mocks, including contaminated mocks to verify that we do not overcorrect when mitigating imaging systematics. Finally, in Section 6, we present the results, and in Section 7 we compare them to previous f NL constraints. Throughout this paper, we fix the cosmological parameters to the Planck 2018 flat ΛCDM model [47] with h = 0.6766, A s = 2.105 × 10 −9 , n s = 0.9665, Ω m = 0.3096, Ω b = 0.049, one neutrino with mass 0.06 eV, and σ 8 = 0.8102.

Theory
Local primordial non-Gaussianity can be parameterized by an additional term in the primordial potential, proportional to f NL [48] where φ G is the initial potential. Correspondingly, non-Gaussianity also increases the density which comes from taking the Laplacian of Eq. 2.1 and setting ∇φ G = 0 since the first derivative is zero at a peak. By changing the height of rare peaks, f NL modifies the response of the halo abundance to a background long-wavelength mode, i.e. the halo bias [7,8,49]. At the threshold for collapse δ c , the peak height is enhanced by 2f NL φ G δ c . Therefore, defining the relationship between the late-time density field and the primordial potential with δ(k) = α(k)φ G (k), the bias is enhanced by where the response p = 1 for a mass-selected sample (hence b − p is simply the Lagrangian bias). For a sample dominated by recent mergers, p = 1.6 is more appropriate. The Laplacian in Fourier space becomes k 2 . Hence α(k) is proportional to k 2 and ∆b(k) is proportional to k −2 , leading to a distinctive upturn in halo clustering on very large scales. More specifically, α(k) is given by where T (k) is the transfer function, D(z) is the linear growth factor normalized to 1 at z = 0, and g(z) = (1 + z)D(z). α(k) is evaluated using the CMB normalization [50], in which Φ refers to the (constant) potential in the matter-dominated era, not the potential at z = 0, which is lower by a factor of ∼ 1.3 due to dark energy. It has been argued the p = 1 is appropriate for luminous red galaxies and star-forming emission line galaxies, and p = 1.6 is appropriate for quasars, which may result from mergers [12,13]. Following Ref. [12,13], in Eq. 2.3 we use δ c = 1.686, fix p to a fiducial value of 1.6, and also test p = 1. In deriving Eq. 2.3, we have assumed universality of the halo mass function to relate b φ to the linear bias. Recent work finds that a significantly different form of the b φ (b 1 ) relation may be possible for various galaxy samples selected from hydrodynamical simulations [51][52][53][54][55]. In light of these theoretical uncertainties, this work may be regarded as placing constraints on b φ f NL , and further theoretical work or additional data is needed to break the b φ f NL degeneracy for the tracer of interest.
We probe ∆b(k) using the matter-galaxy cross-power spectrum, P gm (k). Specifically, since CMB lensing is a 2-dimensional projected field, our observable is the angular crosspower spectrum where dN/dz is the quasar target redshift distribution, and W κ (z) is the CMB lensing kernel with χ the distance to the surface of last scattering. We assume that at low-k, higher-order contributions to the bias are small, so P gm (k, z) = (b(z) + ∆b(k, z))P mm (k, z). We fix the redshift evolution of the bias to the functional form of Ref. [56], b L (z) (similar to other fits from [57,58]), and scale it with a free amplitude b 0 b(z) = b 0 b L (z) = 0.278b 0 (1 + z) 2 − 6.565 + 2.393 (2.7) We also investigate alternatives for the bias evolution in Section 6 and Appendix A.
We also consider the contributions from lensing magnification and redshift-space distortions. In lensing magnification, foreground structure changes the background number density. This is correlated with CMB lensing, leading to an extra contribution to the cross-correlation: where the magnification bias window function W µ (z) is The inner integral runs over the distribution of galaxies, and s ≡ d log 10 n dm [59] is the response of the number density n to achromatic changes in the brightness dm.
Redshift space distortions also contribute to the observed number counts where f (z) is the logarithmic derivative of the growth rate with respect to scale factor, f ≡ d ln D d ln a ≈ Ω m (z) 0.55 , and the spherical Bessel function in Eqs. 2.5 and 2.8 is replaced with its second derivative. Then our final model is C = C κg + C κµ + C κRSD (2.11) Fig. 1 shows the fractional contributions of magnification and RSD (using the fiducial values of b 0 = 1 and s = 0.276), compared to f NL = 50. The magnification term is a considerable fraction (∼ 10%) of the clustering term, and rises in a scale-dependent way that is approximately degenerate with f NL at > 30, emphasizing the importance of low multipoles. The RSD term is very subdominant, only rising to 1% of the fiducial model at < 6. The nonlinear term is only non-negligible at > 200, and it smallness shows that the measurement is well within the linear regime, using large scales at high redshift. Equation 2.5 is a numerically challenging three-dimensional integral over oscillatory spherical Bessel functions. We simplify the problem by splitting the power spectrum into linear and nonlinear regions, P (k) = P lin + (P NL − P lin ) [60]. The redshift and k dependence of the linear power spectrum are separable, P (k, z) = P (k)D 2 (z), allowing us to perform the z 1 and z 2 integrals as 1D integrals, and subsequently the k integral. We handle the spherical Bessel functions using the FFTLog algorithm [60]. For the nonlinear part (P NL − P lin ), we can make the Limber approximation [61] j (kχ) → π 2 +1 δ D ( + 1/2 − kχ), reducing the three-dimensional integral to a one-dimensional integral in k or χ.

Quasar target selection
DESI is a redshift survey targeting 40 million luminous red galaxies, emission line galaxies, and quasars, mainly at 0.5 < z < 2 [62,63]. DESI measures redshifts using 5000 roboticallypositioned fibers across a 7.5 deg 2 field of view [64][65][66]. Three spectrographs with R ∼ 2000−4000 span 360-980 nm. DESI's goal is to determine the nature of dark energy through the most precise measurement of the Universe's expansion history ever made [67]. We use data drawn from both the DESI Early Data Release (EDR) [68] and from the first two months of first year release (DR1). We validate the redshifting and quasar classification pipeline with a small (∼1500) sample of visually inspected quasars from DESI Survey Validation [69], and check these results using the first two months of DR1 over a much wider area with a larger number of targets. DR1 redshift catalogs beyond the first two months of observing are blinded; we therefore only use the first two months of observations. DESI targets quasars as a direct tracer of large-scale structure, as well as backlights for the Lyα forest at z > 2.1. Quasars are selected from DR9 of the Legacy Imaging Surveys [70][71][72] covering nearly 20,000 deg 2 in g, r, and z bands. The Legacy Surveys consist of three independent programs, the Beijing-Arizona Sky Survey (BASS) and the Mayall z-Band Legacy survey (MzLS) in the north; and the DECam Legacy Survey (DECaLS) in the south. Due to varying depths and the non-contiguity imposed by the Galactic plane, we divide the sky into four independent regions (Fig. 2). In the north (δ > 32.375 • ), BASS observed g and r over 5100 deg 2 using the 2.3-meter Bok telescope [70]. z band observations in this region come from MzLS [73], using the 4-meter Mayall telescope. In the south, DECaLS include the publicly available Dark Energy Survey imaging across 5000 deg 2 [74] using the Dark Energy Camera (DECam) [75]. DECaLS subsequently expanded the southern imaging to encompass the entire southern DESI footprint. Because DES was a dedicated weak lensing survey with separate goals, it is substantially deeper than the DECaLS imaging, with typically 4 or more passes of the sky rather than 3. In addition to the optical data, infrared data in W1 (3.4 µm) and W2 (4.6 µm) from the all-sky Wide-field Survey Explorer (WISE) [76] is critical for separating stars and quasars. The unWISE coadds [77] use Tractor [78] forced photometry to improve and deblend WISE flux measurements for optical sources.
The differences in surveys and imaging quality lead us to consider the quasar sample as consisting of 4 disjoint areas: North (MzLS + BASS), DECaLS-North (the northern galactic cap observed by DECaLS), DECaLS-South (the southern galactic cap observed by DECaLS), and DES. As DECaLS-North and DECaLS-South are drawn from the same underlying imaging catalog, they are sometimes considered as one area. However, the stellar density and extinction are quite different between the two regions, and they are also calibrated separately. As a result, each region may show a different relationship between observed quasar density and observational properties such as depth or seeing. In order to test against the impact of these varying imaging systematics, we consider DECaLS-North and DECaLS-South separately.
We also find an unexplained excess power in quasar targets in the DES region, which is not mitigated by systematics weights. This excess power is driven by visually apparent large fluctuations in the quasar density field at δ < −30 • . The δ < −30 • part of DES was found to have slightly stronger systematics trends than the entire DES footprint in Ref. [79]. This is due to a known shift of approximately 0.02 mag in the z-band, where the calibration method transitions from Pan-STARRS1 to Ubercal [72,80]. We find that excluding the 60% of the DES footprint at δ < −30 • reduces the excess low power (Fig. 15), and therefore define DES as only including the region above δ = −30 • . DESI uses quasars' near-infrared excess to separate them from stars, using 5 band photometry (grzW 1W 2) [81]. Quasar target selection is extensively described in Ref. [82], a preliminary version is described in Ref. [83], and target selection for the entire survey is described in Ref. [84]. Angular clustering of quasar targets is measured in Ref. [79]. To maximize the number of quasar targets, a machine-learning Random Forest (RF) algorithm is used to separate stars from quasars. Before applying the random forest algorithm, basic color cuts are applied (16.5 < r < 23, W1< 22.3, W2< 22.3), and targets are restricted to point-like sources ('PSF') in DR9 imaging. The RF algorithm is trained on equal numbers of quasars and stars (332,650 each), using 11 input parameters: the 10 possible colors and r band magnitude. The RF threshold p is a function of r magnitude and varies slightly within the 3 imaging regions North, DECaLS (non-DES) and DES: p(r) = α − β tanh(r − 20.5), with (α, β) equal to (0.88, 0.04), (0.7, 0.05) and (0.84, 0.04), respectively. After selecting quasar targets, we additionally remove targets with any of maskbits 2 1, 8, 9, 11, 12, or 13 set (near optical or WISE bright stars, SGA 3 large galaxies, or globular clusters). The density of quasar targets is strongly dependent on imaging properties, and [79] develops a method to correct for these systematic fluctuations and validates the angular correlation function of the quasar targets. The DESI target selection pipeline is described and documented in [85].
Sky maps of the quasar targets and imaging systematic weights are shown in Fig. 3, and key statistics for the quasar sample are given in Table 1. The angular extent of the usable imaging regions is represented by a random catalog with density 45000 deg −2 . Since the random catalog has features on sub-pixel scales (e.g. excluded areas around moderately bright stars), we measure the random density in NSIDE=2048 HEALPixels [86] to create an imaging completeness mask. We also create binary masks for each of the four imaging regions, and set any pixel with completeness < 80% to zero in the binary masks. To compute the density in each pixel, we divide the quasar counts by the imaging completeness mask in each pixel. After removing low-completeness areas, the binary imaging masks cover 14,719 deg 2 (35.7% of the sky).
Imprints from the various feature maps are clearly visible in the map of systematic weights in Fig. 3. In DECaLS S, regions of high extinction are clearly visible as up-weighted regions; another high-extinction region is visible in the North, near RA 150 • , Dec 75 • . Striping in ecliptic longitude is visible in the DES region, from the WISE scan strategy. Close to the boundary between them, the systematics weights in DeCALS N appear to have much lower resolution than in the North. This is because the systematics weights are dominated by imaging depth and PSF size, which vary from field to field. The northern weights are dominated by PSFISZE_Z, which comes from the relatively small field of view (0.4 deg 2 ) MzLS survey [87]. In contrast, DECam has a 3.2 deg 2 field of view [71], leading to much coarser structure in DECaLS N.
We use DESI spectroscopy from the first two months of DR1 to measure the redshift distribution of the quasar target sample (Fig. 4). The quasar redshift measurement is based on the automated RedRock pipeline [88] supplemented by MgII and QuasarNet [89] afterburners ( Fig. 9 in [79] describes the path to a successful quasar redshift). This pipeline has been validated for high purity (99%) and completeness (95%) with a visually inspected sample of quasars from 3 deep Survey Validation tiles with 7000-9000 s exposure time [90]. These tiles are 7-9 times deeper than the Main Survey tiles, allowing both for true redshifts to be definitively identified, and for the creation of Main Survey like exposures to study the pipeline's performance.

Classifying quasar targets as quasars, stars, or galaxies
The DESI quasar target selection maximizes the number density of quasars, but at the cost of a large fraction of interlopers. Within the deep visual inspection sample of main targets (1361 targets 4 ), 70.9% are quasars, 16.1% are galaxies, 6.3% are stars, and 6.7% are unclassified low-quality objects [90]. We also determine the quasar fraction for the automated redshifts from the first two months of DR1 (396,957 quasar targets).   Table 1. Density of quasars and unmasked area in each imaging region. The bottom two lines give the average density and total area covered by the High Purity sample, and the same statistics for the Main sample used in Ref. [79]. The areas differ because high purity includes our cut on DES to exclude δ < −30 • , whereas main includes the entire DES region.
The spectroscopic quasar classification pipeline is described in Section 3.1 above. Stars are defined as any object with a RedRock redshift < 0.01 that is not included in the quasar catalog. Galaxies are more complex to classify. Most of the quasar targets that are classified as galaxies show strong emission lines, and indeed there is substantial overlap between the quasar sample and the emission line galaxy sample. However, the galaxies show significant diversity, and a minority have strong 4000 Å breaks indicative of an older stellar population. As a result, we blend components of the ELG and LRG selection criteria [91][92][93]. We define a successful galaxy redshift as any object that is not a quasar or a star, which either: 1. meets the [OII] flux-DELTACHI2 criterion of [91]: log 10 (FOII_SNR) > 0.9 − 0.2 × log 10 (DELTACHI2); 2. or has DELTACHI2 > 30; 3. or has [OIII] SNR > 5, for a small minority of objects that have strong [OIII] emission but not much [OII].
We validate this selection on the deep-field VI tiles, using custom coadds reaching the exposure time of the DESI main survey (1000 s). We average our results across the five coadds that were created. Treating the VI classifications and redshifts as truth (and restricting to VI sources with quality >= 2.5), we compute the completeness and purity of quasars, galaxies, and stars ( Table 2). We have on average 933 high-quality VI quasars, 212 galaxies, and 72 stars. We remove targets which do not pass the spectroscopic quality flag COADD_FIBERSTATUS. Our results are slightly different from those of Ref. [90], since we apply extra maskbits 8, 9, and 11 compared to the DESI main selection.
The completeness of the galaxy sample is comparable to the quasar sample, though the purity is 10% lower. But the redshift accuracy of the 10% mis-classified galaxies is sufficient for our purposes, and the good redshift purity is very high. It is key to note that since we are performing a 2D analysis, redshift errors of less than ∼ 0.1 are completely negligible, while they are catastrophic to the DESI spectroscopic survey. Additionally, we do not care if the classifications of the spectra are wrong, only if their redshifts are catastrophically in error (or mis-classified as stars). Of the 116 mis-classified galaxies (summing over the five subsets), 113 are quasars and only 3 (2.6%) are stars. The redshifts are nearly all correct; only 14 (12.1%) are significant outliers with |∆z| > 0.1. Therefore, the good redshift purity, or the fraction of pipeline-selected galaxies with redshifts accurate to within |∆z| < 0.1, is 98.5%. The pipeline redshifts match the VI redshifts extremely closely, with a median offset of -0.06 km s −1 and median absolute deviation scaled by 1.4828 of 9 km s −1 . This is merely a statement that the pipeline and VI redshifts agree extremely well, not a quantification of the absolute redshift error. The absolute redshift error is measured in Ref. [90] by comparing DESI and SDSS redshifts, and is 60 km s −1 across the entire quasar sample. Therefore, the agreement between the pipeline and VI redshifts for the quasar targets classified as galaxies implies that their redshift errors are typical for VI galaxies.
While we use the union of three separate galaxy cuts to create as inclusive a catalog as possible, most good redshift galaxies satisfy all three. 198 of the 211 VI galaxies have both log 10 (FOII_SNR) > 0.9 − 0.2 × log 10 (DELTACHI2) and DELTACHI2 > 30. Allowing for galaxies passing the DELTACHI2 > 30 cut and not the log 10 (FOII_SNR) > 0.9 − 0.2 × log 10 (DELTACHI2) increases the completeness from 93.8% to 94.8%, and additionally allowing for galaxies passing the emission line cut but not the DELTACHI2 > 30 further increases the completeness to 95.2%. Within the VI sample, we do not find any galaxies with only the [OIII] SNR criterion satisfied. However, this cut modestly increases the completeness on a separate VI sample of 86 quasar targets from the main DESI survey with our fiducial ("high purity") flux and photometric redshift cuts applied, as described in Section 3.3 below.
The main limitation of the pipeline classification is that it misses ∼ 20% of the stars, which are classified as bad redshifts. We therefore boost the measured stellar contamination fraction by 28%, following Table 2.
For the main target sample, we obtain 60.2% quasars, 5.4% stars, 16.4% galaxies, and 18.0% unclassified. The unclassified fraction is substantially higher than in the deeper VI data. The quasar fraction is consistent given sample variance and the expectation that the contaminated pipeline recovers 95% of true quasars [90]. The consistent object classification between DR1 and VI validates our automated star/galaxy classification.  Table 2. Purity and completeness of pipeline redshifts, measured using the deep VI spectra (with high-quality VI classification) as truth. The quasar pipeline is described in Refs. [79,90]. The galaxy pipeline is described in Sec. 3.3 and merges elements of the ELG and LRG redshift success criteria. Stars are spectra with z < 0.01 or RedRock classification as stars. The good redshift purity is the fraction of spectra where the redshift is correct to within ∆z = 0.1, regardless of the classification.

Selecting a cleaner quasar sample
The large numbers of stars and unclassified objects are not ideal for this work. Stars imprint complex angular systematics into the sample, since they both directly add to the sample and modulate the density of true quasars (e.g. by spuriously modulating the inferred sky background or directly occulting background quasars). Moreover, the stellar fraction of the sample is difficult to characterize, as it is a strong function of sky position (depending most critically on Galactic latitude and position relative to the Sagittarius Stream), raising the issue that the star fraction may be different in the region covered by DESI spectroscopy and the entire imaging sample (Fig. 2). Unclassified objects pose a problem because they may have a different redshift distribution than the quasars as a whole, biasing predictions for the cross-correlation.
To mitigate these issues, we impose three separate cuts beyond the DESI quasar target selection, which reduce the target density by more than a factor of two but yield a much cleaner selection. We call the full quasar sample the "main" sample, and the lower-density sample the "high purity" sample.
The high purity sample consists of three separate cuts to maximize the fraction of quasars. First, we use the Gaussian mixture model of [94] to separate stars and quasars based on their colors. We reject sources with P star > 0.01 as likely contaminants. We find that this reduces the stellar fraction to 1.0%, increases the quasar fraction to 66.2%, and keeps the galaxy and unclassified fractions similar at 16.7% and 16.0%, respectively. Second, we increase the purity of the sample by removing roughly the faintest quarter of sources in W2, requiring W2 < 20.8. This cut yields 77.4% quasars, 1.1% stars, 9.5% galaxies, and 12.1% unclassified objects. Finally, we also remove the faintest quarter in r band, requiring r < 22.3. This ultimately yields 90.4% quasars, 2.0% stars, 3.6% galaxies, and 4.0% unclassified. While not strictly required, the reduction in galaxies is also helpful, since the galaxies may respond differently than the quasars to imaging systematics, and have a different bias evolution.
We repeat our measurements of the completeness, purity, and good redshift purity using the single-depth deep-field VI data for the high-purity quasar targets ( Table 3). The sample size is smaller, with (averaging over the five subsets) 660 quasars, 20 galaxies and 12 stars. The results are largely similar to the full quasar sample; while the galaxy purity drops to 63%, the good redshift purity remains high at 97%. Due to the small sample of galaxies, we additionally visually inspect 86 main DESI spectra of high-purity quasar targets which are not classified as stars or quasars. 44 of them have high-quality redshifts, and 41 are recovered by the galaxy classification of Sec. 3.2. This sample of galaxies shows more diversity than the main quasar targets, with only 31 VI galaxies ( Table 3. Same as Table 2, but for the high-purity sample. redshifts with DELTACHI2 > 30 only, 2 good redshifts with the [OII] flux cut only, and 2 good redshifts with the [OIII] signal-to-noise cut.Therefore, all three cuts are desirable to maximize the completeness of the pipeline-classified galaxies. The spectroscopic classifications show some variation across imaging regions, with the stellar fraction ranging from 1.6% in the North, to 2.1% in DECaLS N, and 2.7% in DECaLS S. The early DESI spectroscopy does not cover the DES region (Fig. 2). The fraction of unclassified sources also rises with the same trend, at 3.4% (4.0%, 4.6%) in North (DECaLS N, DECaLS S). As a result, the quasar fraction drops from north to south: 91.1% (90.2%, 88.9%) in North (DECaLS N, DECaLS S).
Finally, we use the visual-inspection results from the full-depth deep coadds to check our measurement of the sample purity. Of 721 sources with deep VI data, 94.7 ± 3.6% are quasars, 2.8 ± 0.62% are galaxies, 1.7 ± 0.48% are stars, and 0.9 ± 0.34% are unclassified (errors from Poisson distribution). This is perfectly consistent with the fractions measured from the first two months of DR1, but with a much smaller fraction of unclassified objects. This implies that most unclassified objects are quasars, and they are not biasing our estimate of the stellar fraction. Fig. 4 shows dN/dz derived from DESI spectroscopy of 194,473 quasar targets passing the "high purity" selection cuts. For the 4.0% unclassified sources, we use photometric redshifts from [94], which are specifically tailored for high-redshift galaxies and quasars. Fig. 5 shows how these photometric redshifts compare to the DESI spectroscopic redshifts of the quasar targets classified as quasars or galaxies, both in the redshift bias (median redshift difference) and normalized median absolute deviation, σ NMAD = 1.48 * median(|δz|/(1 + z spec )).

Redshift distribution
The comparison in Fig. 5 is very similar to Figs. 8 and 10 in Ref. [94]; they also find significant biases towards negative ∆z for faint objects at low redshift, and biases to positive ∆z for 2.5 < z < 4 quasars. We also show the same comparison for DESI redshifts that are classified as bad, showing that the bad DESI redshifts are less correlated with the photometric redshifts and are significantly bunched at known problem redshifts around z ∼ 0.5 and z > 1.6. Finally, we use the deep-field VI to study the relationship between Ref. [94] photometric redshift and true spectroscopic redshift from VI. The sample is quite small, only 64 spectra with good VI and good COADD_FIBERSTATUS. 16 of these pipeline-classified galaxies are classified as stars: this is expected from the known incompleteness of the stellar classification and the relatively high stellar fraction of the main sample. This fraction will be lower for the high-purity sample, which has a lower fraction of stars to begin. For the remaining galaxies, we plot the performance of the photometric redshifts in the right panel of Fig. 5, finding good performance at 1 < z spec < 2 where 73% of the bad redshifts lie, and similar degradations at high and low redshift as the overall sample. This justifies our choice to use the photometric redshifts of Ref. [94] for the 4.0% bad redshifts. Nevertheless, because of the small fraction of unclassified objects, the dN/dz from spectroscopic DESI redshifts alone is nearly the same as the dN/dz including the photometric redshifts. We also find that the redshift distribution is very similar across the different imaging regions. Moreover, we can use the VI data to make a noisy measurement of dN/dz (due to the small number of VI sources), which is also perfectly consistent with the dN/dz from the first two months of DR1.
The redshift distribution can be used to interpret the clustering measurement at an effective redshift [39,95] slightly different from the mean redshift. The high-purity sample has an effective redshift of 1.51 using the fiducial redshift distribution. DESI is a multi-pass survey, with quasars targeted at highest priority in the first pass. Therefore, we expect early (single-pass) DESI data to have a complete sampling of quasar redshifts. We verify that the quasar redshifts are complete by removing 5% (25%) of targets with the lowest spectroscopic signal-to-noise. The fraction of unclassified spectra barely changes, from 4.0% to 3.7%, and overall the quasar fraction increases from 90.4% to 90.5%. Similarly, the mean redshift stays constant at 1.621 when removing the lowest 5% (25%) in signal-to-noise. We conclude that the spectroscopic completeness of the early DESI data is sufficient for this work.

Imaging systematics weights
To obtain the cleanest measurement of large-scale modes, we also must remove contaminating power from the quasar maps. We use the linear regression method [96][97][98][99][100][101][102][103], as developed in the Regressis software package [79]. 5 This is the same method that was used for the full quasar target sample, but we must re-measure the weights since we have substantially modified the selection. We use the same set of 11 imaging maps as in [79]: Figure 5. Comparison between DESI redshifts and photometric redshifts from Ref. [94]. Left: Comparison between photometric and DESI spectroscopic redshifts for quasars and galaxies. The overplotted red points are objects without a good DESI redshift, which exhibit significant spurious structure in their DESI redshift distribution. Right: Median bias and normalized median absolute deviation (σ NMAD ) for the quasars and galaxies, and for a randomly shuffled version of the same sample. We also show the median bias for 48 VI targets classified as bad redshifts by our pipeline (not stars, galaxies, or quasars), and not identified as stars by VI, comparing the photometric redshifts to the true spectroscopic redshifts from deep VI (red points).
Planck 2018 lensing map Figure 6. Left: Planck 2018 CMB lensing map, with masked regions shown in gray, in Galactic coordinates. Right: Monte Carlo normalization correction that must be applied to the raw Planck lensing map. This is estimated from the 300 Planck lensing mocks, by dividing the autocorrelation of the true lensing map by the cross-correlation between the true map and the reconstructed map. The cross-correlation is measured separately for each of the four regions, and we find that the lowscale-dependence is different in different regions.
• Stellar density, as defined by the density of point sources from Gaia DR2 with 12 < PHOT_G_MEAN_MAG < 17 [104] • Extinction E(B-V), from [105] and [106] • Sagittarius Stream catalog from [107], with known quasars removed and cut to r > 18 • 5-σ point source depth in g, r, z, W 1, and W 2 magnitudes Region Imaging templates removed North STARDENS, EBV, PSFDEPTH_G, PSFDEPTH_R, PSFDEPTH_Z, PSFDEPTH_W2, PSFSIZE_Z DECaLS N STARDENS, EBV, PSFDEPTH_R, PSFSIZE_Z, SGR_STREAM, PSFSIZE_R DECaLS S EBV DES EBV, PSFDEPTH_W1, PSFDEPTH_W2 Table 4. Imaging templates removed from each region to minimize extra low-noise from angular systematics. We limit the number of imaging templates removed to prevent overfitting, which will decorrelate the quasar field from the true density field and remove power at low . A different set of templates is removed from each region; in particular, only E(B-V) is removed for DECaLS S.
• PSF size in g, r, z. Due to the 6" resolution of WISE, essentially all galaxies are point sources in W1 and W2.
We measure correlations between quasar target density and the systematics maps in HEALPixels of NSIDE=256 (Figs. 7 to 10). Regressis works directly with the counts and template maps in pixel space (rather than binning by systematic property as in Figs. 7 to 10), using Poisson errors on the quasar counts in each pixel as an inverse-noise weight.
Overfitting is a major concern for this analysis; if too many templates are used, or a very flexible regression model is employed, the weights may accidentally remove power from the true density field at low . We use mock catalogs to validate the regression procedure and templates used, as described in Section 5. We use linear regression (as implemented in the regressis package) and treat each of the four regions separately. The templates used in each region are listed in Table 4 and are justified with the contaminated mocks presented in Section 5.1. While these weights do a reasonable job of mitigating trends with systematics, some trends still remain-most notably in DECaLS South, where overfitting has the most severe impact and we are limited to using only a single template.
The weights are defined as the inverse of the relationship between the imaging systematics and the quasar target density field. Figs. 7 to 10 show that the weights reduce the relationship between quasar target density and imaging systematics.

Magnification bias slope
We measure the magnification bias slope s ≡ d log 10 N dm by perturbing the input Legacy Survey photometry by a uniform offset in all bands, re-running quasar target selection, and measuring the change in number density. We try both increasing and decreasing all magnitudes by 0.05, and find s = 0.2751 when we make the input sources fainter, and s = 0.2777 when we make the sources brighter. Given the consistency between these measurements, we use the mean value, s = 0.2764.
Since lensing preserves surface brightness, brightening or dimming is solely caused by a change in the object's size. Our procedure thus assumes that flux measurements capture all of the light from the quasar target. While this is not a good assumption for galaxies [108,109], it is reasonable for the quasars, which are dominated by point sources. The selection requires that the quasar targets are identified as point sources in DR9 imaging, and less than 4% of the sample are galaxies. Moreover, quasar target selection is entirely based on total magnitudes rather than fiber magnitudes, which are measured within a fixed angular aperture. It is therefore a very good approximation to treat lensing magnification as entirely brightening or dimming quasar flux.  This procedure measures the slope using the observed (post-magnification) magnitude distribution, rather than the true, pre-magnification distribution. Magnification will scatter quasars both above and below the flux limit; given that this is a small effect overall, the change in slope induced by magnification is tiny.
The slope varies slightly in the different imaging regions (which have slightly different selection functions), and we use the value of s appropriate for each region. For the North region, we find s = 0.2880, for DECaLS-NGC we find s = 0.2767, for DECaLS-SGC we find s = 0.2778, and for DES we find s = 0.2548.
Our value of s is quite consistent with previous measurements for quasars (s ∼ 0.3), either from the faint end of the luminosity function [110][111][112][113] or from similar methods where the selection is re-run after perturbing the fluxes by a small amount [114]. This agreement is expected based on the similarity between the quasar samples and the selection methods used, and provides a good validation of our method to measure the slope.

Stellar contamination
Approximately 2% of the DESI quasar targets are classified as stars. Provided that the stars are uncorrelated with CMB lensing, they add an additional source of uncorrelated noise and   therefore lower the measured power spectrum Therefore we write the scale dependent bias as In this expression, b is the true bias. The small-scale power spectrum measures b/(1 + f ); by fixing f we can therefore measure the true bias b. Likewise, 1 + f is included in the denominator of the scale-dependent bias. An error on f will change b, but will have a smaller effect on f NL since b − p is divided by 1 + f in the scale-dependent bias term. We use a fixed f of 2.6%, which corrects for the redshift pipeline's incompleteness in categorizing stars.

Planck CMB lensing map
We use the 2018 Planck CMB lensing map [115] and its associated masks and simulations, downloaded from the Planck Legacy Archive 6 (Fig. 6). We convert the provided spherical harmonic coefficients of convergence, κ m , into an NSIDE=2048 HEALPix map in Galactic  coordinates. We use the minimum variance (MV) reconstruction from both temperature and polarization, using the SMICA foreground-cleaned CMB map. This map does not have the Monte Carlo multiplicative correction applied (Eq. 10 in [115]). Following Ref. [116], we apply this correction by measuring on the 300 FFP10 simulations the ratio of the true lensing power spectra to the cross-correlation between the input and reconstructed lensing maps, masking both in each of our 4 regions (Fig. 6). We then multiply the measured cross power spectra by this factor (binned with ∆ = 5 and then linearly interpolated), which is generally small (∼5%) at high , but becomes large on large scales. At the precision of this analysis, biases due to CMB foregrounds are expected to be small [117][118][119]. To verify that we aren't affected by foregrounds, we also cross-correlate with a lensing reconstruction from the CMB temperature map with the thermal Sunyaev-Zel'dovich (tSZ) effect explicitly nulled (tSZ-free). In addition to testing contamination from the tSZ effect, the tSZ-free map will also be differently sensitive to other foregrounds, such as Galactic dust, which could be correlated with the quasar map.
At very low where f NL sensitivity is maximal, the mean-field correction, which must be obtained from simulations, is 100 to 1000 times larger than the CMB lensing signal (Fig. B1 in [115]). The Planck lensing cosmology analysis excludes < 8 due to concerns about the number and fidelity of the simulations used to derive the mean field. However, they note that these very low multipoles do not explicitly look anomalous, except for = 2. Unlike the auto-spectrum, where an underestimated mean field would directly bias the recovered power spectrum, in the cross-correlation it will only affect the covariance. Hence our analysis is less  sensitive to potential low issues, and we use min < 8 in our fiducial analysis.
We determine min by cross-correlating the Planck lensing map with the 11 quasar systematics templates. We measure the cross-correlation in bins of ∆ = 10 starting at min = 4 or 6. For each template, we subtracted its mean across the union of the four imaging masks. For the stellar density and Sagittarius Stream templates, we turned them into overdensity maps (since these are stellar counts or normalized stellar counts respectively). Errorbars are computing using the Gaussian diagonal approximation, which works well for this case. To avoid instabilities in the mask deconvolution, we measure the convolved power spectrum and divide by f sky = 0.45.
While we find no significant correlations between the Planck lensing map and the systematics templates, we do find marginal anti-correlations in the lowest bin (4 ≤ < 13) between the lensing map and PSFSIZE_G, PSFSIZE_R, PSFSIZE_Z, PSFDEPTH_W1, PSFDEPTH_W2, and STARDENS. These anti-correlations are generally significant at 2-2.5σ. We note that some of these maps are very similar (particularly G and R sizes, and W1 and W2 depths), so these are not independent measurements. As an example, we show the signal-to-noise of the cross-correlations with PSFSIZE_R and PSFSIZE_Z in Fig. 11.
We translate these measurements of C κ,syst into C κg using the slope dδg dsyst from Figs. 7 to 10. We use either the linear regression slope from Figs. 7 to 10 (after weights are applied), or the 95% upper bound in the case that the slope is not significantly detected. We find that the estimated bias to the cross-power spectrum can exceed half of the statistical error on C κg at 4 ≤ < 13 in some of the 4 regions. The most problematic templates are PSFSIZE_R and Figure 11. Top: Signal to noise ratio of the cross-correlation between the Planck lensing map and r-band PSF size (left) and z-band PSF size (right). An anti-correlation is marginally detected in the first bin (4 ≤ < 14) and is less significant when the bin is changed to 6 ≤ < 16. Bottom: Estimated impact on C κg from the systematic cross-correlation, using the "systematics-correction" slope from Figs. 7 to 10 to propagate from r or z PSF size to δ g . We use the slope from DES and these 2 systematics maps because they show the largest impact on the low bin, nearly 0.5σ. However, the impact on C κg is considerably reduced when we use min = 6 instead. PSFSIZE_Z in DES. Switching to min = 6 reduces the significance of the cross-correlation. In this case, none of the systematics correlate with Planck lensing at > 2σ in the first bin. Moreover, the impact on C κg is also reduced by 30-50%. Therefore, with min = 6, the upper bound on biases in C κg is < 0.3σ.
Because the cross-correlation covariance may be under-estimated at < 8, we present our results using both min = 6 and min = 8, which has a 15% larger f NL error. Figure 12. Bandpower window functions that must be multiplied by the theory curve to produce a binned theory prediction. Due to mask convolution and deconvolution, the binning is not a perfect tophat, but rather contains some contributions from multipoles outside the desired range.

Mask deconvolution
The presence of masked regions of the sky causes the measured power spectrum to differ from the theoretical expectation. The measured power spectrum is approximately scaled by f sky , and neighboring modes are coupled. On small scales, it is adequate to simply divide the measured power spectrum by f sky to account for the mask. However, at low , the mask also changes the shape of the power spectrum, and a more correct treatment is needed.
We use a pseudo-C estimator [120,121] with the measured power spectrum given bỹ The expectation value of the pseudo-C is related to the true C through the mode coupling matrix: We use the MASTER algorithm [120] as implemented in NaMaster [121] to deconvolve the mask from the windowed power spectra. While the unbinned mode coupling matrix is often singular, MASTER bins the power spectrum to make the (binned) mode coupling matrix invertible. Then the binned, deconvolved bandpowers are estimated by multiplying the measured bandpowers by the inverse of the binned mode-coupling matrix: To account for the binning and unbinning, a bandpower window matrix must be applied to the theory curve We bin the quasars into NSIDE=2048 HEALPixels in Galactic coordinates [86]. We use bins of ∆ = 2 until = 20, ∆ = 5 until = 100, and ∆ = 20 thereafter. The resulting bandpower window functions are shown in Fig. 12. Since we are concerned with very low , we do not correct for the HEALPix window function, which is nearly one over the entire range of scales considered. We use separate masks for the CMB lensing and galaxy fields. To create the CMB lensing mask, we multiply the binary Planck lensing mask by the binary region masks, and then apodize each region's lensing mask by smoothing with a 1 • FWHM Gaussian. The product of the two masks yields an f sky = 0.3364.
We verify that our pipeline can correctly recover the angular power spectrum in the presence of a mask using 100 Poisson-sampled Gaussian simulations. We begin by creating correlated Gaussian quasar and CMB lensing fields. We then multiply each pixel of the mock quasar map by the imaging completeness mask. Next we Poisson sample the resulting field (after setting any negative pixels to zero), and then divide by the imaging completeness map to account for the partial coverage of each pixel. From this map, we compute δ within the four imaging regions, apply the binary region masks, and measure C κg in the four regions. As in the data, we do not consider any pixel with imaging completeness < 80%.
We find that the average C κg from these 100 simulations is statistically identical to the bandpower-convolved input theory spectrum (Fig. 13). The scatter in Fig. 13 is consistent with Gaussian noise: we find χ 2 = 69.1, 50.1, 62.6, and 75.3 over 60 bins from 0 < < 300 for the North, DECaLS N, DECaLS S, and DES regions, and with no significant outliers. The mean ratio is 0.9955 ± 0.0012, 1.0029 ± 0.0013, 1.0048 ± 0.0025, 0.9912 ± 0.0040 at < 300. While the deviation in the North is significant at 3.75σ, and the means of the other regions are ∼ 2σ away from one, these deviations are tiny compared to the statistical errors on the bandpowers, so we do not apply any correction after multiplying by the bandpower window matrix.

Covariance matrix
If there is no mask and the underlying field is Gaussian, the covariance matrix for the binned bandpowers is given by In the cut-sky case, if the bandpowers are sufficiently broad that the off-diagonal terms are small, the covariance can still be well-approximated as diagonal, but with the number of modes in each bin reduced by f sky ("Knox formula") [120,122]: However, our measurement is not in this regime, due to the small width of the bins used and the large correlation between neighboring bins, particularly at low . We therefore primarily rely on a mock-based covariance and check it with a Gaussian analytic covariance from Na-Master [123]. The mock-based approach has the additional advantage of correctly accounting for the excess noise arising from systematic power in C gg at low .

Excess low-power in quasar autocorrelation
The quasar auto-correlation disagrees significantly with theoretical expectations at low (Fig. 15). Due to strong excess power at low , mask deconvolution in NaMaster yields negative binned bandpowers at low . To prevent this issue, we instead plot the convolved bandpowers, averaged over bins of width ∆ = 10 and starting at min = 2, divided by f sky for each sample. Modifying the quasar sample to remove likely stars and faint quasar targets significantly reduces this excess power (blue to red line in Fig. 15). Applying systematics weights also reduces the low-excess power in all samples except DES (dashed lines). As discussed in Section 3.1, for DES we must also remove the region below δ = −30 • to reduce the large-scale power. Nevertheless, there is still significant excess power at low-even after applying the systematics weights. This is possibly due to the clustering of the residual stellar contaminants, f 2 star C . Measuring the stellar power spectrum using Gaia stars with 12 < G < 17 and assuming f star = 0.026, f 2 star C exceeds the C gg theory curve at < 20. If the stars' only impact is extra power, it can be removed by linear regression; however, stars may also reduce the density of nearby quasars by occulting them or raising the effective sky background. Disentangling these competing effects can be difficult with linear regression. We try to mitigate this effect by first subtracting an estimate for the stellar contamination, f starnqso,tot /n N , before applying the systematics weights (wheren qso,tot is the total observed density of quasar targets, including stars). However, this does not reduce the excess power in the unweighted low-power spectrum (similar to the findings of [79]). This perhaps suggests that a different stellar template is neeeded, since the stellar contaminants occupy a very particular region in color space that may have a different spatial distribution from all stars. Using only samples of spectroscopically confirmed quasars is another option, which will soon be available when early DESI data covers a sufficiently large fraction of the sky.
Given the significant excess power at low , we conclude that the DESI quasar targeting data is not sufficiently clean to use the low-autocorrelation. This is similar to previous work constraining f NL from quasar samples [18,20], who concluded that the quasar samples should only be used in cross-correlation, where correlated systematics are much less likely (though see [21] for an attempt to fully remove clustering systematics in the quasar autocorrelation).

Mock-based covariance
To estimate the covariance (shown in Fig. 14), we follow Ref. [124] and cross-correlate CMB lensing maps from 300 realistic Planck mocks (after applying the mean-field correction) with the galaxy maps from the data. Despite lacking correlations between the CMB lensing field and the galaxies, this approach correctly accounts for the excess noise arising from systematic power in the quasars at low . It leads to two sources of bias compared to the correct covariance matrix. First, because the galaxy and CMB lensing fields are not correlated, the measured covariance does not contain the (C κg ) 2 term, the "correlation bias." However, the covariance is dominated by the product C gg N κκ (> 5× larger than (C κg ) 2 ), so the correlation bias is < 5% of the estimated covariance. Second, there is a "realization bias" arising from the fact that we use only one realization of the galaxy field rather than many. In Ref. [124], the realization bias was argued to be small, but this argument relies on the fact that their broad bins average over many modes. This is not the case for our low bins, which are crucial in constraining f NL . We estimate the realization bias from Gaussian mocks and apply the correction factor to the covariance matrix.
To estimate the realization bias, we begin by generating 300 Gaussian mock skies with correlated κ and galaxy fields. We compute the Gaussian covariance C full from these 300 mocks. We then generate a further single uncorrelated galaxy field, measure its crosscorrelation with the 300 κ mocks, and compute the single-realization covariance C i . The realization bias is then the ratio of the matrices C full to C i . These matrices are diagonal dominated but still have significant off-diagonal structure. If the matrices were diagonal, the realization bias would be simply defined as the ratio of their diagonal elements. However, their off-diagonal elements may be different as well.
We enforce that the single-realization covariance matrix derived from the Planck simulations has the same eigenvectors as the covariance matrix from the 300 Gaussian simulations. We define the eigenvectors Q full and eigenvalues Λ full We then rotate C data and C i into the basis defined by Q full We scale Λ data, full by the ratio of Λ full /Λ i,full and neglect its off-diagonal elements We then rotate the basis back to find C data, rb We test this method on simulations and find that it adds negligible variance (≤ 5%) compared to the statistical error. We show the realization bias in the right panel of Fig. 14 Table 5. Fisher forecasts and constraints from noiseless mocks. Fisher forecasts are run for three cases: using the theory curve for C gg in Eq. 4.7; adding realistic low-noise, partially mitigated by systematics weights (dashed red line in Fig. 15); and adding low-noise without any systematics mitigation (solid red line in Fig. 15). For the noiseless mocks, we use both the fiducial mock-based covariance, requiring a multivariate t-distribution likelihood to account for uncertainties in the covariance, and an analytic Gaussian covariance. We also show the impact of applying a correction factor for residual overfitting, as measured from C κg of 100 contaminated mocks (Fig. 19).

Contaminated mocks
Given the strong contamination in the quasar catalogs, systematics weights are crucial in reducing low-noise in the CMB lensing cross-correlation and thus tightening f NL constraints (Fig. 15). However, over-subtraction of systematics templates can decorrelate the quasar map with the true density field at low , removing precisely the cosmological information that we need to constrain f NL . We therefore create a set of contaminated mocks with similar systematics trends and contamination power as the data, but with a correlated CMB lensing field. We generate systematics-correcting weights and measure the cross-correlation of the systematics-corrected, contaminated mocks with the mock CMB lensing maps, to ensure that we can recover an unbiased cross-correlation on all scales. We use this test to determine whether to use a linear or nonlinear (Random Forest) systematics correction, and to determine the number of imaging templates that we can remove before incurring a bias in C κg on large scales. We find that Random Forest generically removes real power at low even with a very small number of templates, and therefore use linear regression instead. These tests justify our choices of imaging templates to remove in Table 4.
We create contaminated mocks by generating Poisson-sampled Gaussian mocks with number density 10 times higher than in data. We down-sample the mocks so that the average overdensity in any NSIDE=256 pixel is proportional to the inverse of the imaging systematic weight. This yields mocks with the same imaging systematics trends as the data, while also preserving the correlated fluctations from the mock density field (Fig. 16 and 17).
The weights w i used to contaminate the mocks are produced using the Random Forest (RF) method with all 11 templates, and are thus different from the weights that we use for the f NL analysis on data (from linear regression with templates given in Table 4). The random forest weights are useful for this test because they ensure fully realistic contaminated mocks, and also allow for the existence of extra systematics in the contaminated mocks beyond the templates that we remove. This mimics the situation in the real data, where we don't know the correct imaging templates to remove. Fig. 18 validates the contaminated mocks: the auto-power spectrum C gg qualitatively matches the data power spectrum both before and after applying weights.
We then process each contaminated mock in exactly the same way as the data: linearly regressing against the contaminant templates in Table 4, applying the resulting systematic weights, and measuring the cross-spectrum with the correlated CMB lensing field.
In Fig. 19, we show the ratio between the mean of 100 contaminated mocks with systematic weights applied, and the mean of 100 uncontaminated mocks. We compare a variety of methods with linear or Random Forest regression, and a varying number of templates used. We use jackknife resampling to derive the errorbars shown in the figure. This figure justifies our choice of regression method and imaging templates in Table 4.
The solid red lines in Fig. 19 show that Random Forest is unsuitable for this work because it leads to a significant drop in C κg at < 50 measured after systematics correction. To test if this result is sensitive to the input weights used to generate the contaminated mock, we tried using neural net input weights rather than random forest, and found very similar results (black solid line). To mitigate overfitting, we first tried Random Forest with a subset of the templates, removing PSFSIZE_G, PSFSIZE_R, SGR_STREAM, and PSFDEPTH_W1. The first three templates have the least-significant correlations with the density field, and we remove PSFDEPTH_W1 because it is very degenerate with PSFDEPTH_W2. However, Random Forest with the smaller set of 7 templates still does not recover C κg well on large scales (dashed red lines).
A linear regression with all 11 templates also leads to overfitting, but less significantly (dashed blue lines). To determine which templates to use in the regression, we used a Regressis output called permutation importance. Permutation importance is defined for each feature under consideration and measures how significantly that feature contributes to the fit. It randomly permutes the elements of that feature, measuring the change in goodness of fit (i.e. χ 2 ), and averages this over many random realizations. Therefore, it measures how much the fit is driven by each parameter.
We started by removing PSFSIZE_G, PSFSIZE_R, SGR_STREAM, and PSFDEPTH_W1, which have both low correlation with the contaminated density field and low permutation importance. With the restricted set of 7 templates and linear regression, we found adequate recovery of C κg in the North region, but overfitting in the other three regions.
In the other three regions, we removed the templates with the lowest permutation importance. We also needed to add back SGR_STREAM and PSFSIZE_R for DECaLS N, and PSFDEPTH_W1 for DES, which had relatively high permutation importance. Indeed, PSFDEPTH_W1 had the largest permutation importance of any template in the DES region. We ultimately settled on the template set in Table 4 (solid blue lines in Fig. 19), which struck the right balance between removing the most important and correlated features (and thus removing low-power in C gg ); and avoiding overfitting from too many templates or too complicated a regression method.
With this template set, Fig. 19 shows we can correctly recover C κg at all multipoles. Over 60 bins at < 300, we find χ 2 = 54.8, 61.7, 63.9, and 63.3 for North, DECaLS N, DECaLS S, and DES, respectively. We find unbiased recovery of C κg , with the mean of ratio 0.9959 ± 0.0024, 0.9970 ± 0.0018, 0.9924 ± 0.0054, and 0.9898 ± 0.0080. Fig. 19 shows that we can accurately recover C κg for contaminated mocks with f NL = 50 (dotted lines). Indeed, the ratio between the simulated and recovered C κg is very similar between the f NL = 50 mocks and the default f NL = 0 mocks. Fig. 19 also shows that we can accurately recover the cross-correlation for the z phot > 2 sample, with a slightly different set of systematics templates as described in Appendix A.  Table 6. f NL measurements from the contaminated mock C κg in Fig. 19. For each true value of f NL , the left columns give the mean and standard deviation of the median marginalized f NL from each of the 100 mocks. The right column gives the median marginalized f NL from a fit to the mean of the 100 mocks, with the covariance scaled down by a factor of 100. The Fisher column is a Fisher forecast for the f NL error, using the setup of the contaminated mocks (no lensing noise and 10 times higher number density than the data).
We propagate these C κg measurements on contaminated mocks to f NL constraints in Fig. 20. For each mock, we measure f NL using the analytic covariance matrix for each region and flat priors on f NL between -500 and 1000 and on b 0 between 0.5 and 2. We measure both the distribution of median marginalized f NL and b 0 from the 100 mocks individually, and f NL from the mean of the 100 mocks, with the covariance scaled down by a factor of 100. Fig. 20 shows the distribution of the marginalized median f NL from all mocks, and Table 6 gives key summary statistics. As the f NL error gets large (as for DES and DeCALS S), we find that some simulations can scatter to very high values of f NL , biasing the mean recovered f NL and even the median. However, if we fit f NL to the average power spectrum from the 100 simulations and scale down the covariance, the long tails in the PDF are suppressed and the fitted value is much closer to the input value.
We find that we can successfully recover the input value of f NL for both f NL = 0 and f NL = 50 mocks, with biases ∆f NL 10. This is acceptable given our statistical error bars of σ f NL = 40. These mocks are created with less noise than the data, to allow us to use fewer mocks to test the C κg pipeline. Therefore, the standard deviations of the median f NL are somewhat smaller than the data errorbars in Table 7 or the Fisher forecasts in Table 5. To check the simulations, we determine the Fisher errorbars for the mock setups: no CMB noise and 10 times higher number density than in the data. These agree reasonably well with the f NL standard deviation in the contaminated mocks for f NL = 0. As expected, f NL = 50 has a higher standard deviation because the true C gg and C κg are larger, increasing the covariance. Finally, we find that the recovered b 0 matches the input extremely well, with deviations of 2%.

Noiseless mock tests
We estimate an error budget for our measurement by comparing our constraints to a Fisher forecast. In the Fisher forecast, we use the Knox formula (equation 4.7) for the covariance, set min = 6, and test both including and excluding N syst in the C gg term. Table 5 shows constraints on f NL using noiseless mocks with the same covariance as the data. In this case, the "noiseless mock" is the f NL = 0 theory curve multiplied by the bandpower window functions of Fig. 12. The fiducial covariance matrix is measured from 300 Planck mocks, with the realization bias correction as described in Section 4.2. To account for uncertainties in the covariance matrix due to the finite number of mocks, we use the multivariate t distribution rather than a Gaussian distribution [125]. Throughout this work, we quote the median of the marginalized posterior and distance to the 16th and 84th percentiles as the upper and lower errorbars.
We find good agreement between the Fisher errorbar (including the additional lowsystematic power, i.e. C gg from the dashed red line in Fig. 15) and the errorbar from the noiseless mocks (σ f NL = 47 vs. 52). The constraining power is significantly degraded by the extra noise, with σ f NL increasing by 50%. On the other hand, if we didn't apply any systematics weights at all, σ f NL would be 60% higher (σ f NL = 65). To test the covariance matrix, we replace the mock-based covariance with an analytic covariance from NaMaster (switching back to a Gaussian likelihood); we find very similar f NL constraints. Furthermore, we find that if we do not apply the matrix rotation of Section 4.2 and instead use the "raw" mock-based covariance, the combined f NL constraint changes from −7 +55 −48 to −5 +53 −46 , so this rotation has a very small impact on the constraints. We also test the simpler diagonal Knox formula for the covariance (Eq. 4.7), which underestimates the f NL errorbar by 10%, due to the strong correlations between the narrow low-bins (Fig. 14). Therefore, we use the Planck mocks for our fiducial covariance matrix.
Finally, we test the impact of potential biases in C κg recovery from linear regression of the systematics templates. While we do not detect a significant difference between the input and measured C κg after applying the systematics weights (Fig. 19), it is possible that the contaminated C κg does have a bias that is smaller than the errorbars. Therefore, we apply a correction factor to the noiseless mock data vector, defined as the ratio between the input and output C κg in Fig. 19. This leads to almost no difference on the f NL constraint, with only a 1% increase in the errorbar. We therefore conclude that any residual overfitting does not affect our measurement.

Measurement on data
The measured C κg is shown in Fig. 21 and compared to a theory curve with f NL fixed to zero. We detect the cross-correlation with total signal-to-noise (S/N) 26.2 at < 300 combining all 4 regions. Signal-to-noise is defined as the difference between the χ 2 with respect to no detection, and the best-fit In the individual regions, the cross-correlation is detected at S/N 14.3, 15.9, 13.2, and 7.5 for North, DECaLS N, DECaLS S, and DES, respectively. Table 7 describes the marginalized f NL and linear bias constraints from data. All bias constraints are presented in terms of the scaling amplitude b 0 defined in Eq. 2.7, and the bestfit measurement of b 0 = 1.09 corresponds to b(z eff = 1.51) = 2.38. We find f NL = −26 +45 The posterior on f NL is notably asymmetric (Fig. 22). The asymmetry is worse in the individual regions, where the constraints are poorer. For the combined constraint, a Gaussian provides a poor fit to the likelihood, and it is best to use the full PDF rather than a summary statistic.
We perform a number of systematics tests on the data, and describe the results in Tables 7 and 8. Table 7 first shows tests in which we use the fiducial data vector but change the likelihood or theory, verifying that the results are robust to changing from the mock-based to analytic covariance. If we use p = 1 instead of p = 1.6, the constraints become tighter (σ f NL = 43 to 28) since the response to f NL is stronger. Likewise, using σ 8 = 0.77 [e.g. [38][39][40][126][127][128][129][130] raises the linear bias and therefore also tightens constraints on f NL to −25 +39 −35 . On the other hand, using an alternative bias evolution which better fits the high-redshift quasar clustering (Appendix A) lowers the high-redshift bias and therefore weakens the f NL constraints, f NL = −54 +51 −45 . Varying the redshift distribution within the options plotted in Fig. 4 yields tiny changes in the median marginalized f NL . Changing to the "spectroscopic only" dN/dz changes f NL by ∼ 0.1. Using the dN/dz measured in North or DECaLS S changes f NL by ∼ 4, whereas using dN/dz from DECaLS N changes dN/dz by ∼ 0.2. The largest change is from using the VI redshift distribution, which shifts f NL by ∼ 9. However, most of this shift is due to the larger noise in the VI dN/dz, which has nearly two orders of magnitude fewer quasars. Therefore, uncertainty in the redshift distribution has an impact of ∆f NL 5.
The main cosmological parameter that matters for the f NL inference is σ 8 , since it controls the amplitude of the power spectrum (and hence the bias). The other parameter that affects the amplitude is the neutrino mass, which changes the relationship between large and small scales by creating a step in the power spectrum at k ∼ 0.1 h Mpc −1 . This affects our f NL measurement, which relies on measuring the bias on k ∼ 0.1 h Mpc −1 scales to calibrate the theory prediction for f NL on large scales. Hence, changing the neutrino mass from its fiducial minimum value of 0.06 eV modestly shifts f NL . Fixing m ν to 0.20 eV (approximately the current upper limit [47]) causes f NL to drop by ∼ 7. Hence future high-precision f NL measurements from C κg (with σ f NL < 10) should likely marginalize over m ν as well. The impact of varying the other five parameters within the Planck uncertainties are tiny, and hence we fix them to their best-fit values. Table 8 shows tests in which we change the data vector: removing the Galactic m = 0 mode (−19 +47 −41 ), using min = 8 (−8 +52 −45 ), and using the tSZ-free lensing map (−25 +69 −57 ). If we don't apply the systematics weights to the quasars, we obtain a statistically compatible but weaker constraint, −5 +64 −56 , showing that the weights mainly affect the covariance by reducing low-noise (see Fig. 21 to compare C κg measured with and without the quasar weights). Interestingly, f NL shifts away from zero for three of the four samples both when turning off systematics weights and when using the tSZ-free lensing map, suggesting that both combinations may be affected by residual systematics.
There are no significant differences in marginalized f NL constraints between any of the regions. All except DES have acceptable χ 2 ; for DES, the χ 2 is slightly high (p = 0.004). If we remove DES altogether, we find f NL = −49 +46 −41 . Alternatively, we find acceptable fits for DES when using either the tSZ-free lensing map or when removing the m = 0 Galactic mode. If we replace the fiducial DES cross-correlation with the tSZ-free one, we find f NL = −30 +46 −40 ; and if we instead use the m = 0 DES cross-correlation, we find f NL = −27 +45 −40 . Hence, our fiducial results are not driven by the DES region.
The overall errors are similar to the Fisher forecasts in Table 5. The individual regions show some differences; this is because the best-fit b 0 differs from one, which can substantially change ∆b and thus sensitivity to f NL .
We fix the magnification bias slope s in addition to fixing the cosmological parameters. The formal statistical error on s is extremely small due to the large number of sources used to measure the slope. Hence the uncertainty on s is dominated by systematic errors. The most rigorous way to determine s is by synthetic source injection [108,131]. Ref. [108] finds a typical systematic error of ∆s ∼ 0.1 − 0.2 between source injection and the simpler flux modification method that we use. This can propagate to a fairly large change in f NL (though still subdominant to our statistical errors): ∆s = 0.1 yields ∆ f NL ∼ −15 (anti-correlated with s). This correlation will asymmetrically broaden the f NL constraint, as it has a much larger impact (|∆ f NL | ∼ 15-20) on f NL ∼ 0 and negative f NL than on f NL ∼ 100 (|∆ f NL | ∼ 5).  Table 7.
Marginalized constraints on f NL and linear bias scaling amplitude b 0 using the fiducial Planck lensing map. In this table, the underlying data is the same, but we change aspects of the likelihood or theory such as the covariance, the assumed value of p, the amplitude of the fiducial cosmology, and the bias evolution. Each region has 21 degrees of freedom.
However, this systematic error is dominated by size selection effects (i.e. lensing increases size while keeping surface brightness constant, so galaxy magnitudes measured in a fixed aperture do not increase by the full amount of magnification). Because the quasar sample is restricted to point sources, these effects are much smaller here and thus the typical systematic error is likely smaller than ∆s = 0.1. In any case, the effect on f NL is much smaller than our statistical errors. However, validating the magnification bias slope measurement via sourceinjection simulations (and further accounting for the flux dependence of the redshift success rate, for spectroscopic surveys) will be necessary for future measurements with σ f NL < 15.  Table 8. Marginalized constraints on f NL and linear bias scaling amplitude b 0 . In this table, we show various systematics tests with different data vectors. Each region has 21 degrees of freedom, except for min = 8, where each region has 20 degrees of freedom.

Discussion
We use the cross-correlation between Planck 2018 CMB lensing and DESI quasar targets at z eff = 1.51 to constrain f NL = −26 +45 −40 with the fiducial p = 1.6 (or f NL = −18 +29

−27
with p = 1.0). Strong systematics are present in the quasar autocorrelation, exceeding the clustering signal by ∼ 3-5× at = 10 even after applying systematic weights (Fig. 18). 7 Since this excess noise is uncorrelated with CMB lensing, we obtain an unbiased cross-correlation measurement down to = 6 in spite of the strong contamination. This result highlights the power of cross-correlations to allow cosmological measurements even from galaxy samples 7 We note that this is nevertheless an improvement over the DR8 quasar autocorrelation measurement from Fig. 19 in [132].
with very noisy large-scale clustering. At low where systematics are unavoidable, crosscorrelations thus offer an alternative to precisely modelling and removing the contaminant power [133].
Our f NL constraint is generally weaker than other LSS constraints from the galaxy auto-power spectrum and bispectrum. The strongest constraints currently come from the 3D power spectrum of eBOSS quasars at 0.8 < z < 2.2, with f NL = −12 ± 21 [12]. A previous result using early eBOSS data found −81 ≤ f NL ≤ 26 at 95% confidence interval [13] (both results fix p = 1.6). Analyses of the BOSS galaxy power spectrum and bispectrum using the effective field theory of large scale structure find f NL = −33 ± 28 [14] and f NL = −30 ± 29 [15]. The constraining power is dominated by the power spectrum, and adding the bispectrum improves the constraint by ∼ 20%. These are significant improvements over early f NL constraints from BOSS DR9 (σ f NL = 60) [134], with a larger sample of galaxies and more robust modelling to smaller scales. While the quasars generally offer slightly better constraints on f NL , comparing results is somewhat tricky due to different assumptions on the b 1 − b φ relationship (or alternatively, assumptions about p in Eq. 2.3: the quasar papers generally use p = 1.6, whereas Ref. [15] use p = 1 and Ref. [14] use p = 0.55 for BOSS galaxies). Further, allowing b φ to vary freely can dramatically degrade the constraints [51].
Analyses of the angular clustering of photometric samples have generally also found competitive f NL constraints, with σ f NL ∼ 20.
Our constraint, with σ f NL ∼ 43, has comparable constraining power to the recent results of Ref. [23], who find σ f N L ∼ 41 from CIB-CMB lensing cross-correlations. We have a 50% smaller errorbar than LSS-CMB lensing and ISW constraints from Ref. [24]. This is primarily due to the lower noise in the Planck 2018 CMB lensing map that we use, compared to Planck 2013 in [24].
Future cross-correlation measurements with a cleaner quasar sample can significantly improve the constraining power on f NL . If the "high purity" DESI quasar target sample had no excess low-power, we would find σ f NL = 26 (and a further improvement to σ f NL = 24 if we could use the entire DES footprint). A fully spectroscopic quasar sample would allow us to remove the residual contamination from stars and unclassified spectra. Moreover, the cleaner spectroscopic sample may allow us to use the higher-number density "main" sample, which would further reduce σ f NL ∼ 15. A cleaner quasar sample will also allow f NL constraints from the autocorrelation, likely with more statistical power than the cross-correlation (depending on the performance of the low-systematics removal). However, we emphasize that while systematics add power to the autocorrelation, they only add noise to the cross-correlation. Interpreting the autocorrelation measurement will always require some level of systematics subtraction, and the cross-correlation measurement is more robust to unmodelled systematics.
Redshift binning is another clear area for improvement. Already, our broad redshift bin combines quasars with positive and negative f NL response (b(z) − 1.6), partially cancelling its constraining power. In Appendix A, we explore whether we can improve the f NL constraint by constructing a second redshift bin that excludes the z < 1 tail with negative f NL response.
The high-redshift bin is 30% more constraining on f NL in a Fisher forecast. However, in data we find the bias of the sample is 30% below expectation, dramatically reducing b(z) − p and thus yielding a weaker f NL constraint than the main sample. The reason for this discrepancy is unknown, potentially pointing to an issue with the photometric redshift bins, which can be resolved with spectroscopic redshifts for all DESI quasar targets. Moreover, spectroscopic redshifts will allow for far more granular redshift bins; the resolution of the photometric redshifts is poor enough that the z phot > 2 split in Appendix A is essentially the best we can do. This will allow for further improvements in the f NL constraints from CMB lensing cross-correlations.
Finally, considerably better CMB lensing data is expected in the near future. Already, the Planck PR4 lensing maps have 10-20% less noise than Planck 2018 [135], as well as better mean-field estimation and improved systematics control at low . Further in the future, the Simons Observatory is forecast to detect CMB lensing at S/N ∼ 140, or twice the signalto-noise of Planck [136]. SO's CMB lensing reconstruction noise is low enough to allow for sample variance cancellation between the galaxy and CMB lensing fields [29], leading to a forecasted σ f NL ∼ 2.

Conclusions
We detect the cross-correlation between Planck 2018 CMB lensing and DESI quasar targets at 26σ over the range 6 < < 300. Rather than using the DESI main quasar target sample (with density 320 deg −2 ), we remove faint quasar targets and potential stars to create a "high purity" quasar sample (density 160 deg −2 ). This sample has much higher completeness, with only 2% stars and 4% redshift failures, compared to 6% (7%) stars (redshift failures) in the main sample. Nevertheless, even after applying weights to mitigate correlations with imaging systematics templates, the quasar autocorrelation has considerable excess power at low , rendering any cosmological interpretation impossible.
Despite the excess noise in the quasar auto-correlation, the quasar-CMB lensing crosscorrelation shows no evidence of contamination. The systematics tests pass, and we constrain f NL = −26 +45 −40 . CMB lensing cross-correlation therefore allows us to extract cosmological information out of this sample, despite the highly contaminated autocorrelation.
When regressing the imaging systematics templates, overfitting can decorrelate the quasar and CMB lensing fields, leading to a dramatically lower cross-correlation at low . Contaminated mocks are critical to verify that we correctly recover the low cross-correlation, and we use them to determine the maximum number of systematics templates that we can regress against. We find that regression can reduce the cross-correlation while still recovering approximately the correct auto-correlation on all scales. These results emphasize that large-scale cross-correlations are especially sensitive to overfitting when mitigating imaging systematics.
We present the first cosmological analysis of the clustering of DESI quasar targets on large scales, complementing the small-scale clustering measurement of Ref. [79]. By lowering the excess systematic power at low and allowing for finer redshift binning, the full spectroscopic quasar sample will allow for improved constraints on f NL from CMB lensing cross-correlations.

Data Availability
All data shown in the paper (including angular spectra, covariances, and a gridded likelihood) is publicly available at this site: https://doi.org/10.5281/zenodo.7921905.    Table 4; dashed lines show weighted power spectra and solid lines show unweighted. Shot noise equal to 1/n is subtracted from all spectra to allow comparison between the high purity sample with half the density of the main sample. The theory curve is computed using the fiducial cosmology and the bias evolution of Ref. [56] (Eq. 2.7), with b 0 = 1. For DES, we show both the full DES region (4162 deg 2 ; solid and dashed lines); and the fiducial δ > −30 • region, which has significantly less excess power on large scales (dotted red line). Both high purity and main have excess power at low relative to the theoretical expectation, even with systematics weights applied. However, the excess power is smaller for High Purity than Main, and is further reduced after applying systematics weights. The covariance on C κg is directly proportional to C gg , including extra noise (Eq. 4.7). Thus, to achieve the tightest constraints on f NL , we must reduce the additional low-contamination power as much as possible.  Figure 16. Relationship between imaging systematics and contaminated mock density for the North region. Both the data (blue) and contaminated mocks (red) have similar trends with imaging systematics. After multiplying by the systematics weights, the trends are similarly mitigated in both data (green) and mocks (orange). Figure 17. Comparison between the data (left) and the contaminated mocks (right), before imaging systematic weights are applied. The contaminated mocks have a slightly lower density than the data (155 deg −2 versus 160 deg −2 ), since the systematic weights used to generate them have a mean of 1.042 (weighted by the fractional coverage of each pixel), not 1. Therefore, for this figure, we multiply the mock quasar density by 1.042 to match the overall density.    Figure 19. Like Fig. 13, but using 100 contaminated Poisson-Gaussian mocks instead. A separate regression is run for each contaminated mock to remove imaging systematics. The comparison C κg is the true input cross-power spectrum. The error bars are computed using jackknife resampling. The left panels show the result both using the fiducial linear regression on the restricted set of templates from Table 4 (solid blue) and the Random Forest regression with all 11 templates (solid red). We also show that the input power is recovered well for a nonzero f NL (dotted blue), and for the high-redshift sample (dot-dash blue), which uses a slightly different template set (Appendix A). For comparison, we also show the result for all 11 templates with linear regression (blue dashed); all 11 templates with Random Forest but with input weights generated with a neural net (black solid); and 7 templates with random forest (red dashed). The center and right panels show the residuals from the linear regression, which agree well with a normal distribution (red).    , the forecasted σ f NL constraint for the low-redshift sample is 37 without noise, and 88 without systematic weights. In constrast, the higher-redshift sample offers better constraints despite the lower number density, with σ f NL of 21 and 41, respectively. The combination of the two samples (correctly accounting for their covariance) offers little improvement over the z phot > 2 sample alone, with σ f NL = 18 and 38 for the noiseless and unweighted cases, respectively. However the z phot > 2 sample alone offers a 25-35% increase in constraining power on f NL compared to the single broad redshift bin. Motivated by this forecast, we also measured the cross-correlation between the highredshift quasar sample and Planck CMB lensing. Due to the lower number density, we linearly regressed systematics templates at resolution NSIDE=128 rather than 256. We also found that slightly different systematics templates were required for the high-redshift sample. We used E(B-V) alone for North and DECaLS S; E(B-V), stellar density, Sagittarius Stream, and PSFDEPTH in both W1 and W2 for DECaLS N; and E(B-V) and W1 and W2 PSFDEPTH for DES. We verified that this was a sufficiently limited template set to allow us to correctly recover the cross-correlation from mocks. Using these templates led to no loss of power on large scales in the cross-correlation. The stellar fraction is also slightly higher, 3.2% versus 2.0% for the full sample, and the magnification bias slope s is slightly different, s = 0.3384, 0.3455, 0.3481, 0.3162 for North, DECaLS N, DECaLS S, DES, respectively. We also find an anomalous anti-correlation between quasars and CMB lensing in the lowest-bin in DECaLS N ( = 4 and 5), significant at 3.3σ. This may be due to the anomalous (and presumably foreground-affected) CMB lensing quadrupole ( = 2) correlating with systematics in the quasars, and this multipole enters the = 4 to 5 bin due to the mask-induced mode coupling. We therefore set min = 6 for DECaLS N.
The quasar-CMB lensing cross-correlation is detected at 14.4σ across all imaging regions and at 0 < < 300, and the bandpowers are shown in Fig. 24. Despite the optimistic Fisher forecasts, the f NL constraints are much weaker for the high-redshift sample than the full sample, with f NL = 124 +283 −129 . This is due to the fact that the clustering amplitude is much lower than expected: b 0 = 0.63 ± 0.07, yielding a much lower value of b(z) − 1.6 and thus poor f NL sensitivity.
Defining the effective bias following Refs. [39,95] b eff = dz 1 the b 0 constraint yields an effective bias of 2.26±0.23 at an effective redshift of 2.07, compared to the predicted b(z = 2.07) = 3.33 in Ref. [56]. The poor constraining power on f NL also leads to a highly non-Gaussian and asymmetric posterior. We repeat the systematics tests in Tables 7 and 8 and do not find any significant deviations. We do find ∼ 2σ discrepancies in the bias measured between different imaging regions, as well as a ∼ 2σ deficit in power between < 200 and > 200 in the combination of DECaLS S, DECaLS N, and North. However, given the large number of tests performed, some 2σ deviations are expected. The stellar fraction is slightly higher for the high-redshift sample than the high-purity sample, but fluctuations in stellar fraction are unlikely to explain the > 30% drop in clustering amplitude. We find that the stellar fraction ranges from 2.6% to 4.3% between the different imaging regions (following the same trend as the high-purity sample, decreasing from North to DECaLS N to DECaLS S). Moreover, the fitted bias does not significantly vary between the fiducial measurements and the imaging regions restricted to Galactic latitude |b| > 40 • .
We also measure the clustering of the z phot < 2 sample. Here we find a clustering amplitude consistent with the evolution of Ref. [56], with b 0 = 1.04 ± 0.05 (fixing f NL = 0 as it is poorly constrained anyway). Given these three clustering measurements, we adjust the bias evolution to determine if they are consistent with each other. We try a modified bias evolution with flat bias at z > 1. Using the modified bias evolution, we find (at fixed f NL = 0) b 0,mod = 1.08 ± 0.075 for the z phot < 2 sample, 1.03 ± 0.03 for the entire sample, and 0.95 ± 0.07 for the z phot > 2 sample. Hence there is no evidence that the three samples are inconsistent with each other. In Fig. 25, we compare our cross-correlation bias measurement to a compilation of quasar clustering results (taking the bias measurements at "face value" without homogenizing the Figure 24. DESI quasar-Planck CMB lensing cross-correlation for the z phot > 2 sample. Note that due to the large bin-widths (∆ = 50), this measurement can constrain b 0 but not f NL , for which we use much smaller bins at low . We show both the C κg theory curve for the Ref. [56] bias evolution (b 0 = 1), and the theory curve with the best-fit amplitude of the bias b 0 .
cosmology or other aspects of the analysis). We present both the modified bias evolution and three datapoints showing the effective bias at the effective redshift, computed using the modified bias evolution. We note that the effective bias differs by ∼ 10% if we use the Ref. [56] bias evolution instead.
For comparison to past work, we also show the Ref. [56] bias evolution. This was derived from 3D autocorrelation measurements of eBOSS [56] and BOSS [137] quasars. We also show the eBOSS 3D autocorrelation measurement of Ref. [138] (averaging over their color-selected bins); and BOSS 3D autocorrelation measurement from Ref. [139]. Ref. [58] used the 2QDES survey to measure quasar clustering across a wide range in color and luminosity (we do Figure 25. Quasar bias measurements from DESI-Planck CMB lensing cross-correlations (red circles, with the diameter corresponding to the 1-σ range). We compare these measurements to previous measurements (colored points) including previous CMB lensing cross-correlations (gray points); the best-fit bias evolution from Ref. [56], and the modified bias evolution (Eq. A.2) that fits our crosscorrelation measurements.
not combine their three luminosity-selected bins per redshift). They find no relationship between quasar clustering and luminosity. Indeed, they measure the highest bias for the lowest-luminosity bins. This result agrees with several other studies finding no significant relationship between quasar clustering and physical properties [138,[140][141][142][143][144]. This suggests that it is reasonable to compare quasar clustering measurements spanning a wide range in luminosity and selection technique.
We also show older quasar clustering measurements from 2dF [57,145] and SDSS DR7 [146]. We show projected rather than 3D clustering both for SDSS photometric quasars [96] and for WISE-selected quasars [147]. The quasar bias was also measured by Ref. [148] using cross-correlations between BOSS quasars and the Lyman-α forest.
Our measurements at z eff = 1.32 and 1.51 are quite consistent with previous results, but the high-redshift cross-correlation yields a significantly lower bias. Likewise, the modified bias evolution required to fit the high-redshift point agrees well with previous measurements at z < 1.5, but is significantly low at higher redshift. While the scatter of the earlier bias measurements is large, the high-precision BOSS and eBOSS results are quite close to the Ref. [56] bias evolution. We do note a weak trend for the z > 2 cross-correlations to be lower than the auto-correlations, with the exception of the lowest-redshift point from [148]. These measurements are not independent: the three CMB lensing cross-correlations at z ∼ 2.4 are all from the same sample (BOSS). Furthermore, the large errorbar of these measurements means that the deviation from Ref. [56] is not highly significant.
To investigate this discrepancy further, we measure the 3D quasar correlation function from the Early Data Release (following the large-scale structure catalog creation described in Ref. [153]). Following Ref. [56], we fit a linear bias and linear (Kaiser effect) RSD model to the monopole ξ 0 in the range 20 < r < 80 h −1 Mpc: where β = f /b and f is the growth rate d ln D d ln a . We find that this simple model fits the data well on these scales. The results are summarized in Table 9, split into a number of spectroscopic redshift bins. These results for the main sample are consistent with the DESI One Percent Survey measurements presented in Ref. [154] (measured on a similar scale range, 10 < r < 85 h −1 Mpc), and are consistent with the Uchuu SHAM mocks presented in Ref. [154]. We find that the 3D correlation function generally prefers a higher bias than Ref. [56], rather than a lower bias; the bias is very similar between the main sample and the high purity sample. We also find a very similar bias between the full high purity sample at 1.9 < z < 2.3 (3.63 ± 0.18) and the z phot > 2 high purity sample at 1.9 < z < 2.3 (4.01 ± 0.33).
We therefore conclude that the origin of the low CMB lensing cross-correlation comes from the CMB lensing analysis itself, and not because the DESI quasars have lower bias than previous quasar samples. One possibility is magnification bias: if s = 0 for the z phot > 2 sample, this would explain the low clustering measurement. We are unable to re-run the photometric redshift code of Ref. [94] after perturbing the photometry to measure the magnification bias slope, and it is possible (though somewhat unlikely) that this has a massive impact on s, lowering it from ∼ 0.34 to 0. Another possibility is residual imaging systematics in the z phot > 2 sample. We regress fewer templates for the z phot > 2 sample than for the high-purity sample as a whole, and we do find a visually significant trend with Galactic latitude in DECaLS S. However, we do not find any significant shifts in our results if we use the tSZ-deprojected lensing map, or if we turn off the angular systematic weights (or replace them with more-aggressive random forest weights), nor do we find any difference between the cross-correlation above and below Galatic latitude |b| = 40 • . Finally, the largest CMB lensing foreground at these redshifts is CIB contamination, since the CIB peaks at z ∼ 2. Previous results suggest that the CIB bias to CMB lensing cross-correlated with z ∼ 0.5 galaxies is similar to the CIB bias on the lensing autocorrelation [117]. The Planck lensing team estimates < 1% biases from CIB on the autospectrum (Section 4.5 of Ref. [115]). On the other hand, the CIB redshift kernel increases from z ∼ 0.5 to z ∼ 2, so it is plausible that CIB biases are larger for our sample by a factor of a few. Interestingly, the sign is correct as the CIB bias is negative. However, the average bias shifts by only 3% when using the tSZ-deprojected lensing map in place of the default SMICA map; these two maps have very different response to CIB. Overall, it seems like it would be difficult to make the CIB contamination large enough to explain the ∼ 30-40% discrepancy.