A NOEMA Molecular Line Scan of the Hubble Deep Field North: Improved Constraints on the CO Luminosity Functions and Cosmic Density of Molecular Gas

We present measurements of the CO luminosity functions (LFs) and the evolution of the cosmic molecular gas density out to z ∼ 6 based on an 8.5 arcmin2 spectral scan survey at 3 mm of the iconic Hubble Deep Field North (HDF-N) observed with the NOrthern Extended Millimeter Array (NOEMA). We use matched filtering to search for line emission from galaxies and determine their redshift probability distributions exploiting the extensive multiwavelength data for the HDF-N. We identify the seven highest-fidelity sources as CO emitters at 1 < z < 6, including the well-known submillimeter galaxy HDF 850.1 at z = 5.18. Four high-fidelity 3 mm continuum sources are found to be radio galaxies at z ≤ 1, plus HDF 850.1. We constrain the CO LFs in the HDF-N out to z ∼ 6, including a first measurement of the CO(5–4) LF at 〈z〉 = 5.0. The relatively large area and depth of the NOEMA HDF-N survey extends the existing LFs at 1 < z < 4 above the knee, yielding a somewhat lower density by 0.15–0.4 dex at the overlap region for the CO(2–1) and CO(3–2) transitions, attributed to cosmic variance. We perform a joint analysis of the CO LFs in the HDF-N and Hubble Ultra Deep Field from ASPECS, finding that they can be well described by a single Schechter function. The evolution of the cosmic molecular gas density from a joint analysis is in good agreement with earlier determinations. This implies that the impact of cosmic field-to-field variance on the measurements is consistent with previous estimates, adding to the challenges for simulations that model galaxies from first principles.


INTRODUCTION
As star formation takes place inside clouds of cold molecular gas (McKee & Ostriker 2007), the cosmic density of molecular gas ( mol ) plays a key role in our understanding of what drives the cosmic star formation rate density (Madau & Dickinson 2014) and the baryon cycle of matter flowing Corresponding author: Leindert Boogaard boogaard@mpia.de in and out of galaxies (Walter et al. 2020;Péroux & Howk 2020).
Measurements of the cosmic molecular gas density have matured over the last decade, in particular through so-called spectral scan surveys in the (sub-)millimeter regime of extragalactic deep fields with large interferometers.By scanning for emission from the low- transitions of carbon monoxide (CO)-one of the key tracers of cold molecular gas in the local universe (Solomon et al. 1992;Bolatto et al. 2013)-a flux-limited census of the molecular gas reservoirs in galax-B .
ies can be obtained, provided the physical conditions of the systems under study are known.
The first constraints on the cosmic molecular gas density from individual CO detections were obtained over a 1 arcmin 2 area in the Hubble Deep Field North (Williams et al. 1996) in the 3 mm band with the Plateau de Bure Interferometer (PdBI, Walter et al. 2014;Decarli et al. 2014).Building on these, the ALMA Spectroscopic Survey of the Hubble Ultra Deep Field (ASPECS)-Pilot program targeted a ∼ 1 arcmin 2 area in the Hubble Ultra Deep Field (Beckwith et al. 2006) in both the 3 mm and 1.2 mm bands at greater depth (Walter et al. 2016;Aravena et al. 2016a,b;Bouwens et al. 2016;Decarli et al. 2016a,b;Carilli et al. 2016).These initial efforts led to the first extragalactic ALMA large program, ASPECS, that covered the entire eXtreme Deep Field region of the HUDF (Illingworth et al. 2013) over a 4.6 arcmin 2 scan at 3 mm (Aravena et al. 2019;Boogaard et al. 2019Boogaard et al. , 2021b;;Decarli et al. 2019;González-López et al. 2019;Popping et al. 2019;Uzgil et al. 2019) and 1.2 mm (Aravena et al. 2020;Boogaard et al. 2020;Bouwens et al. 2020;Decarli et al. 2020;González-López et al. 2020;Inami et al. 2020;Popping et al. 2020;Magnelli et al. 2020).At the same time a significantly larger ∼ 60 arcmin 2 area in the COSMOS and GOODS-North fields was targeted as part of the COLDz survey at 9 mm (Pavesi et al. 2018;Riechers et al. 2019) on the Karl G. Jansky Very Large Array, probing the bright end of the luminosity function for lower- CO transitions at similar or higher redshifts.
Through the larger area and complementary line and redshift coverage, these surveys have provided an increasingly detailed picture of the cosmic molecular gas density out to  ∼ 4.These reveal that  mol () increases by a factor ∼ 6× going out to  ∼ 1.5, with a subsequent decline out to the highest redshifts probed (Walter et al. 2020).However, the speed at which spectral scan surveys can be conducted is limited by the bandwidth that can be observed per tuning (∼ 4 GHz per sideband for ALMA) and the requirement to perform mosaicing over larger areas.This implies that the deepest surveys to date are still probing volumes that are subject to cosmic variance (Decarli et al. 2020;Popping et al. 2020), though the three-dimensional volumes probed are significantly larger than the modest on-sky areas may suggest.
To address these issues, we have conducted a new survey using the NOrthern Extended Millimeter Array (NOEMA), covering the full Hubble Deep Field North in a 45 pointing mosaic encompassing 8.5 arcmin 2 at 3 mm.Capitalising on the quadrupled instantaneous bandwidth of the POLYFIX correlator (compared to the original PdBI scan) of 16 GHz and the increased sensitivity from four extra antennas in the array, we cover almost the full 3 mm window between 82-113 GHz in only 2 setups.For comparison, the NOEMA  113.322GHz.The comoving volume and volume-weighted average redshifts are computed within 0.5 of the primary beam peak sensitivity (8.5 arcmin 2 at 98 GHz), accounting for its frequency dependence.
HDF-N survey covers almost ∼ 2× the area of ASPECS at a ∼ 3× shallower depth.This paper is organised as follows.In § 2 we present the observations and data reduction.We discuss the line search, the identification of both the line and continuum emitters, and the subsequent computation of the luminosity functions (LF) in § 3, also including a comparison to the original PdBI scan.We discuss constraints on the CO LFs and the cosmic molecular gas density in § 4. We summarize and conclude in § 5. Throughout this paper, we adopt a Planck Collaboration et al. ( 2020) cosmology (flat ΛCDM with  0 = 67.66km s −1 Mpc −1 , Ω  = 0.3111 and Ω Λ = 0.6889).We use log to denote log 10 and ln for the natural logarithm.

NOEMA OBSERVATIONS
The NOEMA mosaic consists of 45 pointings that are laidout in a Nyquist-sampled hexagonal pattern at around 98 GHz, with the phase center set to 12:36:47.60+62:13:02.0.It covers 8.5 arcmin 2 (50% peak sensitivity at 98 GHz) in the GOODS-North area, encompassing the complete Hubble Deep Field North (Fig. 1).The mosaic was observed in two setups that cover nearly the full 3 mm band from 82. 394-113.322GHz.The main emission lines that are covered in spectral scan and their associated redshift range and cosmic volume are listed in Table 1.Williams et al. (1996, in white).The white circle indicates the original Plateau de Bure scan (Walter et al. 2014;Decarli et al. 2014).The background shows the Hubble imaging from CANDELS (Grogin et al. 2011;Koekemoer et al. 2011) in the WFC3/F160W, ACS/F850LP, and ACS/F606W filters (RGB).The observations were taken between the 27th of March and 20th April 2019 for setup 1 and between 9th of May and 15th of October 2020 for setup 2. The calibration of the mosaic was performed in .For setup 1, a total of 7 tracks were used in the final reduction.The calibrators were 3C273 and 3C84 for the bandpass, 1125+569 and J1302+690 for the amplitude and phase (using the average polarisation for the amplitude when detected), and LKHA101 and MWC349 for the absolute flux calibration, except for the track on the 2nd of April 2019, for which 1055+018 and 1125+569 were used as the bandpass and flux calibrators respectively.For setup 2, a total of 10 tracks were used, with bandpass calibrators 3C273, 0851+202, 3C84, and 3C345, amplitude and phase calibrator 1125+569, and flux calibrators MWC349 and 3C84.
We create a dirty cube of the entire mosaic for each of the four sidebands separately, with 9 MHz channels and a 0. 75 pixel size, using G (version August 22a).The reference frequency is set to the center of each sideband and we take into account frequency dependence of the synthesised beam for every channel by explicitly setting map_beam_step = 1 in uvmap.We subsequently compute the (frequency-dependent) sensitivity map of the mosaic (or 'primary beam of the mosaic') as the weighted primary beam response of individual pointings.We use the sensitivity map to create both (fluxcalibrated) cubes where the noise is flat (that are used for the line search) and sensitivity-map-corrected cubes (with noise increasing towards the edges of the mosaic).The beam shape and root-mean-square (rms) noise of each of the four cubes is detailed in Table 2.
We mask individual channels with increased noise, such as those at the edges of the two basebands in each sideband, by identifying all channels that deviate by more than 5% from the median filtered rms over 50 channels (in total < 4% of all channels).The resulting rms in each sideband is shown in Fig. 2 and the overall average rms over the full frequency coverage is 0.62 mJy beam −1 channel −1 .Because of the dif- ferences in the rms and beam sizes between the individual sidebands, we do not create a single combined cube.Instead, throughout the remainder of this paper, we perform all analysis (i.e., line searches, completeness corrections) on each of the four individual cubes separately.
We also create a map of the 3 mm continuum map by combining all setups, after masking channels with increased noise in the same way as for the cubes, albeit more aggressively (using a 5% cut with a 200 channel median filter).The rms noise is 11 Jy beam −1 and the beam shape is listed alongside other properties in Table 2.

Line Search
We search the cubes for positive and negative line emission using matched filtering.Several codes have been developed in the context of different spectral scan surveys, including F (Walter et al. 2016;Decarli et al. 2019) , L -González-López et al. (2019) and MF3D (Pavesi et al. 2018).All these codes have been found to perform qualitatively similarly, though differ somewhat in the way they perform the matched filtering and group the final list of candidates (e.g., González-López et al. 2019).
As our fiducial line search code, we use MF3D, which conducts the matched filtering on the S/N cube using spatial and spectral kernels of different sizes.We use a Gaussian kernel in frequency space, with widths ranging from 3 to 18 channels, corresponding to line widths of about 75 to 500 km s −1 (FWHM).We use a single-pixel (point-like) spatial kernel, as all sources are expected to be unresolved at the beam size.
We use the distribution of the negative lines (expected to be produced by noise), to estimate a 'fidelity' of the positive line signal as a function of S/N.The fidelity is computed per kernel () by taking the ratio of the number of positive and negative lines ( pos ,  neg ) in bins of S/N, To mitigate the effect of low-number statistics on the estimate of  neg (S/N, ), we fit the counts with a tail of a Gaussian function centered at zero (as in González-López et al. 2019;Decarli et al. 2019Decarli et al. , 2020)).We finally compute a smooth estimate of the  (S/N) at fixed  by fitting an error function shape to Equation 1 (cf.Walter et al. 2016).As the number of sources detected in the individual cubes is rather limited for a given kernel width, the estimate of the fidelity at a fixed S/N is uncertain (except at the tails of the distribution, close to zero and unity fidelity).To mitigate this effect, we combine the line search results in S/N-space and compute a single fidelity estimate for a given kernel width for all cubes combined.This is equivalent to performing the analysis on the combined cube, as is typically done.This provides a more robust estimate of the fidelity, though we still caution against over-interpreting the exact fidelity values in the intermediate range.
For the final catalog we only consider lines candidates with a S/N > 4 (also when remeasured with a Gaussian fit, see § 3.5), a  > 0.2 (consistent with earlier work), and that lie in the area of the mosaic where the sensitivity is above 40% of the peak sensitivity.This leaves a total of 23 sources, all of which have a S/N ≥ 5 due to the fidelity cut being more stringent than the S/N cut, and that lie within ∼ 50% of primary peak sensitivity.There is a relatively sharp drop in fidelity below ∼ 0.7 and the majority of the sources lie at lower fidelity-and S/N-end.We list the highest fidelity sources, with  ≥ 0.7, in Table 3, which have a S/N ≥ 5.85.We estimate the completeness of our line search by injecting simulated emission lines into the cubes and determining what fraction is recovered by our line search procedure.We assume a 3D Gaussian profile for the simulated lines, which matches shape of the beam for each cube in the spatial directions.We draw 1000 sources from a uniform distribution in peak flux and FWHM line width, ranging from 0 to 3 mJy and 0 to 800 km s −1 respectively.The sources are then injected uniformly across the cubes in the area above 50% of the peak sensitivity.We perform the line search in the same manner as described above and define the completeness () as the fraction of the injected sources that are recovered for a given line width and peak flux.We repeat the experiment 5 times for each cube to increase statistics to a total of 5000 lines per sideband (20.000 simulated lines in total).The resulting completeness fractions are shown in Fig. 11 in Appendix A.  2021) in combination with a typical CO ladder (Danielson et al. 2011;Boogaard et al. 2020).
In the end, we find the completeness corrections are minor, as the uncertainties are dominated by the sample purity (fidelity) and not completeness.
We cross check the line candidates found with MF3D against those found by F and L .Overall, we recover the same high-S/N and -fidelity candidates with all the codes (though there are differences in the absolute S/N, due to the different methods), while there are increasing numbers of candidates found by only a subset of the codes at lower S/N.Similar conclusions were also reached by Decarli et al. (2019) and González-López et al. (2019).We further assess the impact of the line-search code used in Appendix A, where we show we recover the same CO luminosity functions if we use the line candidates from F in favor of MF3D.

Counterpart association
We identify the redshifts of the candidates from the linesearch by exploiting the extensive multi-wavelength data that is available over the HDF-N as compiled by the CANDELS and 3D-HST surveys (Grogin et al. 2011;Koekemoer et al. 2011;Brammer et al. 2012;Skelton et al. 2014;Momcheva et al. 2016), including spectroscopic redshifts (e.g.Barger et al. 2008), as well as the Herschel data (Elbaz et al. 2011).
While for the brightest and/or lower-redshift galaxies the counterpart association is often clear, the relatively large beam of the NOEMA data (compared to typical galaxy sizes) can sometimes lead to ambiguities in the association.There may be multiple galaxies near the peak of the emission or, especially at lower S/N, the peak of the emission may be offset from the position of the counterpart.Moreover, the counterpart may not be detected in the optical/near-infrared imaging at all.To deal with these uncertainties, we assign a redshift probability for every line candidate, based on the photometric redshift distributions of galaxies in the vicinity, weighted by their relative distance to the source.We also include a 'dark' solution, in which there is no optical/NIR photometric counterpart.We use the photometric redshift distributions that are derived with by 3D-HST (Brammer et al. 2012;Skelton et al. 2014;Momcheva et al. 2016).We take a prior on the separation between the line candidate and the photometric counterpart in the form of a Gaussian with a 2" FWHM (about half the beam size), i.e., a radial down-weighting of sources at a larger separation from the line position.
Because it is well known that not all CO transitions are observed in equal numbers at fixed observing frequency, we take into account different prior probabilities on the CO line identification.For the sources with a photometric counterpart, we adopt a prior that is loosely based on the line-flux distribution from ASPECS at 3 mm (Decarli et al. 2020).For the sources without a photometric counterpart, we adopt the redshift distribution of optically-faint submillimeter galaxies (-faint) as found by Smail et al. (2021) and convert it to a CO line distribution in  assuming a fiducial CO ladder based on the typical integrated line flux ratios in SMGs (Danielson et al. 2011, cf. Boogaard et al. 2020).Both CO line priors are shown in Fig. 4. Above a redshift of ∼ 3 there is some ambiguity in the line identification, where multiple lines from CO and [C ] are potentially present in the spectrum.For the identification, we assume that the line is always the strongest line visible at the respective redshift, and check for fainter lines afterwards.This concerns [C ](1-0), that is typically weaker than CO(4-3) (e.g., Valentino et al. 2020), and the  > 6 CO lines, that are rarely significantly stronger than CO(6-5) (Carilli & Walter 2013).Effectively, this means that all lines are assumed to be CO with  ≤ 6.We discuss the impact of these priors on the final results in more detail in Appendix A.
The redshift probability for a line candidate observed at a frequency  and position  is given by: where the sum is taken over all galaxies that lie within a 6" diameter circle (i.e., significantly larger than the beam and typical galaxy sizes).Here   (|  ) is the photometric redshift distribution for galaxy  at position   , (|) is the redshift prior determined by the CO line prior and the observed frequency, and (  |) is the radial weighting that depends only on the absolute separation |  − |.The last term represent the no-counterpart or 'dark' case, where   (|,   ) is the associated (prior) redshift probability distribution and (  |) accounts for the relative weight that is given to the no-counterpart solution.For the latter we assume the value of the radial weighting at 2 , which means the no-counterpart solution gets a larger weight than solutions with a counterpart for separations larger than 2 .The overall probability is normalised to unity for each line candidate, depending on the number of photometric counterparts and their radial weight.
The () distribution for the top sources are shown in Fig. 13 in Appendix B. We cross-check the solutions for the highest fidelity sources with the additional spectroscopic redshifts from literature and find that the () solutions are in perfect agreement with the known spectroscopic redshifts analysis in all cases.The redshifts and associated line identifications are reported in Table 3.
We find the majority of lines (4/7) are CO(2-1) emitters with redshifts 1 <  < 2. Three out of 4 are supported by a spectroscopic redshift, while for ID.6 the () solution contains more than 90% of the total probability.The high number of CO(2-1) emitters is very consistent with the findings from ASPECS, where the majority of lines were from CO(2-1) (roughly 60%), followed by CO(3-2) (roughly 30%), cf.Fig. 4.There is no CO(3-2) emitter with an unambiguous redshift identification, though the redshift probabilities of both sources with a broader () peak at  = 3.We detect CO(5-4) in HDF 850.1 and its spectrum also reveals reveals CO(6-5), albeit at lower fidelity and S/N (below the line-search threshold), shown in Fig. 5.
An alternative method to identify the CO line redshifts is to use the long-wavelength dust spectral energy distribution and compare the inferred dust temperature of the line emitter to the dust temperature distribution of sources with known redshifts (e.g., Weiß et al. 2013;Strandet et al. 2016).To this end, we cross-match our line list to the Herschel catalog from Figure 6.Continuum map at 97.8 GHz before applying the mosaic primary beam correction.The rms noise is 11 Jy beam −1 and the beam size is shown in the bottom left corner.The brightest sources are marked with circles (cf.Fig. 7).Elbaz et al. (2011).We find a clear Herschel counterpart within 1 for all sources, with the exception of IDs 4, 5 and 6, hence no additional constraints on their potential redshift can be derived in this way.

Comparison to earlier PdBI observations
We re-examine the 16/21 candidate emission lines from the earlier PdBI observations in the HDF-N (Walter et al. 2014;Decarli et al. 2014, with a similar average rms of ∼ 0.55 mJy per 9 MHz channel) that fall within the present frequency coverage (i.e., excluding their ID.1, 2, 19, 20 and 21).We confirm their ID.3 and ID.8 (HDF 850.1), which are recovered as ID.2 and ID.3 in the present line search.In addition, we also confirm ID.17, which is recovered at low fidelity in the present line search, but is robust, corresponding to CO(6-5) in HDF 850.1 (see Fig. 5).For the remaining original line candidates no significant emission is identified in the new observations.

Continuum
The 3 mm continuum map is shown in Fig. 6, prior to the mosaic primary beam-correction (i.e., with flat noise).We search for sources in the dirty continuum map using matched filtering in the same way as for the cubes, but with a 1-channel spectral template.We detect four continuum sources with a fidelity above 0.8 (S/N > 4.3).We Hogbom clean the cube around the brightest sources from the line search down to 2.We measure the fluxes in the cleaned image by fitting  2D Gaussians using imfit in .The properties of the sources and measured fluxes are listed in Table 4.
We show HST cutouts of the continuum sources in Fig. 7.The second brightest source corresponds to HDF 850.1.The other three sources are all identified as known (radio) galaxies detected at 1.4, 5, 10 and 34 GHz (Morrison et al. 2010;Owen 2018;Gim et al. 2019;Murphy et al. 2017;Algera et al. 2021) showing AGN signatures in their X-ray (Xue et al. 2016) and/or radio emission (Algera et al. 2021), with known spectroscopic redshifts ≤ 1 (Barger et al. 2008).No bright lines fall within the frequency range of the mosaic at the redshift of these three sources (cf.Table 1).The 3 mm number counts at 5 of  (> 0.09 mJy) ≈ 850 deg −2 (no completeness or flux-boosting correction correction) are in good agreement with the recent estimates from Zavala et al. (2021).

Properties of sources
We fit the lines recovered in the line search with a Gaussian line profile using (Newville et al. 2019).The resulting line fluxes, frequencies, and widths are reported for the highfidelity sources in Table 3.We do not apply any flux boosting corrections, as the completeness simulations ( § 3.1) show that this only has a minor impact on the line flux at the S/N levels under consideration.As in Decarli et al. (2020), we also do not correct the line fluxes for the impact of the increasing temperature of the Cosmic Microwave Background (CMB) with redshift, though note these would only significantly decrease the observed line flux at relatively low intrinsic excitation temperatures.We refer the reader to Decarli et al. (2020) for a more in-depth discussion of both topics.We compute the line luminosities ( ) in units of K km s −1 pc 2 (e.g., Solomon et al. 1992;Carilli & Walter 2013) for each of the the various redshift solutions from the () analysis, which are used for the computation of the CO LFs.

Luminosity Function Analysis
The luminosity functions are computed following the approach of Decarli et al. (2016aDecarli et al. ( , 2019Decarli et al. ( , 2020)), but taking all 23 sources ( § 3.1) into account for the luminosity functions of all the different lines, with a weight depending on their () solution.The exception to this are the high-fidelity sources with a spectroscopic redshift, for which we only include the actual redshift with a fidelity of unity.
The luminosity function for a certain transition is then defined as: Here  is the number of sources per comoving Mpc 3 in an interval of log  ± 0.5 log  ,  is the volume over which a transition is detectable, Δ(log  ) is the bin width, and   and   are the fidelity and completeness for a given transition.To construct the luminosity functions, 5000 independent realisations are created, where in each realisation the line luminosities are varied within errors.The number of sources and associated 1 Poissonian confidence intervals (Gehrels 1986) are computed in 0.5 dex bins, unless a bin contains less than one source on average, in which case a 3 upper limit is provided.The resulting counts and uncertainties are finally scaled by the completeness corrections, divided by the volume and averaged over the realisations.Following earlier work, the luminosity functions are computed five times with offsets of 0.1 dex (which are therefore not independent), to expose the intra-bin variations given the modest statistics.
The luminosity functions in the HDFN are shown in Fig. 8 for CO(1-0) up to CO(6-5) and are tabulated in Table 7.We compare these to the recent luminosity functions for the HUDF determined by the ASPECS Large Program (Decarli et al. 2019(Decarli et al. , 2020)).These are also determined from a spectral scan at 3 mm, over almost exactly the same redshift interval (cf.Fig. 2), but probe down to fainter luminosities over a ∼ 2.5× smaller volume.We also show the LFs derived from background sources in the PHIBSS2 fields (Lenkić et al. 2020) for the redshift ranges that are reasonably close to those from NOEMA HDFN and ASPECS.
We compare the observed LFs to the recent theoretical predictions from the SIDES (Béthermin et al. 2022) and S (Bisigello et al. 2022) simulations.In brief, both simulations use empirical prescriptions for the IR luminosity and CO-to-IR scaling relations.SIDES simulates the galaxy population using a semi-analytical model that that is coupled to the stellar mass of dark matter halos determined via abundance matching on a dark matter-only light-cone from the Bolshoi-Planck simulations (Rodríguez-Puebla et al. 2016), while S is a fully empirical model that is based on the observed galaxy stellar mass and IR luminosity functions.In both the local universe and at higher redshift, the (lower-) luminosity functions have been found to be reasonably well described by Schechter (1976) functions (e.g., Saintonge et al. 2017;Fletcher et al. 2020;Riechers et al. 2019;Decarli et al. 2019).Given the broad overall agreement between the LF's from the HDF-N and HUDF, further discussed in Equation 3.6, we perform a joint fit using a Schechter function in logarithmic units (e.g., Riechers et al. 2019): Here  * is the normalisation defining the overall density of galaxies at the characteristic luminosity  * (in the same units as Equation 3) and  is the faint-end slope.We only fit uncorrelated bins for each dataset, i.e., one-fifth of bins evenly spread across the full range probed, to avoid underestimating the uncertainties (note that we find the same median posterior values if we would fit all the bins, but with narrower posteriors).We also chose not to fit the PHIBSS2 data, as they are not derived consistently with the redshift intervals of the other surveys.As we do not provide new constraints on the   N -See Equation 4. We fix  = −0.2.
faint-end, we fix the slope  = −0.2following Decarli et al. (2020), that is consistent with the local value (Saintonge et al. 2017).We take uniform priors, where −6 ≤ log  * ≤ −2 and 8.5 ≤ log  * ≤ 11.5, and sample the posterior on the parameters using nested sampling with U (Buchner 2021).The resulting fits are shown in Fig. 9 and the marginalised estimates for the parameters are tabulated in Table 5.

CO luminosity functions
The large volume of the NOEMA survey provides new constraints on the bright end of the CO LF, as shown in Fig. 8. Compared to ASPECS, the roughly 3× shallower observations over 2× the area extend the overall constraints past the knee of the LF, while reaching comparable constraints near the knee.
We find a lower luminosity density in the overlap regions of the LF in the HDF-N compared to the HUDF, of roughly 0.15 dex for CO(2-1) and 0.4 dex for CO(3-2).It is not immediately clear where these differences come from, but they do not appear to be methodological (Appendix A).More likely they are due to field-to-field variance.It is known that there is an overdensity in the HUDF at  ∼ 1.1 that may bias the CO(2-1) measurements from ASPECS high (Boogaard et al. 2019).It is unclear if similar over-or underdensities affect the comparison of the CO(3-2) LF, though we cannot confidently identify the same fraction of CO(3-2) emitters as were found in the HUDF.Even larger variations are seen when comparing to the LFs from PHIBSS2.This suggest that the variations seen between the fields can be attributed to cosmic variance.Interestingly both SIDES and S seem to predict a lower luminosity density for CO(2-1) than is observed in both fields, making it less clear whether this is due to cosmic variance (Béthermin et al. 2022) or a missing ingredient in the models.Taken together, the combined measurements from the HDF-N and HUDF are (still) reasonably well described by a single Schechter function (Fig. 9).The joint fits provide improved constraints on the overall shape of the LFs, which now explicitly take into account the measured cosmic variance between the fields.
For CO(4-3), it appears we find a larger number of sources at the bright end than may be expected from an extrapolation from ASPECS.One should note the total number of sources entering the LF here is very limited: there are only 1 and 2 independent LF bins for ASPECS and the HDF-N respectively, with very few sources, hence the differences could simply be due to noise and low-number statistics.Given there are no CO lines with spectroscopic confirmation entering this bin, it is also more sensitive to the assumed priors, and only upper limits can be derived if one limits to the top-sources (see Appendix A for more details).We do find that the bright end of the CO(4-3) LF is consistent with the predictions from both SIDES and S , while the models predict a lower luminosity density than is observed towards the fainter end (Béthermin et al. 2022;Bisigello et al. 2022).While it appears there is a tantalising break in the CO(4-3) LF we find it is not statistically significant (see § 3.6) and the differences could again be caused by cosmic variance.If real, such a break could be caused by a rapid change in the average excitation between the faint and bright end of the LF at these redshifts, though such a scenario is not seen in the simulations, which are otherwise consistent with the LF in the HDF-N.
As the cosmological deep fields are biased against having very low-redshift galaxies in the foreground, we only provide a 3 upper limit on the CO(1-0) luminosity function at the bright end of log  [Mpc −3 dex −1 ] ≤ −1.9.This is somewhat more stringent than that of ASPECS due to the increased volume.There are no high-fidelity sources contributing to CO(6-5) at the highest redshifts, implying a 3 upper limit at the bright end of the luminosity function of log  [Mpc −3 dex −1 ] ≤ −3.6.We do note that the photometric redshifts do not fully cover the redshift range spanned by CO(6-5), which implies some signal may be lost in this bin (though the majority of the signal is expected to come from the no-counterpart sources), hence the constraints should be viewed as conservative.The same upper limit also extends to the higher- lines, nearly independent of transition (cf.Fig. 8).For both CO(1-0) and CO (6-5), the upper limits are comfortably in agreement with theoretical models.
The original PdBI pointing (Walter et al. 2014;Decarli et al. 2014) was chosen specifically to include the bright submillimeter galaxy HDF 850.1 and hence could not provide constraints on the CO(5-4) LF.The full mosaic, however, is not chosen to include this source.Therefore, it now provides the first measurement of the CO(5-4) LF at  = 5.The constraints on the LF are still limited and subject to cosmic variance (given that the expected source density of galaxies like HDF 850.1 is 1 for the area of the HDF-N survey, e.g., Zavala et al. 2021), but in overall agreement with the predictions from S .In contrast to the empirical models discussed above, Popping et al. ( 2019) have used the IllustrisTNG hydrodynamical simulations (Weinberger et al. 2017;Pillepich et al. 2018) and the Santa Cruz semi-analytical model (Somerville & Primack 1999;Somerville et al. 2001) to demonstrate that simulations which model galaxies from first principles generally do not reproduce the observed luminosity functions, unless the HUDF is a strong outlier (i.e., only a small fraction of the various realisations agreed with the HUDF).The result that our bestconstrained CO LFs in the HDF-N, CO(2-1) and CO(3-2), do not deviate strongly from those the HUDF implies there is increasing tension with the simulations that predict significantly smaller amounts of molecular gas in galaxies on average (Popping et al. 2019).These simulated CO LFs are quite sensitive to the assumed prescription for the value of  CO , that is needed to convert the simulated molecular gas mass function to a CO LF.While better agreement can be found by assuming significantly lower average values of  (Lenkić et al. 2020), and the measurements from ASPECS in the HUDF (Decarli et al. 2019(Decarli et al. , 2020)), incl.the dust continuum stacking (Magnelli et al. 2020).The gray line shows the join fit to all the literature data from Walter et al. (2020).The joint measurements confirm a rise of the cosmic molecular gas density from the local universe by a factor 4.5-11× to  ∼ 1.5 and subsequent decline out to higher redshift, with slightly larger uncertainties reflecting the field-to-field variance between the HUDF and HDF-N.
CO 1, tension then still remains in matching the observed faint-and bright end of the mass and luminosity functions simultaneously (e.g., Popping et al. 2019;Dave et al. 2020).

Cosmic molecular gas density and cosmic variance
The evolution of the cosmic molecular gas density is determined from the CO LFs at different redshifts.We combine the constraints from NOEMA in the HDF-N and ASPECS in the HUDF, by integrating the Schechter fits to both surveys down to the lowest  probed in the data (i.e., we do not extrapolate the faint end).This is consistent with earlier work and the integral should trace the bulk of the molecular gas mass given that the knee of the luminosity function is well sampled.Indeed, the stacking results at 1 ≤  < 2 from ASPECS (Inami et al. 2020) imply that there is not a large amount of gas mass missed at fainter luminosities.To convert the observed CO emission to a gas mass one needs to adopt a CO-to-molecular gas conversion factor,  CO (including He), and average excitation correction that is representative for galaxies around the knee of the LF.We assume an  CO = 3.6 M (K km s −1 pc 2 ) −1 ( Daddi et al. 2010;Bolatto et al. 2013), mainly to be consistent with earlier work, but note all results can simply be linearly rescaled to a different  CO .For the excitation, we adopt the average values derived for galaxies at  < 2 and  > 2 from Boogaard et al. (2020), with   1 = [0.75,0.80, 0.61, 0.44] for  = [2, 3, 4, 5].These ratios have been derived from the fluxlimited sample of CO-emitters from the ASPECS survey and were also used for the ASPECS measurements (Decarli et al. 2020).Note the assumed  51 is also reasonably similar to the excitation measured in HDF 850.1, with  52 ≈ 0.47 ± 0.15 (Walter et al. 2012, assuming that CO(2-1) will be highly excited in the intense starburst, even if only from the high CMB temperature at this redshift).The resulting cosmic molecular gas densities are shown in Fig. 10 and tabulated in Table 6.
The new constraints on the cosmic molecular gas density at 1 <  < 1.8 and 2 <  < 3.1 are in good agreement with earlier measurements, including those derived (independently) from the dust continuum in the HUDF (Magnelli et al. 2020), as well as the constraints from lower- transitions at the same redshift in COSMOS, GOODS-N and the HUDF (Pavesi et al. 2018;Riechers et al. 2019Riechers et al. , 2020)).The uncertainties are similar or in fact somewhat larger than the measurement from ASPECS alone at  = 1.4,which reflects the impact of cosmic variance on the LF and subsequent  mol measurements.The combined constraints confirm the rise in the cosmic molecular gas density from redshift zero out to  ∼ 1.5, with a factor between 4.5-11×, and a subsequent decline out to higher redshift.
The relatively high  mol (3 <  < 4.5) is because the bestfit Schechter function appears to somewhat overestimate the knee of the LF.As stated in Equation 3.6, the constraints on the CO(4-3) LF are rather limited, and we conclude that there is still significant uncertainty in the detailed shape of the LF as well as the molecular gas density at these redshifts.Indeed, the measurements from (Lenkić et al. 2020) fall significantly below the average (Walter et al. 2020).Alternatively, it could be that the average excitation is higher than assumed, though this is unlikely to explain the full discrepancy.Further analysis is needed to better constrain the luminosity density at these redshifts (cf.Boogaard et al. 2021b).We also show the constraints on  mol (4 <  < 6) based on the CO(5-4) LF.The large uncertainties are due to the limited constraints on the LF, though overall the value are consistent with early measurements.The scatter around the average excitation is expected to be significantly larger for the higher- lines that trace warmer and denser gas, which makes the estimate of the associated total gas mass from these lines more uncertain.
The new measurements from the NOEMA HDF-N survey provide an important complement to the earlier measurements in other fields.For both the luminosity function and molecular gas density, perfect agreement is not expected because these are determined from different fields.The overall good agreement between the LFs in the HDFN and HUDF, especially for CO(2-1) and CO(3-2) implies that the impact of cosmic variance on these LFs is not more severe than previously estimated (Popping et al. 2019;Decarli et al. 2020).Indeed, while the area-on-sky of the spectral scan surveys are typically modest, the broad redshift coverage implies that substantial volumes are probed (cf.Table 1), mitigating the impact of field-to-field variance (Popping et al. 2019).The combined measurements from the HDF-N and HUDF presented here now fold in the uncertainties due to cosmic variance between the two fields directly.

CONCLUSIONS
This paper presented an 8.5 arcmin 2 NOEMA survey of the Hubble Deep Field North (HDF-N), that scans nearly the complete 3 mm band (from 82-113 GHz) band in 45 pointings, to identify for molecular line emission in distant galaxies, measure the CO luminosity functions (LF), and constrain the cosmic molecular gas density ( mol ) out to  ∼ 6.The main conclusions of this study are as follows.
1. We search for line candidates in the cube via matched filtering and determine a redshift probability distribution, (), for each of the line candidates exploiting the existing photometric redshift distributions of nearby counterparts in combination with a CO redshift prior, including a no-counterpart solution.Out of the larger sample of candidates, we identify 7 high-confidence line emitters (with S/N ≥ 5.85 and fidelity  > 0.7).Four are CO(2-1) emitters at 1 <  < 2 (of which three spectroscopically confirmed), two have a broader () but are most likely CO(3-2) emitters at 2 <  < 3, and the final source is HDF 850.1, a well-known starburst galaxy at  = 5.184 (Walter et al. 2012) detected in both CO(5-4) and CO(6-5).
2. We detect four high-confidence 3 mm continuum sources.One is HDF 850.1, while the other three are all identified as known radio galaxies with spectroscopic redshifts  ≤ 1.
3. The larger area and significant depth of the NOEMA HDF-N survey compared to earlier studies provides the first constraints on the bright end of the CO LFs for  = 2 up to 5 at 1 <  < 6, extending the existing LF measurements up to  ∼ 4 from the knee upwards.We find a lower density in the overlap region near the knee of the CO(2-1) and CO(3-2) LFs in the HDF-N compared to the HUDF (from ASPECS) of ∼ 0.15 and 0.4 dex, respectively.We find tentative evidence for a higher CO(4-3) luminosity density at the bright end than expected from extrapolations of earlier surveys, though in good agreement with simulations.Finally, we provide the first constraints on the CO(5-4) LF at  = 5.
4. We perform a joint analysis of the LFs in the HDF-N and HUDF (from ASPECS) and find that they are well described by Schechter functions up to at least  = 3.Given that the constraints were determined from two completely independent fields, this suggests that the current measurements of the LFs and subsequently the cosmic molecular gas density are not strongly affected by cosmic variance.
5. The agreement between the HDF-N and HUDF poses some challenges for simulations that model galaxies from first principles, that (under the assumption of an  CO ) generally predict lower values for the CO LFs than are observed.
6.We integrate the combined HDF-N and HUDF LFs to provide revised constraints on the molecular gas density from the joint fields.The uncertainties on  mol from the joint determination are similar and in some cases even slightly larger.The latter is a direct consequence of the field-to-field variance which is now reflected in the measurements.We find very good agreement with earlier surveys for  mol (1.0 <  < 1.8) and  mol (2 <  < 3.2) and  mol (4 <  < 6).The results show that the cosmic molecular gas density increases by a factor 4.5-11 from redshift 0 to  ∼ 1.5, in agreement with previous measurements including independent measurements from the dust continuum.
The independent constraints from the NOEMA HDF-N survey provide important constraints on the cosmic variance in the CO LF compared to the other deep fields such as the HUDF.On-going efforts such as WIDE ASPECS are expanding the spectral scan surveys to even larger areas to further constrain the variance in the bright end of the CO LF (not covered by ASPECS).The key combination of depth and relatively large area of the NOEMA HDF-N survey is made possible by the increased sensitivity of the extra antennas and in particular the large instantaneous bandwidth, allowing to scan the entire 3 mm band in only 2 tunings.Future upgrades on NOEMA, such as the dual-band, full-band and multi-beam receivers, as well as the upcoming Band 2 and bandwidth upgrades for ALMA (Carpenter et al. 2022), would allow to constrain the evolution of the cosmic molecular gas density even further.
We want to thank the referee for a constructive and helpful report.We are grateful to Matthieu Béthermin and Laura Bisigello for providing the simulated CO luminosity functions from SIDES and S , respectively.M.  The completeness is similar between the four sidebands, with a slightly higher completeness for the sidebands with lower rms.Overall, the completeness exceeds 90% for integrated line fluxes above roughly 0.4 Jy km s −1 , being higher for narrower lines at fixed integrated flux.

A. LUMINOSITY FUNCTIONS: IMPACT OF METHODOLOGY, PRIORS AND COMPLETENESS
In this section we investigate the impact of the methodology assumptions made for the CO-line identification and the subsequent impact on the luminosity function and cosmic molecular gas density.The completeness corrections as a function of line width and peak flux for each of the four cubes (discussed in § 3.1) are shown in Fig. 11.Overall, we find the completeness corrections are minor, as the uncertainties are dominated by the sample purity (fidelity) and not completeness.
While the different line-search code typically agree on the high-fidelity candidates, there are increasing differences in the exact number of candidates and their S/N and Fidelity for fainter lines.We therefore repeat the full CO LF analysis using the line candidates from F instead of MF3D.The results are shown in Fig. 12. Reassuringly, the CO LFs are effectively unchanged, with the only difference being a slightly higher count for the faintest bins.To check whether our computation of the LF is consistent with earlier work, we also recompute the CO(2-1) and CO(3-2) LFs from the ASPECS large program using the top 15 high-fidelity line candidates (González-López et al. 2019) and find that we recover the LFs from Decarli et al. (2019) perfectly (up to the completeness corrections at the faint end).We also investigate the impact of the prior on the sources with out a photometric counterpart.The first alternative we explore is to assume the same prior as for sources that do have a counterpart.This redistributes the lines towards the CO(2-1) bin, slightly reducing in the CO(3-2) bin, and loses constraints on the CO(4-3) bin (leaving only upper limits), whilst leaving CO(5-4) unchanged.We also explore take sources without counterpart into account with a weight purely proportional to the volume probed at each redshift (as in Decarli et al. 2019).This has the opposite effect, of .Impact on the CO LFs of different codes and assumptions, compared to the fiducial result from the paper (Fig. 8).The result obtained when using the line candidates from F (grey) are nearly identical to those obtained with MF3D, which is not only reassuring, but also supports the join analysis with ASPECS.Including only the top sources ( > 0.7; dark green) leaves the  = 2 and 5 LFs unchanged, but turns the constraints for  = 3 and 4 into upper-limits only.The effect of changing the CO-line prior for the sources without counterpart to that of the sources with counterpart (dark blue) lowers the constraints on  = 4, with minimal impact on the other lines.Using the volume prior used in (Decarli et al. 2019, light blue) instead, boosts the  ≥ 5 LFs in favor of the lower- lines.
reducing the  ≤ 4 transitions more strongly towards lower-, whilst slightly boosting the  ≥ 5 luminosity functions, such that there are sufficient counts also in the  = 6 LF (i.e., effectively more than one source).
In summary, we find the CO(2-1) and CO(5-4) luminosity functions are robust, because they are well-defined by the sources that have a confident identification, such that the prior assumptions (or different line search codes) have minimal impact on the final result to within uncertainties.Moreover, the impact on the (combined)  mol constraints for these lines are negligible.For CO(3-2) and CO(4-3) somewhat larger variation is seen towards the faint end, while the bright end remains robust.Using only the top-fidelity sources, however, results in only upper limits on CO LFs for the latter two transitions.

B. REDSHIFT DISTRIBUTIONS
The redshift probability distributions (Equation 2) for the top candidates from Table 3 are shown in Fig. 13, except for HDF 850.1 which is known to have no photometric counterpart and a spectroscopic redshift of  = 5.184 (Walter et al. 2012).

Figure 1 .
Figure 1.Footprint of the 45-pointing NOEMA mosaic around the central frequency of 98 GHz (in blue) compared to the footprint of the original Hubble Deep Field North observations with WFPC2 fromWilliams et al. (1996, in white).The white circle indicates the original Plateau de Bure scan(Walter et al. 2014;Decarli et al. 2014).The background shows the Hubble imaging from CANDELS(Grogin et al. 2011;Koekemoer et al. 2011) in the WFC3/F160W, ACS/F850LP, and ACS/F606W filters (RGB).

Figure 2 .
Figure 2. The root-mean-square noise (rms) as a function of frequency.The rms is measured per 9 MHz channel and the upper and lower sidebands of each of the two setups of the NOEMA HDF-N scan are shown separately.For comparison, the rms of the ASPECS-LP (Decarli et al. 2019, measured per 7.8 MHz channel) is shown in blue.

Figure 3 .
Figure 3. Emission line candidates with highest fidelity and S/N.The left panels show the moment zero maps overlaid on the HST images (F160W/F850LP/F606W). Contours are spaced from ±3 up to ±10, spaced by ±1 (no negative contours are seen) and the synthesised beam is shown in the bottom left corner.The right panels show the peak pixel spectra and the rms noise (brown line), including Gaussian fits and 1-uncertainties (red line and shading).

Figure 4 .
Figure 4. CO line prior used for the redshift associations.The prior for sources with an (HST) counterpart is based on the observed CO line distribution from ASPECS (Decarli et al. 2020).For the sources without counterpart, we use a line distribution based on the redshift distribution of optically-faint submillimeter galaxies from Smail et al. (2021) in combination with a typical CO ladder(Danielson et al. 2011;Boogaard et al. 2020).

Figure 7 .
Figure 7. Continuum sources overlaid on the HST imaging (F160W/F850LP/F606W). Contours are spaced from ±3 up to ±10, spaced by ±1 and the synthesised beam is shown in bottom left corner.The second continuum source is HDF 850.1, the other three are all radio galaxies at  ≤ 1 (see references in text).

Figure 8 .
Figure8.CO luminosity functions in the HDFN (green boxes).Each panel indicates the CO transition and mean redshift, as well as the mean number of sources that entered the luminosity function (from 5000 realisations).The blue boxes show the results from ASPECS(Decarli et al. 2020) and the pink boxes from PHIBSS2(Lenkić et al. 2020).

Figure 9 .
Figure 9. Schechter fits to the CO luminosity functions of the HDFN (green boxes) + ASPECS (blue boxes).Each panel indicates the CO transition and the fit parameters (cf.Table5).

Figure 10 .
Figure 10.Cosmic molecular gas density from integrating the CO luminosity function fits for the HDF-N + HUDF (green boxes).We compare this to the local measurements from xCOLDGASS (Fletcher et al. 2020), COLDz in COSMOS and GOODS-N (Pavesi et al. 2018; Riechers et al. 2019), VLASPECS in the HUDF (Riechers et al. 2020), PHIBSS2 background sources (Lenkić et al. 2020), and the measurements from ASPECS in the HUDF (Decarli et al. 2019, 2020), incl.the dust continuum stacking (Magnelli et al. 2020).The gray line shows the join fit to all the literature data fromWalter et al. (2020).The joint measurements confirm a rise of the cosmic molecular gas density from the local universe by a factor 4.5-11× to  ∼ 1.5 and subsequent decline out to higher redshift, with slightly larger uncertainties reflecting the field-to-field variance between the HUDF and HDF-N.

Figure 11 .
Figure 11.Completeness fractions for the four fields determined from injecting and recovering mock sources for a range of peak fluxes and line widths.The completeness is similar between the four sidebands, with a slightly higher completeness for the sidebands with lower rms.Overall, the completeness exceeds 90% for integrated line fluxes above roughly 0.4 Jy km s −1 , being higher for narrower lines at fixed integrated flux.
Figure12.Impact on the CO LFs of different codes and assumptions, compared to the fiducial result from the paper (Fig.8).The result obtained when using the line candidates from F (grey) are nearly identical to those obtained with MF3D, which is not only reassuring, but also supports the join analysis with ASPECS.Including only the top sources ( > 0.7; dark green) leaves the  = 2 and 5 LFs unchanged, but turns the constraints for  = 3 and 4 into upper-limits only.The effect of changing the CO-line prior for the sources without counterpart to that of the sources with counterpart (dark blue) lowers the constraints on  = 4, with minimal impact on the other lines.Using the volume prior used in(Decarli et al. 2019, light blue) instead, boosts the  ≥ 5 LFs in favor of the lower- lines.

Figure 13 .
Figure 13.Redshift probability distributions, (), for the top candidates (except HDF 850.1).The solid red bars show the final () and the other bars show the priors (cf.Fig. 4).The black lines show the photometric redshift distributions of the individual sources weighted by the separation, () () (with arbitrary normalisation between the panels).The black arrow in the top of each panel shows the spectroscopic redshift for the most-likely photometric counterpart (if available)-in all cases coincident with the peak of the ().

Table 1 .
Lines and corresponding redshift ranges and volume covered in the HDF-N mosaic.

Table 2 .
Properties of the dirty cubes and continuum map.

Table 3 .
Highest fidelity lines from the line search and their identification.

Table 4 .
Properties of continuum sources.

Table 5 ) . Table 5 .
Schechter function parameters posterior percentiles for different CO transitions.