Constraining the halo-ISM connection through multi-transition carbon monoxide line-intensity mapping

Line-intensity mapping (LIM) surveys will characterise the cosmological large-scale structure of emissivity in a range of atomic and molecular spectral lines, but existing literature rarely considers whether these surveys can recover excitation properties of the tracer gas species, such as the carbon monoxide (CO) molecule. Combining basic empirical and physical assumptions with the off-the-shelf Radex radiative transfer code or a Gaussian process emulator of Radex outputs, we devise a basic dark matter halo model for CO emission by tying bulk CO properties to halo properties, exposing physical variables governing CO excitation as free parameters. The CO Mapping Array Project (COMAP) is working towards a multi-band survey programme to observe both CO(1–0) and CO(2–1) at z ∼ 7. We show that this programme, as well as a further `Triple Deluxe' extension to higher frequencies covering CO(3–2), is fundamentally capable of successfully recovering the connection between halo mass and CO abundances, and constraining the molecular gas kinetic temperature and density within the star-forming interstellar medium in ways that single-transition CO LIM cannot. Given a fiducial thermal pressure of ∼ 104 K cm-3 for molecular gas in halos of ∼ 1010 M ⊙, simulated multi-band COMAP surveys successfully recover the thermal pressure within 68% interval half-widths of 0.5–0.6 dex. Construction of multi-frequency LIM instrumentation to access multiple CO transitions is crucial in harnessing this capability, as part of a cosmic statistical probe of gas metallicity, dust chemistry, and other physical parameters in star-forming regions of the first galaxies and proto-galaxies out of reionisation.


Introduction
Line-intensity mapping (LIM or IM) is a nascent mode of survey at the interface of extragalactic astrophysics and cosmology, targeting the unresolved aggregate spectral line emission of galaxies as opposed to individual emitters [1][2][3].Through the summary statistics of a spatial-spectral observation spanning cosmological scales, the data resulting from a LIM survey will distinguish three-dimensional modes of large-scale structure as illuminated by atomic or molecular line emission, tracing a particular species of gas and thus the elemental abundances and physical processes that excite the observed line emission.
The task is not without challenges, with interloper line emission and observational systematics all posing obstacles to achieving background-limited sensitivity.However, all LIM surveys-and indeed all cosmological surveys-trace the same underlying large-scale structure lit up in different but correlated ways.As such, cross-correlations have led to the first detections of cosmological neutral atomic hydrogen emitting in the 21 cm line [4,5] as well as tentative hints of ionised carbon line emission [6,7], by leveraging external crosscorrelations against galaxy surveys.LIM experiments targeting other gas species like carbon monoxide (CO) have also analysed existing data for possible external cross-correlations [8,9], and expect to achieve sufficient sensitivity for a strong (≳ 5 sigma) cross-correlation detection well ahead of any auto-correlation detection of equally high significance [10].(Interferometric CO LIM surveys have also claimed auto-correlation detections at the 2-3 sigma level [11,12].)Importantly, a cross-correlation result is more than the sum of two auto-correlation results.Beyond the practical advantage of rejecting disjoint foregrounds and systematics-thus removing obstacles to an initial demonstration of detectability of cosmological line emissioncross-correlation between two data sets enables science not possible with either data set in isolation.Cross-correlation between LIM and galaxy surveys, for instance, will enable insights into the atomic and molecular gas content of the cross-correlated galaxies [13][14][15][16].Just as important are LIM-LIM cross-correlations, particularly between 21 cm IM and cmto mm-wave LIM in star-formation lines [17][18][19], but also between different star-formation lines [20][21][22][23]-in some cases even within the same survey.For instance, the CO Mapping Array Project (COMAP; [24]), a single-dish CO LIM experiment, is currently in a Pathfinder phase operating at observing frequencies of 26-34 GHz and primarily targeting CO(1-0) at z ∼ 3.However, as both 12 CO and 13 CO emission fall within these observing frequencies, an internal cross-correlation across the appropriate frequency channels could enable insights into isotopologue ratios and saturated fractions [20].
The benefits of cross-correlation motivate the fiducial design of future COMAP iterations beyond the current Pathfinder phase [25], involving an expansion not only of existing instrumentation centred at 30 GHz, but also construction of new spectrometers observing at 15 GHz.As Figure 1 shows, this enables extraction of CO emission at z ∼ 7-towards the end of the epoch of reionisation (EoR)-through cross-correlation of the lowest rotational energy transitions of the CO molecule.Any detection will be robust not only to excitation conditions at the emission epoch that may suppress higher energy transitions, but also to disjoint foregrounds.This includes CO(1-0) emission at z ∼ 3-the primary target for the Pathfinder, but a prominent foreground in the way of surveying CO(2-1) emission at z ∼ 7.
However, the benefits of multi-transition CO LIM should extend beyond merely improving robustness of a late-reionisation CO detection.Much in the same way that line ratios derived from stacking resolved sources sample the spectral line energy distribution (SLED) of the galaxies and thus reveal the average properties of the interstellar medium (ISM) of the population (e.g., [26][27][28]), multi-transition LIM data will effectively sample the SLED of all contributing unresolved sources.By probing multiple transitions of CO, the data from future iterations of COMAP will have constraining power over the physical conditions in the ISM of late-reionisation galaxies that govern the excitation of these CO rotational transitions, such as the density and kinetic temperature of the surrounding medium as populated by molecular hydrogen (H 2 ), the primary collisional partner for CO.
Accounting for such conditions is key to a complete understanding of the physics of star formation towards the end of reionisation.To successfully recover CO and H 2 densities at these epochs, even in the 'cosmic average' molecular cloud, is to understand the conversion between the two and thus the metallicity and turbulence in these environments (cf. the review of the CO-to-H 2 conversion factor by [29]), with turbulence in particular governing cloud collapse and fragmentation.Constraints on parameters like gas kinetic temperature will be especially powerful in combination with measurements of the redshift evolution of dust temperature (e.g., [30]).As feedback processes influence the complex chemical interplay of molecular gas and dust in dense environments (cf., e.g., [31,32]), CO measurements at these high redshifts will provide important insights into cooling and heating mechanisms governing star formation at the very inception of the star-forming cosmic web.
The addition of instrumentation outside the 26-34 GHz COMAP Pathfinder frequency range is thus transformative, not incremental.The future may see not only a 'double deluxe' COMAP-EoR survey and a COMAP Extended Reionisation Array (COMAP-ERA) incorporating 15 GHz and 30 GHz observations (in the so-called Ku and Ka bands)-as outlined by the forecasts of [25]-but in fact a COMAP 'Triple Deluxe' also targeting CO  at the same late-reionisation redshifts, with additional 45 GHz instrumentation (in the so-called Q band).Taking full advantage of this transformative capability will however require consistent modelling of line ratios across the CO SLED grounded in at least a statistical picture of the physical conditions of the ISM.
Yet previous literature forecasting CO LIM signals at z ≳ 6 [9,[33][34][35][36] leans towards either modelling only a single CO transition, or applying empirical halo models for multiple CO transitions that ultimately abstract away many of the physical variables governing CO excitation in dense gas.That said, some works have constructed modelling frameworks for LIM observables encompassing multiple lines based on explicit, self-consistent physical approaches (e.g., [37][38][39][40][41]).For quite possibly the first time, we move beyond even those works to present a detailed consideration of recoverability of ISM environmental variables specifically from multi-transition CO LIM.
To quantify the potential impact of multi-transition CO LIM in idealised scenarios, we consider here explicit calculations of observables and observabilities for different iterations of COMAP targeting late reionisation, rooted in self-consistent modelling of the CO SLED aided by the off-the-shelf non-LTE radiative transfer code Radex [42] (specifically the implementation of [43]).In doing so, we aim to answer the following questions: • What constraining power can future CO line-intensity mapping provide on the environmental variables that govern CO excitation?
• How much does this constraining power depend on the ability of future surveys to observe multiple lines, in particular multiple transitions of CO?
• Do qualitative aspects of answers to the above change with our approach to modelling the high-redshift CO signal?
The paper is structured as follows.After detailing methods for simulations of signal, noise, and parameter recovery in Section 2, we present detectability and parameter recovery forecasts in Section 3. We discuss key takeaways in Section 4 before concluding in Section 5.

Signal: mocking the halo-CO connection
We consider two possible ways to model the signal, i.e., to model CO line fluxes per dark matter halo.One is an entirely empirical model of the bulk CO column density and other properties as a function of halo mass and redshift, while the other attempts to provide some physical motivation by modelling a population of molecular clouds and formulating a percloud prescription of ISM properties.We will dub these Model A (for 'amalgamated') and Model B (for 'broken-down') respectively.
Across both models, we assume a background temperature of (1 + z) × 2.7255 K with z = 6.68 set to the redshift observed at 15 GHz in CO(1-0) or 30 GHz in CO(2-1).However, the Radex calculation of CO line fluxes from a homogeneous object demands definition of further properties, namely a geometry for the escape probability, the kinetic temperature T k , H 2 volume density n H2 , the line width ∆v, and the CO column density N CO .We assume throughout that the kinetic temperature in environments hosting CO is the same, at 86 K.The choice is motivated by the recent analysis of [30]; at z ∼ 7 their model suggests a dust temperature of ≈ 86 K and we expect the kinetic temperature to be at least weakly coupled to this temperature.The choices of all other parameters will differ significantly between Models A and B owing to their divergent approaches to describing the CO content of dark matter halos.

Model A: amalgamated CO properties per dark matter halo
The key assumption at the core of Model A is the assumption that the CO column density follows a double power law with halo mass: At low masses this assumes a power law of index α, and at high masses the power law assumes a fixed index of −2/3.The resulting CO line fluxes per halo are then proportional to this column density scaled by various physical variables, including population fractions at each energy level.The motivation for choosing the high-mass power law index of −2/3 is to effect a saturation of area-integrated CO luminosity per halo at high halo mass.Following [37], we assume that the CO emission and star formation both originate from a disc with radius R d = (λ/ √ 2)(j d /m d )r vir , with λ = 0.05 (isolated exponential profile) and j d /m d = 1 (same specific angular momentum for disc and halo).Since the total CO luminosity is the product of the velocity-integrated CO flux with the area of this disc, it thus scales as N CO (M h )×M 2/3 h , and should then plateau at high M h given our form of N CO (M h ).
We assume that M break lies where the star formation rate peaks, around 10 12.3 M ⊙ .We also derive a fiducial value of α = 0.9 based on the following sequence of considerations: • The best-fit UniverseMachine model from [46] finds that the SFR at low masses evolves as v 5 peak at COMAP-EoR redshifts.If h , then the SFR evolves as M 5/3 h .
• The SFR surface density then must scale roughly linearly with M h , since the area over which this SFR occurs scales as • If the CO surface density follows a Kennicutt-Schmidt law with Σ SFR ∼ N n CO with n somewhere between 1 and 1.2 [47,48], then at the faint end this yields N CO ∼ M 1/n h , so that α = 1/n is somewhere between 1 and 0.8.The choice of α = 0.9 is the midpoint of this range.
We also assume a minimum CO emitter halo mass of M min = 10 10 M ⊙ , consistent with the photoionisation considerations in [37] introducing a sharp cutoff at this halo mass.
Radex provides three options for escape probability geometry: a uniform sphere, a turbulent sphere (equivalent to a large velocity gradient approximation), and a plane-parallel slab.We use the first option for Model A.
We also assume a rather strong evolution of the H 2 volume density with the host halo mass: where α is the same α found in the definition of N CO (M h ).The form is motivated by the expectation discussed by [37] that the volume density will go as the square of the surface density, whereas the column density scales linearly with surface density.We do not prescribe a break here because molecular hydrogen is more numerous and thus more capable of selfshielding than CO, and expect this to be the case for the mass range that dominates the signal.Our fiducial value for n H2,10 is 10 2 cm −3 .With all of these inputs, Radex can calculate the velocity-integrated flux F J per halo in each CO(J → J − 1) line, leading to the luminosity per halo being L ′ J = πR 2 d F J .

Model B: break-down of the CO content of halos into molecular clouds
This model attempts to model the CO content of each dark matter halo as the ensemble of some number of molecular clouds, which altogether make up the molecular mass of the galaxy.For this, we take additional inspiration from the work of [49], which couples the Santa Cruz semi-analytic model of galaxy formation to a radiative transfer code to model a range of carbonic lines.
For simplicity we assume all clouds are near-virialised, and furthermore have the same mass and size, taking the properties of the average cloud from the cloud mass function assumed by [49].The power-law (truncated Pareto) form of that cloud mass function yields an average cold gas mass per cloud of M MC ≈ 2.2 × 10 4 M ⊙ .Based on the virial theorem, there is an expectation that the external pressure is related to the mass and radius of the molecular cloud, and in particular P ext ∼ M2 MC r −4 MC (see, e.g., [50]).Taking the external pressure acting on the cloud to be P ext /k B ∼ 10 4 K cm −3 ,1 the cloud size for this mass is of order r MC ≈ 8.7 pc based on the calibration given by [49] (in turn taken from [51]).This in turn naturally suggests n H2 ≈ 110 cm −3 (see Equation 2.9)-so now changing this parameter equates to changing the cloud size by modulating the environmental pressure necessary to arrive at the given density.
Incidentally, the corresponding molecular hydrogen column density is N H2 ∼ 3 × 10 21 cm −2 , or a mass surface density of Σ ∼ 90 M ⊙ pc −2 .As a point of comparison, the work of [52] catalogued molecular clouds across the entire Milky Way disk and found their surface densities to span a range of 2 to 300 M ⊙ pc −2 .
[52] also obtained a fitting formula for the velocity dispersion σ v in terms of Σ and r MC : This suggests σ v ∼ 4 km s −1 for our average cloud.We assume a Gaussian line velocity profile with this σ v , leading to a full width at half maximum (FWHM) of ∆v = 10 km s −1 .This value is similar to the effective sound speed assumed by [37].As [52] note, the origin of the line width and the nature of its relation to other cloud parameters is the subject of some debate [53], so we avoid coupling ∆v too tightly to our assumption of virial clouds, and leave it fixed at the fiducial value of 10 km s −1 throughout. 2  The simulation work of [49] suggests that the evolution of H 2 mass with increasing halo mass is roughly linear up to M h ∼ 10 12 M ⊙ , then plateaus.We also take cues from the work of [54], starting with the UniverseMachine empirical approach of modelling star-formation histories and applying it to neutral atomic and molecular gas histories.That work suggests a slightly superlinear growth of molecular gas mass with halo mass at high redshift: the ratio of H 2 mass to halo mass grows roughly as M 0.5 h , although again shrinking slightly at higher masses.We thus consider the cloud count to saturate at high halo mass, and take with log(M brk /M ⊙ ) = 12.3 as before.The denominator implying a H 2 mass of 2.1 × 10 11 M ⊙ for a halo of mass M brk ≈ 2 × 10 12 M ⊙ is tuned to yield a cosmic H 2 mass density of 2.5 × 10 7 M ⊙ Mpc −3 at z = 6.7.
For comparison, the only observational constraint that approaches these redshifts comes from the CO Luminosity Density at High Redshift (COLDz) survey [55,56], an interferometric untargeted search for CO line emitter candidates in CO(1-0) at z ∼ 2-3 and CO(2-1) at z ∼ 5-7.The most conservative upper limit, assuming all unconfirmed emission identified as z ∼ 3 CO(1-0) lines could be z ∼ 7 CO(2-1) lines, is ρ H2 < 4.0 × 10 7 M ⊙ Mpc −3 .Our fiducial functions for cloud mass and count per halo thus result in a value consistent with the primary observational constraint at this redshift, which also depends strongly on extrapolation (if any) of the CO luminosity function beyond the survey detection limit (noting that the COLDz analysis of [56] carries out no such extrapolation), and on assumptions around CO excitation and the CO-to-H 2 conversion factor.
The halo mass modulates the CO column density per cloud through its connection to the hosted gas-phase metallicity.Like [37], we assume that all carbon in molecular clouds is locked up in CO on average, taking the most optimistic end of the 30-100% range considered by the work of [57], which is also how [37] obtains the conversion from H 2 column density to CO column density as We assume Z ′ ≃ 1 (in units of solar metallicity) at M h ≃ M brk , with the evolution essentially following a double power law with the same break scale as the cloud count: Our fiducial assumption is α = 1/3 and Z ′ brk = 2 so that Z ′ (10 10 M ⊙ ) ≈ 0.25 and Z ′ (10 9 M ⊙ ) ≈ 0.125.In general we assume that α > 0.
Because the metallicity gives χ CO , combining this with n H2 forces a value for N CO .To be explicit: meaning But furthermore, we derive r MC anyway from the number density, size, and total mass of each molecular cloud: here µm H2 is the mass per molecule of molecular gas, where following [37] we multiply the H 2 molecular mass by a factor of µ = 1.36 to account for the occasional contribution of helium to the total mass of cold gas.We invert this to obtain As a result, our expression for the CO column density simplifies into For our fiducial model parameter values, this ends up in the range of ∼ 10 17 -10 18 cm −2 for typical halo masses of 10 9 -10 12 M ⊙ .Note that several of our model parameters end up in a degenerate combination as an overall normalisation of column density at this step, namely • the H 2 mass M MC = 2.2 × 10 4 M ⊙ per cloud; • the metallicity at break mass, Z ′ brk /2 = 1; • the CO-H 2 conversion factor at solar metallicity, χ CO,1 = 3 × 10 −4 ; • and the H 2 volume density n H2 = 110 cm −3 in each cloud.
We merge the combination of the first three into a single parameter: All in all, the model follows these steps: • Assume that only halos above some minimum mass M min = 10 9 M ⊙ host CO emission.
(Note that this is a different value from that assumed for Model A.) • Assume the escape probability geometry of a turbulent sphere.
• Assume the same kinetic temperature for each cloud, with the fiducial value again being T k = 86 K.Note that this is completely decoupled from the virial temperature of the molecular cloud, as we do not assume that the thermal energy (as opposed to magnetic or turbulent kinetic energy) is what primarily counteracts the gravitational potential energy or any external confining pressure.
• Assume the same n H2 for each cloud, with the fiducial value being 110 cm −3 .
• The CO column density per cloud as a function of host halo mass is ultimately (2.13) • The number of clouds at each halo mass is given by where the fiducial values of the free parameters are M brk = 10 12.3 M ⊙ , N MC,brk = 9.5 × 10 6 , and β = 3/2, based on the assumptions made above about the ratio of H 2 mass to halo mass and the H 2 mass per cloud.
The CO(J → J − 1) luminosity per cloud, phrased as the velocity-and area-integrated brightness temperature, then becomes πr 2 MC times the velocity-integrated flux F J,MC obtained from Radex.Although r MC depends on both the mass per cloud and the number density, here we will only account for the latter dependence, effectively fixing M MC .In this case, r MC ≈ 42 pc when n H2 = 1 cm −3 .We also assume that the luminosities of each cloud add together linearly to result in the total luminosity per halo, meaning that for each line Seven free parameters {T k , n H2 , M brk , N MC,brk , N CO,brk , α, β} control this relation in the end, with fiducial values of {86 K, 110 cm −3 , 10 Before we move on to how these models of L ′ CO (M h ) lead to observables like the power spectra of the CO lines, we provide a few summaries of Model A and Model B. We summarise the parameters and their fiducial values in Table 1, and the resulting L ′ J relations and ratios between L ′ J and L ′ 1 in Figure 2. From the latter, in particular, we note that Model A predicts that the excitation of CO evolves from sub-thermal at low halo masses to super-thermal at high masses, owing to the strong evolution of n H2 (M h ) ∼ M 2α h = M 1.8 h with halo mass.Model B on the other hand, with n H2 kept the same across all clouds in all halos, essentially predicts sub-thermal excitation across the entire halo mass range with little evolution.
We also show in Appendix A the effect of parameter variations in Model B, on both the L ′ J (M h ) relation shown in Figure 2 and on the power spectra calculated in the next section.

Observables: auto and cross power spectra
We consider auto and cross power spectra across the different CO transitions.We specifically only consider the spherically averaged 3D power spectrum P (k) as a function of comoving wavenumber k; while [25] also considered the quadrupole of the 3D power spectrum, we conservatively model only the 'monopole' P (k), assuming that it is the easiest to detect and correlate across all bands.Otherwise we draw heavily from the formalism of [25] below.We translate the CO luminosities L ′ J (M h ) as functions of halo mass into the observable power spectra using the fit of [58] for the halo mass function dn/dM h (the differential comoving number density of halos of mass M h ) and the model of [59] for the halo bias b(M h ) with which halos of mass M h trace the matter density contrast.Given the CO(J → J − 1) luminosity L ′ J (M h ) as a function of halo mass, we can find the average line brightness temperature, and the average temperature-bias product, with H(z) being the redshift-dependent Hubble parameter for our fiducial cosmology.The luminosity-weighted average bias of the CO emission is then b J = ⟨T b⟩ J /⟨T ⟩ J .
In real comoving space, the spherically averaged auto power spectrum for any transition is the sum of a clustering component proportional to the matter power spectrum P m (k) and a scale-independent shot noise component P s : where (Note that we do not assume any halo-to-halo scatter around the mean L ′ J (M h ) relation; Appendix A discusses the effect of adding such scatter to model predictions.) The actual observed auto power spectrum undergoes a number of anisotropic distortions due to large-scale structure flows, line broadening, and angular resolution.Accounting for all such distortions results in a 2D auto pseudo-power spectrum P (k, µ), dependent on both k and µ (the cosine of the angle of the wavevector relative to the line-of-sight direction, such that the line-of-sight component is k ∥ = kµ and the transverse component is k ⊥ = k 1 − µ 2 ).It is necessary to then average over µ ∈ (−1, 1) to obtain the spherically averaged pseudo-power spectrum P (k) for each line.
Assuming both the beam profile and line profile are Gaussian functions, the attenuation due to these two factors is given by The transverse profile is defined by σ ⊥ = χ(z)θ B / √ 8 ln 2, where the comoving angular diameter distance to redshift z is equal to simply the comoving distance χ(z) to redshift z for our fiducial flat cosmology, and θ B is the angular FWHM of the main beam.We assume values of 3.7/3.9/4.0 arcminutes for θ B at 15/30/45 GHz, which are possibly slightly optimistic at the level of 10-20%, but nonetheless realistically achievable with dishes of diameters in the range of 24-7 m.For reference, the dishes expected to be used for COMAP-EoR are 18 m class at 15 GHz and 10 m class at 30 GHz.
Meanwhile, the line-of-sight profile is given by where here we assume a typical line width of v eff = 250 km s −1 .Note that this represents the velocity dispersion per halo, not per cloud, and therefore we apply it equally to Models A and B. We also put in two additional factors to account for loss in sensitivity due to survey parameters.The first is the main beam efficiency η of the instrument, which will vary by band depending on instrumental details like blockage due to on-axis reflectors.We follow the values of η 1 = 1 and η 2 = 0.72 for the 15 GHz and 30 GHz instrumentation assumed in [25], and pick a value of η 3 = 0.86 roughly in the middle for the 45 GHz instrumentation (assuming effectively that moderate improvements can be made over the 30 GHz instrumentation even in the case of on-axis antennas being used).The loss of signal manifests as a factor of η J in temperature and thus a factor of η 2 J in the auto power spectrum.The second is a volume window function, following [60], to account for the loss of modes at scales approaching the limits of the survey volume.Here we assume that the full overlap between all data sets across the three transitions spans z = 6.2-7.2,corresponding to observations of CO(1-0) at 14-16 GHz, and fields of size Ω F = 4 deg 2 .Assuming then that the maximum accessible transverse scale is L ⊥ = χ(z = 6.7) √ Ω F (the transverse span of each survey patch at the central redshift) and the maximum accessible line-of-sight scale is L ∥ = χ(z = 7.2) − χ(z = 6.2), the corresponding minimum accessible wavenumbers are k ⊥,min = 2π/L ⊥ and k ∥,min = 2π/L ∥ , and the volume window function is With one final assumption that the dominant mode of redshift-space distortion is linear growth of structure, we obtain So for each CO transition, given σ ⊥ ∝ θ B and σ ∥ defined as above for all lines, As an illustration of the halo masses that tend to dominate the overall comoving line luminosity density that determines the signal, Figure 3 shows the integrand of Equation 2.16 for the three CO transitions considered.In both Model A and Model B, halos of mass ≳ 10 12 M ⊙ are too rare to contribute significantly to the predicted clustering signal, with the differential brightness temperature peaking for halos of mass ∼ 10 10 -10 11 M ⊙ .Cross power spectra calculations are largely similar.The cross shot noise amplitude is calculated as and based on this we may find the total cross-correlation pseudo-power spectrum as (2.26)

Noise: COMAP survey parameters at reionisation
For survey parameters, we borrow the COMAP-EoR and COMAP-ERA concepts proposed in a COMAP Collaboration paper [25].These assume receivers operating at 30 GHz with 19 spectrometers per dish and system temperature T sys = 44 K, and at 15 GHz with 38 spectrometers (19 dual-polarisation feeds) per dish and T sys = 22 K.We follow the original paper in assuming 7000 dish-hours in the Ka band and 29000 dish-hours in the Ku band for the COMAP-EoR survey, and a significantly higher 57000 dish-hours in the Ka band and 110000 dish-hours in the Ku band for the COMAP-ERA survey.In all cases, we make the highly approximate assumption of uniform noise achieved across N F = 3 survey fields, each of size Ω F = 4 deg 2 .Additionally, we consider here a COMAP 'Triple Deluxe' concept, which adds a Q band component to the COMAP-ERA proposal.We conservatively assume 38 spectrometers (19 dual-polarisation feeds) with T sys = 60 K for the system temperature3 and only 30000 dish-hours achieved on sky in the same fields.
In all cases, the number of dish-hours t obs and the number of spectrometers per dish N s/d provides the total integration time per sky pixel of solid angle Ω px : The noise level per voxel is then given simply by the radiometer equation, with an additional dependence on the bandwidth δν per frequency channel: (2.28) Given the comoving volume per voxel V vox , we then obtain the contribution of noise to the observed power spectrum as Where necessary, we assume δν = 2 MHz and Ω px = θ B / √ 8 ln 2. However, in reality the factors of δν −1 and Ω −1 px in σ 2 N will cancel the factors of δν and Ω px that must figure into the calculation of V vox .Therefore, the calculation of P N is ultimately agnostic to voxelisation.
Beyond power spectrum uncertainties from sample variance and radiometer noise, we must consider the contribution of interloper line emission to the 30 GHz and 45 GHz observations.Suppose that interloper line emission in CO(J i → J i − 1) from some redshift z ∼ z Ji is described by the power spectrum P J (k ∥ , k ⊥ ) in its original comoving frame.In the z = z target ∼ 7 observation, this interloper component will present with modes scaled in the transverse dimensions by and along the line of sight by so that the apparent power spectrum is as a function of the line-of-sight and transverse components (k ∥ , k ⊥ ) of the wavevector in the projected z ∼ 7 comoving frame.This projected power spectrum may then be subjected to the same resolution and volume window functions as the primary target power spectra, as well as the appropriate factor of η 2 J , to yield the projected spherically averaged pseudopower spectrum PJi,proj (k).(Note that the definition of the original P Ji (k ∥ , k ⊥ ) would cover redshift-space distortions from linear structure growth.) We consider all interlopers to follow a luminosity function Φ(L ′ ) parameterised as a Schechter function with a smooth but rapid cutoff at low luminosities, and tuned to mimic the luminosity function of [63]. 4 Given that the interloper emission is not a primary focus of this paper and largely manifests as a source of additional uncertainty in the measurement for example, the achieved system temperature of ≈ 38 K for the Q/U Imaging ExperimenT (QUIET) Q-band coherent amplifier array [61] (albeit from the Atacama desert, a particularly dry site with median precipitable water vapour of 1.2 mm during QUIET's Q-band observations), or the expected system temperature of ≈ 45 K for the Next-generation Very Large Array (ngVLA) 41 GHz receivers [62].
of reionisation-epoch signals, we consider it appropriate to apply this level of modelling across both the z ∼ 3 CO(1-0) interloper in the 30 GHz observation and the two additional interloper components that would manifest in the 45 GHz observation.Appendix B provides further details including a favourable comparison against current observational constraints on CO luminosity functions at z ∼ 1-4.
With the power spectrum for thermal noise and any interlopers, the total observed auto pseudo-power spectrum from an observation targeting CO(J → J − 1) at z ∼ 7 becomes PJ×J,tot (k) = P N + PJ×J (k) + (2.33) The expected values of all cross pseudo-power spectra remain unaffected (i.e., PJ×J ′ ,tot (k) = PJ×J ′ (k) for J ̸ = J ′ ) as all assumed sources of noise and interloper emission are independent between the different observations being cross-correlated.At last we are ready to calculate the uncertainties for auto and cross power spectra.We assume Gaussianity, meaning diagonal covariances within each observable so that no correlations exist between different wavenumber bins.Within each wavenumber bin of width ∆k (not necessarily uniform) we expect to observe a number of independent modes equal to (2.34) (The total number of modes observed is twice this, but half of the Fourier modes perfectly track the other half for Fourier transforms of real variables.)Then the uncertainty in the auto pseudo-power spectrum at each wavenumber is given simply as

.35)
Most generally, the covariance between any of the pseudo-power spectra will be given as owing to the assumption of Gaussian random fields and the consequent ability to express fourth-order moments purely in terms of second-order moments (cf., e.g., Appendix A of [64] for more explicit consideration).This recovers the expression for the variance of the cross pseudo-power spectrum, given by [25] as and the covariance between auto and cross pseudo-power spectra, given by [25] as given J ̸ = J ′ .However, the most general expression also allows us to consider covariances between different cross pseudo-power spectra.For example, the covariance between the 15 GHz-30 GHz auto pseudo-power spectrum (J 1 = 1, J ′ 1 = 2) and the 15 GHz-45 GHz cross pseudo-power spectrum (J 2 = 1, This does imply, incidentally, that even when simulating an experimental scenario where only the 15 GHz-30 GHz and 15 GHz-45 GHz cross spectra inform parameter recovery, calculation of the 30 GHz-45 GHz cross spectrum is necessary to inform covariances.For most of this work, we only use wavenumbers in the range of k ∈ (0.03, 0.5) Mpc −1 , including when calculating the total signal-to-noise ratio for each observable: (2.40) (For a combination of multiple observables, this generalises to the square root of the vectormatrix-vector product between the transposed observable vector, the full covariance matrix, and the observable vector.)We take this step to avoid unduly over-emphasising wavenumbers where in reality miscalibrated beams or other transfer functions may strongly bias measurements.We show observables in some contexts across a wider range of k ∈ (10 −2 , 10 1 ) Mpc −1 , but purely for illustrative purposes.

Recovery capability
To test (albeit in a highly idealised simulation) whether any of the COMAP scenarios described above are capable of recovering the original model parameters, we run a series of Markov chain Monte Carlo (MCMC) exercises using emcee [65].In addition to model parameters for the z ∼ 7 CO flux calculation, we also explore the posterior distribution for three nuisance parameters describing the 30 GHz interloper emission in CO(1-0) from z ∼ 3.
The CO(3-2) auto power spectrum is never used, so we forgo the need to marginalise over six more nuisance parameters for the two interloper line components in the Q band.We provide fairly constrained Gaussian priors on ⟨T b⟩, b, and P shot for the z ∼ 3 CO(1-0) interloper (20%, 33%, and 10% relative uncertainties in each case).This reflects an expectation that external data (from ngVLA surveys, for instance, or even from the COMAP Pathfinder) may constrain parameters for this CO component by the time of COMAP-EoR.
Otherwise, we use broad tophat priors for all model parameters.
Most of these priors are meant to be broad and uninformative.The range of allowed T k values in Model B reflects an expectation that the dust temperature T dust is in the range of 50 to 100 K, and that the kinetic temperature is comparable to this but is driven by slightly different physics to the point where T k /T dust ∈ (0.5, 3).We allow a slightly wider range for Model A to reflect the more phenomenological nature of the model.The noise budget for the 30 GHz observation includes both contributions from the radiometer noise and the CO(1-0) interloper emission from z ∼ 3, unless noted otherwise in the last column.
Table 2 describes the COMAP scenarios considered and the observables used in simulating each scenario.Note in particular that we consider an alternate COMAP-ERA scenario where the Ku band instrumentation is replaced rather than supplemented (as in 'Triple Deluxe') with Q band instrumentation, requiring us to fall back on the CO(2-1)-CO(3-2) cross and CO(2-1) auto power spectra.In fact, neither of these observables are used in the ideal 'Triple Deluxe' simulation, which leverages the Ku band data to detect the CO(1-0) auto, CO(1-0)-CO(2-1) cross, and CO(1-0)-CO(3-2) cross power spectra, probing the CO SLED with the ground transition as a firm reference point.
Note that we also add a number of idealised scenarios.In two of these, only the 15 GHz or 30 GHz survey occurs but with infinite survey time such that the radiometer noise goes to zero.This results in a noise-free 15 GHz standalone survey in one case, and a noise-free, interloper-limited 30 GHz standalone survey in the other.We also consider a scenario where COMAP-ERA only operates in the Ka band with 110000 dish-hours, but the interloper has somehow been subtracted with perfect knowledge so that its contribution to the uncertainty in the reionisation-epoch CO(2-1) signal may be excluded from the noise budget.In this scenario (and the full 'Triple Deluxe' scenario), we do not marginalise over the interloper nuisance parameters as there is no need to do so.

Streamlining evaluation: a Gaussian process emulator for Model B
Running MCMC exercises requires as many as millions of evaluations of the outputs of our CO model.For instance, for the Model A MCMC exercises, we iterate 48 walkers over 55865 steps, discarding the 16000 steps as burn-in.This means we calculate the CO observables a total of almost 2.7 million times.While the Radex-based halo model calculation is relatively fast compared to a full hydrodynamical simulation, it still demands enough computational resources for each MCMC exercise to take multiple days of continuous wall time.
There is little leeway around this for Model A, due to the relatively wide range of column densities and number densities that may exist across the halo population.However, for Model B, the Radex calculation is done not per halo but per cloud, so that we have a narrower expectation for the range of plausible environmental conditions.Motivated by this, we design an 'emulator' to interpolate the Radex calculation of CO fluxes per cloud for a given set of parameters, in lieu of a simple lookup table.
After fully optimising the regression, we verify the performance of our Gaussian process emulator with 27000 randomly drawn values of {T k , n H2 , N CO } in the range.Calculation of these outputs took roughly 12 seconds with the Gaussian process emulator, versus 220 seconds with Radex.Relative errors for fluxes were contained within 5% in the overwhelming majority of cases, with the 99.9% CI for GP prediction being 1.00 +0.04 −0.03 times the actual flux value for each line.The accuracy is best towards the middle of this column density range, but even a run of 8000 randomly drawn values restricted to N CO ∈ (10 15 , 10 20 ) cm −2 found 1.00 +0.05 −0.03 for J = 1 and J = 3, 1.00 ± 0.03 for J = 2.Even extreme outliers in this run fell within 20% of truth.The acceptable accuracy of the emulator as a surrogate for Radex and its order-of-magnitude improvement in calculation speed allows use of more and larger MCMCs for Model B, using 64 walkers over 86555 steps, discarding 30969 steps as burn-in.
Note that we have only trained our emulator up to column density values of N CO = 10 20 cm −2 .The reason lies in the typical column density values for our fiducial incarnation of Model B, where halos of M h ≲ 10 11 M ⊙ hosting clouds with N CO ∼ 10 18 cm −2 dominate the signal.Values much higher may be unphysical or violate assumptions surrounding our model.For one, Radex itself limits the user to evaluating CO fluxes for N CO ∈ (10 5 , 10 25 ) cm −2 .But furthermore, if we believe Equation 2.11, then a value of N CO = 10 20 cm −2 in the average molecular cloud requires 100% of carbon to be locked up in CO with, e.g., n H2 = 10 5 cm −3 and Z ′ = Z ⊙ , or n H2 = 10 6 cm −3 and Z ′ = 0.2Z ⊙ -fairly extreme expectations for the nascent molecular clouds of z ∼ 7. Given all of this, during the MCMC we clip column densities to the range covered by the emulator before evaluating line fluxes.
By reverting to a prediction for lower column densities than the model would otherwise predict, this truncation could bias us subtly towards models that favour higher CO column densities than 'truth', i.e., higher values of N CO,brk or n H2 .In practice, we find that the posteriors for the multi-band COMAP scenarios tend not to clip into the the upper limit, and certainly not into the lower limit.In fact, the recovered distribution of both N CO,brk and n H2 tend to have heavy left tails, not the heavy right tails that one might expect based on the emulator input truncation.We show this and the full posterior distribution for all MCMC exercises in Appendix C.
We emphasise that Gaussian process emulators allow highly flexible interpolation of smooth multivariate output from multivariate input, but in some cases require significant tuning of kernel forms and/or have difficulty with highly unsmooth outputs.We also note that the emulator used in this work does not fully explore the possibilities of Gaussian processes.For instance, the emulator here is agnostic to correlations between CO lines, which could be used to the advantage of an emulator with an appropriate cross-covariance model (see, e.g., the review by [67]-although see also [68]).and Model B (black solid curves) from this work.We also show a number of models from existing literature as considered in [25], representing the range of predictions; see the main text for details.

Results
We first discuss in Section 3.1 the raw detectability of the observables given our CO emission models.Following this, we consider in Section 3.2 whether the various single-and multi-band surveys simulated are fundamentally capable of recovering ISM properties.

Observables and signal-to-noise
Figure 4 shows all of our predicted CO auto and cross power spectra based on both Model A and Model B. While some of our assumptions around both models are somewhat optimistic, we note that our predictions fall within an order of magnitude of the so-called 'Li+16-Keating+20' model [12,63], which falls in the middle of the pack of models considered by [25].Given that this model combines a number of fairly straightforward scaling relations between halo mass, star-formation rate, infrared luminosity, and CO line luminosities drawn from a number of numerical and observational works [69][70][71], it is not an unreasonable model for us to track.For additional reference, we also reproduce the two models at the extremes of the pack considered by [25]: the 'Lidz+11' model from [33] featuring a simple linear scaling relation between halo mass and CO luminosity, and the 'Yang+22' model from [36] featuring scaling relations calibrated based on semi-empirical models applied to cosmological simulations.We refer the reader to the individual papers, including [25], for further details on the models not introduced in this work, which we have recalculated at z = 6.7 for Figure 4.
Based on the parameters and equations laid out in Section 2.2, we calculate the uncertainties for each observable and the signal-to-noise ratio obtained in each survey scenario.Figure 5 shows all pseudo-power spectra and sensitivities for the COMAP 'Triple Deluxe' scenario, including for the unused CO(3-2) auto power spectrum.Under Model B, the entirely sub-thermal CO excitation predicted renders the two higher transitions dimmer but not out of reach of a detection, especially in cross-correlation against CO(1-0).However, CO(2-1) and CO(3-2) emission are certainly easier to detect under the more optimistic Model A.
We confirm this quantitatively with the calculated signal-to-noise values shown in Table 3 and Table 4.Given either model, however, the two-band COMAP-EoR scenario achieves a significant detection of CO emission with total signal-to-noise of 17-18 across all observables.COMAP-ERA and 'Triple Deluxe' scenarios also achieve total signal-to-noise near or above 60.

Recovery capability
Figure 6 and Table 5 show the fundamental recovery capabilities under Model A of the experimental scenarios considered in Table 3   halo masses (≲ 10 12 M ⊙ ) that actually dominate the signal.The multi-band COMAP-EoR scenario, despite achieving around roughly one-third of the total signal-to-noise of either single-band survey scenario, recovers the CO column density at low halo masses comparably if not better.COMAP-ERA and 'Triple Deluxe' simulations improve recovery of the relation at these masses and extends meaningful constraints to halos of somewhat higher masses, which COMAP-EoR is able to constrain less owing to their expected relative scarcity.Interestingly, the alternate COMAP-ERA scenario forgoing 15 GHz in favour of 30/45 GHz does not improve on the single-band scenarios.The story is largely similar for the ISM parameters of kinetic temperature, H 2 number density, and H 2 thermal pressure (the product of the first two parameters).In particular, the 'Triple Deluxe' scenario is capable of noticeably constraining all three parameters beyond the limits of the prior distribution.The CO(1-0) emission alone is incapable of constraining any of these variables too much beyond the priors, while the higher transitions by themselves can bound the kinetic temperature but have trouble placing limits on the number density  and pressure without the ground transition.The 'Triple Deluxe' constraint on the thermal pressure of log [T k n H2 /(K cm −3 )] = 2.99-4.04(68% CI, for a half-width of 0.53 dex) is somewhat skewed towards lower values as is apparent from the heavy left tail of the posterior distribution shown in Figure 6, but the constraint is fully consistent with the ground truth of log [T k n H2 /(K cm −3 )] = 3.93, and we also see in Figure 6 that the maximum a posteriori value is close to this 'true' value.Compared to Model A, we have allowed more parameters to vary for Model B in the mock recovery exercise, and Model B predicts sub-thermal excitation and thus dimmer signals from higher transitions relative to Model A. Nonetheless, we find that many of the findings under Model A apply for Model B, for which we show results across the full range of experimental scenarios in Figure 7 and Table 6.In particular, in both cases, the typical lower bound of T k > 42 K is within striking distance of the 40-60 K dust temperature values ascribed to observations of individual objects at z > 6 [72][73][74], constraining interactions between molecular gas and dust.
One notable difference is that under Model B, tight constraints on the product of cloud  count and CO column density per cloud (representing the bulk abundance of CO per halo) do not appear to require multi-band CO LIM.In principle, even surveying CO(1-0) only is capable of achieving constraints equivalent to those achieved by the multi-band versions of COMAP-EoR or COMAP-ERA.However, such results certainly cannot be obtained without CO(1-0), barring perfect removal of interloper emission at 30 GHz and 45 GHz.When it comes to the ISM parameters, we once again find that multi-transition LIM yields the best results, particularly with the inclusion of the CO(1-0) transition.Even with slightly dimmed detection significance for the higher CO transitions, the 'Triple Deluxe' -22 - halo mass (M )  simulation recovers the strongest bounds on the kinetic temperature and H 2 number density.The resulting bound on H 2 thermal pressure, log [T k n H2 /(K cm −3 )] = 3.24-4.56(68% CI, for a half-width of 0.66 dex), is comparable to the COMAP-ERA constraint (68% CI of log [T k n H2 /(K cm −3 )] = 3.27-4.50for a half-width of 0.62 dex), but is arguably more robust as it is obtained without relying on the CO(2-1) auto spectrum.

Discussion
We will discuss first what some of our results mean for the model-dependence of our primary qualitative findings in Section 4.1, then how mapping multiple transitions lends greater sensitivity to ISM parameters in Section 4.2.

Model-dependence of results
We have presented two very different models of CO emission in this work.Where Model A models the ISM within a dark matter halo as a homogeneous medium-essentially as if it were a single uniform sphere of molecular hydrogen and carbon monoxide somewhere inside the halo-Model B considers the star-forming phase of the ISM in each halo to be made up of many clouds and effectively builds up the CO luminosity per halo from many copies of the 'average' molecular cloud.In part due to their starting points (empirical versus physical), these models ended up making rather different predictions about CO excitation, with the SLED evolving strongly with halo mass for Model A from sub-to super-thermal, but only weakly for Model B and staying sub-thermal for the entire halo mass range considered.This may be responsible in particular for the differences in the expected constraining power of observations of CO(1-0) alone versus observations of one or both of the higher transitions.In the case of a model like Model A, the CO excitation will depend strongly on the model parameters-for instance, the H 2 number density can evolve strongly with halo mass depending on the value of α.Information from higher transitions becomes particularly valuable in this case, whereas the ground transition may not reveal as much.However, in the case of Model B, the H 2 number density is held constant for each cloud across all clouds in a given halo.As such, we do not expect the CO SLED to evolve as strongly with the other input parameters, and the ground transition is more informative given the physical assumptions implicit in the formulation of the model (as as more observable than the non-ground transitions due to the sub-thermal excitation).
Yet, the qualitative outcomes of the simulation remain similar between the two models we have presented.In particular, the best outcome always results from the COMAP-ERA and 'Triple Deluxe' simulations, significantly outperforming the single-transition survey simulations in parameter recovery even when the raw signal-to-noise ratio is comparable.While we will shortly further discuss the importance of surveying multiple lines, it is important to note before doing so that we were able to quantify the advantage of multi-transition LIM over single-transition LIM for two very different models.
There are further model variations well beyond the scope of this paper, including models predicting CO emission so faint as to be barely detectable if at all even with the full might of the 'Triple Deluxe' instrumentation performing in impossibly ideal fashion.However, as we noted above, both models fall in the middle of the pack of models predicting CO emission at late reionisation, so they are certainly not unreasonably optimistic given the state of knowledge about high-redshift molecular gas as examined by a fair sampling of literature within roughly the past decade.
In addition, whether future CO LIM experiments can constrain CO excitation is only one of the three key questions driving this paper.Provided that the CO emission is at a detectable level where a COMAP-ERA or 'Triple Deluxe' survey ought to be able to constrain the nature of the emitting galaxies, it is clear that those constraints depend strongly on the ability to probe multiple CO transitions in a way that does not depend extremely strongly or sharply on the CO emission model used for simulation and (mock) inference.

The necessity of probing multiple transitions
Based on the results of our idealised simulations, we find a twofold need for COMAP to observe CO in multiple rotational transitions.The first is improved signal-to-noise over simply maintaining the 30 GHz instrumentation designed for the COMAP Pathfinder.The fiducial COMAP-EoR proposal [25] to add a 15 GHz component to the survey to observe the ground CO(1-0) transition has the particular advantage that it will not be limited by the presence of interloper emission in CO lines from lower redshift, which will be the case for the largest-scale modes of CO(2-1) and CO .Rejection of interloper emission may be possible through a range of methods proposed in the literature [75][76][77] but it is best to avoid having interlopers add to the uncertainty in the z ∼ 7 observation in the first place as much as possible.Indeed, in the case of Model B, a noise-free interloper-limited 30 GHz survey would only achieve a signal-to-noise ratio for the CO(2-1) auto spectrum on par with the COMAP-ERA signal-to-noise for the CO(1-0)-CO(2-1) cross spectrum alone.
However, the simulations demonstrate clear differences in the qualitative knowledge obtained from single-versus multi-transition CO LIM detections with similar signal-to-noise.It is true, for instance, that a single-band survey would be sufficient to constrain the overall abundance of CO (in the form of the total CO column density per halo, for instance), particularly if it is possible to reject interloper emission.However, such a survey is insufficient to constrain physical variables beyond this that govern the excitation of CO, such as the H 2 number density or thermal pressure, or the kinetic temperature.
Most interesting is the alternate COMAP-ERA concept of adding 45 GHz instrumentation instead of 15 GHz instrumentation.In the case of Model A, where the CO content of the halos becomes sufficiently excited for the CO(2-1) and CO  fluctuations to be as bright as CO(1-0) fluctuations in aggregate, the combination of Ka-and Q-band data can constrain the kinetic temperature, but is unable to sufficiently constrain H 2 number density.In the case of Model B, where the higher transitions are clearly subthermal and thus less observable (by a factor of 2.5 in terms of total signal-to-noise), even the kinetic temperature is not constrained significantly beyond the prior distribution.The story unfolds conversely for hypothetical 15 GHz only surveys, whose constraining power is weaker under Model A and stronger for Model B, although still much weaker than multi-band scenarios in any case.
While differences between the models (as outlined in the previous subsection) clearly result in differences in the constraining power of these scenarios, the common upshot is that CO LIM observations cannot rely on the higher rotational transitions alone to tell the whole story about the high-redshift ISM, certainly not one non-ground transition on its own.Meanwhile, CO(1-0) observations may have some constraining power on their own, but the signal-to-noise required to match constraints from multi-band LIM is outrageously high.In the case of Model B, constraints on T k and n H2 from a completely noise-free, sample variance-limited 15 GHz survey-with a total signal-to-noise ratio of 310-essentially has the same constraining power in between a COMAP-ERA 15/30 GHz analysis or a 'Triple Deluxe' 15/30/45 GHz analysis, with total signal-to-noise ratios no higher than 70.Despite the equivalent of roughly four to five times higher survey time, the information content of CO(1-0) emission alone is simply not as 'dense' as multi-transition data.
Thus, a multi-band approach to future phases of COMAP is a necessity in both practical and scientific terms.Not only will the additional data increase detectability of high-z CO, but also at a given total detection significance across all observables, the constraints obtained on ISM parameters are simply much stronger for multi-band LIM than for single-band LIM.

Conclusions
Through modelling detection significance and corresponding parameter recovery for a number of signal models and experimental scenarios, we end up with the following answers to the questions we posed in the Introduction: • What constraining power can future CO line-intensity mapping provide on the environmental variables that govern CO excitation?In addition to constraining the overall column density of CO hosted in dark matter halos, future phases of COMAP can place lower bounds on the ISM kinetic temperature and upper bounds on the abundance of H 2 , the chief collisional partner for CO.
• How much does this constraining power depend on the ability of future surveys to observe multiple lines, in particular multiple transitions of CO?The ability to constrain overall CO abundances per galaxy or proto-cluster (in terms of column density) does not depend strongly on the ability to observe multiple transitions.However, constraints on kinetic temperature, H 2 density, or H 2 thermal pressure strengthen considerably by measuring auto and cross power spectra across multiple rotational transitions of CO.
• Do qualitative aspects of answers to the above change with our approach to modelling the high-redshift CO signal?Apart from changes in relative detectability of different transitions, our Model A and Model B simulations a consistent qualitative result: observations of a single CO transition alone cannot robustly constrain ISM variables beyond bulk CO abundances.
This work thus demonstrates not only the ability of multi-transition CO LIM as envisioned by COMAP to constrain the characteristics of molecular gas, but also the necessity of the 'multitransition' aspect of the COMAP-EoR, COMAP-ERA, and 'Triple Deluxe' survey designs in obtaining such constraints.The endpoint of LIM however, of course, is multi-phase, multispecies LIM, interpreted using physically motivated, self-consistent modelling of a diversity of atomic and molecular gas species (cf., e.g., [38]), for a cosmic statistical survey of the high-redshift ISM, CGM, and IGM.Atomic hydrogen lines are key in this endeavour, but so are carbonic lines beyond CO like [C i] and [C ii], as additional probes of carbon abundance and thus gas metallicity in different ISM phases at high redshift-see, e.g., the analysis of [27] combining CO and [C i] line observations with far-infrared continuum data to constrain the properties of photodissociation regions in CO-selected galaxies at z ∼ 1-3.
In devising suitable models of the high-redshift ISM, LIM signal forecasts will need to interface more with both semi-analytic models and hydrodynamical simulations that explicitly simulate processes of cloud formation and dissociation.Of particular value will be the use of Gaussian process emulators (as considered in this work, although hopefully employed with more sophistication) or Gaussian mixture models (cf., e.g., [78,79]) to inform halo models to 'paint' onto large-scale cosmological simulations, or dimensionality reduction techniques to gain insights into specific variables that describe the 'painting'.The advantage of some of these machine learning models will be the ability to characterise uncertainty in simplifying or extrapolating the fine-grain information, and the ability in principle to propagate these uncertainties all the way through to parameter inferences.
Multi-transition and multi-line LIM scenarios provide compelling science prospects not just for future ground-based observations culminating in arrays of single-dish receivers, but also perhaps for space missions providing multi-or wide-band spectrometer backends, with more reliable access to certain frequency windows and thus certain higher energy transitions of CO and other lines redshifted into sub-cm observing wavelengths.This includes not only concepts that could resolve the LIM signal fluctuations at scales relevant to galaxy formation, but also missions whose primary focus is on the global sky-averaged microwave intensity spectrum (e.g., [80,81]).These observations will contain high information content about the redshift evolution of cosmic line-luminosity densities [82] and thus can strongly complement LIM measurements of line-intensity fluctuations, along with measurements of dust and continuum emission in individual objects.Only through a concerted community halo mass (M ) Effect of varying M min (upper panels) and α (lower panels) on the halo mass-line luminosity relations (left panels) and the CO auto and cross power spectra (right panels).We indicate the fiducial predictions in fainter grey curves in the left panels, and in thick black curves in the right panels.Note that varying M min does not affect the L ′ CO,J (M h ) relation above M min , with the shaded areas in the left panels corresponding to the cutoffs imposed for the power spectra plotted in the right panel.
We show in Figure 10 the effect of introducing scatter to Model B, comparing the resulting predictions to each other as well as to the predictions of the 'Li+16-Keating+20' model [12,63] considered in [25] and discussed in this work in Section 3.1.This 'Li+16-Keating+20' model (which we denote L16K20 below for brevity) by default assumes ρ J,J ′ = 0 between J = 1 and J ′ = 2, and we follow this choice in the main text where we assume ρ J,J ′ = 0 for all J ̸ = J ′ .However, in Figure 10 we show the effect of assuming non-zero inter-line correlations by showing a variation of the model setting ρ J,J ′ = 1 for all J ̸ = J ′ .This in turn shows that the cross power spectra for Model B only fall off more slowly at high k than the L16K20 model because of assuming ρ J,J ′ = 0 for all J ̸ = J ′ , which suppresses the cross shot noise.In fact, the second moment of CO line luminosities is intrinsically higher halo mass (M )  L (K km s −1 pc 2 ) CO(4-3) z = 3.01-4.49Figure 11.The function assumed for all interloper line emission components, compared against all constraints on CO luminosity functions derived from 3 mm ASPECS LP data [83].For completeness we also show ASPECS LP upper limits on the CO(1-0) luminosity function at z ∼ 0, although not strictly relevant to this particular work.The original parameter values tuned to match the CO(1-0) luminosity function of [63] are ϕ * /L * = 8.7 × 10 −11 Mpc −3 L −1 ⊙ , L * = 2.1 × 10 6 L ⊙ , α = −1.87,L min = 500 L ⊙ .However, here we make a subtle change, which is to recast the luminosity function in terms of observer units of velocity-and area-integrated brightness temperature:
Figure 11 provides a comparison of this luminosity function (now as Φ(L ′ ) ≡ dn/d(log L ′ ) instead of the original ϕ = dn/dL ′ ) against constraints derived from 3 mm observations of the ALMA SPECtroscopic Survey Large Programme (ASPECS LP) in the Hubble Ultra-Deep Field [83].We see that the same luminosity function is broadly consistent with all constraints on all observed transitions, with particularly good agreement with the observed CO(2-1) luminosity function at z ∼ 1.0-1.7.Deviations exist but are within an order of magnitude.Note also that each one of our interloper transitions is one rung below the transition observed at the same redshift in the 3 mm ASPECS LP data; even somewhat subthermal excitation could mean sizeable swings in luminosities further up the ladder of rotational transitions.
C Monte Carlo posterior distributions in original parameter spaces Figure 12 and Figure 13 show the full MCMC 'corner plots' for all exercises in the main text, for Models A and B respectively.For Model A, the difference between the multi-and single-band MCMC exercises are evident in the bounded, mostly well-defined contours for the multi-band scenarios and the foamy flat distributions for the single-band (and Ka-/Qband) scenarios.For Model B, we find that even the multi-band scenarios cannot provide significant constraints on certain parameters, like the power-law indices α and β for the CO column density and molecular cloud count, but otherwise we encounter a similar narrative.The exception, oddly enough, is that the noise-free CO(1-0)-only scenario can successfully bound both power-law indices for Model B from above, owing to its sensitivity to at least the overall abundance of CO.
In the case of Model B, we find that the relevant mass scales are largely misidentified, with the minimum mass for CO emission unconstrained, and M brk identified as ∼ 10 11 M ⊙ instead of the fiducial value, which is around 20 times that.We ascribe this to the fact that the observation senses not all halos above a certain minimum mass or all halos below a certain maximum mass, but rather whatever halos dominate the signal.Indeed, despite misidentification of the break mass, we find in Figure 7 a successful recovery of the product of cloud count and CO column density at these masses.Still, this suggests that in practice, when the break in a double power law is unlikely to affect the signal-which Figure 3 suggests is the case-fitting a single power law may be preferable so as to reduce complexity without sacrificing information recovery.
We also note that analysing the CO(1-0) data by itself tends to result in a downward bias of n H2 from the 'true' value, whereas analysing one or both of the higher transitions results in a very weak upward bias from the 'true' value (although some of this may be due to our emulator limits as discussed in Section 2.3.1).This once again highlights the need for multi-transition CO LIM; adding further integration time and further noise may well reduce the remaining (albeit much reduced) bias in the COMAP 'Triple Deluxe' posterior.

7 CO 1 Figure 1 .
Figure 1.Illustration of the correspondence of observing frequency with the emission redshift of the three lowest-energy CO rotational transition lines.We also show the possible observational bands for future COMAP phases considered in simulated scenarios in this work, as well as the specific target and interloper components (solid and dotted boxes respectively) for reionisation-epoch CO LIM observations.

Figure 2 .
Figure 2. Summaries of Model A (left) and Model B (right), showing the relations between halo mass and line luminosity (upper panels) and ratios between line luminosities (lower panels) given fiducial parameter values.

Figure 3 .
Figure 3. Differential contribution to the average line brightness temperature ⟨T ⟩ per logarithmic halo mass bin for each CO transition considered in this work, for Model A (left) and Model B (right).

Figure 4 .
Figure 4. Predictions for CO auto and cross power spectra based on Model A (red solid curves) and Model B (black solid curves) from this work.We also show a number of models from existing literature as considered in[25], representing the range of predictions; see the main text for details.

Figure 5 .
Figure 5. Predictions for CO auto and cross power spectra given Model A (left panels) and Model B (right panels), alongside the expected uncertainty on the power spectra evaluated for k-bins of size ∆[log (k Mpc)] = 0.03.We show the sensitivities expected for the COMAP-ERA or COMAP Triple Deluxe scenarios; for COMAP-EoR, the radiometer noise will be worse for the CO(1-0) auto power measurement by a factor of 8, and for CO(2-1) by a factor of 4.

Figure 6 .
Figure 6.Predicted constraints, given Model A, on the CO column density per halo (upper panels) and the H 2 thermal pressure derived from the product of kinetic temperature and number density for a dark matter halo of mass 10 10 M ⊙ (lower panels).The left and right panels show different experimental scenarios as indicated in the legends.Also shown for reference are the true parameter values for Model A (black dotted lines), as well as the distribution of H 2 thermal pressure values implied by the MCMC priors alone (red dashed lines).

Figure 7 .
Figure 7. Predicted constraints, given Model B, on the product of the CO column density and cloud count per halo (upper panels) and the H 2 thermal pressure derived from the product of kinetic temperature and number density (lower panels).The left and right panels show different experimental scenarios as indicated in the legends.Also shown for reference are the true parameter values for Model B (black dotted lines), as well as the distribution of H 2 thermal pressure values implied by the MCMC priors alone (red dashed lines).

2 Figure 8 .
Figure 8.Effect of varying M min (upper panels) and α (lower panels) on the halo mass-line luminosity relations (left panels) and the CO auto and cross power spectra (right panels).We indicate the fiducial predictions in fainter grey curves in the left panels, and in thick black curves in the right panels.Note that varying M min does not affect the L ′ CO,J (M h ) relation above M min , with the shaded areas in the left panels corresponding to the cutoffs imposed for the power spectra plotted in the right panel.

3 haloFigure 9 .
Figure 9. Same as Figure8, but for T k n H2 , and N CO,brk as indicated in each panel.

For
all interloper line emission, the basic form of the luminosity function assumed is the combination of a quasi-exponential low-luminosity cutoff and a Schechter function, itself the combination of a power law and an exponential high-luminosity cutoff:

Figure 12 .
Figure 12.Full MCMC posterior distributions for exercises carried out with Model A.

Figure 13 .
Figure 13.Full MCMC posterior distributions for exercises carried out with Model B.

Table 1 .
Low-mass power-law index for N CO (M h ) relation (half of power-law index for n H2 (M h ) relation) M break 10 12.3 M ⊙ Characteristic mass scale for N CO (M h ) relation N break 10 19.86 cm −2 Total CO column density in halo with M h = M brk T Low-mass power-law index for N MC (M h ) relation N MC,brk 9.5 × 10 6 Number of molecular clouds in halo of mass M brk M min 10 9 M ⊙ Minimum halo mass for CO emission (fixed at 10 10 M ⊙ for Model A) Parameters allowed to vary for both Models A and B throughout this work.Some parameters relate to the CO column density N CO (M 12.3M ⊙ , 9.5 × 10 6 , 5.2 × 10 16 , 1/3, 3/2}.h ) (per halo for Model A, per molecular cloud for Model B) or the number of molecular clouds per halo N MC (M h ) (for Model B) as functions of halo mass.

Table 2 .
Experimental scenarios considered in this work, and the specific auto and cross spectra included in mock parameter recovery for each scenario.COMAP-ERA (COMAP-EoR) scenarios assume 57000 (7000) dish-hours at 15 GHz and 110000 (29000) dish-hours at 30 GHz.Both the alternate COMAP-ERA and 'Triple Deluxe' scenarios assume accrual of 30000 dish-hours at 45 GHz.
. The multi-transition survey simulations that include the Ka band all appear to significantly outperform the single-transition survey simulations in recovering the relation between halo mass and CO column density, particularly at the Ku × . . .

Table 3 .
Predictions for signal-to-noise ratios both for individual observables and in total for selected scenarios described in Table2, assuming Model A. COMAP-ERA (COMAP-EoR) scenarios assume 57000 (7000) dish-hours at 15 GHz and 110000 (29000) dish-hours at 30 GHz. 'Triple Deluxe' assumes 30000 dish-hours at 45 GHz, on top of COMAP-ERA sensitivities at 15 GHz and 30 GHz.The noise budget for the 30 GHz observation includes both contributions from the radiometer noise and the CO(1-0) interloper emission from z ∼ 3, unless noted otherwise in the last column.

Table 4 .
Predictions for signal-to-noise ratios both for individual observables and in total for the experimental scenarios described in Table2, assuming Model B. COMAP-ERA (COMAP-EoR) scenarios assume 57000 (7000) dish-hours at 15 GHz and 110000 (29000) dish-hours at 30 GHz. 'Triple Deluxe' assumes 30000 dish-hours at 45 GHz, on top of COMAP-ERA sensitivities at 15 GHz and 30 GHz.The noise budget for the 30 GHz observation includes both contributions from the radiometer noise and the CO(1-0) interloper emission from z ∼ 3, unless noted otherwise in the last column.

Table 5 .
Predicted constraints, given Model A, on kinetic temperature T k , H 2 number density n H2,10 in dark matter halos of mass 10 10 M ⊙ , and the H 2 thermal pressure derived from the product of the two.Also shown for reference are the true parameter values for Model A and the percentiles implied solely by tophat priors bounding the MCMC posterior.Values in bold type indicate shifts away from the prior percentiles by more than 10% of the 90% width of the prior distribution (i.e., by more than 0.15 dex for T k , 0.54 dex for n H2,10 , and 0.57 dex for T k n H2,10 ), unless either bound of the posterior interval falls beyond the corresponding bound of the prior.

Table 6 .
Predicted constraints, given Model B, on kinetic temperature T k , H 2 number density n H2 , and the H 2 thermal pressure derived from the product of the two.Also shown for reference are the true parameter values for Model B and the percentiles implied solely by tophat priors bounding the MCMC posterior.Values in bold type indicate shifts away from the prior percentiles by more than 10% of the 90% width of the prior distribution (i.e., by more than 0.10 dex for T k and 0.54 dex for both n H2 and T k n H2 ), unless either posterior bound falls beyond the corresponding bound of the prior.