Mapping a Lower Limit on the Mass Fraction of the Cold Neutral Medium Using Fourier-transformed H i 21 cm Emission Line Spectra: Application to the DRAO Deep Field from DHIGLS and the HI4PI Survey

We develop a new method for spatially mapping a lower limit on the mass fraction of the cold neutral medium by analyzing the amplitude structure of Tˆb(kv) , the Fourier transform of T b (v), the spectrum of the brightness temperature of the H i 21 cm line emission with respect to the radial velocity v. This advances a broader effort exploiting 21 cm emission line data alone (without absorption line data, τ) to extract integrated properties of the multiphase structure of the H i gas and to map each phase separately. Using toy models, we illustrate the origin of interference patterns seen in Tˆb(kv) . Building on this, a lower limit on the cold gas mass fraction is obtained from the amplitude of Tˆb at high k v . Tested on a numerical simulation of thermally bi-stable turbulence, the lower limit from this method has a strong linear correlation with the “true” cold gas mass fraction from the simulation for a relatively low cold gas mass fraction. At a higher mass fraction, our lower limit is lower than the “true” value, because of a combination of interference and opacity effects. Comparison with absorption surveys shows a similar behavior, with a departure from linear correlation at N H I ≳ 3–5 × 1020 cm−2. Application to the DRAO Deep Field from DHIGLS reveals a complex network of cold filaments in the Spider, an important structural property of the thermal condensation of the H i gas. Application to the HI4PI survey in the velocity range −90 < v < 90 km s−1 produces a full sky map of a lower limit on the mass fraction of the cold neutral medium at 16.′2 resolution. Our new method has the ability to extract a lower limit on the cold gas mass fraction for massive amounts of emission line data alone with low computing time and memory, pointing the way to new approaches suitable for the new generation of radio interferometers.

In recent decades, huge efforts have been made to map the 21 cm emission of Galactic H I (e.g., Taylor et al. 2003;Kalberla et al. 2005;Stil et al. 2006;McClure-Griffiths et al. 2009;Martin et al. 2015;HI4PI Collaboration et al. 2016;Winkel et al. 2016;Beuther et al. 2016;Blagrave et al. 2017;Peek et al. 2011Peek et al. , 2018;;Wang et al. 2020b), as well as H I in nearby galaxies (e.g., Kim et al. 1998;Walter et al. 2008;Koch et al. 2018;Pingel et al. 2022).However, the spatial information about the multiphase and multi-scale nature of the neutral ISM contained in these large hyper-spectral positionposition-velocity (PPV) cubes has remained elusive due to the difficulty in separating the emission from the different phases along each line of sight (see Marchal et al. 2019, and references within).
In recent decades, methods aimed at phase separation of 21 cm emission-only data have been developed to recover information about the physical properties of each phase.Among them, Gaussian decomposition algorithms have focused on modeling the contribution of each phase to H I PPV cubes (Haud 2000;Nidever et al. 2008;Lindner et al. 2015;Kalberla & Haud 2018;Riener et al. 2019;Marchal et al. 2019).Through the use of spatial regularization of the Gaussian model, the code ROHSA (Marchal et al. 2019) was developed to model each phase with spatial coherence across velocities and allowed the thermal and turbulent properties of the WNM to be studied with emission line data only (Marchal & Miville-Deschênes 2021).However, these methods are often computationally expensive and difficult to apply to large data sets.
Furthering this ongoing effort, Murray et al. (2020) have recently introduced a novel approach based on a 1D convolutional neural network (CNN) trained on a library of synthetic observations of emission line and absorption spectra of three-dimensional hydrodynamical simulations of the multiphase ISM (Kim et al. 2013(Kim et al. , 2014)).In contrast to Gaussian decomposition algorithms, Murray et al. (2020) focused on extracting the total cold gas mass fraction, and the correction to the optically thin estimate of H I column density because of optical depth along each spectrum, losing information on the velocity field.In this paper, although focusing on a similar goal (i.e., extracting the total cold gas mass fraction), we develop a new method by analyzing the amplitude structure of T (k ), the Fourier transform (hereafter FT) of T b ( ), the spectrum of the brightness temperature of H I 21 cm line emission with respect to the radial velocity .
Previous studies performing spectral modeling in the k domain, such as the Velocity Coordinate Spectrum (VCS) (VCS, Lazarian & Pogosyan 2000, 2006, 2008;Lazarian 2009), have shown that the power spectrum of spatially averaged 21 cm data cubes contains information about the multiphase structure of the gas (Chepurnov & Lazarian 2006;Chepurnov et al. 2010Chepurnov et al. , 2015)), but the VCS loses spatial information on the plane of the sky and precludes mapping out integrated properties of each phase.Our new method using Fourier transformed spectra exploits valuable information on the amplitude in the k domain along individual lines of sight and we obtain a spatial map of a lower limit on the contribution of the cold phase to the total line emission (i.e., the gas cold mass fraction). 1he paper is organized as follows.In Section 2, we present the interference patterns seen in the FT of emission data from the 21 cm Spectral Line Observations of Neutral Gas with the VLA (21-SPONGE) survey (Murray et al. 2015;Murray et al. 2018) and use toy models to understand their origins.Building on this, we develop a methodology to evaluate a lower limit on the mass fraction of gas below an arbitrary kinetic temperature.Evaluation using a realistic numerical simulation of the neutral ISM is presented in Section 3. In Section 4.1, we apply our new method to emission data alone from 21-SPONGE and compare our lower limits to the cold gas mass fraction obtained by Murray et al. (2018) by joint analysis of emission and absorption data.Application to emission-line data for the DRAO Deep Field (hereafter DF) from the DHIGLS survey and the HI4PI survey in Sections 6 and 7, respectively, shows that the information contained at high k in PPK cubes2 of Fourier transformed emission-line spectra is intimately related to the emission in small spatial scale features often seen in individual channel maps of PPV cubes.A discussion and a summary are provided in Sections 9 and 10, respectively.
where j is the imaginary unit.At k = 0, Tb (0) = T b ( ) d , which is the estimator of the total H I column density in the optically thin limit, N * HI , divided by C = 1.82243 × 10 18 cm −2 (K km s −1 ) −1 .We use the information contained in the first mode to define the k = 0 normalization of the FT data whose amplitude has values in the range [0, 1].

A case from the 21-SPONGE survey
To illustrate the transformation of the data from the velocity domain to the k domain, we used observa-  The matching emission-line spectrum T b ( ) was interpolated from off-source observations with the Arecibo telescope with angular resolution of about 3. ′ 5 and velocity resolution of 0.16 km s −1 (see details about the matching procedure in section 2.4 of Murray et al. 2015).These complementary spectra 3 Publicly available on the 21-SPONGE Dataverse are shown in Figure 1, with the τ( ) spectrum scaled to the same maximum as T b ( ) to highlight the relationship between the two.
Multiple components of cold gas were identified by Murray et al. ( 2018) using a joint multi-Gaussian decomposition of T b and τ.Inferred optical depth, velocity, velocity dispersion, spin temperature, and column density of each component can be found in their table 5.The total mass fraction of cold gas that they detect in absorption along the line of sight against J2232 is 0.30±0.10(see their table 2).
Figure 2 shows the amplitude of k 0 normalized Fourier transformed spectra of T b and τ on a logarithmic scale, using the same color-code as in Figure 1.Several "dips" at similar k are observed in the two amplitude spectra, marked by the vertical dashed lines.Because the absorption spectrum (τ) is revealing only cold gas along the line-of-sight, these dips observed in both the FT of τ and T b suggests that CNM gas plays an important role in shaping the FT of the emission spectrum as well, even though emission includes contributions from H I at all temperatures.Note also that at high k , noise dominates the signal in both spectra.However, in the FT of τ the position of the dips can be traced unambiguously up to relatively high k (low velocity dispersion), about k ≳ 0.4 (km/s) −1 .

Understanding interference patterns in the amplitude of the FT spectrum
To understand the origin of the observed features in the Fourier transformed emission-line spectra, we used a toy model based on the mixing of multiple Gaussians with various amplitudes, velocities, and velocity dispersions (an arbitrary mixture of kinetic and turbulent broadening) to generate simplified mock observations of 21 cm emission data.

Toy model
The generative model is where each of the N Gaussians is parameterized by θ n = (a n , µ n , σ n ), with amplitude a n ≥ 0, mean velocity µ n , and velocity dispersion σ n .Taking the Fourier transform and using the shift theorem, the normalized FT of the model is where is the mass fraction of each Gaussian n (bounded between zero and unity) relative to the optically thin estimator of the total column density, and σ k ,n = 1/(2πσ n ).

Single-Gaussian case
For a simplified model with N = 1 the complex exponential that defines the phase of the signal, parametrized by µ, simplifies when one considers the normalized amplitude of the FT, and can be fully parametrized by a single f = 1 and the corresponding σ k , dropping the subscript n = 1 in Equation 4.
The amplitude of the FT of a single Gaussian is also a Gaussian, centered on k = 0 with a width inversely proportional to the velocity dispersion of the original Gaussian.Therefore, a narrow (broad) Gaussian describing cold (warm) gas T b ( , θ) spectrum will appear broad (narrow) in its FT amplitude spectrum.This is illustrated in Figure 3 by the narrow (σ = 2 km s −1 ) and broad (σ = 7 km s −1 ) Gaussians with µ = 0 km s −1 (i.e., centered at = 0 km s −1 ) (black and blue lines, respectively), chosen to have the same area, representing the same column density of gas.Their corresponding normalized FT amplitude spectra are shown in Figure 4.

Two centered Gaussians
Here the spectral model is the sum of the narrow (black) and broad (blue) Gaussians in Figure 3, though not shown there.Thus, there is twice as much gas, half of which could be classified as cold; i.e., the cold gas mass fraction is 0.5.The normalized amplitude of the FT is given by where f = f 1 = f 2 = 0.5.Because µ 1 = µ 2 there is no information from the phase of the FT (cf.Equation 7), and Equation 6 has the same functional form as the spectral model (i.e., a sum of two Gaussians).But now the broad component comes from the cold gas (black, σ k ,1 = 0.079577 ( km s −1 ) −1 ), while the narrow component originates from the warm gas (blue, σ k ,2 = 0.022736 ( km s −1 ) −1 ).The normalized amplitude is plotted as the cyan curve in Figure 4.

Analogy to Fraunhofer diffraction
It is useful to make an analogy to Fraunhofer diffraction.When a plane wave passes through an aperture, the patterns on a distant screen can be calculated by the interference of secondary wavelets according to the Huygens-Fresnel principle.The diffraction pattern calculated for a sharply bounded transparent aperture has characteristic oscillations, but the diffraction pattern calculated for an aperture with a Gaussian transmission profile is a smooth Gaussian.This can be generalized to the case of multiple combined Gaussian transmission profiles of different widths but all centered at the same position.
Alternatively, in Fourier optics the Fraunhofer diffraction can be written as the FT of the aperture function, whose functional form again determines the shape of the diffraction pattern observed on the distant screen.Here, the brightness temperature T b ( ) can be seen as analogous to a 1D aperture function and so the amplitude of its FT is analogous to the amplitude of the resulting 1D Fraunhofer diffraction pattern.

A "double-slit" analogy
When a plane wave passes through two slits, there is further interference of the secondary wavelets to be accounted for.The resulting diffraction can again be calculated with the Fourier transform.This can be appreciated with a simple double-Gaussian model with a 1 = a 2 = a = 0.5, and σ 1 = σ 2 = 2 km s −1 , and µ 1 = −µ 2 = 6 km s −1 .Note that this model, red in Figure 3, has the same column density of gas as the single Gaussian model (black) with the same σ.
Equation 4 simplifies to where f = 1 and ∆ = µ 1 − µ 2 = 12 km s −1 is the velocity separation between the two Gaussians.This is the same as Equation 5for the single Gaussian, except now modulated by | cos(π ∆ k)|.The normalized amplitude of its FT is shown by the red curve in Figure 4.The spectral resolution was set to 0.1 km s −1 .This modulation affects the Gaussian function only destructively and so the single-Gaussian case (black) forms an envelope to the amplitude of the FT of the double-Gaussian model (red), as is familiar in the Fraunhofer diffraction pattern of any double slit experiment.

Multiple Gaussians
Using Equation 3, we can generate more realistic synthetic spectra of a turbulent fluid with multiple components of various widths representing multiple thermal phases.Specifically, we built six-Gaussian models with one Gaussian assigned to the velocity dispersion range 6 km s −1 < σ < 8 km s −1 (about 4400 K < T k,max < 7800 K), two in the range 3 to 6 km s −1 (about 1100 K < T k,max < 4400 K), and three in the range 0.7 to 2 km s −1 (about 70 K < T k,max < 500 K).The six amplitudes a n were chosen randomly, in the range 0 to 16 K.
A baseline model was made with all components centered at µ n = 0 to produce a non-destructive reference.Next, five more models were built with six velocities µ n randomly chosen in the velocity range −6 km s −1 < µ < 6 km s −1 , to be different for each of these five models.However, the six models share the same amplitudes a n and velocity dispersions σ n and hence have the same mass fractions associated with each of the six individual Gaussians.
Figure 5 shows the brightness temperature of all six models for a case with cold gas mass fraction (the three Gaussians with σ < 3 km s −1 ) about 0.5.The spectral resolution was again set to 0.1 km s −1 .The black dashed curve shows the baseline model with µ n = 0.The five colored curves show models with varying µ n .Figure 6 shows their FT amplitude spectra, using the same color code.As expected from the single and double-Gaussian models, interference patterns are seen in the five models with µ n 0, resulting from the multiple peaks in their T b spectra.As in the double-Gaussian model, interference appears only destructively and the µ n = 0 baseline model provides an envelope for the other five models, regardless of the specific patterns produced by the random positions of each Gaussian.

A lower limit on the cold gas mass fraction
Building on our understanding of patterns produced in the FT amplitude spectrum by the multi-component nature (phase and clouds/turbulent structures) of 21 cm data, here we develop a new method to set a lower limit on the mass fraction of gas with an arbitrary maximum velocity dispersion (maximum kinetic temperature) in PPV cubes of H I data.Specifically, we focused our analysis on the cold gas mass fraction.
Figure 7 shows the normalized amplitude of the FT of single-Gaussian models with 0.5 < σ < 8 km s −1 .Gaussians with larger σ appear narrower in the k -domain.In general, the presence of cold gas (narrow spectral lines) is detectable by the fact that the amplitude of the FT remains high at relatively large k .Furthermore, if there is warm gas contributing to the spectrum, it will not be detectable in the amplitude of the FT at high k , but it will affect the normalization.Therefore, what we can estimate at high k is a lower limit to the cold gas mass fraction of the emitting gas.
Focusing on colder gas, at the vertical black line k = k ,lim = 0.12 ( km s −1 ) −1 gas characterized by a Gaussian with σ = σ lim = 3 km s −1 can be detected in the FT, though only at a threshold fractional level t = t lim = 0.05 of its true total mass.The value of the normalized amplitude is thus an underestimate of the total gas present in an amount dependent on k and σ.
Without measurements of the spin temperature T s , which is close to the kinetic temperature T k in cold gas, we need to consider line widths in the T b spectra.Using the approximation in units of K, σ = 3 km s −1 corresponds to gas with a maximum kinetic temperature of about 1000 K.In reality, accounting for turbulent broadening, T k is probably much less.This can be assessed by looking at the data in the BIGHICAT meta-catalog compiled by McClure-Griffiths et al. (2023) Figure 8. Histogram of σ of absorption line components in the BIGHIHAT catalog (black).The histogram for the CNM subset, defined as having T s < 250 K, is also plotted (red).
(Section 4.4).Figure 8 shows a histogram of σ for all Gaussian components identified in absorption with closely matching profiles in emission.These analyses typically use T s < 250 K to define CNM, not the line width (because of turbulent broadening).With that definition, we overplot the histogram for the subset that are CNM components.This shows that we want an estimator that detects gas with σ up to a few km s −1 .Most of this gas will be CNM and any unstable LNM with be suppressed.As first introduced in Section 2.3.5, destructive interference patterns caused by multiple peaks in the T b spectra can reduce the amplitude of the FT at k = k ,lim , lowering detectability of the contribution to the T b signal coming from Gaussians with σ ≤ σ lim .Note that because of the modulation, higher values of the amplitude might occur at higher k > k ,lim , improving detectability, though still providing a lower limit on the gas mass fraction.
Following this discussion, we define the estimator of the lower limit on the cold gas mass fraction to be Except where specified in the appendices, in the rest of this paper we have evaluated this estimator of the cold gas mass fraction with k ,lim = 0.12 ( km s −1 ) −1 .For compactness, this is denoted f 0.12 low .For other applications based on the maximum kinetic temperature one wishes to probe in the data (e.g., probing the structure of cold gas and unstable gas simultaneously), k ,lim can be chosen judiciously following the methodology previously described pairing σ lim and t lim .
2.4.1.Assessing f 0.12 low for toy models First, we evaluated f 0.12 low for the instructive cases in Figure 4.In the case of a simple spectrum with a single peak, the amplitude of the FT (blue, black) gives a direct quantitative estimate.Our estimator evaluates as 1×10 −7 for the warm gas (blue); i.e., warm is not detectable.For the cold gas (black) the estimate is 0.26 for the selected k ,lim , thus detectable but lower than the true value of the cold gas mass fraction, which is 1.0.In this simple model, only for σ even smaller than 2 km s −1 does this estimator approach unity (Figure 7).For the mixture of warm and cold gas (cyan), the warm gas is not detectable directly but affects the normalization and we find 0.13 compared to the true value 0.5.For the double Gaussian model (red), which has the same amount of cold gas as the black model, the amplitude of the FT is smaller than the black envelope because of the modulation, by a factor depending on the (range of) k examined.Because of this modulation, the estimate of the cold gas mass fraction in more complex spectra can be even lower than the true value; for the red curve we find 0.14, compared to 0.26 for the black curve and the true 1.0.
Similarly, for the more realistic six-Gaussian models in Figure 6, we have evaluated f 0.12 low .Our estimator evaluates as 0.26 for the baseline model (black line in Figure 6), about half of the true input cold gas mass fraction, which is about 0.5.For the related five models with random µ n in the range −6 km s −1 < µ < 6 km s −1 (colored lines), our estimator fluctuates due to interference but is lower than the value inferred from the baseline model.
We have expanded this six-Gaussian experiment to a sample of 2 18 spectra with randomly selected amplitudes and dispersions (hence cold gas mass fractions) and velocities µ n in the same parameter ranges described in connection with Figure 5.The top panel of Figure 9 shows the 2D histogram of the estimator f 0.12 low vs. the actual f (σ < 3 km s −1 ) for modified spectra resulting from setting all µ n = 0, like the baseline model in Figure 5, which is shown here as the black star.For a constant f (σ < 3 km s −1 ), there is a vertical spread of f 0.12 low though still consistent with being a lower limit.This illustrates well that our estimator is sensitive to the specific distribution of the underlying amplitudes and dispersions for all phases.
The bottom panel is based on the original 2 18 spectra including the non-zero µ n .The colored stars are for the other five spectra in Figure 6.This illustrates again that destructive interference lowers f 0.12 low but our estimator is still aptly called a lower limit.We also note that even with interference, f 0.12 low retains a good correlation with f (σ < 3 km s −1 ).
3. ASSESSING f 0.12 low USING A NUMERICAL SIMULATION Because of the simplicity of the toy models, it is unclear if f 0.12 low also retains information about the spatial structure of cold gas in a realistic situation.This is investigated here with the high-resolution hydrodynamical simulation of thermally low and the actual f (σ < 3 km s −1 ) for 2 18 six-Gaussian toy models with random amplitudes and dispersions in the parameter ranges used in Figure 5 (see text).Top: spectra modified to have all µ n = 0.The black star is for the baseline model in Figure 5. Bottom: spectra retaining random µ n in the range −6 km s −1 < µ < 6 km s −1 .The colored stars are for the other five spectra that have interference in Figure 6, which lowers the estimator.

Numerical simulation
In HERACLES, heating and cooling processes were implemented based on the prescription of Wolfire et al. (1995Wolfire et al. ( , 2003))  Specifically, we used part of their 1024N01 simulation (1024 3 pixels and a physical size of the box of 40 pc) characterized by an initial volume density n 0 = 1 cm −3 , a largescale velocity S = 12.5 km s −1 and a spectral weight ζ = 0.2 that controls the modes of the turbulent mixing (here a majority of compressible modes).The initial density corresponds to the typical density of the WNM before thermal instability and condensation, and the large-scale velocity represents the amplitude given to the field that generates large-scale turbulent motions in the box.A stationary state was reached after running the simulation for about 16 Myr out of a total of 40 Myr.The snapshot used here was taken after 16 Myr (Saury et al. 2014).
Figure 10 shows the 2D histogram of thermal pressure P th and volume density n for cells in the numerical simulation, weighted by n.The red, black, green, and blue lines indicate isothermal curves at 8000 K, 1000 K, 500 K, and 100 K, respectively.
To explore the effect of opacity we concentrate our analysis on cells in a 512 × 512 × 1024 pixel region with a large range of cold gas mass fraction.

21 cm line synthetic observations
Following Marchal et al. (2019), we generated synthetic 21 cm H I observations both in the optically thin limit and considering the full radiative transfer, which is sensitive to the optical depth of the line.We used an effective velocity resolution of 0.8 km s −1 in the velocity range −40 < < 40 km s −1 .
In a given cell at spatial coordinates r, the turbulent velocity field of the gas along the line of sight (taken to be the z axis) is z (r, z).The 1D distribution function of the z-component of the Maxwellian velocity distribution in that cell is given by ϕ z (r, z), a normal distribution (Gaussian) shifted by z (r, z), where ∆(r, z) = √ k B T (r, z)/m H is the thermal broadening of the 21 cm line, m H is the mass of the hydrogen atom, k B is the Boltzmann constant, and T (r, z) is the gas temperature.
The general case for the computation of the 21 cm brightness temperature T b ( z , r) is based on the following radiative transfer equation: where τ( z , r, z) is the optical depth of the 21-cm line across the cell defined as where ρ(r, z) is the gas density.In this representation, emission by gas in the cell at z is absorbed by gas in foreground cells at positions z ′ < z.
A complete derivation of T s (r, z) requires determination of the level populations of the hyper-fine state of the 21 cm line (see e.g., section 2.3 in Kim et al. 2014).The three dominant mechanisms are: collisional transitions, direct radiative transitions by 21 cm photons, and indirect radiative transitions (e.g., Lyα scattering).To simplify our calculation of the opacity, we assume here that T s (r, z) = T (r, z), motivated by the fact that collisions dominate the transition between levels in the cold dense phase.In the warmer phase T s might depart from T if the rate of collisional transitions is insufficient to thermalize the gas, resulting in low values of T s in the range 1000-4000 K (Liszt 2001).However, the detection of a widespread warm component in 21-SPONGE studies with ⟨T s ⟩ = 7200 +1800 −1200 K (Murray et al. 2014) or higher,4 suggests that additional excitation mechanisms such as resonant Lyα scattering are present, making T s close to the kinetic temperature of the gas.
In the optically thin limit where the opacity is negligible (i.e., τ( z , r, z) << 1 everywhere), the 21 cm brightness temperature is where L is the total depth of atomic gas along the line of sight.Of course, T * b ( z , r) can be calculated even when the Table 1.Summary statistics of the correlation between f 0.12 low and f (T < T k,max ) for the four cases presented in Figures 13 and 14 gas is not optically thin, and the difference of T b ( z , r) with respect to this reveals the effect of opacity.We generated two additional cubes, T b ( z = 0, r) and T * b ( z = 0, r), both with the turbulent velocity field fixed at 0 km s −1 for each cell in the simulation.These four cubes were used to explore the independent effects of opacity and of interference patterns induced by the turbulent velocity field on the Fourier transformed 21 cm emission line.Figure 11 shows brightness temperature spectra of a randomly selected line of sight within these four variants of the simulated PPV cube to illustrate the shape of the simulated spectra and the effects of opacity and velocity field on the line.

Testing the lower limit
As a baseline, Figure 12 shows a projection of the mass fraction of cold gas in the simulation, obtained by integration of cells with T < T k,max = 1089 K.This is the "true" cold gas mass fraction, denoted f (T < T k,max ). Figure 13 shows maps of f 0.12 low evaluated using Equation 9 on the four different synthetic spectral cubes.For all cases, the map of the lower limit on the cold gas mass fraction strongly resembles the map of the true mass fraction obtained directly from the density field alone ( f (T < T k,max )), shown in Figure 12.The spatial structure of the cold gas mass fraction is highlighted well.
This similarity is quantified in Figure 14 which shows 2D histograms of f 0.12 low vs. f (T < T k,max ) for the four different cases.Table 1 summarizes four statistics of the correlations: the Pearson correlation coefficient r p , the slope and intercept, and the scale-averaged5 cross correlation coefficient ⟨cc(k)⟩.
Regardless of the opacity regime (top row or bottom row in Figures 13 and 14; columns 2 and 3 or 4 and 5 in Table 1), r p is the highest when z (r, z) = 0 (left column in figures; columns 2 and 4 in table ).Interference patterns induced by the turbulent velocity field (right column in figures; columns 3 and 5 in table) lower the slope and spread the linear correlation with increasing "true" cold gas mass fraction, resulting in lower r p .The synthetic spectra are indeed expected to be more complex (possibly showing multiple peaks) and their FTs to be more susceptible to significant interference patterns as the mass fraction of cold gas rises.
Regardless of the velocity field (left or right column), the effect of opacity (bottom row) translates into a saturation of the upper envelope to f 0.12 low at about 0.5.This saturation is intrinsic to the simulated spectra where the intensity of their peaks is lowered (compared to the optically-thin treatment) by the opacity of the line.Its effect is all the more important when there is more cold gas along the line of sight.It can be seen visually in the successive panels in Figure 13 that despite the effects of both opacity and the velocity field, information about the projected spatial structure in Figure 12 is retained.The scale-averaged cross correlation coefficients ⟨cc(k)⟩ between f 0.12 low and f (T < T k,max ) for the four cases are tabulated in Table 1.Across scales and for both treatments of opacity, the imprint of the turbulent velocity field on f 0.12 low removes less than 20% of the spatial correlation with f (T < T k,max ).Qualitatively, the impact of the turbulent velocity field appears as local attenuation on f 0.12 low .This is likely to change the statistical properties of the field (hence the statistics of the corresponding column density field).
4. ASSESSING f 0.12 low WITH POINTED OBSERVATIONS 4.1.Using T b and τ spectra from 21-SPONGE For an initial exploration of H I observations, we return to the line of sight to J2232 (Galactic latitude −38.• 582) discussed in Section 2.2, whose 21-SPONGE survey τ and interpolated T b spectra and FTs are displayed in Figures 1 and  2. Murray et al. (2018) carried out a fairly classical joint analysis of τ and T b spectra, including Gaussian decomposition and exploiting the fact that the optical depth is inversely dependent on the spin temperature T s , to obtain a cold gas mass fraction of f 21 = 0.30 ± 0.10.For our simple estimate, using only the FT of the interpolated emission spectrum T b , Equation 9for the lower limit evaluates to f 0.12 low,Arecibo = 0.08, indeed lower than f 21 , which is expected to be closer to the true value.
The 21-SPONGE dataset covers lines of sight over a range of Galactic latitude with spectra displaying a variety of complexity.We extended the comparison made for J2232 to the entire dataset.A scatter plot is shown in Figure 15, where dots are color coded by absolute Galactic latitude and their sizes encode the complexity of the τ spectra (the number of absorption components, ranging from 0 to 13, identified in the Gaussian decomposition by Murray et al. 2018).The dot with the red edges denotes the line of sight to J2232.
Most of the f 0.12 low,Arecibo values are below the 1:1 line shown by the solid black line (i.e., < f 21 ), which supports our proposition that f 0.12 low would provide a lower limit.The lower limit can approach the true value in some circumstances: at high Galactic latitudes and relatively low cold gas mass fraction ( f 21 ≲ 0.2) and spectral complexity (see encoding of dots), we observe a fair agreement between f 0.12 low,Arecibo and f 21 .See also the discussion of Figure 16 below.However, for f 21 ≳ 0.2 and increasing spectral complexity, f 0.12 low,Arecibo saturates below about 0.1.At lower Galactic latitudes the trend is similar, but the values of f 0.12 low,Arecibo are systematically lower for a given f 21 and saturate below about 0.03.This relative behavior of f 0.12 low,Arecibo is plausibly due to a combination of increasing opacity coupled with a higher complexity of the emission line (i.e., more components seen in absorption, as shown by the size of each dot, is likely associated with a more complex line profile in emission as well), leading to more interference in the FT.
In the Taurus and Gemini Regions, Nguyen et al. ( 2019) observed a strong dependency of the correction to the optically thin H I approximation, R = N H I /N * H I , on the cold gas mass fraction.The authors found that for the CNM mass fraction below 0.2, N * H I and N H I are consistent within the errors.However, for mass fraction higher than 0.2 (and lower than 0.75), R rises from 1 to 2.
A scatter plot of the difference f 21 − f 0.12 low,Arecibo vs. total column density N H I so corrected for opacity is shown in That means that at low column density either our lower limit is close to the true value, which it would tend to be for narrow lines from cold gas such as are favored by absorption line measurements (also low turbulent broadening), or the true value f 21 is also an underestimate.The dot with the red edges denotes the line of sight to J2232 (Figure 2), near this threshold column density.The effect of the combination of increasing opac-  ity coupled with a higher complexity of the line (dot size), especially at low latitude, can be appreciated here again. 6lthough values of f 21 and f 0.12 low,Arecibo are fairly similar at low column density, we do note that seven lines of sight have f 21 − f 0.12 low,Arecibo < 0, which seems to suggest an inconsistency between our lower limit and the cold gas mass frac-  D1).Four of them are consistent within errors but three lines of sight remain unexplained, namely 3C245A, 1055+018, and 3C273.We have explored this further in Section 4.4 (validation on BIGHICAT) where a significant number of lines of sight show the same behavior and cannot be reconciled within uncertainties.

Lower resolution emission line data from HI4PI
We evaluated the applicability of our FT method to 21 cm emission line data with lower spatial and spectral resolution than 21-SPONGE (3.′ 5), namely that from the HI4PI survey, 7 which has a 16. ′ 2 beam and 1.49 km s −1 spectral resolution (HI4PI Collaboration et al. 2016).At the position of each source from 21-SPONGE for which we previously evaluated f 0.12 low,Arecibo , we extracted the on-source T b spectrum in the spectral range −90 < < 90 km s −1 , apodizing it with a cosine function at both ends (Section 6.2).On this extracted HI4PI spectrum, we evaluated our FT based estimate of the lower limit of the cold gas mass fraction, f 0.12 low,HI4PI .Using the on-source spectrum rather than an interpolated spectrum assumes that at the low spatial resolution of HI4PI the peak brightness temperature of the continuum source T c 7 HI4PI is a combined all-sky product based on data from the Effelsberg-Bonn H I Survey (EBHIS, surveyed with the Effelsberg 100-meter radio telescope; Kerp et al. 2011;Winkel et al. 2016) and from the third revision of the Galactic All-Sky Survey (GASS, surveyed with the Parkes telescope; McClure-Griffiths et al. 2009;Kalberla et al. 2010;Kalberla & Haud 2015).
Figure 18.Scatter plot of f 0.12 low,HI4PI vs. f 0.12 low,Arecibo for lines of sight in 21-SPONGE.Color code, size, and annotations are as in Figure 15.The dot with the red edges denotes the line of sight to 3C273 shown in Figure 17.
is diluted so much that the on-source emission spectra do not show the effects of absorption.To quantify this, T c seen by HI4PI at 16. ′ 2 is (Blagrave et al. 2017) where S ν is the source flux density.The background source is reduced by H I absorption producing a negative-going feature in the continuum-subtracted spectrum with profile T c (1 − exp(−τ)).This is to be compared to the emission line profile, which in a gas of constant spin temperature T s is T b = T s (1 − exp(−τ)).Therefore, we want to check that T c is significantly less than T s .We adopt T s = 88 K for cold gas (Blagrave et al. 2017) and note that even for the brightest source in 21-SPONGE, 3C273, T c = 36 K.As an empirical check, in Figure 17 we have compared the brightness temperature spectrum interpolated from offsource spectra around 3C273 from 21-SPONGE (red) and the on-source spectrum extracted from the HI4PI survey (black).Their peak brightness temperatures differ by less than 1 K. Importantly for our estimator of the cold gas mass fraction, the value f 0.12 low,HI4PI = 0.06 for the black spectrum is close to f 0.12 low,Arecibo = 0.036 for the red spectrum.
In addition, we extracted a 2.56 deg 2 HI4PI brightness temperature PPV cube (not shown here) centered on the sources and found no evidence of attenuation in individual channel maps.
In summary, Figure 18 plots f 0.12 low,HI4PI vs. f 0.12 low,Arecibo .Points are scattered around the 1:1 relation shown by the black solid line with a relatively low dispersion that shows that f 0.12 low is quite consistent over spatial resolutions varying from 3. ′ 5 to 16. ′ 2, at least for lines of sight in 21-SPONGE.Although the cold gas emission is likely to arise in small scale structures, its fractional contribution is not being modified significantly within a 16. ′ 2 beam.Furthermore, despite the lower velocity resolution of HI4PI, the cold gas with narrow lines is being reliably detected.

Using a CNN on T b spectra
As mentioned in the introduction, Murray et al. (2020) developed a 1D CNN, trained on simulated data that included opacity information and designed to infer an absolute value of the cold gas mass fraction when applied to emission line data.For the same 21-SPONGE emission data for the J2232 line of sight, using this 1D CNN they obtained f CNN = 0.08 ± 0.03, similar to the lower limit obtained with our FT method.Both f CNN and f 0.12 low,Arecibo are lower than f 21 .We used table D1 in Murray et al. (2020) to produce the scatter plot of f CNN − f 0.12 low,Arecibo against N H I shown in Figure 19 for all the lines of sight at high Galactic latitude (i.e., |b| > 30 • ) used in Murray et al. (2020).The color code, size, and annotations are as in Figure 16.Both the CNN and our FT method are consistent within 10% for N H I ≲ 3 − 5 × 10 20 cm −2 .The FT values are lower than the CNN values above this threshold, which again supports our proposition that f 0.12 low is a lower limit, affected by both the opacity of the line and the complexity of the line shape.

BIGHICAT
We extended the comparison made with 21-SPONGE to the 373 unique lines of sight in the BIGHICAT meta-catalog compiled by McClure-Griffiths et al. (2023).
BIGHICAT combines results from absorption surveys with publicly available spectral Gaussian modeling (Heiles & Troland 2003;Mohan et al. 2004a;Stanimirović et al. 2014;Dénes et al. 2018;Murray et al. 2018;Nguyen et al. 2019;Murray et al. 2021)   Figure 20 shows the location of the 373 lines of sight in BIGHICAT on a Mollweide projection of the total column density map N * H I (on a logarithmic scale) from the HI4PI survey over the velocity range −90 < < 90 km s −1 .The purple crosses show the positions of non-detections and the white dots, whose sizes encode the number of absorption components (ranging from 1 to 16) in their τ spectrum, show the detections.
For comparison to f BIGHICAT , we need an emission spectrum on which to apply our FT method.For a homogeneous product, we used on-source spectra from HI4PI as described in Section 4.2. Figure 21 shows the scatter plot of f BIGHICAT − f 0.12 low,HI4PI vs. N H I color coded by absolute latitude for all lines of sight in BIGHICAT.Dots with red edges show lines of sight from 21-SPONGE.Due to the high correlation observed between f 0.12 low,HI4PI and f 0.12 low,Arecibo (see Figure 18), the distribution for 21-SPONGE lines of sight is similar to that in Figure 16.The comparison with BIGHICAT as a whole shows a similar trend to that of 21-SPONGE only, but many more lines of sight.Again our explanation of the scatter at large column density is that f 0.12 low,HI4PI < f BIGHICAT , with the amount of underestimation by our lower limit f 0.12 low,HI4PI depending on the opacity and the destructive interference introduced by the complexity of the line profile.
However, at low column density there are now many sight lines where our lower limit is higher than the cold gas mass fraction tabulated in BIGHICAT, by nearly 20% (as opposed to less than 10% for 21-SPONGE only).On investigating their distribution on the sky, we found most to be part of the MACH survey (Murray et al. 2021) that targets a small region of the sky located in the first quadrant above the Galactic plane (see the concentration of purple crosses and white dots in Figure 20).This is a region where we found a large cloud of relatively low column density and high cold gas mass fraction when evaluating Equation 9 on HI4PI over the whole sky (see Section 7 and Figure 26 therein).
Interestingly, in their analysis whose results are incorporated into BIGHICAT Murray et al. (2021) reported a case (source named J14434) where their Gaussian modeling of the spectra yielded no detection of CNM in absorption at local velocities while the emission data from EBHIS at 10. ′ 8 resolution show a clear narrow peak at the same velocities.This can be appreciated in their figure 2 (first column and second row).They suggested that this is likely to reflect the nature of small scale structure of the CNM in this region, i.e., a porous medium in which the pencil beam of the absorption line ob- ).The case of J14434 suggests that other lines of sight in this region are likely to be affected by the same effect.EBHIS data appear in HI4PI at degraded 16. ′ 2 resolution and it seems reasonable that a 16. ′ 2 beam probing the same porous medium would exhibit the same effect in emission.The HI4PI spectrum at the position of the source (not shown here) is similar to that of the EBHIS data, also showing a narrow peak at local velocities.In that specific case, our FT approach finds f 0.12 low,HI4PI = 0.1.Because of the underlying structure of the phases within the 16.′ 2 beam, f 0.12 low,HI4PI > f BIGHICAT but f 0.12 low,HI4PI is still a lower limit at the beam size at which the FT was applied.
5. ASSESSING f 0.12 low using the ROHSA thermal phase decomposition of lliv1 Vujeva et al. (2023) used the spectral decomposition code ROHSA (see details in Section 6.3) to model the column density of different thermal phases in the Low-Latitude Intermediate-Velocity Arch 1 (LLIV1) dataset from GHIGLS in the velocity range −81 < < −27 km s −1 (Vujeva et al.   8 Depending on the spatial structure, this seemingly paradoxical situation might be avoided by an emission-line survey at very high spatial resolution.However, in that case T c of the radio source would be very high and any absorption would compromise the on-source spectrum.Interpolating emission spectra from the surrounding annulus would reintroduce the issue of sampling different spatial structures.2023).At each pixel, the ROHSA decomposition is expressed as Gaussian components, each with an amplitude, central velocity (µ), and velocity dispersion (σ).These parameters cluster in a 2D histogram of σ vs. µ (the σ−µ diagram, figure 6 in Vujeva et al. 2023).The Gaussians in clusters with σ ∼ 2 km s −1 are interpreted as from CNM gas, leading to a map of the column density of CNM gas, their figure 7 (upper left); that map, N * HI CNM, ROHSA , is the basis for the test here.We have applied the FT method to the same data cube and produced a map of f 0.12 low N * H I , shown in Figure 22.Visual comparison of these two maps reveals that the FT method retains information about the projected spatial structure, just as we found using simulated data in Section 3.3.However, because f 0.12 low is a lower limit, the column density is lower.This is quantified in Figure 23, a 2D histogram correlating the values in the two maps.
The CNM clusters in the ROHSA σ−µ diagram are centered near 2 km s −1 .If all CNM Gaussians had this dispersion, then their f 0.12 low would give the slope of dark blue line (second lowest), absent destructive interference effects.However, the clusters have some spread.Some Gaussians are as broad as σ = 3 km s −1 , leading to slope 0.05 (lowest line), the fiducial value discussed above, and some are narrower, a few as small as 1 km s −1 .These lines nicely bracket the 2D histogram in Figure 23.
At low N * HI CNM, ROHSA (≲ 2×10 19 cm −2 ), the plateau of low but finite f 0.12 low N * H I (corresponding to the darker regions in Figure 22) arises from map areas with very small values of f 0.12 low (typically below 0.05) from broad WNM emission and noise multiplied by significant N * H I from WNM gas.

APPLICATION TO MAPPED 21 CM DATA FOR DHIGLS DF
The main science goal illustrated by the applications below is to search for regions where there has been a thermal phase transition with condensation of cold dense gas.Searches in the future will need to be done over larger and larger data sets and in different spectral ranges and so require an efficient approach, such as we have developed in our FT-based method.Interesting regions found can then be targeted for analysis with more compute-intensive methods like ROHSA (Marchal et al. 2019) to quantify the properties of the multiphase medium.
We have used the FT method on data from GHIGLS and DHIGLS for a rapid evaluation of other regions and velocity ranges to study.The reader can explore this with the notebook supplied (footnote 1).Two applications are described below, here and in Section 7.

Data
The 57.4 square degree DF dataset used in the application here, located at (α, δ) = (10 h 30 m , 73 • 48 ′ ), was produced in the DHIGLS9 H I survey (Blagrave et al. 2017) with the Synthesis Telescope (ST) at the Dominion Radio Astrophysical Observatory.The 256-channel spectrometer, spacing ∆ = 0.824 km s −1 sampling data with velocity resolution 1.32 km s −1 , was centered at c = −60 km s −1 relative to the Local Standard of Rest (LSR).The spatial resolution of the ST interferometric data was about 0. ′ 9. DF, which encompasses the Spider region (named for its prominent "legs" emanating from a central "body": Marchal & Martin 2023 and references therein), is embedded in the North Celestial Pole Loop (NCPL) mosaic of the GHIGLS H I survey (Martin et al. 2015) with the Green Bank Telescope (GBT), with spatial resolution about 9. ′ 4. The DHIGLS DF product has the full range of spatial frequencies, obtained by a rigorous combination of the ST interferometric and GBT single dish data (see section 5 in Blagrave et al. 2017).The pixel size is 18 ′′ .A map of the total column density in the DF field in the velocity range −15 < < 15 km s −1 is shown in the Figure 24.

Apodization
Real measurements of the 21 cm line frequently include emission from intermediate velocity clouds (IVCs) that overlap the local velocity component (LVC).For example in the NCPL, there is also significant emission bridging velocities in between, which does not show the loop and has some characteristics of IVC, but was distinguished as a Bridge Velocity Component (BVC) by Taank et al. (2022) in their analysis of the multiphase structure of the loop.Building on this, we used T b in the velocity range −15 < < 15 km s −1 , effectively setting T b to zero outside of this range.Introducing this sharp edge (step function) in the signal produces a "ringing" in the Fourier transformed spectrum (Gibbs phenomenon).However, we suppress this by applying a cosine apodization on four channels of the signal at both ends of the spectral range.

Denoising
Evaluation of the cold gas mass fraction using the method described in Section 2.4 is sensitive to noise.This can be appreciated in Figure 2 where noise dominates the signal at high k ,lim values in the amplitude spectrum of the FT of the interpolated T b spectrum (black) of J2232 from 21-SPONGE.To overcome this limitation, we used the ROHSA algorithm (Marchal et al. 2019) to "denoise" the whole hyper-spectral cube of the DF dataset.
ROHSA is a regularized optimization algorithm that decomposes position-position-velocity (PPV) data cubes into a sum of Gaussians (Marchal et al. 2019).ROHSA takes into account the spatial coherence of the emission and its multi-phase nature to perform a separation of different thermal phases.Earlier applications of ROHSA were dedicated to mapping out thermal phases in 21 cm data (e.g., Marchal et al. 2021;Marchal & Miville-Deschênes 2021;Taank et al. 2022;Vujeva et al. 2023).
Here we simply made use of the spatial regularization to obtain a spatially coherent model of the data that concurrently reduces noise.The decomposition used in this work was obtained using N = 6 Gaussians and the hyperparameters λ a = λ µ = λ σ = λ σ ′ = 10, which control the strength of the regularization that penalizes small-scale spatial fluctuations of each Gaussian parameter (amplitude, central velocity, dispersion), measured by the energy at high spatial frequencies (λ a = λ µ = λ σ ), and the regularization of the variance of each velocity dispersion field (λ σ ′ ).This decomposition reproduces the original data with a good χ 2 and we refer to the PPV cube built from the model as "denoised data." 10 Maps of the gas column density, f k ,lim low N * H I , as a function of k ,lim computed from the denoised data and the original data are shown in Figures A.2 and A.3,respectively,in Appendix A.2.At low k ,lim , the maps are similar but the relative 10 This simple decomposition is good enough for denoising DF, then applying the FT method.However, it has not undergone the rigorous analysis and testing needed to recommend it for thermal phase separation in this complex field.Application of a denoising algorithm should be considered based on the maximum kinetic temperature one wishes to probe in the data following the methodology described in Section 2.4 pairing σ lim and t lim to motivate the choice of k ,lim .Because the FT is computationally inexpensive (see Section 9) compared to denoising, we recommend exploratory application of the FT method without denoising the data to judge the quality of f k ,lim low at the selected k ,lim .In our application to Spider with f k ,lim low = 0.12, denoising the data is not particularly beneficial for most of the map, except for areas such as the upper right where the column density is low or around the perimeter where the noise is larger.In those areas, image is less noisy and the contrast between green structure and blue is enhanced.

Results
The top panel of Figure 25 shows f 0.12 low (corresponding to k ,lim = 0.12 ( km s −1 ) −1 ) in DF using the apodized denoised data.The bottom panel of Figure 25 shows the corresponding lower limit on the column density of cold gas, calculated by multiplying the total column density (Figure 24, computed in the optically thin limit) by the cold gas mass fraction (top panel), i.e., f 0.12 low N * H I .Both panels reveal structure attributable to cold gas in the Spider, arranged in a complex network of "filaments." Because f 0.12 low is a lower limit, we cannot discuss the exact spatial distribution of the cold gas, even in projection let alone 3D.However, our tests in Section 3.3 n realistic simulations of HI gas show that information about the projected spatial structure is retained in the FT-based map.Here, applied to actual data, it seems unlikely that the structure that appears filamentary in projection is created at random; rather it reveals an important structural property of the thermal condensation of the H I gas.
In the top panel, the lower limit f 0.12 low varies considerably across the field, ranging from 0 (no CNM) up to 0.3 along the filamentary structures.In the bottom panel, the brightest part of the lower limit on cold gas column density is located near the core of the Spider identified in the total column density map, but the contrast is higher in this map and the structural details differ.
For completeness, we present lower-limit gas mass fraction maps and column density maps for the first nine k ,lim values obtained from the FT of the denoised data, in Figures A.1  and A.2, respectively. 11The PPK cube reveals increasingly colder structures as k ,lim increases and it can be appreciated visually that at high k ,lim there is more structure on small spatial scales.Note that we applied a Fourier transform only along the velocity axis of the PPV cube, with no explicit spatial filtering.The relationship between small spatial scale features and the intensity at high k ,lim can be understood as a direct consequence of the thermal condensation of H I gas.
The network of filaments seen in f 0.12 low N * H I is similar to the filamentary structures revealed by scanning through the PPV cube of the original data.These structures were first illustrated by Blagrave et al. (2017) in their RGB color image of the H I emission using three distinct channels separated by only 2.47 km s −1 (top panel of their figure 24).The dramatic changes in the filamentary structure of H I channel maps over a small velocity separation is captured at high k in the Fourier transformed data, hence the similarities between f 0.12 low N * H I and the structure of individual channel maps.The FT approach and the channel maps provide complementary perspectives on the nature of filaments (also called fibers in the context of the diffuse sky at high latitude, Clark et al. 2014) that are believed to be dominated by cold density structures (Clark et al. 2019;Peek & Clark 2019;Murray et al. 2020).
This qualitative result is consistent with the recent spatial wavelet scattering transform (ST) analysis performed by Lei & Clark (2023).Using a combination of GALFA-HI data (Peek et al. 2011(Peek et al. , 2018) ) and 21-SPONGE data, the authors found that ST-based metrics for small-scale linearity are predictive of the fraction of cold gas along the line of sight.It is noteworthy that the ST approach is based solely on analyzing the multi-scale information of H I emission maps while our FT method is solely applied on the spectral axis of the data cube.This complementarity opens up an interesting avenue for combining both the spatial and spectral information that can be extracted from 21 cm data.
A comprehensive statistical characterization of the correlation between f 0.12 low N * H I (or different k ,lim ) and the structure of individual channel maps is beyond the scope of this paper.This could be achieved using a cross correlation analysis.Similarly, a thorough analysis of the whole PPK cube of DF is beyond the scope of this paper but in the future will provide valuable information about the multiphase and multiscale structure of H I gas in the Spider that is part of the NCPL.Using the HI4PI data (Section 4.2), we computed f 0.12 low,HI4PI in the velocity range −90 < < 90 km s −1 over the full sky.
No denoising was applied to the dataset prior to the Fourier transform because the noise level of T b is fairly uniformly low, about σ rms ∼ 43 mK.Each spectrum was apodized with a cosine function at both ends of the spectral range.
Figure 26 shows Mollweide projections centered on the Galactic center of f 0.12 low,HI4PI (top) and f 0.12 low,HI4PI N * H I in units of 10 19 cm −2 (bottom).Annotations showing results from BIGHICAT are similar to those in Figure 20. 12Comparison of the top map of f 0.12 low,HI4PI with maps made with increasing values of k ,lim above 0.12 ( km s −1 ) −1 (Figure B.1 top in Appendix B) reveals the same morphology but with amplitude decreasing with k ,lim , showing that the features detecting CNM gas are real, not noise dominated.
At high latitudes, these full sky maps show a wide variety of known regions, e.g., the North Polar Spur (Tunmer 1958;Hanbury Brown et al. 1960;Das et al. 2020;West et al. 2021, and reference within), the compressed magnetized shells associated to the Corona Australis molecular cloud (Bracco et al. 2020), or the North Celestial Pole Loop (Meyerdierks et al. 1991;Taank et al. 2022;Marchal & Martin 2023).A comprehensive description is beyond the scope of this paper and will be explored in future works.
Toward the Galactic plane, f 0.12 low,HI4PI is low, but due to the high total column density (computed in the optically thin limit), f 0.12 low,HI4PI N * H I shows a variety of structures with column density comparable to that of the high latitude sky.At these latitudes, a sum of Gaussians as a model to describe the 21 cm line is not a good description of the data, due to the extreme effect of opacity and self-absorption.The Riegel-Crutcher cloud seen toward the Galactic center (Heeschen 1955;Riegel & Crutcher 1972;McClure-Griffiths et al. 2006) is a noteworthy example.H I self-absorption (HISA) seen in its emission spectrum (see, e.g., figure 2 in McClure-Griffiths et al. 2006) is most likely to impact the amplitude structure of Tb (k ).Narrow emission lines in the optically thin limit and HISA will both be highlighted at high k in the Fourier transform of the emission data but whether Equation 9 still provides a lower limit on the amount of cold gas remains unclear and should be investigated in future work.We therefore recommend that f 0.12 low,HI4PI at very low Galactic latitude be taken with caution, especially in places where HISA are known to impact strongly the emission lines.
Figure 27 shows masks of regions with N * H I < 3 × 10 20 cm −2 (yellow), 3 × 10 20 < N * H I < 5 × 10 20 cm −2 (green), and N * H I > 5×10 20 cm −2 (blue).In the green and yellow areas (mainly the high latitude sky), where f BIGHICAT and f 0.12 low,HI4PI Figure 26.Mollweide projection centered on the Galactic center of the lower limit on the cold gas mass fraction f 0.12 low,HI4PI (top) and corresponding lower limit on the cold gas column density f 0.12 low,HI4PI N * H I in units of 10 19 cm −2 (bottom) of the entire HI4PI dataset in the velocity range −90 < < 90 km s −1 .Similar to the annotations in Figure 20, the red crosses in the top panel show the positions of non-detections in the compilation of H I absorption spectra from the BIGHICAT meta-catalog (McClure-Griffiths et al. 2023) and the orange dots, whose size encodes the number of absorption components (ranging from 1 to 16) in their τ spectrum, show the detections in BIGHICAT.The green dots show where detections in BIGHICAT have f BIGHICAT − f 0.12 low,HI4PI < 0. are fairly similar, our lower limit estimate tends toward the absolute value of cold gas mass fraction probed by absorption line surveys (see discussion of Figure 21 in Section 4.4).In the blue areas, our lower limit should be treated as its definition, saying the cold gas mass fraction is at least f 0.12 low,HI4PI but its absolute value remains unknown from this method.We recommend using these column density cuts to determine regions of the sky where f 0.12 low,HI4PI tends towards the absolute value of the cold gas mass fraction and where f 0.12 low,HI4PI should be strictly used as a lower limit.2022) applied the CNN model developed by Murray et al. (2020) to HI4PI data in the same velocity range, −90 < < 90 km s −1 , but only in the high latitude sky (i.e., |b| > 30 • ), obtaining a map of f HI4PI CNN in both Galactic hemispherical caps.
The two estimates of the cold mass gas fraction are compared in Figure 28, an orthographic projection of the difference f HI4PI CNN − f 0.12 low,HI4PI in northern (left) and southern (right) Galactic hemispheres.Figure 29 shows the 2D distribution function of the difference vs. log N * H I .The two estimates are in fairly good agreement, with the mean and standard deviation of the difference f HI4PI CNN − f 0.12 low,HI4PI being −0.0065 and 0.033, respectively.However, there are coherent regions of positive and negative excursions around the mean, predominately in the range 30 < |b| • < 60.Positive values are predominantly in regions of higher column density, where the CNN, through its training, performs better at correcting for opacity effects (see figure 8 in Murray et al. 2020).
Negative values are found in more localized clouds, prominent at, e.g., (ℓ, b) = (75 • , 35 • ) (which corresponds to the region covered by the MACH survey and is discussed in Section 4.4) and (ℓ, b) = (135 • , 55 • ).The emission of the first cloud is in the low velocity range (LVC).The second cloud is a bright (T b ≈ 26 K) IVC at about −50 km s −1 . 13The CNN and the FT methods have both been applied to the same emission spectra at 16. ′ 2 resolution and so unlike for the comparison between our lower limit and the BIGHICAT estimate based on absorption, this discrepancy cannot be explained by the beam of the observation.
We have investigated the IVC cloud, where the difference between the two methods is the most negative.Its column density map N * HI in the velocity range −90 < < 90 km s −1 from HI4PI is shown in the left panel in Figure 30.Figure 30 (middle) shows a map of f 0.12 low,HI4PI .It has a structure similar to that of the total column density within the IVC cloud, a shape clearly recognized in the Figure 28.We have extracted a spectrum at (ℓ, b) = (136.3• , 54.7 • ) in a bright substructure of the cloud (annotated by the red star in Figure 30) where f 0.12 low,HI4PI is quite high (0.28). Figure 31 shows this T b spectrum in light blue.We have decomposed the spectral data with a Gaussian model, where the number of Gaussian (N = 10) was selected based on a reduced chi-square criterion.The total model is shown in black.Individual Gaussians with σ < 3 km s −1 (σ > 3 km s −1 ) are shown in blue (red).The gas mass fraction in the cold Gaussians (blue) evaluates to 0.67, higher than our lower limit f 0.12 low,HI4PI .On the other hand, the map of the estimate f HI4PI CNN (Figure 30, right) is overall very close to 0; while it also shows a structure of the same shape, its mean value is curiously negative, about -0.04, lower than that of the background of about 0.01.This failure of the CNN probably arises because the IVC-dominated spectra are far different than the training set used for the CNN.In support of this, if we translate the IVC peak in the spectrum in Figure 31 to be centered on 0 km s −1 , then the CNN method does detect CNM gas, with f HI4PI CNN = 0.22.

DISCUSSION
The FT method presented in this work has disadvantages as well as advantages.By focusing on only the amplitude information of the PPK cube, the phase information is inevitably lost and only the average information about the amount of cold gas remains available to some extent.The impact of the velocity field on the amplitude of the FT (i.e., the interference patterns) constrains us to evaluating a lower limit on the mass fraction of cold gas.
However, this partial loss of information of the velocity field, as well as the absence of modeling in comparison to Gaussian decomposition algorithms, makes our approach highly efficient in terms of computing time and so a unique tool to explore where the phase transition has taken place.Note that the computing time is not reduced in the case where ROHSA is used to denoise the data prior to the FT but, as discussed in Section 6.3 and Appendix A.2, denoising only becomes critical at high k ,lim and the cold gas mass fraction map inferred at k ,lim = 0.12 ( km s −1 ) −1 using the original data is still practicable for further analysis.As a reference, the PPK cube of the DF field with dimensions 60, 1680, and 1764 along the spectral and two spatial axes, respectively, was obtained in about a minute on a single CPU using the real fast Fourier transform algorithm imple-mented in NumPy.The lower resolution full sky HI4PI data were Fourier transformed in about ten minutes on a single CPU.The computation time scales linearly with the number of pixels because each line of sight is treated independently while applying the FT.This offers the possibility of straightforward parallelization on multiple CPUs.Furthermore, almost no additional memory is required as the operation can be performed independently for each spectrum.Both the low computing time and memory make this method easily applicable to massive data sets, pointing the way to new approaches suitable for the new generation of radio interferometers like the Australian Square Kilometer Array pathfinder (ASKAP, Dickey et al. 2013;Pingel et al. 2022), the Square Kilometer Array (SKA, McClure-Griffiths et al. 2015;de Blok et al. 2015), as well as the Next Generation Very Large Array (ngVLA, Pisano et al. 2018).
To overcome the loss of information of the velocity field and fully probe the fraction of cold gas as a function of velocity, it is necessary to go beyond the statistics encoded in the amplitude of the FT.Future research ought to examine the Short-Time Fourier Transform (STFT) and/or wavelet transform, which extract information from the k domain while preserving information in the time domain (i.e., here the velocity domain), to facilitate progress on understanding the multiphase (multi-scale in terms of velocity) nature of the medium traced by 21 cm data.

SUMMARY
We develop a new method for spatially mapping a lower limit on the mass fraction of the cold neutral medium by analyzing the amplitude structure of Tb (k ), the Fourier trans-  .Individual Gaussians with σ < 3 km s −1 (σ > 3 km s −1 ) are shown in blue (red).The gas mass fraction in cold Gaussians evaluates to 0.63 and our lower limit estimate f 0.12 low,HI4PI = 0.28.
form of T b ( ), the spectrum of the brightness temperature of H I 21 cm line emission with respect to the radial velocity .This development is supported by multiple experiments including a basic understanding of the FT spectrum using toy models, evaluation on a realistic simulation of the neutral ISM, validation using absorption line surveys, and application to real data.The main conclusions are as follows.
• Using toy models, we illustrate the origin of interference patterns seen in Tb (k ).Building on this, a lower limit on the cold gas mass fraction is obtained from the amplitude of Tb at high k .• Tested on a numerical simulation of thermally bi-stable turbulence, the lower limit from this method has a strong linear correlation with the "true" cold gas mass fraction from the simulation for relatively low cold gas mass fraction.At higher mass fraction, our lower limit is lower than the "true" value likely due to a combination of interference and opacity effects.
• Comparison with absorption surveys shows that our lower limit is close to the absolute value obtained from a combination of emission and absorption data for a total column density along the line of sight N H I ≲ 3 − 5 × 10 20 cm −2 , with a departure from linear correlation at higher column densities also due to a combination of interference and opacity effects.
• Application to the DRAO Deep Field (DF) from DHIGLS reveals a complex network of filaments in the Spider, an important structural property of the thermal condensation of the H I gas.Our estimator f 0.12 low varies considerably across the field, ranging from 0 (no CNM) up to 0.3 along the filamentary structures.The brightest parts in the map of the lower limit on the cold gas column density are located near the core of the Spider identified in the total column density map.
• Application to the HI4PI survey in the velocity range −90 < < 90 km s −1 provides a full sky map of a lower limit on the mass fraction of the cold neutral medium that is fairly consistent with results from the CNN model produced by Murray et al. (2020).
Although highly efficient in terms of computing time, the main limitation of our FT approach is the loss of information encoded in the velocity field.Future research ought to examine the Short-Time Fourier Transform and/or wavelet transform, which extract information from the k domain while preserving information in the velocity domain.
We acknowledge support from the Natural Sciences and Engineering Research Council (NSERC) of Canada.A.B. acknowledges support from the European Research Council through the Advanced Grant MIST (FP7/2017-2022, No.742719).This research has made use of the NASA Astrophysics Data System.This work began under the program "Grand Cascade" organized and hosted by Institut Pascal at Université Paris-Saclay and the Interstellar Institute.We appreciate enlightening conversations with J.E.G.Peek, C. Murray, N. Pingel, and other members of the Interstellar Institute.We appreciate enlightening conversations with Van Hiep Nguyen.We thank the anonymous referee whose comments and suggestions have improved this manuscript.
2. FOURIER TRANSFORM OF 21 CM DATA 2.1.Fourier transform and k = 0 normalization The discrete FT of brightness temperature data T b ( ) as a function of radial velocity is Tb (k ) = T b ( ) exp (−2 π j k ) d ,

Figure 1 .
Figure 1.Brightness temperature T b of the absorption source J2232 from the 21-SPONGE survey (black).Its corresponding optical depth τ, normalized at the peak of T b , is shown in red.

Figure 2 .
Figure 2. Amplitude of the normalized FT of T b (black) and τ (red) of the absorption source J2232 from the 21-SPONGE survey.The vertical dashed lines shows the position of dips observed in both amplitude spectra at similar k .

Figure 3 .
Figure 3. Brightness temperature spectrum T b of mock models.The black and blue curves show single-Gaussian models with no radial velocity (i.e., centered at = 0 km s −1 ) and σ = 2 km s −1 , and σ = 7 km s −1 , respectively.The red curve shows a double-Gaussian model with velocity separation ∆ = 12 km s −1 .

Figure 4 .
Figure 4. Normalized amplitude of the FT of the brightness temperature spectrum T b of toy models.Colors are the same as in Figure 3, with the addition of the cyan curve for the model with two centered Gaussians in Section 2.3.3.

Figure 5 .
Figure 5. Brightness temperature spectrum T b of six-Gaussian toy models of a multi-Gaussian component multiphase gas.The black dashed curve shows T b of the baseline model with µ n = 0, and randomly selected a n and σ n (see text).Colored lines show T b of five more models with the same a n and σ n but with µ n chosen randomly in the range −6 km s −1 < µ < 6 km s −1 .

Figure 6 .
Figure 6.Normalized amplitude of the FT of the brightness temperature spectrum T b of toy models of a multi-Gaussian component multiphase gas.Annotations are as in Figure 5.

Figure 7 .
Figure 7. Normalized amplitude of the FT spectrum of single Gaussians with various width σ.The horizontal dashed line denotes the mass fraction f = 0.05 and the vertical line its corresponding k for the Gaussian with σ = 3 km s −1 .

Figure 9 .
Figure9.2D histograms of the estimator f 0.12 low and the actual f (σ < 3 km s −1 ) for 2 18 six-Gaussian toy models with random amplitudes and dispersions in the parameter ranges used in Figure5(see text).Top: spectra modified to have all µ n = 0.The black star is for the baseline model in Figure5.Bottom: spectra retaining random µ n in the range −6 km s −1 < µ < 6 km s −1 .The colored stars are for the other five spectra that have interference in Figure6, which lowers the estimator.
, including cooling by [CII], [OI], recombination onto positively charged interstellar grains, and Lyman α, and heating dominated by the photo-electric effect on small dust grains and PAHs.In their simulation suite, Saury et al. (2014) considered a spatially uniform radiation field with the spectrum and intensity of the Habing field (G 0 /1.7,where G 0 is the Draine flux: Habing 1968; Draine 1978).

Figure 10 .
Figure 10.Pressure vs. volume density histogram weighted by volume density.The red, black, green, and blue lines indicate isothermal curves at 8000 K, 1000 K, 500 K, and 100 K, respectively.

Figure 11 .
Figure 11.Brightness temperature spectra of a randomly selected line of sight within the four variants of the simulated PPV cube: T * b ( z = 0, r) (dashed black), T * b ( z , r) (dashed blue), T b ( z = 0, r) (black), and T b ( z , r) (blue).

Figure 12 .
Figure 12.Cold gas mass fraction with T < T k,max obtained directly from the simulated density field.

Figure 13 .
Figure 13.Map of lower limit on the cold gas mass fraction f 0.12 low evaluated with Equation 9 for T * b ( z = 0, r) (top left), T * b ( z , r) (top right), T b ( z = 0, r) (bottom left), and T b ( z , r) (bottom right).Compare to f (T < T k,max ) in Figure 12.

Figure 14 .
Figure 14.2D distribution function of f 0.12 low evaluated with Equation 9 vs. f (T < T k,max ), for T * b ( z = 0, r) (top left), T * b ( z , r) (top right), T b ( z = 0, r) (bottom left), and T b ( z , r) (bottom right).The solid black line shows the 1:1 line.Dashed lines show linear fits using the bisector estimator.
Figure 16.Values of f 21 and f 0.12 low,Arecibo are fairly similar for N H I ≲ 3 − 5 × 10 20 cm −2 , which is close to the typical column density, N H I ≈ 5 × 10 20 cm −2 , where opacity effects start to become apparent in various absorption line surveys (see e.g., figure 2 in Nguyen et al. 2018, showing R vs. the opacitycorrected N H I for data in the Arecibo Millennium Survey (Heiles & Troland 2003) and 21-SPONGE).

Figure 15 .
Figure 15.Scatter plot of f 0.12 low,Arecibo vs. f 21 for all sources from the 21-SPONGE survey color coded by absolute Galactic latitude.The size of each dot encodes the number of absorption components (ranging from 0 to 13) identified in the Gaussian decomposition of the optical depth spectrum by Murray et al. (2018).The solid black line shows the 1:1 line.The dot with the red edges denotes the line of sight to J2232.

Figure 16 .
Figure 16.Scatter plot of f 21 − f 0.12 low,Arecibo vs. N H I color coded by absolute Galactic latitude.The dashed vertical line denotes the typical column density where the effects of opacity become significant.The size code is as in Figure 15.The dot with the red edges denotes the line of sight to J2232.

Figure 17 .
Figure 17.Comparison between the brightness temperature spectrum interpolated around 3C273 (source with the highest T c ) from 21-SPONGE (red) and the on-source spectrum extracted from the HI4PI survey (black).

Figure 19 .
Figure 19.Scatter plot of f CNN − f 0.12 low,Arecibo vs. N H I .Color code, size, and annotations are as in Figure 16.

Figure 20 .
Figure 20.Mollweide projection centered on the Galactic center of the total column density map N * H I (in units of 10 19 cm −2 , on a logarithmic scale) from the HI4PI survey over the velocity range −90 < < 90 km s −1 .The purple crosses show the positions of non-detections in the compilation of H I absorption spectra from the BIGHICAT meta-catalog (McClure-Griffiths et al. 2023).The white dots, whose size encodes the number of absorption components (ranging from 1 to 16) in their τ spectrum, show the detections in BIGHICAT.

Figure 21 .
Figure 21.Scatter plot of f BIGHICAT − f 0.12 low,HI4PI vs. N H I color coded by absolute latitude.Color code, size, and annotations are as in Figure 16.The dots with red edges show lines of sight from 21-SPONGE.The locations of lines of sight with a negative difference are shown with green dots in Figure 26.

Figure 22 .
Figure 22.Lower limit on the cold gas column density f 0.12 low N * H I of Low-Latitude Intermediate-Velocity Arch 1 from GHIGLS in the velocity range −81 < < −27 km s −1 .The color bar is in units 10 19 cm −2 .

Figure 23 .
Figure 23.2D histogram (logarithmic color bar) demonstrating the pixel by pixel correlation of the map of f 0.12 low N * H I with the CNM column density map obtained using ROHSA (figure 7 upper left of Vujeva et al. 2023).Both axes are in units 10 19 cm −2 .The values from the FT method are clearly below the 1:1 line, by amounts bounded by the other lines discussed in the text.

Figure 24 .
Figure 24.Total column density N * H I of the DF dataset from DHIGLS in the velocity range −15 < < 15 km s −1 , computed in the optically thin limit.The solid white and black lines show the N * H I = 3 × 10 20 cm −2 and N * H I = 5 × 10 20 cm −2 contours.

Figure 25 .
Figure 25.Lower limit on the cold gas mass fraction f 0.12 low (top) and corresponding lower limit on the cold gas column density f 0.12 low N * H I (bottom) of the DF dataset from DHIGLS in the velocity range −15 < < 15 km s −1 .
proportion of noise compared to the sky signal increases with k ,lim to a point where filamentary structures are hardly visible in Figure A.3.Decomposition with ROHSA concurrently reduces noise at high k ,lim , retaining spatial information about the very cold structures in the Spider in Figure A.2.
7. APPLICATION TO HI4PI DATA FOR THE WHOLE SKY11 Note that at high k ,lim , denoising the data prior to the FT becomes critical, as is evident in Figure A.3, which shows the same column density maps as Figure A.2 but computed with the original data.
Multiple k ,lim limitsWe present in Figures A.1 and A.2, respectively, the gas mass fraction maps and column density maps for the first nine k ,lim values obtained from the FT of the denoised data of the DF field from DHIGLS.A.2.Denoising the data prior to the FTTo provide a baseline for evaluating the impact of the denoising of the DF data using ROHSA on the PPK cube, we computed the FT of the original data.Figure A.3 shows the column density f k ,lim low N * H I as a function of k ,lim .This can be compared panel by panel with Figure A.2 for the denoised data, presented with consistent color bars.At low k ,lim , Figures A.2 and A.3 are very similar.However, proceeding toward higher k ,lim , the maps in Figure A.3 become increasingly noisier, up to a point where the filamentary structures are hard to distinguish from the ambient noise (see especially the map at the highest k ,lim ).B. HI4PI In Figure B.1 we present the same two quantities but for nine k ,lim values sampled in the range 0.01 < k ( km s −1 ) −1 < 0.27.

Figure A. 2 .
Figure A.2. Column density f k ,lim low N * H I as function of k ,lim computed from the denoised data of the DF field from DHIGLS.

Figure A. 3 .
Figure A.3.Column density f k ,lim low N * H I as function of k ,lim computed from the original data (no denoising) of the DF field from DHIGLS.

Figure B. 1 .
Figure B.1.Mollweide projection centered on the Galactic center of the lower limit on the cold gas mass fraction f k ,lim low,HI4PI (top) and the lower limit on the cold gas column density f k ,lim low,HI4PI N * H I in units of 10 19 cm −2 (bottom) of the entire HI4PI dataset in the velocity range −90 < < 90 km s −1 .fixme . As McClure-Griffiths et al. (2023) emphasize, BIGHICAT is heterogeneous in terms of sen-