Deep Learning for Line Intensity Mapping Observations: Information Extraction from Noisy Maps

Line intensity mapping (LIM) is a promising observational method to probe large-scale fluctuations of line emission from distant galaxies. Data from wide-field LIM observations allow us to study the large-scale structure of the universe as well as galaxy populations and their evolution. A serious problem with LIM is contamination by foreground/background sources and various noise contributions. We develop conditional generative adversarial networks (cGANs) that extract designated signals and information from noisy maps. We train the cGANs using 30,000 mock observation maps with assuming a Gaussian noise matched to the expected noise level of NASA's SPHEREx mission. The trained cGANs successfully reconstruct H{\alpha} emission from galaxies at a target redshift from observed, noisy intensity maps. Intensity peaks with heights greater than 3.5 {\sigma} noise are located with 60 % precision. The one-point probability distribution and the power spectrum are accurately recovered even in the noise-dominated regime. However, the overall reconstruction performance depends on the pixel size and on the survey volume assumed for the training data. It is necessary to generate training mock data with a sufficiently large volume in order to reconstruct the intensity power spectrum at large angular scales. Our deep-learning approach can be readily applied to observational data with line confusion and with noise.


INTRODUCTION
The large-scale structure of the universe contains rich information on galaxy formation and on the nature of dark matter and dark energy. Line intensity mapping (LIM) is an emerging observational technique that measures the fluctuations of line emission from galaxies and intergalactic medium. With typically low angular and spectral resolutions, LIM can survey an extremely large volume. Future LIM observations are aimed at detecting emission lines at various wavelengths: Hi 21-cm line (e.g., SKA, Koopmans et al. 2015), FIR/sub-millimeter lines such as [Cii] and CO (e.g., TIME, Crites et al. 2014), and ultra-violet/optical lines such as Lyα and Hα (e.g., SPHEREx, Doré et al. 2014).
While LIM has the advantage of being able to detect all contributions including emission from faint, dwarf galaxies, there is a serious contamination problem, the so-called line confusion. Because individual line sources are not resolved in LIM observations, foreground/background contamination cannot be easily removed. So far, only a few practical methods have been proposed to extract designated signals. Statistics-based approaches include cross-correlation analysis with galaxies/emission sources from the same redshift (e.g., Visbal & Loeb 2010), and one that utilizes the anisotropic power spectrum shape (e.g., Cheng et al. 2016). Cheng et al. (2020) devise a method based on sparsity modeling that reconstructs successfully the positions and the arXiv:2010.00809v1 [astro-ph.GA] 2 Oct 2020 line luminosity functions of point sources from multifrequency data.
Earlier in Moriwaki et al. (2020), we have proposed a deep learning approach to solve the line confusion problem. We use conditional generative adversarial networks (cGANs), which are known to apply to a broad range of image-to-image translation problems. Our cGANs learn the clustering features of multiple emission sources and are trained to separate signals from different redshifts. It is shown that deep learning offers a promising analysis method of data from LIM observations. However, in practice, various noise sources can cause a serious problem. Faint emission line signals from distant galaxies are likely overwhelmed by noise even with the typical level of next-generation observations.
In this Letter, we propose to use cGANs to effectively de-noise line intensity maps. We show that suitably trained cGANs successfully reconstruct the emission line signals on a map, and recovers basic statistics of the intensity distribution. All such information extracted from noisy maps can be used for studies on cosmology and galaxy population evolution.

METHODS
We consider Hα emission from galaxies at z = 1.3 and observed at 1.5 µm. The Hα emission is one of the major target lines of future satellite missions such as SPHEREx (Doré et al. 2014) and CDIM (Cooray et al. 2019). We develop cGANs that extract Hα signals from noisy observational data. We first describe how we generate mock intensity maps that are used for training and test. We then explain the basic architecture of our cGANs. Further technical details can be found in Moriwaki et al. (2020).

Training and test data
We prepare a large set of training and test data. We use a fast halo population code PINOCCHIO (Monaco et al. 2013) that populates a cosmological volume with dark matter halos in a consistent manner with the underlying linear density field. We generate 300 (1000) independent halo catalogs with a cubic box of 280h −1 Mpc on a side for training (test) 1 . The smallest halo mass is 3 × 10 10 M . We then assign Hα luminosities to the individual halos to obtain a three-dimensional emissivity field. The halo mass-to-luminosity relation is derived using the result of a hydrodynamics simulation Illustris-TNG (Nelson et al. 2019). We assume that the line luminosity is given by a function of the star- Table 1. Observational parameters of SPHEREx deep (Doré et al. 2014).
formation rate of the simulated galaxy as where we adopt A line = 1.0 mag, and C line is a coefficient table computed using the photoionization simulation code Cloudy (Ferland et al. 2017). We generate two-dimensional Hα intensity maps by projecting the three-dimensional emissivity fields along one direction. For each realization of the training (test) data, 100 maps (1 map) with an area of (0.85 deg) 2 are generated by projecting random portions of an emissivity field. A total of 30,000 training data and 1000 test data are generated in this manner. The intensity maps are pixelized with the angular and spectral resolution of SPHEREx listed in Table 1.
Finally, realistic mock observation maps are generated by adding Gaussian noise to the Hα intensity map. We adopt the noise level of "SPHEREx deep" whose 5 σ n sensitivity per pixel at λ = 1.5µm is 22 mag, corresponding to σ n = 2.6 × 10 −6 erg/s/cm 2 /sr. The maps are normalized by 1.0 × 10 −4 erg/s/cm 2 /sr before input to the networks.

Network architecture
We develop cGANs using the publicly available pix2pix code (Isola et al. 2016). The cGANs consist of two adversarial convolutional networks: a generator and a discriminator. The generator, consisting of 8 convolutional and 8 de-convolutional layers, outputs a map G ("reconstructed map") from an observed map x. The discriminator, consisting of 4 convolutional layers, returns a value D for the input of (x, y) or (x, G[x]) with y denoting the Hα map. The value D indicates the probability that the input is not (x, G[x]) but (x, y). During the training, the two networks are updated repeatedly in an adversarial way; the generator is updated so that it deceives the discriminator (i.e., D(G[x]) should get closer to 1) while the discriminator is updated so that it gets better accuracy (i.e., D(x, y) and D(x, G[x]) get closer to 1 and 0, respectively).
Specifically, the parameters in the generator (discriminator) are updated to decrease (increase) the loss function, where Note that we include an additional term L L1 that is known to ensure better performance by imposing the condition that the values of the corresponding pixels in the true and reconstructed maps should be close (Isola et al. 2016). In each round of training, the loss function is computed with a mini-batch. After some experiments, we set λ = 1000 and batch size 4. The networks are trained for 8 epochs. We adopt these parameter values throughout the present study.
3. RESULTS Figure 1 shows the reconstruction performance of our cGANs. Our networks reduce the noise and successfully extract the true Hα signals. It is remarkable that both the source positions and the intensities are reproduced well, even though the observed map is noise-dominated.
A simpler approach in such a noise-dominated case would be to select only high signal sources in an observed map. However, we find that, if we select pixels with signals greater than 3.5σ n from the observed maps, only 20 % of them are true sources (see also Figure 2). With our networks, about 60 % of the reconstructed pixels with intensities greater than 3.5σ n are real sources. Hence our method significantly outperforms the simple signal selection based on the local intensity.

Probability distribution function
The probability distribution function (PDF) of line intensity is an excellent statistic that can constrain galaxy populations and their physical properties (e.g., Breysse et al. 2017). We test whether our networks also recover the PDF accurately.
We first note that, in general, a single set of networks do not reproduce pixel statistics of images/maps robustly. We thus resort to training multiple networks and take the mean of the statistics reconstructed by the ensemble of networks. This technique, called "bagging", is known to reduce generalization error (Goodfellow et al. 2016), and has been applied to, for instance, de-noising weak lensing convergence maps (Shirasaki et al. 2019). In practice, we average the PDFs reconstructed by 5 networks that are trained with different datasets. Figure 2 compares the PDFs of true and reconstructed maps. The vertical dashed line indicates the 1 − σ noise level. Our cGANs are able to reconstruct the PDF of the Hα intensity above 1-σ. Note also that the scatter of the averaged PDF lies within the intrinsic scatter of the true Hα maps, i.e., within the so-called cosmic variance. Apparently, the networks tend to reconstruct the PDFs close to the average. This is simply caused by the bagging procedure. We have checked and confirmed that the variance of reconstructed PDFs by a single network is as large as the intrinsic one.

Power spectrum
We further examine the ability of our cGANs to reconstruct the intensity power spectrum. To this end, we again adopt the bagging of 5 networks that are trained with different datasets. The red error bars and shaded regions in Figure 3 show the power spectra of reconstructed and true Hα maps, respectively. The dashed line indicates the noise power spectrum.  We notice that the reconstructed power at large scales (k 0.5) is systematically underestimated. This might be owing to the finite boxsize of the training data. To examine this, we train the cGANs with mock intensity maps with a larger area. For this test, we generate halo catalogs in a cubic volume of 700h −1 Mpc on a side (see Section 2.1). Then the smallest halos populated is degraded to 3 × 10 11 M , but we have confirmed that the mean Hα intensity (or the total luminosity density) is not significantly different from that with our default boxsize of 280h −1 Mpc. We set the side length of the pixel l = 2.0 arcmin and the spectral resolution R = 41.5. Each map has a ten times larger area of (8.5 deg) 2 . The noise level scales with the angular and spectral resolution as where l, R, σ n are the original angular and spectral resolution and the noise level of SPHEREx (Table 1). The resulting noise level of the wide maps is σ n = 1.3 × 10 −7 erg/s/cm 2 /sr. We adopt the same hyperparameters in the cGANs as in our default case except we set λ = 200 and the normalization factor 1.0 × 10 −6 erg/s/cm 2 /sr for the low-resolution maps. The purple dots with error bars in Figure 3 show the power spectrum of the reconstructed wide maps. The lightpurple shade indicates the 1-σ dispersion of the true Hα power spectra. Clearly, the large-scale (low-k) power spectrum is reconstructed more accurately compared to our default case. We note that the cGANs trained with the wider maps do not resolve point sources, but the peaks and voids in the reconstructed map correspond closely to the positions of groups/clusters and void regions.
Ideally, networks trained with maps that have fine pixels and a large boxsize would be able to reconstruct both the positions of point sources (galaxies) and their largescale clustering. However, the required computational resources would be beyond the level of what is currently available. We thus suggest that one should generate training data depending on the purpose. In order to detect point sources robustly, one needs to train the networks using fine-pixel maps. If the primary purpose is to reconstruct the large-scale power spectrum, for cosmology studies for instance, then one needs to generate maps with a sufficiently large area (volume) but with coarse pixels. The reconstructed power spectra shown in Figure 3 suggest that one should adopt at least a several times larger area for training than the actual size of observed maps.

CONCLUSION
We have developed cGANs that effectively reduce observational noise in line intensity maps. We train the cGANs by using a large set of mock observations assuming a realistic noise level expected for SPHEREx mission. Our cGANs can reconstruct the point source positions and the PDF of the intensity maps. The power spectrum is also reconstructed remarkably well, but the accuracy depends on the area/volume assumed for the training data.
If we combine with another set of networks that efficiently separates signals from different redshifts (Moriwaki et al. 2020), the cGANs developed in this study can extract the emission line signal from galaxies at an arbitrarily specified redshift from noisy maps. Therefore, using data from multi-frequency, wide-field intensity mapping observations, we can reconstruct the threedimensional distribution of emission-line galaxies. The intensity peaks detected by our cGANs correspond to bright galaxies and galaxy groups with high confidence, which will be promising targets for follow-up observations. Finally, the reconstructed line intensity map essentially traces the distribution of galaxies and hence of underlying matter, and thus it is well suited for crosscorrelation analysis with other tracers. Accurate recon-struction of the statistics such as the one-point PDF and power spectrum as shown in this Letter will allow us to perform cosmological parameter inference and to study galaxy formation and evolution using data from future LIM observations.