Predicting the Radiation Field of Molecular Clouds Using Denoising Diffusion Probabilistic Models

Accurately quantifying the impact of radiation feedback in star formation is challenging. To address this complex problem, we employ deep-learning techniques known as denoising diffusion probabilistic models (DDPMs) to predict the interstellar radiation field (ISRF) strength based on three-band dust emission at 4.5, 24, and 250 μm. We adopt magnetohydrodynamic simulations from the STARFORGE project that model star formation and giant molecular cloud (GMC) evolution. We generate synthetic dust emission maps matching observed spectral energy distributions in the Monoceros R2 (MonR2) GMC. We train DDPMs to estimate the ISRF using synthetic three-band dust emission. The dispersion between the predictions and true values is within a factor of 0.1 for the test set. We extended our assessment of the diffusion model to include new simulations with varying physical parameters. While there is a consistent offset observed in these out-of-distribution simulations, the model effectively constrains the relative intensity to within a factor of 2. Meanwhile, our analysis reveals a weak correlation between the ISRF solely derived from dust temperature and the actual ISRF. We apply our trained model to predict the ISRF in MonR2, revealing a correspondence between intense ISRF, bright sources, and high dust emission, confirming the model’s ability to capture ISRF variations. Our model robustly predicts radiation feedback distribution, even in complex, poorly constrained ISRF environments like those influenced by nearby star clusters. However, precise ISRF predictions require an accurate training data set mirroring the target molecular cloud’s unique physical conditions.


INTRODUCTION
Stellar feedback plays a crucial role in the star formation process, manifesting in two main forms: mechanical feedback and radiative feedback (Fall et al. 2010;Girichidis et al. 2020).Mechanical feedback involves the injection of momentum and kinetic energy into the surrounding clouds through stellar winds, including protostellar outflows and isotropic stellar-wind-driven bubbles (Churchwell et al. 2007;Arce et al. 2007;Frank xuduo117@virginia.edu et al. 2014).Conversely, radiative feedback is associated with the dissociation and ionization of cold molecular gas by the intense radiation emitted by massive stars (Walch et al. 2012;Grudić & Hopkins 2019;Rosen & Krumholz 2020).This radiation also exerts pressure on the surrounding gas and dust, resulting in the formation of ionized bubbles known as H ii regions, which release a substantial amount of energy (Lopez et al. 2014).
Recent studies have highlighted the significant impact of stellar feedback on the star formation process.Simulations show that mechanical feedback, such as outflows and stellar winds, reduces protostellar masses and ac-cretion rates, as well as disperses surrounding gas, leading to a decrease in both the global star formation rate and efficiency (Matzner 2007;Federrath et al. 2014;Federrath 2015;Offner & Chaban 2017;Guszejnov et al. 2022).However, the energy injection from outflows is typically limited to smaller scales, ranging from subparsec to parsec scales (Wang et al. 2010;Xu et al. 2022a).In contrast, the combined effects of photoionization and radiation pressure from massive stars and their H ii regions result in the heating of the surrounding gas and efficient dispersal of the nearby cloud (Dale et al. 2012(Dale et al. , 2013)).The radiation feedback from massive stars can have a broad impact, spanning scales from a few parsecs to tens of parsecs (Walch et al. 2012;Lopez et al. 2014;Rosen & Krumholz 2020;Guszejnov et al. 2022;Grudić et al. 2022;Rosen 2022).
In order to gain a comprehensive understanding of how stellar feedback influences the star formation process, including star formation rate and efficiency, it is crucial to study feedback mechanisms across molecular clouds with varying physical and chemical conditions.However, accurately quantifying the impact of radiation from massive stars continues to present a challenge in observational studies.There are several current "classical" approaches to estimating the radiation field from observations.For example, the strength of the radiation field originating from massive stars is commonly estimated using dust emission (Bernard et al. 2010).However, the mean dust temperature as derived from long-wavelength emission gives an incomplete picture of local conditions.Pound & Wolfire (2023) developed a framework using the ratio between far-infrared (FIR) fine-structure lines, such as [O I], [C I], and [C II], to estimate the strength of the radiation field through photodissociation region (PDR) models.However, PDR models rely on simplified assumptions about the cloud geometry and density distribution, leading to uncertainties when applied to actual observational data.This approach also does not provide an accurate estimate of the radiation field within the cloud due to young embedded sources.Other PDR codes, such as 3D-PDR (Bisbas et al. 2012), offer the advantage of allowing for arbitrary density distributions and the ability to specify radiating sources within the cloud, addressing some of the uncertainties mentioned earlier.However, degeneracy due to physical conditions imposes a significant limitation in using line ratios to determine the strength of the radiation field, since different number densities and radiation field strengths can produce the same line ratio (Pound & Wolfire 2023).Additionally, mapping FIR lines across molecular clouds is time-consuming, especially in quies-cent regions where [O I], [C I], and [C II] emission is relatively faint.
By comparison, machine learning provides a promising avenue for improving the estimation of physical variables given relatively limited observational data.There has been a proliferation in machine learning-based approaches to predict physical quantities from observational data across various fields, including solar physics (Asensio Ramos & Díaz Baso 2019), interstellar medium (Peek & Burkhart 2019;Xu et al. 2020aXu et al. ,b, 2022a,b),b), and in the realm of galaxies and cosmology (Wu & Boada 2019;Neutsch et al. 2022).Machine learning provides a powerful tool to study mechanical stellar feedback as it enables complex morphological features, previously only detectable by visual inspection, to be identified quickly and reliably.Recent studies have developed and employed a deep learning method called casi (Convolutional Approach to Structure Identification) to systematically identify protostellar outflows and wind-driven bubbles in nearby molecular clouds using molecular line data cubes (Van Oort et al. 2019;Xu et al. 2020aXu et al. ,b, 2022a)).
Recently Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful and reliable tools for image generation (Sohl-Dickstein et al. 2015;Ho et al. 2020), which have shown great potential in addressing prediction tasks within the field of astronomy.Smith et al. (2022) employed DDPMs to generate synthetic images resembling observed galaxies, achieving a high level of realism.In another study, Wang et al. (2023) utilized DDPMs to enhance image quality and suppress noise in interferometric observations.Furthermore, Xu et al. (2023) applied DDPMs to infer the number density of molecular clouds-a parameter notoriously challenging to measure based on readily obtainable column density maps.DDPMs exhibit superior accuracy in predicting molecular cloud number density, underscoring their effectiveness and reliability in the estimation task.
In this paper, we employ a deep learning approach based on DDPMs to estimate the radiation field strength induced by massive stars within molecular clouds.Specifically, we utilize multiple bands of dust emission to infer the radiation field strength.In Section 2, we elucidate the diffusion model utilized in our analysis and delineate the procedure employed to generate the training set from magnetohydrodynamic (MHD) simulations.Subsequently, in Section 3, we comprehensively evaluate the performance of our diffusion model in predicting the radiation field strength.Additionally, we apply our diffusion model to actual observational data, as detailed in Section 3. Finally, we consolidate our findings and draw conclusions in Section 4.

Magnetohydrodynamics Simulations
We employ MHD simulations acquired from the STARFORGE project (STAR FORmation in Gaseous Environments, Grudić et al. 2021).The project introduces a novel numerical framework for conducting threedimensional radiation MHD simulations of star formation, allowing for a comprehensive examination of multiple processes.These processes encompass the formation, accretion, evolution, and dynamics of individual stars within massive giant molecular clouds (GMCs), while considering the intricate effects of stellar feedback.Stellar feedback mechanisms taken into account include jets, radiative heating and momentum, stellar winds, and supernovae.
The simulations in the STARFORGE project utilize the GIZMO code (Hopkins 2015), which incorporates the mesh-free Lagrangian MHD (MFM) method.Specifically, the star cluster formation is simulated within a GMC characterized by an initial mass of 2×10 4 M ⊙ and a radius of 10 pc (Grudić et al. 2022;Guszejnov et al. 2022).The initial magnetic field strength is set to 2 µG, and the cloud possesses an initial virial parameter of 2. The simulations achieve a mass resolution of 10 −3 M ⊙ and span an evolutionary timeframe of approximately 9 Myr.The interstellar radiation field (ISRF) default configuration is scaled to the background SED of the Solar neighborhood, with the Draine (1978) value of G 0 = 1.7 in the FUV band.Additionally, we employ simulations where the ISRF is intensified by a factor of 10 and 100, corresponding to G 0 = 17 and G 0 = 170 (Guszejnov et al. 2022).This alternative setup enables us to assess the performance of the machine learning model under stronger radiation field conditions.Significantly, the simulations account for stellar feedback by incorporating accretion-and fusion-powered stellar radiation in five distinct frequency bins.These bins include H-ionizing (λ < 912 Å), far ultraviolet (912 Å< λ < 1550 Å), near ultraviolet (1550 Å< λ <3600 Å), optical-to-near infrared (3600 Å< λ < 3µm), and far infrared (λ > 3µm) ranges.It is important to highlight that our work incorporates simulations spanning different evolutionary stages, ranging from 2 Myr to 8 Myr.This wide temporal range encompasses both early and late stages of star formation as well as the evolution of GMCs.
In addition to the full physics simulations, we incorporate a specific simulation that emphasizes the impact of stellar winds and radiation feedback while deactivating the presence of jets (Guszejnov et al. 2022).This alter-native simulation configuration introduces slight variations in the physical setup and enhances the diversity of cloud morphologies within the simulations.By including this simulation in our analysis, we expand the range of training data and further enrich the training set for our machine learning model.We provide a summary of the adopted simulations in Table 1.For further comprehensive information regarding the STARFORGE project, additional details can be found in Grudić et al. (2021).

MonR2 Observations
We adopt the Monoceros R2 (MonR2) GMC as an observational test case.MonR2 is well observed at all bandpasses of interest for this project.MonR2 is located 860 pc away, is 33,000 M ⊙ , and hosts over 900 young stellar objects with excess IR emission indicative of dusty circumstellar material such as protoplanetary disks or protostellar envelopes (Pokhrel et al. 2020).Thus, our fiducial STARFORGE calculation provides a reasonable representation of the MonR2 region given its cloud mass, evolutionary stage and level of star formation activity.
We adopt SESNA (Spitzer Extended Solar Neighborhood Archive; Gutermuth et al. in prep.;Pokhrel et al. 2020) Spitzer (Werner et al. 2004) mosaics at 4.5 µm from the Infrared Array Camera (IRAC; Fazio et al. 2004) instrument and 24 µm from the Mid-Infrared Photometer for Spitzer (MIPS; Rieke et al. 2004) instrument.For the 250 µm image, we use the Herschel (Pilbratt et al. 2010) Spectral and Photometric Imaging REceiver (SPIRE; Griffin et al. 2010) image from Pokhrel et al. (2016) that included an absolute calibration correction to Planck High Frequency Instrument (HFI; Planck HFI Core Team et al. 2011) data of the same region of sky.
The trained model as described in §2.3 operates on physical scales of 1/8 parsec per pixel, which translates to 30 ′′ per pixel at MonR2's distance.For our analysis we resample all three infrared images to a common pixel grid set by the IRAC 4.5 µm image, the highest resolution data of the collection at 2. ′′ 2 beam width and 0. ′′ 87 pixel size (MIPS 24 µm is 6. ′′ 3 beam width and 1. ′′ 8 per pixel; SPIRE 250 µm is 18 ′′ beam width and 6 ′′ per pixel).Since the beam resolutions of all three images are less than our final pixel scale, we simply box-average and down-sample the flux into the desired 30 ′′ pixel size grid.We then apply a mask to limit consideration to those pixels with coverage in all three bandpasses.The resulting coverage spans an area of 5.23 deg 2 .This treatment was applied using standard routines for these tasks (e.g., hastrom, hrebin) from the IDL Astronomy User's Library (Landsman 1993).

Synthetic Dust Observations
To calculate the dust temperature and generate synthetic dust emission at multiple wavelengths, we employ the 3D radiative transfer code radmc-3d (Dullemond et al. 2012).STARFORGE uses a subgrid model for protostellar evolution (Offner et al. 2009) and stores the luminosity, radius, and effective temperature of each source.Due to computational constraints, it is not feasible to assign a unique stellar spectrum to each individual star.Instead, we categorize the stars into four groups based on their effective temperature: <2000 K, 2000-5000 K, 5000-10000 K, and >10000 K.For each category, we calculate the mean effective temperature by considering the mean luminosity and mean surface area of the stars within that category.The calculation of the mean effective temperature is solely based on the stars that are located within the domain where the radiative transfer is conducted.An example of star categorization in one simulation snapshot is presented in Appendix A.
We explore two different approaches for modeling the stellar spectrum, specifically the spectral energy distribution (SED), in our study: the "Blackbody SED" and the "YSO SED."The Blackbody SED assumes a blackbody spectrum based on the mean effective temperature of the stars.However, circumstellar disks play a significant role in shaping the SEDs of young sources (Whitney et al. 2003;Robitaille et al. 2007;Offner et al. 2012).In the STARFORGE simulations, the formation of circumstellar disks is suppressed due to strong magnetic braking.The YSO SED accounts for the emission reprocessing (e.g.extinction, absorption and remission, scattering) caused by these (missing) disks.We adopt the stellar spectra with disks from Robitaille (2017).For each category of stars, we retrieve the SED from the table in Robitaille (2017) by selecting the one that closely matches the effective temperature and stellar radius of the star.The table also includes different inclination angles for young stars with disks.In our approach, we adopt the spectrum with an inclination angle that is closest to 45 • .
In the radiative transfer calculation, we employ two different dust models depending on the gas number density.For gas number densities exceeding 10 5 cm −3 , we utilize the dust model proposed by Koepferl et al. (2017) for dense gas.This model consists of three dust compositions: 80.63% big grains (>200 Å), 13.51% very small grains (vsg, 20-200 Å), and 5.86% ultrasmall grains (usg, <20 Å) in the form of polycyclic aromatic hydrocarbon (PAH) molecules.On the other hand, for gas num-ber densities below 10 5 cm −3 , we adopt the dust model developed by Hensley & Draine (2023) specifically designed for diffuse gas.This model incorporates two dust components: astrodust (90.69%) and PAH (9.31%).In Appendix B, we investigate various cutoffs on gas number densities when selecting dust models, as well as explore different dust models.This exercise illustrates that the choice of dust model is crucial to reproduce the observed SEDs; our hybrid model reproduces the relative fluxes in the three bands significantly better than the canonical Draine & Lee (1984) model or either of the two models alone.
Dust heating in molecular clouds is influenced by multiple mechanisms, with radiation from stars and the ISRF playing dominant roles.It is important to mention that the simulation data used in this study does not include the saved dust temperature during the simulation runs.The gas temperature is not a good proxy for the dust temperature, as they differ by an order of magnitude in shocks and in lower density regions, where the dust are gas are not well-coupled.Consequently, using the gas temperature in place of the dust temperature in the radiative transfer would result in a substantial difference in the calculated dust emission, spanning several orders of magnitude.To address this, we utilize the radmc-3d package to calculate the dust temperature in post-processing.
Given that GIZMO utilizes a Lagrangian meshless finite mass method rather than a Cartesian grid, we employ the yt toolkit (Turk et al. 2011) to sample the simulation data and transform it into a uniform Cartesian grid.We use this processed data as the input for radmc-3d in our analysis.We assume a gas-todust ratio of 100 and incorporate the Henyey-Greenstein anisotropic scattering model in the radiative transfer calculation.
Due to computational constraints, we generate the synthetic dust images for each 10 × 10 × 10 pc 3 box with an image resolution of 80×80 pixels.We verify that the radiative transfer results remain robust regardless of the resolution.In Appendix B, we present the radiative transfer simulations with a resolution of 256×256, and we show that the resulting SEDs are consistent with those obtained at lower resolutions.
To account for stars located near the box boundaries, we apply an additional post-processing step.After generating the initial synthetic dust images with dimensions of 80×80 pixels, we crop a 1 pc boundary on all four sides of the image.This results in a final image size of 64×64 pixels, representing an 8 × 8 pc 2 sky area.
Figure 1 illustrates the synthetic dust emission at 4.5 µm, 24 µm, and 250 µm, considering the two differ-  ent treatments for the stellar spectrum, a blackbody SED and a YSO SED, at various evolutionary stages.The figure also presents the projected radiation field averaged by the radiation energy along the line-of-sight.The synthetic dust emission SEDs are sensitive to the choice of stellar spectrum.The SEDs generated with blackbody SEDs as the radiating sources exhibit two distinct peaks, representing the contributions of stellar radiation in the optical to near-infrared range and dust emission of the cloud material in the mid-to far-infrared range.In contrast, the SEDs generated with the radiating sources modeled as YSOs display an infrared excess from 1-10 µm.This difference arises because the YSO SEDs exhibit higher levels of infrared emission due to emission reprocessing by the dust in circumstellar disks.We note that neither set of synthetic images exhibit out- flow features, which often appear in these bands (Looney et al. 2007;Tobin et al. 2008;Takami et al. 2010).Some of the excess observed emission likely arises from shockexcited H 2 and CO lines (Cyganowski et al. 2008), which are not included in our radiative transfer modeling step.

Constructing the Training Set
In this study, we utilize three specific bands of dust emission, namely 4.5 µm, 24 µm, and 250 µm, as input for training the machine learning model to predict the radiation field at the pixel level.These bands cover both near-infrared and far-infrared dust emission and, importantly, encompass information that is well-modeled by our training set. Figure 2 presents a gallery of SEDs for all the synthetic data, with darker colors indicating a higher number of stacked SEDs.Observations from the MonR2 GMC are included in the figure for reference, showcasing the Spitzer bands (3.6 µm, 4.5 µm, 5.8 µm, 8.0 µm, 24 µm) and Herschel bands (250 µm, 350 µm, 500 µm).The synthetic SEDs demonstrate a broad range of coverage for the observed data points.Nonetheless, there is an evident discrepancy in the synthetic SEDs, particularly in the 8 µm emission, where it is noticeably underestimated.This discrepancy is caused by strong PAH emission, indicating that the adopted dust model does not replicate the observed PAH emission adequately.Part of the discrepancy is likely due to the absence of non-thermal excitation mechanisms, such as shocks, which are not included in post-processing.It is important to note that the presence of H 2 lines at 2.22 µm, 2.41 µm, 2.63 µm, and 3.00 µm may cause contamination in the 3.6 µm bandpass.Additionally, the strong aromatic infrared bands at 3.30 µm, 6.20 µm, 7.70 µm, 8.60 µm, 11.30 µm, and 12.70 µm can potentially contaminate the 3.6 µm, 5.8 µm, and 8 µm bands (Foschino et al. 2019).As our project aims to infer the ISRF using a limited amount of data, it is crucial to ensure that the synthetic data closely resembles the real data.To mitigate this issue and achieve better performance, we have excluded bandpasses that include H 2 lines and strong PAH feature emission (3.6 µm, 5.8 µm, 8.0 µm) in this study.
To enhance the model's capability to handle real observational data, which may include the presence of foreground and background stars, we randomly introduce bright false sources that simulate such stars in the 4.5 µm images.These sources exhibit a two-dimensional Gaussian intensity distribution, where the peak intensity is randomly selected as a fraction between 0.1 and 1 of the 99.5th percentile of the 4.5 µm images.For each synthetic dust map, we utilize the projected radiation field averaged along the line-of-sight, weighted by radiation energy, as the target for the machine learning training.In summary, the input consists of three-band dust emission maps (3 × 64 × 64), while the target or model output is the corresponding projected radiation field (64 × 64).
We generate a total of 9,750 synthetic dust maps, encompassing various evolutionary stages, feedback configurations, and treatments for the stellar spectrum.To evaluate the performance of the machine learning model, the data is randomly split into an 80% training set and a 20% test set.
To comprehensively assess the performance of the machine learning model across different environments, we generate an additional 162 synthetic dust observations.These synthetic observations are generated from simulations with an ISRF strength of G 0 = 17 and G 0 = 170.Importantly, these new synthetic dust observations are entirely distinct from the training data, allowing us to evaluate the model's performance on previously unseen data.

Denoising Diffusion Probabilistic Models
Diffusion models, also known as denoising diffusion probabilistic models (DDPMs), are state-of-the-art generative methods used in deep learning and computer vision research (Sohl-Dickstein et al. 2015;Ho et al. 2020;Rombach et al. 2022).These models leverage probability theory and stochastic processes to effectively model and reconstruct data, with a focus on modeling the conditional distribution of clean data given noisy observations.By estimating the underlying distribution of the data, DDPMs capture its statistical properties, patterns, variations, and complexities.
The primary objective of DDPMs is to denoise and reconstruct the original signal from noisy or corrupted data.By modeling the distribution of clean data and incorporating diffusion processes, DDPMs excel at recovering true underlying structure while suppressing noise.The diffusion process is a key component, governing the evolution of the data distribution over time.
The DDPM starts with a simple initial distribution and gradually transforms it into the target distribution, which represents the conditional distribution of clean data.This transformation occurs through a sequence of diffusion steps, involving diffusion and denoising operations.Controlled noise is introduced during diffusion to guide the data along a diffusion path, followed by a denoising step that estimates the clean data from the noisy observations.Typically, deep neural network architectures like CNNs are employed in the denoising step, training them to map noisy observations to clean data.Figure 3 illustrates an example of the reverse process applied to our test data, demonstrating the gradual conversion of Gaussian noise into our desired target over 1000 time steps.
In our work, we adopt the same diffusion model described in Xu et al. (2023), where a detailed mathematical explanation of DDPM formulation is provided.To train our diffusion model for the task of reconstructing the radiation fields based on three-band dust emission maps, we follow the training strategy outlined in Xu et al. (2023).

Assessing the Performance of the Diffusion Model
In this section, we assess the performance of the diffusion model in predicting the radiation field strength based on the three-band dust emission.We begin by evaluating the diffusion model's performance on the test set.Figure 4 illustrates the predicted radiation field, alongside the three-band dust emission and the ground truth radiation field.By visual comparison, the diffusion model accurately predicts the radiation field at the pixel level.
To further evaluate the performance, we present a 2D histogram depicting the correlation between the diffusion model's predictions and the ground truth values of the radiation fields in Figure 5.The histogram demonstrates a strong alignment between the diffusion model's predictions and the ground truth radiation field values.The deviation between the true value and the predicted value is within a factor of 0.1.To provide a more interpretable representation of the radiation energy, we convert the radiation field energy into ISRF luminosity in the solar neighborhood.This conversion is achieved by adopting the mean intensity integrated over frequency from Mathis et al. (1983), where 4πJ = 0.0217 erg cm −2 s −1 .
We further investigate the comparison between the traditional approach, which estimates the ISRF from dust temperature, and the actual ISRF in the test set.Bernard et al. (2010) proposed an analytical formula to estimate the ISRF based on dust temperature in GMCs, following a power law: where the dust emissivity index β is assumed to be 2, which is a good fit to our adopted dust model at long wavelengths.This equation follows from the balance of dust absorption and emission in the diffuse ISM, assuming an ambient ISRF with an SED like the ISRF in the Solar neighborhood.Therefore, it should describe the low-extinction parts of the cloud fairly well but not the inner parts subject to extinction and irradiation by protostars.
Figure 6 depicts the correlation between the true ISRF and the dust temperature calculated using radmc-3d.We observe a weak or even unclear linear trend between these two quantities.Since the plot is in log scale, the pattern appears similar to the ISRF T−Dust vs ISRF True plot, but with different magnitudes in their values.For reference, the one-to-one line of ISRF T−Dust and ISRF True is also shown in Figure 6.The inferred ISRF from the dust temperature exhibits a notable offset from the true ISRF, mainly due to the extinction of radiation from stars within the cloud by dust along the line-of-sight.This offset diminishes at low ISRF values, where line-of-sight extinction is minimal.Overall, this highlights a significant level of uncertainty, with scatter and offset of over a factor of 10, in the traditional approach for estimating the ISRF from dust emission as compared to our machine learning approach.

Testing on Out-of-Distribution Data
Although the test set was not included in the training process, both the synthetic dust observations in the test set and the training set originate from the same sequence of MHD simulations.As a result, the diffusion model is potentially capable of learning all the intricacies within the synthetic dust emission and achieving unfairly accurate predictions, which could be significantly less accurate when applied to more diverse data.To address the possible issue of over-fitting, we evaluate the performance of the diffusion model on unseen data.This includes synthetic images created using different dust models and entirely novel MHD simulations featuring diverse physical parameters.Since simulations can never perfectly model observations, even when great care is taken to match physical conditions and to include relevant physical effects, our training data is by definition out-of-distribution compared to the observational data.
The tests presented here thus provide a more realistic assessment of the prediction accuracy of the model applied to observations.

Different Dust Models
First, we evaluate the diffusion model's performance using synthetic dust images generated with alternative dust models, distinct from those used in the training set.It's important to note that the simulation data employed for this evaluation is identical to that used for generating the standard training and testing datasets.Therefore, the primary difference between the test data in this assessment and the standard training data lies in the selection of dust models.We consider two extreme scenarios, namely, the pure K17 model (Koepferl et al. 2017) and the pure HD23 model (Hensley & Draine 2023).
Figure 7 illustrates the correlation between the predictions of the diffusion model and the actual ground truth values of radiation fields in this evaluation.The histogram shows a strong alignment between the diffusion model's predictions and the actual radiation field values.The deviation between the true and predicted values is within a factor of 0.2, slightly larger than that observed in the fiducial test set described in Section 3.1.This suggests that the diffusion model can make robust predictions even when applied to the same sequence of MHD simulations with different dust model setups.Hence, the choice of a different dust model does not significantly impact the performance of the diffusion model during training and prediction.

Higher ISRF
Next, we assess the diffusion model's performance on new simulations characterized by significantly higher ISRF values.These simulations involve boosting the ISRF by factors of 10 and 100, resulting in ISRF intensities of G 0 = 17 and G 0 = 170, respectively.Figure 8 illustrates the radiation field predicted by the diffusion model on these simulations.Despite the large difference in the ISRF, the diffusion model still produces predictions similar to the ground truth.However, there are some discrepancies: for instance, in the third and fourth rows on the left, the presence of dotted strong radiation regions in the actual radiation field are not accurately recovered by the diffusion model.These spots are not traced by any band of the dust emission, which explains why the diffusion model may fail to recover them.the offset between the diffusion model's predictions and the ground truth values for the 10 times higher ISRF is approximately 0.25 dex, corresponding to an underestimation factor of 1.8.Similarly, the offset for the 100 times higher ISRF is about 0.43 dex, equivalent to an underestimation factor of 2.7.Nevertheless, the relative ISRF intensity is well constrained, as the predictions and the ground truth values still exhibit a logarithmic-linear correlation, with a dispersion of 0.5.Consequently, we conclude that the diffusion model is capable of providing a reasonably accurate estimation of the radiation field even when the true field is orders of magnitude different than that of the training set.However, if a more precise estimation of the radiation field in extremely high ISRF regions is desired, it is advisable to retrain the diffusion model using an appropriate synthetic dataset.

Higher Density & Updated Heating/Cooling
Lastly, we evaluate the diffusion model's performance using a novel set of MHD simulations featuring different initial conditions.These simulations involve driving turbulence for two crossing times as described in Lane et al. (2022), resulting in elevated gas density within the molecular cloud and well-developed turbulence throughout.In addition, these new simulations incorporate an updated radiative cooling and heating scheme, utilizing the cooling module shared with the FIRE-3 simu- lations (Hopkins et al. 2023).In contrast, the fiducial simulations used for training relied on a simpler fitting function based on tabulated CLOUDY results (Ferland et al. 1998), accounting for local density, temperature, and metallicity.The updated MHD simulation applied in this assessment offers a more detailed representation of heating and cooling processes, encompassing all major molecular, atomic, nebular, and continuum interactions, to better capture the thermal state of the cold ISM (Hopkins et al. 2023).This fresh batch of simulations allows us to assess the diffusion model's performance under extreme out-of-distribution conditions.
Figure 10 displays the radiation field predictions generated by the diffusion model for these updated simulations.Despite significant differences in initial conditions and heating/cooling approaches, the diffusion model still generates predictions that closely resemble the ground truth, albeit with some discernible variations.Notably, the global background ISRF values predicted by the diffusion model are notably elevated compared to the ground truth.
To more precisely assess this divergence, we present a 2D histogram in Figure 11, illustrating the correlation between the diffusion model's predictions and the actual radiation field values for these fresh simulations.The diffusion model still performs reasonably well but exhibits a systematic offset and some variability in its predictions.It's worth noting that in these new simulations, the diffusion model appears to consistently overestimate the ISRF by approximately a factor of 3.Moreover, the dispersion in the predictions is approximately a factor of 2.
It's possible that the updated heating and cooling methods and/or the increased density within the clumps and cores in the new simulations systematically lead to a decrease in the ISRF values.This is evident from the fact that the highest ground truth ISRF value in the new simulations is considerably smaller than that in the fiducial simulations, as observed in Figure 5 and 7.
Consequently, it is crucial to stress that for accurate ISRF predictions in a real molecular cloud require an appropriate training dataset reflecting the specific physical conditions of that cloud.One should exercise caution when interpreting machine learning model predictions, especially regarding their absolute values.However, the diffusion model is capable of correctly capturing relative intensity variations across various out-ofdistribution datasets.This capability provides a promis- ing avenue for assessing the relative ISRF strengths within observed molecular clouds.

Testing on MonR2
In this section, we employ our diffusion model to analyze the actual dust observations in MonR2.The input for the diffusion model consists of three-band dust observations: 4.5 µm, 24 µm, and 250 µm.To ensure consistency in physical scales, we convolve the MonR2 dust observation to a similar physical resolution as our training data.Due to the larger size of the MonR2 dust map in terms of pixel count on both dimensions and the fixed image size requirement of the diffusion model (64 × 64), we employ a strategy that involves cropping the large map into smaller postage stamps with dimensions of 64 × 64 and a step size of 2 pixels.After the prediction phase, these postage stamps are combined by averaging the predictions for each pixel, resulting in the reconstruction of the original large map. Figure 12 presents the diffusion model's prediction of the radiation field.The regions of strong radiation field predominantly align with areas of intense dust emission, which aligns with our intuitive understanding that dust is most heated in regions with relatively strong radiation fields.Notably, there is some blue dotted emission that is primarily highlighted in the 4.5 µm band.These bright dots likely represent foreground and/or background stars that are not associated with MonR2.The diffusion model's radiation field prediction appears to successfully exclude these contaminants.Our findings demonstrate that the average projected line-of-sight ISRF in MonR2 is notably higher than the fiducial value of 1 G 0 .This can be attributed to several factors.First, MonR2 is an active star-forming region where the radiation field is dominated by forming stars, leading to a significant increase in the level of ISRF.Additionally, our predicted ISRF is averaged along the line-of-sight into the molecular cloud without being extincted by dust, providing a more accurate estimation of the actual impact of radiation feedback across the cloud.This information is crucial for further analyses of molecular clouds, including for investigating the influence of radiation feedback on the core mass function or variation of turbulent properties.
We next perform statistical analyses to investigate the correlation between the predicted ISRF and dust emission.Figure 13 presents 2D histograms illustrating this correlation at 4.5 µm, 24 µm, and 250 µm in MonR2.The dust emission at 4.5 µm does not exhibit a clear overall trend with the predicted ISRF.The presence of numerous blue dots, likely foreground and/or background stars unrelated to MonR2, creates a branch in the middle of the 2D histogram where the dust emission increases while the ISRF remains relatively constant.Similarly, no distinct trend is observed between the dust emission at 24 µm and the predicted ISRF.However, a positive correlation is evident between the dust emission at 250 µm and the predicted ISRF.For comparison, we investigate the correlation between the predicted ISRF and dust emission in the synthetic test data, as detailed in Appendix C. We observe a similar positive trend between the dust emission at 250 µm and the predicted ISRF in the test data.
Typically, longer wavelength emission, such as observed by Herschel at 160 µm, 250 µm, 350 µm, and 500 µm, is commonly used to estimate the dust column density and temperature.The dust temperature, in turn, can be used to estimate the radiation field, as shown in Equation 1.However, as discussed in Section 3.1 and shown in Figure 6, there is only a limited correlation between the ISRF and the dust temperature in the synthetic test data.Our study extends beyond these longer wavelengths to include the analysis of shorter wavelength emission.By considering this broader wavelength range, we obtain a more accurate estimation of the stellar radiative feedback and dust emission.
Finally, we investigate the relationship between the predicted ISRF, the column density and the dust temperature, utilizing the column density map and the dust temperature maps derived by Pokhrel et al. (2016).Figure 14 showcases the column density map and the dust temperature map of MonR2.The correlation between the predicted ISRF and the column density, as well as the dust temperature, appears to be limited.While certain regions of strong ISRF coincide with areas of high column density and high dust temperature, this relationship is not consistent.Some regions with high column density and/or high dust temperature do not exhibit a strong ISRF.Figure 15 displays 2D histograms illustrating the correlation in MonR2.Notably, we do not observe a clear trend between the ISRF and the dust temperature in contrast to typical modeling assumptions (e.g., Bernard et al. 2010).Although a positive correlation can be discerned when the dust temperature exceeds 20 K (log T=1.3), it is accompanied by significant scatter.

Dust Emission
It is important to highlight that the black line, representing the ISRF inferred from the grey-body dust temperature, is significantly offset from the DDPMpredicted ISRF values.This discrepancy may suggest that radiation from embedded stars is highly attenudated, resulting in much cooler dust temperatures.Another possibility is that the dust consists of multiple temperature components, with the colder components dominating the emission in the Herschel band used to derive this temperature map.Similarly, no distinct pattern emerges between the predicted ISRF and the col-umn density.These findings suggest that the dust emission at 250 µm is influenced by factors beyond just column density and dust temperature, including the presence of radiation.

CONCLUSIONS
We produce synthetic dust observations of MHD simulations from the STARFORGE project, which incorporate various physical processes to simulate star formation and GMC evolution.Using these synthetic observations, we trained deep learning diffusion models to estimate the radiation field strength based on three-band dust emission at 4.5 µm, 24 µm, and 250 µm.We evaluated the performance of the diffusion model on both 2. We utilized deep learning diffusion models to estimate the strength of the radiation field and assessed its performance on the test set.The diffusion model successfully reconstructed the radiation field strength at the pixel level, generally recovering the true value within 10%.
3. We find that the relationship between ISRF and dust temperature exhibits a high degree of scatter, such that a simple grey-body model for dust emission does not accurately predict the radiation field.
4. We evaluated the performance of the diffusion model using synthetic dust images created with different dust models than those employed in the training set.The results indicate that the diffusion model can accurately predict the ISRF, with errors within a 20% range.This suggests that the diffusion model is not overly sensitive to our choice of dust model and can provide reliable predictions even when applied to MHD simulations with varying dust model configurations.
5. We assessed the performance of the diffusion model on new MHD simulations featuring an ISRF that is 10 and 100 times higher than that of the fiducial simulations.The diffusion model was still able to predict the ISRF reasonably accurately, although there was a systematic underestimation factor of 1.8 and 2.7 for the 10 and 100 times higher ISRF, respectively.
6.We assessed the performance of the diffusion model using new MHD simulations featuring updated heating/cooling and initial turbulent driving.The model performed satisfactorily but consistently overestimated the ISRF by a factor of 3, with a dispersion of approximately a factor of 2. This overestimation is likely due to the updated heating/cooling treatments, which systematically causes lower ISRF values in the simulations relative to simulations in the fiducial training set.
7. The evaluation on the out-of-distribution dataset underscores the resilience of the diffusion model in predicting the relative ISRF levels within a single molecular cloud.While there are systematic offsets in the absolute ISRF values, the relative intensity of the ISRF is accurately estimated with a dispersion of up to a factor of 2.
8. We employed the diffusion model to predict the radiation field in MonR2 using observed dust emission.The diffusion model successfully captures the locations of intense radiation field regions, which corresponded to areas with high dust emission.We find a positive correlation between the predicted ISRF and the dust emission at 250 µm with a large degree of scatter.
Although the model performs well overall, we stress that the test simulations, which represent out-ofdistribution data, produce significantly less precise model predictions.This suggests that the uncertainties associated with the predicted ISRF represent upper limits on the accuracy of the ISRF predicted from observational data, as by definition, our training set is out-ofdistribution compared to actual observations.Therefore it is important to adopt an appropriate training dataset that reflects the particular physical conditions of the targeted molecular cloud as accurately as possible to ensure the most precise ISRF prediction.
In future work, we plan to extend the application of the diffusion model to additional archived dust observations of nearby molecular clouds, as well as to nearby galaxies.This approach will allow us to study the im-pact of radiation fields on molecular clouds and star formation in a broader context.
We would like to express our gratitude to the anonymous referee for their invaluable suggestions, particularly those pertaining to the assessment of out-ofdistribution data.DX acknowledges support from the Virginia Initiative on Cosmic Origins (VICO).In this section, we provide an example of categorizing stars based on their effective temperature in radiative transfer.Due to computational limitations, it is not feasible to perform radiative transfer for each individual star in the simulation.Instead, we employ a binning approach to group stars into four categories based on their effective temperature.Figure 16 illustrates the projected stellar radiation density and corresponding SEDs for a snapshot of simulations at 4.5 My without jets, as depicted in the third row of Figure 1.The stellar field is represented by blob-like structures, deposited onto the grid with a full width at half maximum (FWHM) of 3 pixels.The SEDs for each star category are displayed, including the blackbody SED and the young stellar object (YSO) SED with disks from Robitaille (2017).

Figure 1 .
Figure 1.Synthetic dust observations, including images and SEDs, for simulations at various evolutionary stages and with different feedback configurations.The first column illustrates the SEDs of the synthetic observations.The second and third columns showcase the three-color synthetic dust images at 4.5 µm (blue), 24 µm (green), and 250 µm (red) wavelengths, utilizing different radiative transfer configurations.The fourth column presents the projected radiation field strength obtained from the simulations, measured in ergs/cm −3 .

Figure 3 .
Figure 3. Demonstration of the diffusion process (reverse) on a sample in the test set.In the upper row, the first panel represents the input (condition) for the diffusion model, while in the lower row, the first panel represents the corresponding target (ground truth).The initial status is denoted as T1000, which corresponds to the random Gaussian noise.The final states of the reverse Markov chain, representing the final predictions by the diffusion model, are indicated as T0.The intermediate steps of the reverse Markov chain, ranging from T = 0 to 1000, are depicted in the remaining panels.

Figure 9 Figure 5 .
Figure 5. 2D histogram illustrating the correlation between the predictions of the diffusion model and the ground truth values of the radiation fields.

Figure 6 .
Figure 6.2D histogram illustrating the correlation between the true ISRF and the dust temperature in the synthetic test data.The black dashed line represents the relationship predicted by Equation(Bernard et al. 2010), where the dust emissivity index β is assumed to be 2 and the incident SED is that of the unattenuated local ISRF.

Figure 7 .
Figure 7. Similar to Figure 5, but applied to synthetic dust images generated using different dust models.The black dotted line represents the one-to-one line, which represents perfect prediction.The blue dotted line represents the ten-to-one line, indicating underestimation by a factor of 10.The brown dotted line represents the one-to-ten line, indicating overestimation by a factor of 10.

Figure 8 .
Figure 8. Predicted radiation field generated by the diffusion model, along with the corresponding three-band dust emission and the ground truth radiation field.The left three panels correspond to new simulations with an ISRF of G0 = 17, while the right three panels represent simulations with an ISRF of G0 = 170.

Figure 9 .
Figure 9. Similar to Figure 5, but specifically for simulations with an ISRF of G0 = 17 (left panel) and G0 = 170 (right panel).The black dotted line represents the one-to-one line, which represents perfect prediction.The blue dotted line represents the ten-to-one line, indicating underestimation by a factor of 10.

Figure 10 .
Figure10.Predicted radiation field produced by the diffusion model, alongside the associated three-band dust emission and the actual radiation field for new simulations.These simulations incorporate updated heating and cooling treatments and experience two crossing times of turbulent driving, leading to well-developed turbulence and increased gas density before self-gravity is turned on.

Figure 13 .Figure 14 .
Figure 13.2D histograms illustrating the correlation between the predicted ISRF and the dust emission intensity (4.5 µm on the left, 24 µm in the middle, and 250 µm on the right) in MonR2.

Figure 15 .
Figure 15.2D histograms illustrating the correlation between the predicted ISRF and the dust temperature (left panel) and the column density (right panel) in MonR2.The black dashed line in the left panel represents the predicted relation between the ISRF and dust temperature given in Equation 1.

Figure 16 .
Figure16.Projected stellar radiation density and corresponding SEDs of different categories for a snapshot of simulations at 4.5 Myr without jets (the third row of Figure1).

Figure 19 .Figure 20 .
Figure 19.χ 2 values calculated between the synthetic SEDs and observed SEDs across various gas number density cutoffs during dust model selection and different evolutionary stages in various simulations.The upper panel displays the raw χ 2 values calculated between the synthetic SEDs and the observed SEDs.In the lower panel, we compute the χ 2 values while incorporating a free scaling parameter during the calculation.

Table 1 .
Summary of STARFORGE Simulations a Evolutionary time, ISRF, whether protostellar jets are included, and the number of image samples.b These simulations are initially subjected to turbulent driving for two crossing times, equivalent to 17.5 Myr.Furthermore, they incorporate the updated heating and cooling treatments. a