A Star-based Method for the Precise Flux Calibration of the Chinese Space Station Telescope Slitless Spectroscopic Survey

The upcoming Chinese Space Station Telescope (CSST) slitless spectroscopic survey poses a challenge of flux calibration, which requires a large number of flux-standard stars. In this work, we design an uncertainty-aware residual attention network, the UaRA-net, to derive the CSST spectral energy distributions (SEDs) with a resolution of R = 200 over the wavelength range of 2500–10000 Å using LAMOST normalized spectra with a resolution of R = 2000 over the wavelength range of 4000–7000 Å. With the special structure and training strategy, the proposed model provides accurate predictions not only of SEDs, but also of their corresponding errors. The precision of the predicted SEDs depends on the effective temperature (T eff), wavelength, and the LAMOST spectral signal-to-noise ratios (S/Ns), particularly in the GU band. For stars with T eff = 6000 K, the typical SED precisions in the GU band are 4.2%, 2.1%, and 1.5% at S/N values of 20, 40, and 80, respectively. As T eff increases to 8000 K, the precision increases to 1.2%, 0.6%, and 0.5%, respectively. The precision is higher at redder wavelengths. In the GI band, the typical SED precisions for stars with T eff = 6000 K increase to 0.3%, 0.1%, and 0.1% at S/N values of 20, 40, and 80, respectively. We further verify our model using empirical MILES spectra and find a good performance. The proposed method will open up new possibilities for the optimal use of slitless spectra of the CSST and other surveys.


Introduction
Slitless spectroscopy is a powerful tool in astronomy, with the capability of obtaining spectra of a huge numbers of objects that are free from target selection biases.Ground-based slitless spectroscopic surveys have a long history, such as the Curtis Schmidt-thin prism survey (MacAlpine et al. 1977), the Second Spectral Survey (Markarian et al. 1987), the APM QSO Survey (Foltz et al. 1989), the Hamburg Quasar Survey (Hagen et al. 1995), and the Quasars near Quasars survey (Worseck et al. 2008).However, these ground-based slitless spectroscopic surveys are susceptible to serious problems, including source contamination and bright sky background.Due to the high spatial resolution and low sky background, space is the most natural and suitable place for slitless spectroscopic observations.Wide-field slitless spectroscopic surveys of space telescopes, such as the Chinese Space Station Telescope (CSST; Zhan 2011), the Nancy Grace Roman Space Telescope (Green et al. 2012;Akeson et al. 2019), and the Euclid Space Telescope (Laureijs et al. 2011(Laureijs et al. , 2012;;Amendola et al. 2013Amendola et al. , 2018)), can reach much fainter and wider areas and thereby provide huge opportunities in astronomy.
The CSST is an upcoming 2 m space telescope with a large field of view (FOV) of 1.1 deg 2 .In 10 years starting from 2025, the telescope will simultaneously conduct both imaging and slitless grating spectroscopic surveys, covering a large sky area of 17,500 deg 2 at a high spatial resolution of ∼0 15 (Zhan 2011;Cao et al. 2018;Gong et al. 2019).With seven photometric filters (NUV, u, g, r, i, z, and y), the imaging survey spans a broad spectrum of wavelengths from 255 to 1000 nm.The slitless spectroscopic survey complements the imaging survey by offering high-quality spectra for hundreds of millions of stars and galaxies in three bands, GU (255-420 nm), GV (400-650 nm), and GI (620-1000 nm), within the same wavelength range.The spectra have a resolution higher than 200 and a magnitude limit around 23 mag in the GU, GV, and GI bands.The designed parameters and intrinsic transmission curves for the CSST photometric and spectroscopic surveys can be found in Table 1 and Figure 1 of Gong et al. (2019), respectively.
To make optimal use of the unique capability of wide-field slitless spectroscopic surveys, data reduction is very important and requires much effort.In the data reduction process, wavelength and flux calibrations are extremely challenging, particularly for the CSST.
The challenges of the wavelength calibration for the CSST were discussed in Yuan et al. (2021).Briefly, the number of specific emission line objects is not sufficient, and the calibration time to eliminate field effects and geometric distortions is not sufficient either.To solve this problem, Yuan et al. (2021) proposed a star-based method for the wavelength calibration of the CSST slitless spectroscopic survey.This method makes use of three prerequisites: (i) Over 10 million stars with precise radial velocities have been delivered by spectroscopic surveys such as LAMOST (Deng et al. 2012;Liu et al. 2014).(ii) The CSST can observe a large number of these stars during a short time thanks to its large FOV.(iii) A narrow segment of CSST spectra can provide reliable estimates of radial velocities (Sun et al. 2021).Taking advantage of the above prerequisites, the key idea of this method is to use enormous numbers of stars (absorption lines rather than emission lines) of known radial velocities observed during normal scientific observations as wavelength standards to monitor and correct for possible errors in wavelength calibration.Using only hundreds of velocity standard stars, they demonstrated that it is possible to achieve a wavelength calibration precision of a few kilometers per second for the GU band, and about 10-20 km s −1 for the GV and GI bands.
In this work, we focus on achieving a precise flux calibration for the CSST slitless spectra.The goal of thr flux calibration is to convert astronomical measurements from instrumental into physical units.The challenges of the flux calibration for slitless spectroscopic observations lie in the flat-fielding.A 3D flatfield cube must be used to perform the flat-field correction for each pixel in the 2D detector as a function of incident wavelength.To characterize the wavelength-dependent behavior, a series of direct-image flat fields taken at different central wavelengths can be used to construct the 3D flat-field cube for telescopes such as the Hubble Space Telescope (Momcheva et al. 2016;Pharo et al. 2020), the Roman Space Telescope, and the Euclid Space Telescope.Contrasting with the conventional approach that relies on a filter wheel to switch between filters and gratings, the CSST employs a strategy in which each detector is dedicated to a specific filter or grating (see Figure 5 of Zhan 2021).As a result, this hardware design means that direct-imaging observations are not feasible for the CSST slitless spectroscopic survey.
To perform high-precision photometric calibration of widefield imaging surveys, Yuan et al. (2015) proposed a stellar color regression (SCR) method in the era of large-scale spectroscopic surveys.The method assumes that the intrinsic stellar colors can be precisely predicted by their atmospheric parameters, and thus, millions of stars targeted by spectroscopic surveys such as LAMOST can serve as excellent color standards.Combining uniform photometric data from Gaia, the method has been applied to a number of surveys, including SDSS/Stripe 82 (Huang & Yuan 2022), SMSS DR2 (Huang et al. 2021), PS1 DR1 (Xiao & Yuan 2022;Xiao et al. 2023), and Gaia DR2 (Niu et al. 2021a) and EDR3 (Niu et al. 2021b;Yang et al. 2021).A typical precision of a few millimagnitude was usually achieved (see the recent review by Huang et al. 2022).
Assuming that stars with the same normalized spectra have the same spectral energy distributions (SEDs), in this work, we further develop the SCR method to predict the CSST-like (R ∼ 200) SEDs for a huge number of stars from LAMOSTlike (R ∼ 1800) normalized spectra.Then, with a large number of flux-standard stars, we can map the 2D variations of the large-scale flat6 in the detector for each wavelength.Combining the 2D large-scale flats together, we can obtain the 3D flat-field cube for an accurate flux calibration of the CSST slitless spectroscopic survey.
We use a neural network to predict the CSST-like SEDs from LAMOST-like normalized spectra.Due to the impact of observational noise on model performance, it is challenging to deal with uncertain data.Traditional denoising methods depend on user-defined filters (e.g., Gilda & Slepian 2019;Politsch et al. 2020).Without manual denoising, various deep-learning methods have been put forward to automatically learn important features from noisy observations (e.g., Zhao et al. 2020;Zhou et al. 2021Zhou et al. , 2022)).However, these methods depend on a sufficient training sample with high signal-to-noise ratios (S/Ns) and have difficulty in predicting flux errors.To address the above limitations, we propose an uncertainty-aware residual attention network, UaRA-net, to predict not only precise and robust CSST-like SEDs, but also their corresponding errors using LAMOST-like normalized spectra and their errors as input.Our model structure and training strategy make full use of the prior observational errors, thereby improving the flexibility on input spectra at different S/Ns.
The paper is organized as follows: Section 2 describes the data.Section 3 introduces the proposed UaRA-net in detail.Section 4 reports the results of the pretrained and fine-tuned models, as well as a verification of the proposed method.We summarize in Section 5 and discuss the challenges and potential future improvements.

Data
The data sets used in this work consist of theoretical spectra and empirical spectra.The former, for which we adopt the BOSZ models, are used to pretrain the UaRA-net model.The latter, including the NGSL and MILES spectra, are used to further fine-tune and verify the model.
During the pretraining stage of UaRA-net, two sets of spectra, one set with R = 2000 in the wavelength range of 4000-7000 Å and the other set with R = 200 in the wavelength range of 2500-10500 Å, are chosen to simulate the LAMOST low-resolution spectra and CSST slitless spectra, respectively.The LAMOST-like spectra serve as the inputs of the UaRA-net and are normalized by diving their continua, which are estimated using the moving-average method with a window size of 51 pixels.In cases where there are fewer than 25 pixels on one side of the boundary, all pixels on that side are used.The CSST-like spectra are used as the outputs of the UaRA-net and are scaled by dividing the mean flux value in the wavelength range of 5526-5586 Å.The above wavelength range is adopted because it is free of strong stellar absorption features and close to the central wavelength of the V band.
Only stars or spectra with T eff ranging from 5000 to 9750 K are used.Hotter stars are excluded because there will be a limited number of such stars in the CSST data; cooler stars are also excluded because their scaled fluxes in the GU band are very low and difficult to predict.The selected stars are randomly divided into training and testing samples with a ratio of 3:1.The training set contains 41,219 stars.For each training star, we add noise to its clean LAMOST-like spectrum to generate spectra at different S/Ns ä{20, 25, 30, 35, 40}, resulting in 206,095 noisy training spectra.The testing set contains the other 13740 stars, and their clean LAMOST-like spectra are mixed with noise at different S/Ns ä{20, 25, 30, 35, 40, 80}, resulting in 82440 noisy testing spectra.It should be noted that the testing set also includes spectra with S/N = 80, which are used to examine the extrapolation ability of the proposed method to high-quality spectra.A noisy spectrum F with S/N = snr at wavelength λ can be written as where r snr 0, 1 2  ~( () ).

Empirical Spectra
The MILES spectral stellar library8 consists of 985 fluxcalibrated and reddening-corrected spectra observed by the 2.5 m INT telescope, covering the wavelength range of 3500-7500 Å (Sánchez- Blázquez et al. 2006) at a spectral resolution of R = 2500 (Falcón- Barroso et al. 2011).The MILES catalog (Cenarro et al. 2007) provides stellar effective temperatures (T eff ), surface gravities ( g log ), and metallicities ([Fe/H]).Similar to the BOSZ spectra, we require stars in the same T eff range of 5000-9750 K. To ensure the quality of the MILES spectra, we further require stars with E (B − V ) < 0.1 mag.To exclude abnormal MILES spectra (see more details in Appendix B), we train a Gaussian process regression (GPR) model to predict SEDs at R = 200 from stellar parameters (T eff , g log , and [Fe/H]).In this paper, a relative residual of r with n pixels is used to represent the difference between the predicted SEDs and the true SEDs, which is defined as where F ˆis an n-dimensional vector that represents the predictive SED.Then, the root mean square relative error (RMSAE) that represents the overall difference between the predicted SEDs and the true SEDs is defined as We regard MILES spectra of RMSAE < 0.05 as normal, and obtain 265 stars.
The NGSL spectral stellar library (Heap 2008)9 comprises 378 stars observed by the Hubble Space Telescope Imaging Spectrograph using the three low-dispersion (R ∼ 1000) gratings, G230LB, G430L, and G750L, which overlap at 2990-3060 Å and 5500-5650.The spectra cover the wavelength range of 1670-10250 Å. Reddening corrections and flux corrections were applied to their SEDs, and their zeropoint is the same as that of the BOSZ spectra, using the provided reddening values A V and calibration curves calculated by the difference from the CALSPEC reddening spectra of seven common stars.To avoid the spectrum overlapping effect, the calibration curve is only applied to the NGSL SED segment at λ > 5700 Å.Similar to the BOSZ spectra, we also require stars with T eff in the range of 5000-9750 K. To ensure the quality of the NGSL spectra, only stars with −0.05 < A V < 0.25 mag, err ( g log , dex) < 0.7 dex, and err([Fe/H]) < 0.25 are selected.Before using NGSL spectra to fine-tune the UaRA-net, we train two GPR models, one to exclude abnormal spectra (see more details in Appendix B), and the other to generate normalized spectra at R = 2000 (see more details in Appendix C) from stellar parameters (T eff , g log , and [Fe/H]).The first model takes the stellar parameters as inputs and returns the predicted SEDs at R = 200.By comparing the difference between the predicted SEDs and the true SEDs, the NGSL spectra of RMSAE < 0.05 are regared as normal, and 104 stars are selected.The second model is trained on the MILES spectra and applied on the NGSL spectra to generate normalized NGSL-based spectra at R = 2000.During the fine-tuning stage of UaRA-net, 104 generated NGSL-based spectra are randomly divided into training and testing samples with a ratio of 4:1.
During the fine-tuning stage of UaRA-net, the generated NGSL-based spectra are used as inputs, and the scaled NGSL observations degraded to R = 200 at 2500 < λ < 10250 Å are used as outputs.During the stage of verifying the UaRA-net, the normalized MILES spectra degraded to R = 2000 at 4000 < λ < 7000 Å are used as inputs, and the scaled MILES observations degraded to R = 200 at 3500 < λ < 7500 Å are used as outputs.

Method
Considering the distribution of the input highly noisy spectra, we propose the UaRA-net, which estimates CSST-like SEDs and their corresponding errors by returning a probability distribution with the mean and variance of a Gaussian.As shown in Figure 1, the UaRA-net is divided into two branches: the SED branch, and the uncertainty branch.The SED branch takes normalized LAMOST-like spectra as the input vector X and returns a latent representation Z and the predicted means μ of the CSST-like SEDs.The uncertainty branch takes the concatenation of the corresponding spectrum errors X err and Z as the input vector and returns the predicted heteroscedastic uncertainties σ of the CSST-like SEDs.

Residual Attention Network
The residual network (RetNet; He et al. 2016) is a modularized architecture that uses stack building blocks called residual units.One of the advantages of RetNet is the shortcut connection between neighboring residual units, which helps address the degradation problem in deep networks.To maintain the same connecting shape, additional zero entries are padded when the dimensions of the next residual unit increase.The residual unit in the SEDs branch mainly consists of convolutional layers, batch normalization (BN) layers, and rectified linear unit (ReLU) activation functions.The convolutional layers greatly reduce the number of trainable parameters by sharing parameters with the convolutional kernel.The BN layers prevent overfitting and speed up convergence by ensuring nonzero variances of the forward-propagated signals.
To improve the robustness of the network regarding the noise, the attention map , which is generated by three sequential 1D convolutions with ReLU activation function, first serves as the denoising operator on feature maps M by determining flux-related soft thresholds τ.The threshold estimates are calculated as where Φ cov,Relu (•) denotes a linear combination of nonlinear 1D convolution functions and ⊗ denotes element-wise multiplication.Then, an intermediate feature map otherwise, and τ 0 C×W×1 is the feature-wise threshold.
The latent representation of the M¢ is thought to contain all information of stellar atmospheric parameters and elemental abundances (hereafter, stellar labels) and is calculated by a global average pooling (GAP) operation as where Φ FC (•) denotes a linear combination of nonlinear functions from fully connected layers.

Uncertainty Quantification
The uncertainty of the SED estimates in the UaRA-net comes from two sources: the epistemic uncertainty, and the aleatoric uncertainty.Given the large number of spectra in the training sample, the epistemic uncertainty is far smaller than the aleatoric uncertainty.In this work, we consider the aleatoric uncertainty, which depends on the observation error of the input spectra and stellar labels, as an approximation of the uncertainty of the SED estimates.
The error branch of the UaRA-net is used to estimate the aleatoric uncertainty.

Training Strategy
Gaussian maximum likelihood is used to model the loss function of the URA-net.We assume that the difference δY i between the observed target Y i and the prediction Y i ˆfollows Gaussian distributions, where r ) is the heteroscedastic uncertainty.The predictive SED in Equation (8) can be defined as In the probabilistic view, we assume that the prediction SED Y i is the mean of a Gaussian predictive distribution with variance σ i Y i , which can be defined as The probability of Y i given a spectra input F i can be approximated by a Gaussian distribution In maximum-likelihood inference, we define the likelihood as To train σ and Y, we find the parameters through minimizing the following loss , which is the negative logarithm of the Equation (16) likelihood r Y 2 log log .17

Experimental Results
The UaRA-net was trained using Keras 2.2.4,Tensorflow 1.12.0,CUDA 9.2, and cuDNN 7.6, running on the GPUs for acceleration.Experiments were conducted on a computer with  six Intel Xeon Gold 6230 CPUs and an NVIDIA Tesla V100 SXM2 GPU.The precision and adaptability of the model were tested in this section.

Hyperparameter Setup
Before pretraining and fine-tuning the UaRA-net, hyperparameters related to the architecture and optimization were set manually.Architecture-related hyperparameters include the size and number of convolutional kernels, the number of neurons and hidden layers of fully connected layers, the activation function, and the weight initialization method.To train CSST-like SEDs with the SEDs branch, the architecturerelated hyperparameters are summarized in Table 1.They were set based on the hyperparameters used in cost-sensitive neural networks (CSNet; Yang et al. 2022), deep residual shrinkage networks (DRSN; Zhao et al. 2020), andResNet (He et al. 2016).A subunit consists of four convolutional layers in 3D forms of (channels, size, and stride), where the three items represent the number, size, and moving stride of the convolutional kernel, respectively.To train errors of the predictive CSST-like SEDs through the uncertainty branch, three hidden layers with 16, 64, and 574 neurons and a The optimization-related hyperparameters include the optimizer, the learning rate η, the batch size of the training samples s, and the number of training iterations epochs.The adaptive moment estimation (ADAM) was used as the optimizer due to its good robustness to the initial learning rate.The learning rate η was set to 0.001 following the recommendation of Kingma & Ba (2014).A large batch size of s = 2048 was set to accelerate convergence, while being limited by GPU memory.The loss function in Equation (16) converged when epoch = 3000.When fine-tuning the UaRA-net on NGSL spectra, the optimization-related hyperparameter η, s, epochs were set to 0.001, 32, 200 due to the small size of the training samples.

Performance of Predicting CSST-like Spectra
In this section, using the BOSZ and NGSL spectra, we evaluated the accuracy of the SEDs and uncertainties estimated from the pretrained and fine-tuned models, respectively.
For the BOSZ spectra, we evaluated the performance of the pretrained model on both the training and testing sets.To avoid crowding, we only show the results of the training set with S/N ä 20, 40 and the testing set with S/N ä 20, 40, 80. Figure 2 shows the results of the training set.The medians of the relative residuals at both S/N = 20 and 40 are close to zero, demonstrating that our predicted SEDs agree well with the reference.As for the standard deviations of the relative residuals, dependences on the T eff , wavelength, and S/Ns are revealed, especially in the GU band.In this band, spectra of hot stars have higher fluxes and are less sensitive to the metallicity, leading to relatively lower errors.For stars with T eff = 6000 K, the typical errors in the GU, GV, and GI bands are (0.0409, 0.0024, 0.0026) and (0.0213, 0.0012, 0.0014) for the training set at S/N = 20 and 40, respectively.When T eff increases to 8000 K, the typical errors decrease to (0.0115, 0.0016, 0.0023) and (0.0064, 0.0008, 0.0013) for the training set at S/N = 20 and 40, respectively.
The results of the testing set shown in Figure 3 are similar to those of the training set.We note that the testing set at S/N = 80 performs well, although these samples were derived by extrapolation due to the limitation of the training set.For stars with T eff = 6000 K, the typical errors in the GU, GV, and GI bands are (0.0425, 0.0025, 0.0029), (0.0207, 0.0013, 0.0015), and (0.0151, 0.0011, 0.0011) for the testing set at S/N = 20, 40, and 80, respectively.When T eff increases to 8000 K, the typical errors decrease to (0.0117, 0.0016, 0.0024), (0.0064, 0.0008, 0.0014), and (0.0052, 0.0007, 0.0012) at S/N = 20, 40, and 80, respectively.The small errors demonstrate that the SEDs at R = 200 can be well recovered from the normalized spectra at R = 2000.
To estimate the errors, we divide the samples into different T eff bins.Figures 4 and 5 show comparisons of the medians of the predicted errors and the standard deviations of the relative residuals for the T eff bins of the training and testing set, respectively.Overall, the predicted errors agree well with the true values.However, we do not expect a perfect match, as the uncertainty branch of the UaRA-net for estimating errors is an unsupervised-learning algorithm.To observe the details of the relative residuals over the whole wavelength range, Figure 6 plots six representative spectra from the training and testing sets.It can be seen that the pretrained model performs well, and the predicted errors have a strong negative correlation with the S/Ns, as expected.
For the NGSL spectra, we first evaluated the performance of the pretrained model.Figure 7 shows relative residuals of the pretrained model.It can be seen that the relative residuals have a systematic trend with T eff , which is caused by the systematic discrepancies between the BOSZ theoretical spectra and the NGSL empirical spectra.The discrepancies mainly come from   training and testing sets.Figure 10 shows the distribution of the relative residuals as a function of wavelength for the training and testing sets.The relative residuals show no systematic patterns with wavelength.

Verification with MILES Spectra
To demonstrate the reliability of the proposed method, we applied the fine-tuned UaRA-net to MILES spectra and obtained their predicted SEDs. Figure 11 shows the relative residuals of the 265 MILES spectra as a function of wavelength.The typical relative residuals over the whole wavelength range are lower than 0.02.The trend of the relative residuals with wavelength (red line) is consistent with the systematic difference between the NGSL spectra and MILES spectra (black line).Note that the discrepancy at λ > 6700 Å is attributed to the systematic errors (imperfect correction for the second-order contamination; Sánchez-Blázquez et al. 2006) in MILES.The errors at λ ∼ 3600, 4500, 5000, 6000, and 7000 Å are 0.070, 0.022, 0.013, 0.010, and 0.022, respectively.The small errors suggest that the proposed SEDs estimation pipeline can generally be applied to observed spectra.

Summary and Future Perspective
In this work, we designed and trained an uncertainty-aware residual attention network, the UaRA-net, to convert normalized spectra across the wavelength range of 4000-7000 Å into SEDs spanning 2500-10500 Å.The UaRA-net incorporates the effects of stellar parameters and S/N to estimate precise SEDs and their associated uncertainties.We first pretrained a baseline model using degraded BOSZ spectra, which simulate the CSST stellar spectra, and then further fine-tuned the model with other CSST-like spectra, such as the NGSL and MILES spectra.We found that the precision of the predicted SEDs depends on T eff , wavelength, and S/N.At S/N = 20, the maximum offsets and errors in the GU, GV, and GI bands are (0.072, 0.198), (0.007, 0.032), and (0.005, 0.014), respectively.Using millions of stellar spectra from LAMOST, this method can be applied to the flux calibration of CSST and other space-based spectroscopic surveys in the future.
To investigate the precision of the flux calibration by our method, following the procedure outlined in Yuan et al. (2021), we randomly selected 400 stars with S/N > 20 and G > 14 from LAMOST DR7.Assuming that these stars would be observed by the CSST, we estimated calibration errors of the UaRA-net using Monte Carlo simulations.We fit the differences in flux among the 400 spectra using a secondorder 2D polynomial function to generate the flat-field cube.We achieved a typical flux calibration precision of 0.005, 0.0003, and 0.0005 for the GU, GV, and GI bands, respectively.Note that these numbers are too optimistic and only valid under ideal conditions.
The proposed UaRA-net assumes that the epistemic uncertainty is far smaller than the aleatoric uncertainty of the model.However, when the number of training samples is small, the epistemic uncertainty becomes significant.Therefore, the performance of the UaRA-net can be improved by fine-tuning with more well-calibrated empirical spectra.Furthermore, the UaRA-net can be directly trained with a number of well-calibrated CSST spectra in the future.
We assumed that we know the flux at 5526-5586 Å of our LAMOST stars well.However, it is important to note that these flux measurements have not been subject to an absolute flux calibration.In the future, we plan to determine the absolute flux at 5526-5586 Å using Gaia photometry and XP spectra.In addition, the flux in the UV band is sensitive to metallicity, leading to large uncertainties in the UV.We will improve the correction performance in the UV by incorporating white dwarfs in the future.
We assumed that the resolution (R = 200) of the CSST slitless spectra is uniform across the entire wavelength range.However, the actual resolution varies slightly across the GU, GV, and GI bands, which needs to be taken into account in the future.In the current method, we also ignored the effect of dust reddening.This effect should be taken into account in the future.
Gaia DR3 provides high-quality low-resolution (R ∼ 50) BP and RP spectra covering the wavelength range 3300-10500 Å.After a comprehensive correction for systematic errors in the flux calibration (Huang et al. 2024), it provides an alternative data set for the UaRA-net to deliver a large number of fluxstandard stars.the observations.Then, the trained model was applied to the NGSL spectra to obtain the NGSL-based spectra at R = 2000.Figure 15 shows the relative residuals of 53 stars in common between MILES and NGSL-based spectra at R = 2000.The good agreement between MILES and NGSL demonstrates that the NGSL spectra of R = 2000 at 4000 < λ < 7000 Å are reliable.

Figure 1 .
Figure 1.(a) A subunit of the proposed network, where C, W, and 1 in C*W*1 are the indicators of the number of channels, width, and height of the feature map, respectively.(b) Overall architecture of the proposed network.
The feature map * M e C 1 Î  associated with the spectrum noise is generated by the full connection as (7) is associated with the stellar labels.The above two feature maps are then concatenated by the concatenation layer as

Figure 2 .
Figure 2. The medians (left column) and 1σ uncertainties (right column) of relative residuals of the training set with S/N ä{20, 40} for different T eff bins.The relative residuals are defined in Equation (2).

Figure 3 .
Figure 3. Similar to Figure 2, but for the testing set with S/N ä{20, 40, 80}.It is worth mentioning that the spectra with S/N = 80 are not in the training set.

Figure 5 .
Figure 5. Similar to Figure 4, but for the testing set with S/N ä{20, 40, 80}.It is worth mentioning that the spectra with S/N = 80 are not in the training set.

Figure 6 .
Figure 6.The SEDs from the UaRA-net and observations and their relative residuals for the training and testing sets at different S/Ns.A total of six representative stars of different spectral types are plotted.

Figure 7 .
Figure 7. Relative residuals for the predicted NGSL SEDs when directly applying the pretrained UaRA-net.The spectra are ranked by the T eff from 5000 to 9750 K.

Figure 8 .
Figure 8. Relative residuals for the training (top panel) and testing (bottom panel) sets of the predicted NGSL SEDs with the fine-tuned UaRA-net.

Figure 9 .
Figure 9.The RMSAE of the predicted NGSL SEDs as a function of A V for the training (top panel) and testing (bottom panel) set with the fine-tuned UaRA-net.

Figure 10 .
Figure 10.Relative residuals for the predicted NGSL SEDs with the fine-tuned UaRA-net.The blue lines denote the medians and the standard deviations of relative residuals as a function of wavelength.Top panel: training set.Bottom panel: testing set.

Figure 11 .
Figure 11.Similar to Figure 10, but for the MILES spectra.It is worth mentioning that the fine-tuned UaRA-net applied here uses the NGSL spectra rather than the MILES spectra.The red line denotes the systematic difference between the MILES spectra and NGSL spectra yielded by their common stars.The blue and red lines match well.

Figure 12 .
Figure 12.Relative residuals of the GPR training (top panel) and testing (bottom panel) set for MILES.

Table 1
Architecture-related Hyperparameters for the Training of CSST-like SEDs and Errors