An Explainable Deep-learning Model of Proton Auroras on Mars

Proton auroras are widely observed on the dayside of Mars, identified as a significant intensity enhancement in the hydrogen Lyman alpha (121.6 nm) emission between 110 - 150 km altitudes. Solar wind protons penetrating as energetic neutral atoms into Mars thermosphere are thought to be primarily responsible for these auroras. Recent observations of spatially localized (patchy) proton auroras suggest a possible direct deposition of protons into Mars atmosphere during unstable solar wind conditions. Improving our understanding of proton auroras is therefore important for characterizing the solar wind interaction with Mars atmosphere. Here, we develop a first purely data-driven model of proton auroras using Mars Atmosphere and Volatile EvolutioN (MAVEN) in-situ observations and limb scans of Ly-alpha emissions between 2014 - 2022. We train an artificial neural network (ANN) that reproduces individual Lyman alpha intensities and relative Lyman alpha peak intensity enhancements with a Pearson correlation of 0.94 and 0.60 respectively for the test data, along with a faithful reconstruction of the shape of the observed Lyman alpha emission altitude profiles. By performing a SHapley Additive exPlanations (SHAP) analysis, we find that solar zenith angle, solar longitude, CO2 atmosphere variability, solar wind speed and temperature are the most important features for the modeled Lyman alpha peak intensity enhancements. Additionally, we find that the modeled peak intensity enhancements are high for early local time hours, particularly near polar latitudes, as well as weaker induced magnetic fields. Through SHAP analysis, we also identify the influence of biases in the training data and interdependecies between the measurements used for the modeling, and an improvement on those aspects can significantly improve the performance and applicability of the ANN model.


Introduction
Auroras on Mars are observed as enhancements in far-UV and EUV emissions on both the nightside and dayside of Mars (Bertaux et al. 2005;Schneider et al. 2015;Deighan et al. 2018).Three distinct types of such auroras have been observed.Discrete auroras were the first auroras observed on Mars (Bertaux et al. 2005).They are typically caused by electrons moving from the dayside to the nightside along closed crustal magnetic field lines and are highly localized.In contrast, diffuse electron auroras on Mars are observed during solar energetic particle events and are caused by higher-energy electrons penetrating along the open magnetic field lines across the planet (Schneider et al. 2015).In addition to discrete and diffuse auroras observed over the years, proton auroras are a relatively newly discovered phenomenon (Deighan et al. 2018;Ritter et al. 2018;Chaffin et al. 2022) and observed mainly on the dayside of Mars.Both electron and proton auroras occur extremely frequently (Hughes et al. 2019;Lillis et al. 2022), and therefore studying them can provide new insights into the complex interactions between the solar wind and the weak crustal field of the planet and its surrounding plasma environment (see Atri et al. 2022 for a recent review).
Proton auroras are widely observed auroras on Mars, identified in ∼14% of observations (Hughes et al. 2019) from the Imaging Ultraviolet Spectrograph (IUVS; McClintock et al. 2015) on board the Mars Atmosphere and Volatile Evolution (MAVEN) spacecraft (Jakosky et al. 2015).On Mars, these auroras are thought to be caused primarily by a population of solar wind protons penetrating the Martian magnetosphere as energetic neutral atoms (ENAs) of hydrogen (Deighan et al. 2018).These hydrogen ENAs are formed in the outer hydrogen corona of Mars via electron stripping and charge exchange.Once in the thermosphere, the ENAs undergo repeated charge exchange and collisions with neutrals, and can emit Lyα (121.6 nm) radiation that is seen as proton auroras.
Using Lyα emission profiles from MAVEN/IUVS, Deighan et al. (2018) first reported the observation of a proton aurora on Mars, characterized by a Lyα intensity enhancement in the altitude range 120-150 km.They showed that such emission enhancements are correlated with the observed penetrating proton flux (Halekas et al. 2015), and suggested that these auroras are thus triggered by hydrogen ENAs.Subsequently, Ritter et al. (2018) presented UV data from the Spectroscopy for the Investigation of the Characteristics of the Atmosphere of Mars (SPICAM; Bertaux et al. 2006) on board Mars Express to confirm the observations of proton auroras.Recently, Chaffin et al. (2022) reported observations of proton auroras in Lyα and Lyβ (102.6 nm) from Emirates Mars Ultraviolet Spectrometer (EMUS; Holsclaw et al. 2021) on board the Emirates Mars Mission (Amiri et al. 2022).EMUS captures a global synoptic view of UV emissions from Mars and has revealed that the regions of proton auroras can be highly localized and "patchy."Since the solar wind conditions that are known to trigger proton auroras are uniform across the dayside, these EMUS observations of "patchy" proton auroras suggest the existence of other mechanisms responsible for proton auroras on Mars (see Chaffin et al. 2022).
Known physical processes involved in triggering proton auroras, namely the formation of ENAs and penetrating solar wind protons, form an important characteristic of the interaction of the solar wind with the Martian magnetosphere and the escape of Mars' atmosphere.A comprehensive understanding of the occurrence characteristics of proton auroras and consequences for the evolution of Mars' atmosphere requires a thorough analysis of the influence of solar wind properties and the subsequent response of Mars' magnetosphere.Hughes et al. (2019) conducted a first statistical study of proton auroras observed by MAVEN/IUVS, to understand how the occurrence rates and emission enhancements of proton auroras vary with the solar longitude (L s ), solar zenith angle (SZA), local time (lt), etc., among other factors.They found that proton auroras occur at altitudes of 110-150 km, and particularly at the lower altitudes around L s ∼ 180°.Their analysis showed that the primary factors affecting the occurrence rates of proton auroras are L s and SZA, with the highest occurrence rates and emission enhancements observed around southern summer solstice (L s ∼ 270°) and low SZAs.Hydrogen corona column densities above the bow shock and the solar wind flux increase during this season (L s ∼ 270°) and are accompanied by higher atmospheric temperatures and inflation of the lower atmosphere because of the increased dust activity.Hughes et al. (2019) suggested that these factors could contribute to a higher population of ENAs and their increased interaction with the lower atmosphere, and therefore increased occurrences and emission enhancement of proton auroras.Hughes (2021) further studied the influence of the interplanetary magnetic field (IMF) upstream of the Martian bow shock, flux of penetrating protons, dust, and extreme solar activity on the Lyα emission enhancements at the peak altitudes for proton auroras.They reported a possible preference for proton auroras to occur during radial IMF orientations.They also presented simple linear regression models of the orbit-averaged Lyα emission enhancements, separately accounting for the cases of high dust activity and extreme solar activity, using MAVEN in situ measurements of the orbit-averaged penetrating proton flux.
In this work, we attempt to explicitly model the influence of solar wind proton characteristics on Lyα intensity enhancements.We consider the measurements of the energy, density, temperature, and velocity of protons, and also in situ magnetic fields, obtained by MAVEN during its passage through the different regions of the magnetosphere in each orbit.We also consider the dependence on the density of atmospheric CO 2 and Mars' crustal magnetic fields.These MAVEN measurements, referred to as features in this manuscript, are numerous and may have interdependences, plus a highly nonlinear relationship with Lyα emissions.We therefore develop an artificial neural network (ANN) model using these features as inputs to reproduce the observed Lyα altitude profiles.Deep neural networks are extremely efficient in leveraging correlations from complex, high-dimensional large data sets to perform challenging tasks such as classification, regression, segmentation, etc. (Goodfellow et al. 2016).The development, i.e., training or learning, of an ANN is posed as an optimization problem to obtain the ANN parameters that minimize a loss function or misfit between the modeled output and ground truth.Here, we demonstrate that the ANN is trained to learn dependences between the input MAVEN observations and Lyα emissions to accurately model the observations of altitude profiles.
Modeling of the observed Lyα enhancements of proton auroras has been previously considered briefly in the original discovery paper by Deighan et al. (2018) and extensively in a recent "multi-model campaign" by Hughes et al. (2023).This physics-based modeling involved Monte Carlo simulations of proton/hydrogen precipitations in the Martian atmosphere, using the observed flux as input, and subsequent recreations of background-subtracted Lyα altitude profiles using a radiative transfer model.Hughes et al. (2023) recreated and compared such simulations for MAVEN orbit #4235 using four different Monte Carlo models and found the particle flux and velocity of the solar wind as the primary variables affecting proton auroras.In contrast to these physics-based models, the ANN model considered in this work is purely data-driven and is used to recreate the Lyα altitude profiles from many (∼2000) MAVEN orbits simultaneously.However, unlike a physics-based model, a definite understanding of how a trained ANN uses given inputs for modeling is difficult and the reliability of an ANN model is therefore also questionable.Here we carry out a Shapley value analysis (Lundberg & Lee 2017) of the trained ANN to explore correlations of the input features used for modeling the Lyα intensities.Through the Shapley analysis, we are able to identify possible biases and caveats in the data and modeling, recover the previously known dependence of proton auroras on L s and SZA, and also find some new patterns in the data.
This paper is organized as follows.In Section 2, we describe the selection and preprocessing of MAVEN data used for analysis.In Section 3 we explain the ANN architecture and training methodology.In Section 4 we report our findings.Section 4.1 outlines the accuracy of our model.Section 4.2 presents a detailed analysis of Shapley values for different input features.In Section 5 we summarize our results and discuss their implications, the scope of our model, and possible improvements.

Data
We use remote sensing and in situ data from MAVEN between 2014 October and 2022 April for developing an ANN model of the observed altitude profiles of Lyα emission.The data considered here cover almost four Martian years.Details of the analyzed data are as follows.

IUVS Limb-scan Observations
MAVEN/IUVS is a remote-sensing UV spectrograph monitoring the state of Mars' upper atmosphere (110-225 km).The IUVS wavelength range covers the far-UV (110-190 nm) and mid-UV (180-340 nm).IUVS is thus sensitive to, among other emissions, Lyα (121.6 nm), which is the focus of this study on proton auroras, as well as the CO 2 + UV doublet band (co2uvd; 288.3 nm and 289.6 nm), which we use as a proxy for the density of atmospheric CO 2 (Deighan et al. 2018).We use the publicly available Level 1C processed data products.During periapsis passes, IUVS operates in the limb-scan mode, when it records the altitude emission profiles at 100 and 220 km (McClintock et al. 2015).Typically, 12 such scans are recorded for each periapsis passing segment, lasting ∼23 min.
We only use limb-scan profiles that provide emissions for the full altitude range of 100-200 km for Lyα and 130-190 km for co2uvd.These altitude ranges are chosen to minimize the exclusion of limb-scan observations because of any missing data at lower or higher altitudes and to still cover the region of Mars' thermosphere relevant for this study.The proton auroras in Lyα altitude scans are identified as an enhancement of emission intensity between 110 and 150 km.Following Hughes et al. (2019), we quantify the emission enhancement using an enhancement measure (EM) defined as the difference between the second highest intensity in the peak altitude range and the median value in the 160 and 190 km range.Figure 1(a) shows a sample co2uvd emission profile and Figure 1(b) shows samples of Lyα profiles for observations of a nonproton aurora (blue) and a proton aurora (red).The latter shows a marked enhancement of Lyα emission in the peak altitude range compared to the characteristic flat dayglow profile in the former.Hughes et al. (2019) defined an observation to be a proton aurora if EM exceeds 0.5σ from the mean EM, σ being the standard deviation of EMs in the data.Figure 1(c) shows the distribution of the enhancements across the data considered for the analysis, with the threshold highlighted by the dashed black line.Hughes et al. (2019) showed that the occurrence rate of proton auroras primarily depends on L s and SZA.We therefore include these in our analysis along with other measurements of latitude (lat), longitude (lon), and local time (lt) for each limb scan.The IUVS limb-scan data, with all altitude observations in the ranges specified above for Lyα and co2uvd profiles, are available for 4130 orbits during the observation period considered for this study, out of a total of 16,079 MAVEN orbits in this period.

MAVEN In Situ Measurements of Protons and Magnetic Field
MAVEN measures in situ properties of plasma and solar EUV flux it encounters in its flight.We use various in situ measurements from the Solar Wind Ion Analyser (SWIA; Halekas et al. 2015) and Magnetometer (MAG; Connerney et al. 2015) on board MAVEN to characterize the influence of solar wind protons and magnetic fields on proton auroras.The details are as follows.
In its orbit, MAVEN samples plasma from different regions of the magnetosphere-the solar wind upstream of the bow shock (SW), the magnetosheath (MS), and the thermosphere (TH).We identify measurements within SW using the algorithm from Halekas et al. (2017).We use the positions of bow shock and magnetic pileup boundaries from Trotignon et al. (2006) to identify measurements within MS.Finally, we use observations taken below 250 km as in situ observations of TH.Depending on the altitude, MAVEN does not always sample plasma from SW in each orbit; we only include the orbits in our analysis for which MAVEN samples observations from SW. Out of the 4130 orbits, after screening for the availability of IUVS limb-scan data, SW sampling is available for 2211 MAVEN orbits.This study therefore uses all these 2211 orbits for analysis.A distribution of the selected orbits across the considered observation period is shown Figures 9-11 in Appendix A. Figure 2(a) shows the identified positions of the bow shock, magnetic pileup boundary, and TH regions from the SWIA energy spectra for protons for a sample orabit.
The IUVS limb scans are remote sensing measurements and are obtained by accumulating the number of photons along the line of sight over a period of approximately 2 minutes.The in situ measurements obtained in the regions SW, MS, and TH correspond to the plasma properties at the location of the spacecraft.Thus, in general, these two measurements correspond to very different regions in the Martian atmosphere.Using the in situ properties in SW, MS, and TH regions allows us to characterize the variability of plasma in these regions to an extent.However, this characterization and therefore our analysis are limited by any local spatial and temporal variations between the locations and measurement times of these in situ properties and the recorded IUVS limb-scan observations.
For SW and MS regions, we use the average values of proton and in situ measurements over an orbit.SW conditions are generally expected to be uniform and therefore orbit-averaged values are a good representative of the solar wind conditions (Ruhunusiri et al. 2018;Hughes 2021).The MS plasma environment, spanning a range of ∼1000 km, has a greater variability in plasma properties, and therefore averages may be biased by the presence of outliers.For the TH region, we divide The Lyα emissions show both a proton aurora case (red) and a nonproton aurora case, i.e., the background dayglow emission (blue).The proton aurora profile shows the characteristic enhancement around 110-150 km altitude.Note that the co2uvd emission profiles at altitudes of 130 and 190 km are used in this study.The orbit number for each profile is shown.The distribution of the Lyα enhancements within the data considered is shown in (c) (after Hughes et al. 2019), with darker red also indicating increasing intensity of proton aurora enhancements.The dashed line marks a threshold for intensity enhancement, used only as a reference, for defining the proton auroras as per Hughes et al. (2019).
all observations within an orbit into 12 equal parts (approximately one for each limb scan) and take the average values from each part.This facilitates characterization of local changes in proton aurora emissions, if present, within an orbit.Each of the 12 parts is marked by its corresponding average SZA and altitude.Note that the IUVS limb-scan measurements typically span the entire ∼24 minutes of the periapsis pass and therefore the corresponding in situ measurements are not always confined to the TH region defined above, and hence 12 equal-time bins are used instead.
Proton auroras are affected by the solar wind protons, which are monitored by MAVEN/SWIA.SWIA measures proton flux over the energy range 10 eV-5 keV, providing information about proton energies, velocities, and temperatures.We convert the velocities provided in the Mars-centered solar orbital (MSO) coordinate system to the Mars solar electrical (MSE) coordinate system.In MSE, the x-direction is antiparallel to the solar wind velocity (v SW ), the z-direction is the direction of the motional electric field E SW = − v SW × B SW , and the y-direction completes the right-handed system.This conversion from MSO to MSE, although not strictly necessary, is used in this case to facilitate the inference from our analysis in terms of the plasma flows and currents in the induced magnetosphere of Mars (Ramstad et al. 2020).For MS and TH regions, we use the velocities (converted to MSE), temperature, and densities given in the data of SWIA in situ key parameters.In the TH region, the velocity, temperature, and densities are likely to be overestimated owing to the presence of heavy ions in this region (Halekas et al. 2017; see Section 4.2 for a further explanation of consequences of this bias).Also, we explicitly include the spectrum of protons from the TH region as an additional input to the model.This spectrum typically shows a peak of flux at the characteristic solar wind energy of 1 keV when the ENAs converted to penetrating protons are observed (Halekas et al. 2015), An example of the spectrum is shown in Figure 2(c).
In order to study the influence of the magnetic field on proton auroras, we use in situ measurements of magnetic fields from MAG on board MAVEN (Connerney et al. 2015).The magnetic field measurements are provided in the MSO coordinate system.For SW measurements, we convert magnetic fields to solar wind clock angle (B clock ) and cone angle (B cone ) that characterize the direction of the IMF.For MS and TH regions, we decompose the measurements into magnitude, elevation angle, and azimuth angle (Hara et al. 2018).The elevation angle (θ) measures how near vertical (±90°) or horizontal the magnetic field is.The azimuth angle (f) measures how far east (0°) or north (90°) the horizontal magnetic field is.
Proton aurora occurences may depend on crustal magnetic fields of Mars.These fields are extensively modeled using observations from MAVEN and Mars Global Surveyor (Acuña et al. 2001).Here, we use a publicly available model from Gao et al. (2021) that estimates the crustal field with a spatial resolution of ∼200 km and can model MAVEN observations based on these crustal fields within a few nanotesla of the true observations.We use total crustal magnetic field magnitude B tot,crustal and spherical coordinate system components B r,crustal , B θ,crustal , and B f,crustal evaluated at the location of the spacecraft using the Gao et al. (2021) model for our analysis.The 200 km spatial resolution of the crustal field is comparable to the <300 km scale of the spatial variability observed in patchy proton auroras (Chaffin et al. 2022) and therefore may be appropriate for this study.
All observations used as the ANN input are listed in Table 1.Since the MS and TH plasma environments are driven by the incident solar wind to a large extent, many measurements, such as magnetic fields, proton speed, density, and temperature, may be strongly correlated to the corresponding measurements in the SW regions.The preparation of input data for developing an efficient machine learning (ML) or ANN model typically involves selecting features that are independent and likely to contain the most information about the modeled phenomena.(2006).The region between the dashed yellow lines shows observations below 250 km identified as the thermosphere region in our analysis.(c) Energy spectra of protons within the thermosphere region as per Halekas et al. (2015).
Therefore, it is generally desirable to reduce the inputs to an independent set using dimensional reduction techniques such as principal component analysis (Hastie et al. 2001).However, one of the key objectives of this study is to explore the possible correlations between various measurements considered in Table 1 and proton auroras.Hence, we include all features listed in Table 1 as inputs, without performing any dimensional reduction.Thus, one purpose of the Shapley value analysis presented in Section 4.2 is also feature selection, i.e., identifying and retaining the most important features, for improving the ANN model.

Methods
We build an ANN model for the altitude profiles of Lyα intensity (between 100 and 200 km) observed by MAVEN/ IUVS in each periapsis limb scan using MAVEN/SWIA in situ observations of proton energy, density, velocity, temperature, and magnetic fields, modeled crustal magnetic fields, and MAVEN/IUVS observations of co2uvd altitude profiles as well as SZA, L s , latitude, longitude, and local time for each limb scan (see Table 1).

Artificial Neural Network Architecture
An ANN is a function X, where X are the inputs, Y is the modeled output and θ are the parameters-weights and biases-of the artificial neurons in the network.Each artificial neuron outputs • ( ) s + w x b , where x is the output from the previous layer (or the input features), w and b are the weights and biases of the neuron, and σ is an activation function Hastie et al. (2001).A fully connected (FC) layer of neurons is connected to all neurons (or the inputs) in the previous layer.Layers in a convolution neural network (CNN) are made up of convolution filters, comprising a set of neurons with the same weights and biases, which sequentially process outputs from the previous layer (or the inputs) to output feature maps.Convolutional layers are typically used to process images or tabular data.
In this work, the input features consist of different types of observations-orbit average values (SW and MS), average time series of measurements for each orbit (TH:insitu meas.), energy spectra (TH:en spec.), the co2uvd altitude profiles (TH: co2uvd), and other remote sensing measurements (TH:rs geom.).Hence, we use different subnetworks to individually process different inputs and obtain a homogeneous abstract representation that is further processed by several FC layers (hidden layers) to yield the output.All inputs except TH:insitu meas.are 1D features and are processed by an FC subnetwork with identical architecture.TH:insitu meas.are 2D tabular inputs with 12 average values of each features spaced equally in time during the periapsis scan, and they are processed by a 1D-CNN.The details of subnetworks FC and 1D-CNN, and FC hidden layers are as follows.
1. FC subnetwork.This is made up of three FC layers including the output layer.neurons that models the observed Lyα emission intensity profile between 100 and 200 km binned into 20 equally spaced altitude bins.All neurons in the output layer also have the sigmoid activation function.
The ANN architecture is shown in Figure 3.The 1D-CNN subnetwork is used to process only the TH:insitu meas.inputs since they are in a 2D tabular form of 17 × 12, corresponding to each equal-time bin in the TH region.The 1D-CNN kernel spans one time bin of TH:insitu meas.and is designed to learn identical patterns across all time bins.All other inputs are 1D and are hence processed with the FC subnetwork.The number of hidden layers/convolution filters is typically a power of 2 and also first increases and then decreases.The layer with highest number of neurons contains an abstract encoding of the information learned from inputs.This encoded information is decoded in steps via layers with a decreasing number of neurons to obtain the desired output.The number of neurons in the output layer is dictated by the number of altitude bins for the Lyα profile.

Training
We split the available MAVEN data between 2014 October and 2022 April into three parts for training (60%), validation (20%), and testing (20%).The data in the training set are used to train the ANN, i.e., obtain the values of weights and biases that accurately model the output.The data in the validation set are used to ensure that the model is not overfitting and its performance can be generalized by tuning the hyperparameters (discussed below).A well trained ANN learns concrete patterns from the data to model the output and generalizes to yield good performance on the test set, which is a final test of an ANN model.
Observations of proton auroras in different limb scans from an orbit are correlated (Hughes 2021), and therefore mixing limb-scan observations from the same orbit in training, validation, and testing would result in an artificially high performance.Hence, we first randomly split the orbits in the given ratio and all limb scans from each orbit are then included in the respective set.The total numbers of orbits for training, validation, and test data are 1326, 442, and 443 respectively.The total number of limb scans for training, validation and test data are 7178, 2382, and 2384 respectively.Note that a random splitting may result in observations from adjacent orbits in the same set.However, from the input features, only L s and latitude evolve slowly and have approximately identical values in adjacent orbits.Other input features, dependent on the solar wind, are expected to vary from orbit to orbit and are known to produce significantly different Lyα response in adjacent orbits (e.g., Deighan et al. 2018).Hence, the presence of a few samples from adjacent orbits in the same set due to a random splitting is not expected to bias the performance significantly.
During training, all samples from the training data are fed into the ANN.For each sample, the ANN returns an output altitude profile of Lyα emission intensity Y that is compared with the true observed profile Y true , and a loss function L(Y, Y true ) is computed.The loss function serves as an objective function for optimizing the ANN model parameters (Hastie et al. 2001).The ANN weights and biases are then modified using a stochastic gradient descent to yield a minimum value for the loss function as the training progresses.For regression problems, such as the one considered here, the most commonly used loss function is a mean squared error (MSE) defined as, where N is the total number of samples, M = 20 is the number of altitude bins, and Y (Y true ) are the corresponding Lyα intensity values for the modeled (true) scans.The MSE loss function does not explicitly aid the ANN to accurately model the shape of the individual intensity profile, including the characteristic peak for proton auroras.Rather it only aids in reducing the overall mean square error in the intensities.To accurately model the shape, we additionally use the structural similarity index (SSIM) defined as where the μ are the means and the σ 2 are the variances of Y and Y true respectively, and s YY true is the covariance.C 1 and C 2 are constants with small values used to ensure that the denominator is nonzero.The SSIM was originally designed for comparing images (Wang et al. 2004).Intuitively, the means quantify the brightness of images, standard deviations quantify the contrast of images, and the covariances are incorporated to compare the local structures of the image (Nilsson & Akenine-Möller 2020).
The SSIM value ranges between 0 and 1, with the latter meaning that the two profiles are identical.The SSIM loss is therefore defined as The EM for each profile provides an additional constraint that can be used for modeling.We use the mean squared error for EM, as a third loss function for training.The complete loss function used for training is a weighted sum of the three loss functions: where weight coefficients λ 1 and λ 2 are hyperparameters used for balancing contributions of the different loss function terms.Since the ANN learns empirically from the data, it is important to have the training data free from any biases with respect to the modeled phenomena i.e., in this case Lyα intensity and proton aurora enhancements.As shown in Figure 1(c), the number of limb scan profiles decreases significantly with increasing enhancement.We bin the Lyα profiles in training data as per EM and repeat the number of samples, i.e., we oversample the data in bins with high EM values to match the number in the lowest enhancement bin.Thus, the distribution of Lyα profiles to be modeled in the training data is uniform with respect to EM. Oversampling is a standard technique in ML for countering the inadequacy of the number of samples in the training data in extreme regimes of the modeled phenomena.However, this may introduce a bias with respect to the inputs to the model.Here, L s and SZA are known to be important for proton auroras, and the oversampled training data introduce a sampling bias whereby the duplicated samples of the higher EMs are dominated by observations from L s ∼ 270°and SZA ∼ 50°(see Figure 12 in Appendix A).
Consequences of this bias in the oversampled training data are considered in Section 5.
Each input is standardized, by subtracting the mean and dividing by the standard deviation, before feeding into the ANN.Sine and cosine values of all angular input variables and local time (with a period of 24 hr) are used to ensure continuity across branch cuts.Each true observed Lyα altitude profile, Y true , is normalized by the maximum Lyα emission value corresponding to each altitude.The sigmoid neurons in the ANN output layer yield a value between 0 and 1, which is reconverted back to the original units using the same observed maximum Lyα emission values.
Apart from weights and biases, the ANN model also has hyperparameters that are important for training the model and minimizing the loss function.These include the learning rate, i.e., the step size for the stochastic gradient descent, the batch size, i.e., the number of samples used to calculate the gradients for updating weights and biases, and the parameters λ 1 and λ 2 for the loss function.We monitor the performance of the model on the validation data to tune the values of these hyperparameters.We implement the ANN model using pytorch (Paszke et al. 2019).Hughes (2021) developed a simple linear regression fit for obtaining Lyα enhancement averaged over an orbit using only the flux of penetrating protons measured by SWIA.They developed three different models for three different scenariosnominal conditions, extreme solar events, and events during high dust activity.They obtained R 2 values of 0.87, 0.61, and 0.43 for these three scenarios respectively.Here, we consider this simple linear regression model by Hughes (2021) as a baseline for the performance of the ANN model.Compared to Hughes' (2021) simple regression models, the ANN model yields R 2 = 0.91 for the orbit-averaged EMs for the training data.Considering that only the training data are used for the optimization of the ANN model as well as its applicability to all scenarios specified above, the higher R 2 of 0.91 indicates an improved performance over the baseline model.The ANN model also generalizes reasonably well to the validation and test data, yielding R 2 of 0.65 and 0.55 respectively for the orbitaveraged EMs.

Accuracy of the Modeled Proton Auroras
Figure 5 shows a direct comparison of mean Lyα altitude profiles, binned for each percentile population (after Hughes et al. 2019) as per EM, for the training, validation, and test data respectively.The signed difference between the true and modeled mean profiles for each percentile bin is also shown.Overall, the ANN model accurately reproduces the enhancement shape of the profiles, at altitudes of 125 and 150 km, as expected.For the training data the modeled and observed profiles conform closely to each other across all percentile bins.For the validation and test data, the modeled profiles produce lower peak intensities than the true observations for the highest percentile bin.Also, for the highest percentile bins, the width of the peak is systematically lower than in the true profiles.From the difference plots, the modeled intensities, particularly for the highest percentile bins, are lower for altitudes of 110 and 125 km.For the lower percentile bins, the observed and modeled profiles compare reasonably well.

SHAP Values: Identifying Important Features
Complex ML models such as neural networks, although highly efficient and accurate in learning from large highdimensional data sets, are notoriously difficult to interpret.An interpretation or explanation of a trained ANN model, however, is highly desirable-first and foremost to gain confidence in applying the model to new data, and subsequently, if possible, to uncover new patterns in the data.Over the last decade, a number of such methods of interpretation/explanation have been developed (Simonyan et al. 2013;Zeiler & Fergus 2014;Sundararajan et al. 2017;Selvaraju et al. 2017;Shrikumar et al. 2017).Additive feature attribution methods are a class of explanation models that can be written in the form of a linear function of binary variables that indicates the presence/absence of the model inputs.Shapley values, from cooperative game theory, are useful for quantifying the impact of each feature on the model output.Lundberg & Lee (2017) developed Shapley Additive Explanations (SHAP), based on Shapley values, as a unified measure of feature importance for the class of additive feature attribution methods that show a number of desirable properties.SHAP values have proven widely successful and are now the state of the art for explaining an ML model output in terms of contributions of its input features.We use the publicly For a given model output, the SHAP value for each input is the contribution of the input to the difference between the model output and an expected model output for a set of reference inputs.The sum of SHAP values for all inputs thus adds up to the difference between the given model output and the expectation value of the reference model output.Here, we use the training data inputs as a reference for calculating the SHAP values.Corresponding to a given set of input measurements, the SHAP values are obtained corresponding to the Lyα emission for each altitude bin in the output between 100 and 200 km (see Figure 14 in Appendix B).Since we are interested in understanding the relationship between the inputs and the Lyα enhancement during proton auroras, we consider mean SHAP values for the Lyα intensities at altitudes of 110 and 150 km.The Lyα emissions at these altitudes also change due to changes in the neutral H background emission.However, the ANN model is explicitly trained to reconstruct the shape of the Lyα emission at the peak altitudes characteristic of proton auroras via the SSIM (Equation ( 3)) and EM loss (Equation ( 4)) functions.Therefore, the reported SHAPs are expected to capture the contributions of the inputs primarily for proton aurora-related enhancements.SHAP values here are measured in kR, the same as the intensities.Table 2 lists an average unsigned (absolute magnitude) SHAP value for each group of input features over the training and validation data.The unsigned SHAP values reflect the magnitude of the contribution, and therefore the importance, of the input feature toward the modeled Lyα peak altitude intensities.As the table shows, TH:rs geom.and co2uvd intensity profiles that serve as a proxy for the CO 2 atmosphere (in the altitude range 130-190 km considered here; Deighan et al. 2018) contribute most to the model output on average, while the MS and SW measurements contribute the least.In the following, we analyze SHAP values of important features from each input group in detail.

In Situ Measurements
SHAP values for the TH remote sensing and in situ measurements are the highest and third highest respectively.Table 3 lists unsigned SHAP values for these features as well as for SW and MS in situ measurements.The SHAP values are normalized by the maximum unsigned value for the respective group.These unsigned SHAP values are an indicator of the contribution of the respective input feature to an increase or decrease in the modeled Lyα intensities.We identify the features from each group as the most significant when the normalized SHAP value exceeds an equal contribution of 1/N where N is the total number of features from the respective group.We also calculate the Pearson correlation (r) between the SHAP values and the measured values of the features.High values of r indicate a strong linear relationship between the SHAP value and the feature, suggesting the existence of a definite pattern associated with the Lyα enhancements.Note that the SHAP values are calculated using Lyα intensities between 110 and 150 km only, where the peak enhancements during proton auroras are typically observed.The positive/ negative sign of r indicates the direction of the relationship.Significant normalized SHAP values and/or a relatively high r helps us identify input features from these groups, with definite patterns correlated to the modeled Lyα intensities.Figure 6 shows the distribution of SHAP values for these selected  shown in Figure 6(a), from L s = 90°, i.e., the northern summer solstice, the SHAP values increase approximately linearly overall, and reach a maximum of ∼1.0 kR at approximately L s = 270°, close to the southern summer solstice.This is consistent with the highest occurrence rates of proton auroras during the southern summer season (Hughes et al. 2019).This is also consistent with the distribution of EM according to L s with the highest enhancements observed during the southern summer season.The SHAP values around L s = 90°are negative, indicating that the ANN model negatively associates those L s values with the Lyα enhancements.The may be a consequence of the bias owing to high occurrences and enhancements in the southern summer season.
From Figure 6(b), the SHAP values are highest for the early local times (∼2 hr), with a maximum of ∼0.8 kR, and decrease approximately linearly overall until they start increasing again in the evening hours after ∼18 hr.Panel (c) shows that the SHAP values are highest at ∼2.0 kR for low SZAs ∼0°and appear to decrease linearly on average with increasing SZA.The preference of the modeled Lyα enhancements for early hours is noteworthy.However, these values at early local times are dominantly sampled from around polar latitudes and periods close to L s = 90°and L s = 270°during the northern and southern summers respectively (see Figure 16 in Appendix B), which may be a source of bias.The observed trend in the SHAP values with SZA is consistent with the known proton aurora mechanism; since this is a dayside phenomenon, the enhancements are expected to be maximum for the lowest SZA (Hughes et al. 2019).The proton aurora enhancements (also shown in the figure), however, are found to be highest for mid SZAs ∼45°.This discrepancy is mainly due to the observation bias of the occurrence of proton auroras being maximum during southern summer, for which there are few low-SZA Lyα observations (see Figure 17 in Appendix B).Nonetheless, the ANN model here learns the true expected relationship because the Lyα intensities for different limb scans within a single orbit are also expected to decrease with SZA (Hughes 2021;He et al. 2023).
Distributions of SHAP values for the most significant SW measurements Vmse x and T SW are shown in Figures 6(d) and (e) respectively.The variation of SHAP value with both these variables is similar, i.e., generally increasing with increasing solar wind speed and temperature.Figure 15(d) in Appendix B shows that the SHAP values also increase overall with increasing solar wind densities.Thus, the SHAP contributions for these SW measurements imply increasing modeled Lyα peak intensities with increasing solar wind proton flux, which is consistent with the known proton aurora mechanism.
Figures 6(f), (g), (h), and (i) show the SHAP values for TH proton velocities Vmse x,TH and Vmse y,TH , temperature T TH , and in situ magnetic field magnitude B tot,TH .We see an increasing trend in SHAPs with increasing proton speed in the solar wind direction and −mse y direction.The former trend implies increasing Lyα peak intensities with an increasing downward (on the dayside) speed of protons, a relationship that is expected and consistent with the SHAP trends for solar wind proton speed and temperature.The latter relationship may be a result of a strong mutual correlation between the x and y MSE components of the proton speed (e.g., because of the global structure of induced currents (Ramstad et al. 2020).From panel (h), SHAP values are increasing with increasing TH proton temperatures overall, although there appears to be a strong variability at low temperatures of <100 eV.Proton aurora enhancements are also high in this region of low proton temperatures.The proton temperature values in the TH are likely to be overestimated because of the presence of heavy ions (Halekas et al. 2017), and an overall linearly increasing trend of SHAPs for T TH  160 eV-the maximum solar wind temperature in our data-may be an artifact.From Panel (i), we find that the SHAP values slowly decrease overall with increasing magnitude of TH in situ magnetic field, and this is also consistent with the distribution of enhancements with respect to these magnetic fields as shown.

co2uvd Altitude Profiles
The altitude profiles of co2uvd intensity make the second highest contribution to the modeled Lyα peak intensities as per the unsigned SHAPs in Table 2.The co2uvd intensities serve as a proxy for the density of atmospheric CO 2 for altitudes above 130 km.In order to understand the relationship between the modeled Lyα peak intensities and the co2uvd intensities, we divide the co2uvd intensity profiles into three groups based on altitudes: 130-150 km, which overlaps with the Lyα peak altitudes, and higher altitudes of 150-170 km and 170-190 km.For all three groups, the SHAP values are higher corresponding to the higher intensities at the respective altitudes as shown in Figure 7(a).The lowest altitude group yields the highest SHAP values up to ∼4 kR as expected because these include the altitudes for the peak Lyα intensities.The higher altitude groups 150-170 km and 170-190 km yield SHAP values up to ∼2 kR.
To understand the relevance of co2uvd intensity values at these different altitude ranges for the modeled Lyα emission, we examine co2uvd intensity profiles that are most relevant, i.e., have the highest SHAP values, from each of the three altitude ranges.Figure 7(b) shows average altitude profiles for 100 percentile bins ordered by the SHAP values for the three altitude groups.The highest SHAP percentile bins for all groups show average co2uvd profiles with higher overall intensities at all altitudes, but particularly at the respective altitude range.The co2uvd intensities as well as the CO 2 atmosphere are known to vary seasonally with L s .The inset shows the fraction of samples within each percentile bin approximately corresponding to southern summer solstice (L s ∼ 270°) and southern winter solstice (L s ∼ 90°).The fraction of profiles from southern summer is higher for the highest SHAP percentile bins 75-100 in all three cases, whereas the fraction of profiles from southern winter systematically decreases to 0 for these SHAP bins.Thus, SHAP values for the co2uvd intensities at all altitudes are highest when these intensities are most inflated during the southern summer at L s ∼ 270°.An increase in only the CO 2 density and therefore co2uvd intensities at these altitudes, however, is expected to increase the absorption of the proton aurora Lyα emission (Hughes 2021).Therefore the relationship of the modeled Lyα peak intensities is mainly dictated by the covariant increase in the co2uvd intensities during the southern summer.

Proton Energy Spectra
The penetrating proton population is identified by a peak at the characteristic solar wind proton energy ∼1 keV (Halekas et al. 2015).We find that the proton fluxes at other energies are also strongly correlated with the flux at ∼1 keV as shown in Table 4.We divide the energy spectra into six energy ranges depending on the degree of correlation with the typical solar wind energy.Table 5 lists the normalized SHAP contributions from these energy ranges as well as a Pearson correlation of SHAP with the proton flux values from these ranges.Figure 8 shows the distribution of SHAP values from the most significant energy ranges with the proton flux at the respective energies.We find that the SHAP values from the energy range 625-1288 eV, which contains the typical solar wind proton energy ∼1 keV, contribute the most, and SHAP values show an increasing trend with increasing proton flux in this range (panel (a)), yielding a Pearson correlation of 67%.SHAP values from higher energies, 1489-1720 eV and 1988-3068 eV, are also relatively strongly correlated with the proton flux from these ranges, with Pearson correlations of 68% and 47% respectively.From Figures 8(b) and (c), respectively, we find that the SHAP values corresponding to these energies increase slowly compared to panel (a) and are only significant for higher fluxes in these energy ranges.For other energy ranges, shown in Table 5, the SHAP values do not contribute as significantly and also have low Pearson correlation values.The modeled Lyα peak altitude intensities therefore are primarily dependent on the proton flux at ∼1 keV, the typical energy of the penetrating solar wind protons.

Discussion
In this work, we developed an ANN l to obtain the MAVEN/IUVS observed Lyα altitude profiles that show a marked enhancement at altitudes of 110-150 km during proton auroras on Mars.Our ANN model includes a comprehensive set of in situ measurements from MAVEN/SWIA and MAVEN/MAG to characterize the observed Lyα emissions in the thermosphere.These measurements included proton densities, temperatures, speeds, magnetic fields, and energies of sampled upstream solar wind, within the magnetosheath and also within the thermosphere below an altitude of 250 km.These, along with a proxy for the CO 2 atmosphere and crustal magnetic fields, served as inputs to the ANN model.The trained ANN model reproduces the Lyα intensities in the validation and test data sets with high Pearson correlations of 0.93 and 0.94 respectively and mean absolute error <1 kR.Although we did not explicitly model the intensity enhancements (EM), the trained ANN yields Pearson correlations of 0.65 and 0.60 for the validation and test data respectively.The extreme enhancements are not reproduced particularly accurately by the model.These enhancements occur at a lower peak altitude, and typically outside the dominant proton aurora  measurements in the magnetosheath region do not yield significant SHAP values or patterns.
The SHAP value analysis solidifies previously known patterns and also uncovers trends that are not directly observed in the data.In the light of the biases identified in the training data, further analysis and scrutiny are required to verify and establish the relationships presented by the ANN model and SHAP analysis.L s and SZA are the most important contributors; however, they are not primary physical processes, rather only a proxy for conditions during which the interaction of solar wind with Mars is more prominent.Moreover, the in situ proton measurements in the magnetosheath and thermosphere are expected to be strongly correlated to the solar wind measurements.Thus, exclusion of L s and SZA, as well as the solar wind measurement inputs, may help the ANN learn subtle correlations of magnetosheath and thermosphere protons, magnetic fields, and crustal fields with the occurrences of proton auroras.Training data free of the aforementioned biases are a prerequisite to obtain robust statistical results from such an exercise.We defer these improvements to future work.An improved ANN model, for the Lyα intensities or other related phenomena (e.g., ion loss; He et al. 2023), can thus be reliably used to simulate, characterize, and model varied Marssolar wind interactions for hand-tailored input conditions.Under the paradigm of physics-informed neural networks, ANNs can be modeled to explicitly include the known physical processes, further improving their applicability.ANN models, once trained, are considerably inexpensive compared to traditional numerical models and simulations.The plethora of data from Mars' magnetosphere, solar wind, and atmosphere available from past and current missions can thus be leveraged using the ever-advancing ML technology, for understanding the dynamics of Mars' magnetosphere and uncovering the mechanisms of the historical loss of Mars' atmosphere.Figure 14 illustrates the decomposition of a sample modeled Lyα altitude profile in contributions from the input features in the different groups.Figure 15 shows the SHAP values for an additional set of inputs of in situ measurements that contribute significantly toward the modeled output.Figure 16 shows distributions of number of samples with local time <6h with repsect to Ls and SZA. Figure 17 shows distributions of number of samples with SZA < 30°with respect to Ls and SZA.

Figure 1 .
Figure 1.Examples of MAVEN/IUVS limb scans showing the thermospheric altitude profiles of co2uvd emission (a) and Lyα emissions (b).The Lyα emissions show both a proton aurora case (red) and a nonproton aurora case, i.e., the background dayglow emission (blue).The proton aurora profile shows the characteristic enhancement around 110-150 km altitude.Note that the co2uvd emission profiles at altitudes of 130 and 190 km are used in this study.The orbit number for each profile is shown.The distribution of the Lyα enhancements within the data considered is shown in (c) (afterHughes et al. 2019), with darker red also indicating increasing intensity of proton aurora enhancements.The dashed line marks a threshold for intensity enhancement, used only as a reference, for defining the proton auroras as perHughes et al. (2019).

Figure 2 .
Figure 2. Example of MAVEN/SWIA measurements for a sample orbit showing the variation in the proton energy spectra (a) and the MAVEN altitude (b).The boundaries of the bow shock (black) and magnetic pileup (red) marked by the dashed lines are from Trotignon et al.(2006).The region between the dashed yellow lines shows observations below 250 km identified as the thermosphere region in our analysis.(c) Energy spectra of protons within the thermosphere region as perHalekas et al. (2015).

Figure 3 .
Figure3.The artificial neural network architecture.The ANN takes in SWIA in situ measurements of proton properties and magnetic fields from the upstream solar wind (SW), magnetosheath (MS), and thermosphere (TH) regions, as well as remote sensing geometry measurements of each IUVS limb scan.These input features are summarized in Table1.Fully connected or 1D-CNN subnetworks process the individual set of input features and yield an abstract representation.These representations are further processed by layers of fully connected neurons (hidden layers) to obtain the observed Lyα altitude profile of each IUVS limb scan as the output.The details of the FC and 1D-CNN subnetworks, hidden layers, and the output layer are given in the main text.
We use the mean absolute error (MAE) of the modeled Lyα intensities, the Pearson correlation between true and modeled intensities, as well as EM values obtained from true and modeled profiles for quantifying the accuracy and performance of the ANN model.The MAE and a comparison of true and modeled intensities for the training, validation, and test data are shown in Figure 4. Panel (a) shows the MAE in the modeled intensities, calculated for binned populations of the true intensity values.For the training data, the MAE is consistently low, =1 kR (unit kR is kilorayleigh) across the entire range.In the case of the validation and test data, for the true intensities up to 9 kR, MAE remains low, =1 kR.The MAE increases for true intensity values >9 kR to maxima of ∼2 kR and ∼1 kR for the validation and test data respectively.Figures 4(b) and (c) show a heat-map comparison of true and modeled intensity values, with the highest population of samples (indicated in red on the color bar) lying adjacent to the diagonal.The modeled intensities in the training data are dominated by values up to 9 kR.Higher MAE in the validation and test data for intensities >9 kR is a consequence of this bias.We find no strong dependence of MAE on L s or SZA, as shown in Figure 13 in Appendix A. The Pearson correlation between the true and modeled intensities yields high values: 0.96, 0.93, and 0.94 for training, validation, and test sets respectively.Similarly to the true Lyα altitude profiles, we obtain the EM values for the modeled Lyα altitude profiles.Pearson correlations between these EM values, corresponding to the true and modeled Lyα altitude profiles, are 0.92, 0.65, and 0.60 for training, validation, and test sets respectively.

Figure 4 .
Figure 4. Performance of the ANN.(a) Mean absolute error in the predicted intensity as a function of true intensity binned in 10 equal-sized bins.The error bars indicate 1σ standard error.(b), (c), (d) Heat maps showing the population of predicted intensity samples binned in 2D as per true and predicted intensities for training, validation, and test sets respectively.The training data are used to obtain the model parameters that minimize the loss function (Equation (5)).The validation data are used to ensure that the model performance can generalize to new data and obtain hyperparameters for the training.The test data are totally unseen by the model.Note that the bins lying closer to the diagonal (dashed line) indicate accurate predictions.The corresponding Pearson correlation values (r) are also noted.The observed intensities of Lyα emission below ∼9 kR are accurately reproduced by the ANN model.

Figure 5 .
Figure 5. Summary of reconstructed Lyα intensity altitude profiles for the training, validation, and test data.The training data are used to obtain the model parameters that minimize the loss function (Equation (5)).The validation data are used to ensure that the model performance can generalize to new data and obtain hyperparameters for the training.The test data are totally unseen by the model.Each profile is a mean profile obtained from a population of profiles binned in percentile bins of the peak intensity enhancements.Darker color indicates increasing enhancement value.Each profile is normalized by mean intensity values in the altitude range 160 and 200 km (after Hughes et al. 2019).The characteristic shape of the observed Lyα intensity profiles of proton auroras is reproduced reasonably well by the ANN model, except for the cases of extreme intensity enhancements.

Figure 7 .
Figure 7. SHAP values for the co2uvd altitude profile input.(a) SHAP values for co2uvd intensity values grouped in three altitude regions, with the color bar showing the scale of intensities.(b) The mean co2uvd altitude profiles for 100 percentile bins ordered by SHAP values of each altitude range.The darker color corresponds to higher SHAP values.The insets show the fraction of observations in each percentile bin corresponding to L s = 270°± 25°and L s = 90°± 25°.The SHAP values for the co2uvd intensities are highest when the co2uvd intensities are most inflated during the southern summer at L s ∼ 270°.

Figure 10 .
Figure 10.Number of samples in training, validation, and test data sets across the solar longitude L s during the considered observation period.The size of each bin is 20°.The dashed lines show the fractional data count in each bin.

Figure 11 .
Figure 11.Number of samples in training, validation, and test data sets across the considered solar zenith angle SZA.The size of each bin is 15°.The dashed lines show the fractional data count in each bin.

Figure 12 .
Figure 12.Comparison of the fraction of samples in the training data set after oversampling with the validation and test data sets for the considered solar longitude L s and solar zenith angle SZA.

Figure 13 .
Figure 13.Mean squared error of the modeled Lyα intensities in the training validation and test data as a function of the considered solar longitude L s and solar zenith angle SZA.

Figure 14 .
Figure 14.An illustration of SHAPs.The SHAP (DeepSHAP) values of altitude profiles splitting the ANN outputs into contributions from the input feature groups.The altitude profiles are color-coded as per the legend.

Figure 15 .
Figure 15.Distribution of SHAP values (blue) vs. measurements of the important input features from in situ measurements of the magnetosheath (MS) and thermosphere (TH) as well as the TH remote sensing geometry as identified in Table3.The distribution of Lyα enhancement (EM) corresponding to each measurement is also shown (red).

Figure 16 .
Figure 16.Distribution of counts for Lyα observations with local time <6 h with respect to solar longitude L s and latitude.

Table 1
MAVEN/SWIA, MAVEN/MAG, and MAVEN/IUVS Observations Used as Input Features for Modeling Lyα Altitude Profiles Observed by MAVEN/IUVS IMF clock angle (IMF clock ) Radial crustal magnetic field (B r,cr ) co2uvd altitude profile IMF cone angle (IMF cone ) Azimuthal crustal magnetic field (B f,cr ) Thermosphere (TH:rs geom.)MSE-X proton speed (Vmse x,SW ) Polar crustal magnetic field (B θ,cr ) Solar season (L s ) Proton temperature (T SW ) Total magnetic field (B tot,TH ) (Hastie et al. 2001) layers have 128 and 256 neurons respectively, while the output layer has 64 neurons.The number of inputs is different for different input groups as listed in Table1.All neurons have the sigmoid activation that gives an output value between 0 and 1(Hastie et al. 2001).features are concatenated (vector with length 64 × 5 + 128 = 448) and fed into a network of three FC hidden layers with 1024, 512, and 256 neurons respectively.Each neuron in the hidden layers has the sigmoid activation.4. Output layer.The output of the hidden layers (length = 256) is fed into the output layer with 20

Table 2
Ranking of Inputs

Table 4
Pearson Correlation (r) of Flux of Penetrating Protons (Observed within the Thermosphere, TH) at Different Energies Measured by MAVEN/SWIA, with the Flux Measured at Energy 964.61 eV that is Closest to the Typical Solar Note.The energies are grouped as per their correlation values.