Inferring coupling strengths of mixed-mode oscillations in red-giant stars using deep learning

Asteroseismology is a powerful tool that may be applied to shed light on stellar interiors and stellar evolution. Mixed modes, behaving as acoustic waves in the envelope and buoyancy modes in the core, are remarkable because they allow for probing the radiative cores and evanescent zones of red-giant stars. Here, we have developed a neural network that can accurately infer the coupling strength, a parameter related to the size of the evanescent zone, of solar-like stars in $\sim$5 milliseconds. In comparison with existing methods, we found that only $\sim$43\% inferences were in agreement to within a difference of 0.03 on a sample of $\sim$1,700 \textit{Kepler} red giants. To understand the origin of these differences, we analyzed a few of these stars using independent techniques such as the Monte Carlo Markov Chain method and Echelle diagrams. Through our analysis, we discovered that these alternate techniques are supportive of the neural-net inferences. We also demonstrate that the network can be used to yield estimates of coupling strength and large period separation in stars with structural discontinuities. Our findings suggest that the rate of decline in the coupling strength in the red-giant branch is greater than previously believed. These results are in closer agreement with calculations of stellar-evolution models than prior estimates, further underscoring the remarkable success of stellar-evolution theory and computation. Additionally, we show that the uncertainty in measuring large-period separation increases rapidly with diminishing coupling strength.


INTRODUCTION
Seismology is the most powerful tool with which to probe the interior structure of stars and understand their evolution.Missions such as CoRoT (Baglin et al. 2006), Kepler (Borucki et al. 2004(Borucki et al. , 2010)), and TESS (Ricker et al. 2015) have detected stellar pulsations in hundreds of thousands of stars, providing valuable observations.More than twenty thousand of these pulsators are red giants (Mosser et al. 2010 a;Yu et al. 2018).In the terminal stages of evolution, solar-like stars become red-giants where Hydrogen in the outer shell is depleted and core Helium becomes concentrated.Oscillations in red giants are caused by the turbulent motion of gas in the external convection zone.
Seismology has enhanced our understanding of rotation profiles in red giants and clumps (Beck et al. 2012; dhanpal.siddharth@gmail.comDeheuvels, S. et al. 2014;Mauro et al. 2016), possible evidence for the prevalence of strong magnetic fields in the core (Fuller et al. 2015) and allowed for the seismic estimation of magnetic fields (Li et al. 2022).Combining asteroseismology with spectroscopic information of red giants enables us to carry out galactic archaeology (Hekker 2018;Anders, F. et al. 2017).It also sheds light on the long standing question of core-envelope angular momentum transport in evolved stars (Aerts et al. 2019).
To analyse observations of the Kepler lightcurves, we follow the usual practice of transforming them into power spectra.These power spectra dominantly exhibit two kinds of resonant frequencies in solar-like stars (a) pressure driven modes (p-modes) which are typically spaced linearly, and (b) mixed modes, whose restoring force is pressure in the envelope and buoyancy in the core.These modes propagate both in the core and envelope but decay in the evanescent zone.The modes provide us the opportunity to analyze all three zones in red giants.
Mixed modes in solar-like stars follow a highly nonlinear pattern.The identification and interpretation of mixed modes is particularly challenging.With exponentially growing datasets from ongoing and future missions such as PLATO (Rauer et al. 2014), traditional fitting methods (Benomar, O. et al. 2009;Handberg & Campante 2011;Corsaro & De Ridder 2014) are simply computationally challenging.While PLATO is primarily designed for main-sequence and subgiant stars, it is anticipated that there will be red giants in the observed field.With the potential to observe over a million light curves, PLATO could potentially capture more than 200,000 red giants if the statistics are similar to those of the Kepler mission.Although there are a few semi-automated methods (Vrard et al. 2016;Gehan et al. 2018;Kallinger 2019), they still require some amount of visual inspection.It is therefore crucial to develop a fully automated, accurate and computationally cheap tool with which to analyse these stars.
Deep learning has been very successful in identifying complex patterns in wide range of problems (Krizhevsky et al. 2017;Fawzi et al. 2022;Jiménez-Luna et al. 2020).It has also shown success in asteroseismology in scenarios such as classifying stars (Barbara et al. 2022) and identifying red giants (Hon et al. 2019), measuring seismic parameters (Dhanpal et al. 2022) and measuring stellar parameters (Verma et al. 2016).In Dhanpal et al. (2022), we have designed an algorithm to discern pulsations of red giants from noise and measure seismic parameters that broadly describe the structure of the core and envelope of a solar-like star.Hereafter, we refer Dhanpal et al. (2022) as D22.
Mixed modes in red-giants can be used to probe the core of the star as shown in Bedding et al. (2011); Beck et al. (2012).In the last paper D22, we have developed a deep learning technique to measure the asymptotic frequency spacing (∆ν), the asymptotic period spacing (∆Π) of the star which is related to the core size of the star and the frequency with maximum amplitude (ν max ) in one single step.In this paper, we extend the deep learning technique to measure the coupling strength (q).Jiang et al. (2020) have demonstrated that the asymptotic radial mode frequencies decrease as the coupling strength decreases during the evolutionary process of the star.While it has been acknowledged by Jiang et al. (2022) that a formalism to establish the relationship between coupling strength and physical quantities is lacking in cases of rapid structural variation, it has been established that the coupling strength is linked to the size of the evanescent zone in situations where the evanes-cent zones are thick (Shibahashi 1979;Unno et al. 1989;Jiang & Christensen-Dalsgaard 2014).(Pinçon, C. et al. 2020) have shown that the coupling strength can serve as a probe to investigate the position of the base of the convection zone.Their study also demonstrates that the progressive migration of evanescent region towards the convective zone in evolved red giants leads to decrease in coupling strength.
The automated technique developed in (Mosser, B. et al. 2017) enables the measurement of q in numerous stars.However, this technique involves multiple steps, such as removing ℓ = 0, 2 p-modes from the spectra, stretching the spectra, and estimating the qualitative regularity in the stretched spectra for various values of q, ∆Π, and the offset parameter ϵ g .These steps require a considerable amount of time, taking at least ∼10s to complete.Furthermore, Ong & Gehan (2023) has provided analytical evidence indicating that this approach is susceptible to overestimation due to its sole dependence on frequencies.In contrast, the trained neural network developed in this paper yields significant advantages.The end-to-end neural network is capable of inferring all seismic parameters, such as ∆ν, ∆Π, ν max , and q, for a given star in 5 milliseconds offering a computational gain of at least 10 3 when compared to the aforementioned automated technique.In addition, the neural network developed in this study incorporates both frequencies and amplitudes to infer the coupling strength, thereby mitigating the potential bias of overestimation.

METHODS AND TECHNIQUE
To measure global seismic parameters ∆ν, ∆Π, ν max and coupling strength, we designed a neural network and trained it on synthetic spectra.Following that, we used MCMC to verify the inferences of some specific stars.In this section, we describe the neural network, the training data, the results on synthetic spectra and MCMC method used for the analysis.

Neural Network technique
Neural networks learn to execute a task by training on data.In this case, the network learns to retrieve seismic parameters Y from the corresponding power spectral data X.These networks use models with many neurons, which perform various non-linear operations to estimate the parameters.Each neuron comprises a set of tunable parameters (weights and biases) and a model typically contains millions of parameters.Neural networks learn to adjust these parameters by optimizing the error between the estimates and true labels.
Instead of directly predicting the real number associated with the parameters, we lay it out as a classification problem.To implement this, the parameter ranges of seismic parameters are divided into bins and the network estimates the probability score in each bin using a softmax function.
The advantage of classification is that it provides a Bayesian posterior (Richard & Lippmann 1991) distribution for a seismic parameter.Regression tasks typically don't offer this directly, but there are techniques like k-fold cross-validation that can estimate probability distributions (James et al. 2014).In regression with k-fold cross-validation, the network is trained multiple times on different data subsets, increasing training time.In contrast, classification provides a probability distribution for each class with by training the network only once, eliminating the need for additional iterations or folds and significantly reducing training time.Although mixture density networks (Bishop 1994;Hon et al. 2020) can be used to obtain probability distributions and train the network in a single pass, it requires having a predefined number of modes and are prone to mode collapse.A thorough investigation of the advantages of different methods is beyond the scope of this paper.
In the classification framework employed, the selection of bin size holds significant importance.It is essential to strike a balance where the bin size is sufficiently small to ensure accurate parameter resolution, while also being large enough to provide an ample number of training examples within each bin for effective network training.For the current study, bin sizes of 0.1µHz for ∆ν, 2.5s (in the range of 40-150s) and 7s (in the range of 150-500s) for ∆Π, 2.8 µHz for ν max and 0.02 for q were employed.These choices of bin sizes were approximately determined to ensure that the uncertainties in the parameters are comparable to the bin size.In the present study, we have adopted bin sizes that are representative of the typical uncertainties encountered in classical methods, such as fitting methods.However, we recognize that the uncertainties obtained by classical methods, especially in cases with low coupling constants, may be larger than typical published values for ∆Π as shown in section 3.3.In future experiments, we plan to increase the number of samples and decrease the bin size to improve precision in parameters.
We designed the neural network to estimate all four parameters at once.To carry out this task, we connected four different softmax layers representing the four seismic parameters.As shown in Figure 1, we built this network using Convolutional layers, LSTM cells, dense layers, all connected to four output softmax layers.Initially, our experimentation with the network involved solely convolutional layers.However, we have incorporated LSTM cells and increased the kernel size from 5 to 50 to improve the network's performance.Conse-quently, we devised the final design of the network based on these enhancements.To train this neural net, we optimize cross-entropy loss on training data using the Adam Optimizer.The network is considered trained if its performance on unseen data is as good as on training data.The trained neural network produces four probability vectors, for each power spectrum that correspond to the four parameters.

Training data
To train the neural network, we constructed a synthetic training dataset based on the asymptotic theory of oscillations (García & Ballot 2019;Aerts et al. 2010).This theory incorporates the physics of structure, composition gradient, and rotation in red giants.The formulation applied in the simulations is described in the Appendix A of Dhanpal et al. (2022).In D22, it is demonstrated that our simulations are of high quality, are realistic and cover a wide parameter space.

Modeling frequencies
Within the predictions of asymptotic theory, dipole mixed-mode frequencies in solar-like stars are given by an implicit equation 1.Although not in its current form, the equation was originally developed in Unno et al. (1989).It was further improved in Mosser et al. (2012Mosser et al. ( , 2015)); Farnir, M. et al. (2021); Lindsay et al. (2022) and Ong & Gehan (2023).In this study, we adopted the formalism presented in Mosser et al. (2015) to facilitate a direct comparison between our neural network inferences and the values provided by the same formulation.
where ν p are the frequencies of pure p modes.In order to model general oscillation pattern of p modes in red giant stars, we have employed the equation 1 given in Mosser et al. (2012) for ν p , (2) where ∆ν is the large-frequency separation, which gives the mean-frequency separation between two successive radial modes, ϵ p (∆ν) is the offset parameter, d 0ℓ the small-frequency separation, and α ℓ the degree-dependent gradient α ℓ = (d log ∆ν/dn) ℓ .
In the equation 1, ν g is given by, ν −1 g =(n + ϵ g )∆Π where ϵ g is the offset parameter corresponding to gmodes.According to Ong & Gehan (2023) and Lindsay et al. (2022), the relationship between the frequencies ν g and pure g-mode frequencies (ν g,pure ), is given by ν To determine the frequencies of the mixed modes, we solve the implicit equation 1.By calculating the intersections between the left and right sides of the equation, we identify the mixed mode frequencies within a range of approximately 1.2∆ν for each pure ℓ = 1 p mode.In this process, we remove any duplicate frequencies that are within a resolution of 4 years of each other.Additional details are described in Benomar (2023):external/ARMM/solver mm.cpp.
Pure p mode frequencies dominantly oscillate in the envelope and pure g mode are restricted to the core region.In contrast, mixed modes oscillate in both the core and envelope, while decaying in the evanescent zone between the core and envelope.We predominantly observe ℓ = 1 modes to be mixed modes.The coupling strength is inversely related to the size of the evanescent zone (Takata 2016).The transmission factor T of the mixed mode from the g-mode to the p-mode cavity is related to the coupling strength, which is given by, (3) The transmission factor of the mixed mode is proportional to exp(− κdr).
The wave-number κ in the evanescent zone is described by where Ŝ1 and NBV are the modified versions of the Lamb and Brunt-Vaisala frequencies, respectively (Takata 2006).
The gradients of Lamb and Brunt-Vaisala frequencies are proportional to the density contrast between the core and envelope.As the star evolves, the thickness of intermediate evanescent zone increases, causing the coupling strength to decrease further.

Modeling the power spectrum
The stellar power spectrum comprises the oscillation spectrum S(ν, Θ S ) and the noise profile N (ν, Θ N ).The power spectrum model M (ν, Θ) is computed using asymptotic theory, at characteristic frequency ν which is obtained as follows, ( where Θ S and Θ N comprise the parameters of model and noise respectively. The oscillation power spectrum is a sum of Lorentzians with heights H(n, ℓ, m) centered around ν(n, ℓ, m) with widths Γ(n, ℓ, m).
The frequencies ν(n, ℓ, m) are modeled using ℓ = 0, 2, 3 p-modes and dipole mixed modes given in equations 2 and 1 respectively.The oscillation spectrum model therefore is given by, The heights H(n, ℓ, m) are modeled using the following equation, where V (ℓ) is the mode visibility, A(n) depends on radial order, r ℓ,m (ι) the relative amplitude of the mode, which depends on the inclination angle ι.The visibility function is influenced by both the limb-darkening function, which varies depending on the type of star, and the measurement technique employed.For example, in a star with depressed dipole modes, V(ℓ=1) tends to be small.
In addition to considering the visibility factor of ℓ = 1 modes, the heights and widths of the mixed modes relative to ℓ = 0 modes are given by the following expressions: where Γ 1 and A 1 (ν) are width and amplitudes of the mixed modes respectively.In these expressions, Γ 0 and A 0 correspond to the width and amplitudes of the ℓ = 0 modes, which are derived from a few templates using redgiant and subgiant stars, as described in section A.1.3 of Dhanpal et al. (2022).And ζ is given by In the case of a mode trapped as a p-dominated mixed mode, ζ tends to approach 0. Conversely, for a mode trapped as a g-dominated mixed mode, ζ tends to approach 1.
The noise profile is essentially due to the convective motions at the surface of the stars.The noise model is constructed as the sum of a low-frequency (lf) Harvey profile and a high-frequency (hf) Harvey profile .
where H is the characteristic granulation amplitude, τ is the characteristic timescale of granulation, p is the characteristic power law and N 0 is the white noise level.Architecture of neural network used to infer seismic parameters.The network takes in the normalized power spectrum as input and outputs probability distributions of ∆Π, ∆ν, νmax and q.The neural network comprises 1D Convolutional layers, LSTM cells and a dense layer.We used dropout layers to prevent over-fitting.

Description of Dataset
The asymptotic theory described thus far has been implemented in Benomar (2023), which generates spectra for a given set of seismic parameters.Using this theory, we built two large training datasets comprising 1 million high-frequency red-giant branch (RGB) oscillators, along with 4 million samples of low-frequency RGB and clump oscillators.The distributions of these two datasets are shown in Table 1.While generating these datasets, we used a uniform random distribution to sample ∆ν, q, ∆Π and most of the parameters, as shown in Table 1.However, we used an isotropic distribution (P (ι) ∝ sin ι) to sample the inclination angle, and the distribution of Kepler red giants to sample the signal-to-noise ratio (SNR) and observation time.
To create these datasets, we have treated ∆ν, ∆Π and q to be independent parameters.Within this dataset, certain combinations of seismic parameters utilized for generating spectra may lack a corresponding theoretical model for stars.Examples of such combinations include ∆ν of 18µHz, ∆Π of 41s, q of 0.5 etc., among others.Nonetheless, incorporating these examples in the training process aids the neural network to understand these seismic parameters by thoroughly investigating the diverse patterns present in the data.
The red giants in our training data were categorized into two groups based on their ∆ν values: low-frequency and high-frequency RGB stars.An RGB star was classified as low frequency if its ∆ν was less than 9µHz, while those with ∆ν greater than or equal to 9µHz were classified as high frequency.This classification was heuristic, taking into account findings from Mosser et al. (2014), which indicated that a significant portion of clumps and several hydrogen shell-burning stars fall within the parameter range of ∆ν < 9.5µHz.Therefore, we set the separation point at 9µHz.The designation of highfrequency RGB stars for ∆ν > 9µHz stems from their higher ν max values compared to the other class, while the remaining class was referred to as low-frequency RGB stars.
Though there are two different datasets, we trained only one neural net which works across both datasets.In order to account for the lack of prior information regarding a new star's power spectrum and for the purpose of an independent analysis, we opted to train a single neural network using both datasets collectively.The neural net takes in normalized power spectra and outputs marginal distributions of all four seismic parameters (∆ν,ν max ,∆Π,q) as shown in Figure 1.
In order to optimize the utilization of our computational resources, we generated a dataset comprising 5 million examples and trained the neural network within this constraint.We anticipate that increasing the number of samples will improve the performance, and we have plans to expand the dataset in future research.However, for the purposes of this paper, we specifically focused on generating a larger number of samples for low-frequency stars.The proximity of peaks in these stars presents a more challenging task for the machine in learning the distinctive features.Consequently, we established a 4:1 ratio between the low and high frequency portions of the dataset.It is important to acknowledge that the current ratio is heuristic, and further experimentation is required to optimize the network.However, the exploration of this aspect is beyond the scope of the present paper.

Results on Synthetic data
For a diverse set of 5 million simulated red-giant spectra, we train the machine to obtain probability distributions of ∆ν,∆Π,ν max and q as shown in Figure 1.Although machine can infer offset parameters ϵ p and ϵ g , the accuracy of these parameters on our simulations is relatively low and these parameters are not in the interest of this article.Hence, we do not report them.
Once trained, the network is evaluated over 30,000 unseen simulated spectra which span a large parameter space.We present the results of the estimates on these unseen spectra in Figure 2. Figure 2(a) plots couplingconstant predictions (q pred ) as a function of injected values (q true ).It shows that the predictions and true values are highly correlated.As the predictions uncertainty decreases, the extent of the correlation increases.Figure 2(b) graphs the distribution of errors in the network estimates.It shows that we can recover the coupling strength to within 0.02 for 67% of predictions with uncertainty less than 0.03.Hence, we label the prediction with the uncertainty less than 0.03 as a confident prediction, which is a heuristic choice.

Markov Chain Monte Carlo (MCMC)
Here, we briefly summarize the Markov-Chain Monte-Carlo (MCMC) fitting model that serves as a reference for our neural network technique.MCMC is a powerful algorithm to sample the underlying probability distribution.Compared to gradient-descent methods (MLE or MAP), MCMC is robust to local maxima and provides accurate parameter estimates.Therefore, it serves as an ideal choice for establishing a benchmark baseline in the field of Machine Learning.In this case, we sample the two-degrees-of-freedom χ 2 -likelihood distribution, which is calculated using the power spectral data of the star y and the power spectrum model M (ν, Θ) described in section 2.2.2.We follow methodology described in Benomar, O. et al. (2009) to fit the model to the data and get the underlying parameter distribution.Here, we briefly recall the method described in that publication.

Bayesian formalism
To fit for seismic parameters from the given data, we use a Bayesian formalism, where we obtain the posterior distribution of the parameters given the observations π(Θ|y, M, I).The posterior distribution is given by Using Bayes' theorem, the posterior distribution π(Θ|y i , M i , I) is given by where L(y i |Θ, M i , I) is the likelihood function, π(Θ|M i , I) is the prior of the parameters and π(y|M i , I) is the evidence.
As the observed power spectrum follows 2-dof χ 2 statistics, the likelihood function L(y i |Θ, M i , I) is given by

Metropolis Hastings algorithm
For a given model S, we find the best fit parameters by maximizing the π(Θ|y, S, I).This may be achieved by maximizing the likelihood function π(y|Θ, S, I) or equivalently, minimizing its -log π(y|Θ, S, I).However, in a Bayesian context, one needs to maximize posterior distribution as shown in equation 12.
We sample the posterior using Metropolis Hastings algorithm (Metropolis et al. 1953;Hastings 1970).To find the best-fit parameters, we sample these posterior distributions by drawing random points from the prior and accepting at ratio α, given by Further details about the entire formalism can be found in Benomar, O. et al. (2009).

RESULTS ON KEPLER DATA
After training the network on synthetic examples and building trust based on the attendant predictions, we analysed the Kepler data.To corroborate the network performance on Kepler data, we compared the network inferences with those obtained from other methods.We have used a python package Lightkurve (Lightkurve Collaboration et al. 2018) to extract the Kepler lightcurves and construct the power spectra.
The marginal distributions of seismic parameters obtained from network may be in varied forms and have a range of uncertainties.To compare the predictions with other methods, we only selected confident predictions, i.e., where the uncertainty in q is ≤ 0.03 and the uncertainty in ∆ν is ≤ 0.1µHz.2017) which consists a catalogue of 5166 stars.As we are comparing only confident machine predictions, we selected stars common among the confident predictions and catalogue presented in Mosser, B. et al. (2017).Among 5166 stars, 53.8% of the network's measurements exhibit lower uncertainties in comparison to those reported in Mosser, B. et al. (2017), and this percentage remains at 47.4% among the subset of 3465 stars not meeting the confidence criteria for measurements.

Machine inferences on Kepler data
In Figure 3(a), we have plotted the neural net's predictions (q pred ) against published values (q ref ) and figure 3(b) graphs the distribution of differences between the respective values.In 730 stars, the differences between predictions and published values are less than 0.03.In 309 stars, the published values are greater than predictions by at least 0.08.In 301 stars, the coupling strength measurements of the machine are ≤ 0.05.As the network's measurements in these 309 stars are lower than published values, we investigated some of these stars by constructing echelle diagrams.Figure 4 shows an example of echelle diagram analysis.For KIC 10157507, the network's measurement of q is 0.03±0.02whereas Mosser, B. et al. (2017) measured the coupling strength to be ∼0.11.In figures 4(b) and 4(c), we show synthetic echelle diagrams with injected q = 0.03 and q = 0.11 respectively.As q increases from 0.03 to 0.11, the number of distinct and apparent ℓ = 1 mixed modes increase.In fact, there is a conspicuous difference in the amplitude of g-dominated modes between the two cases.When we compare the echelle diagram of KIC 10157507 shown in 4(a) to synthetic analogues, the star is qualitatively closer to an echelle diagram with q = 0.03.
To establish additional evidence, we have also analysed this star using MCMC.We obtain the underlying posterior distributions on KIC 10157507 using MCMC with prior shown in Table 2.The best-fit model is computed using the medians of the posterior distributions.We show comparison of the best-fit model to smoothed data in figure 5. We smoothed the data using a box-car of width 0.05µHz.We present the mixed-mode parameter distributions of q and ∆Π in figure 6.In the context of a specific neural network, the training data serves as a prior.The neural network developed here outputs probability distributions that can be seen as analogous to marginal posterior distributions obtained through MCMC sampling.As the neural network outputs are Bayesian (Richard & Lippmann 1991), comparing them to MCMC is a like-to-like comparison.However, the uncertainties in their output distributions also depend on network complexity and the number of training samples.Therefore, contrasting these distributions to MCMC posterior distributions can provide validation for the uncertainties and distributions we have obtained.Figure 6(a) suggest that the distributions of coupling strength obtained by the neural net and MCMC are in statistical agreement, i.e., to within an errorbar.The current network cannot accurately replicate the MCMC posterior distribution Figure 3. (a) Predicted value of q by the network at each value of published q in 1701 stars from Mosser, B. et al. (2017).These 1701 stars are selected such that the estimates of both ∆ν and q are precise to within 0.1µHz and 0.03 respectively.In this plot, the points marked with 'R' are RGB stars and the points marked with 'C' are clump stars.The grey lines associated with each point are the uncertainites and the black solid line corresponds to zero difference.The values of ∆ν and ∆Π allows us to differentiate between RGB stars and clump stars (Vrard et al. 2016;Stello et al. 2013).Specifically, a star is classified as a clump star if ∆ν is less than 10µHz and ∆Π is greater than 150s; otherwise it is an RGB star.(b)Distribution of difference in q in all 1701 stars.due to its large bin size.But both these methods affirm that q < 0.05 for KIC 10157507, demonstrating that some red-giants may have a measurable low coupling strength.

Inferences on clumps with glitches
Figure 8 showcases the measurements of the coupling strength and period spacing on 23 stars from Vrard et al. ( 2022) that exhibit structural discontinuities 1 .In Figure 8(a), we plot the neural network's estimates of the coupling strength (q pred ) against the reported values (q ref ), while Figure 8(b) displays the distribution of differences between the respective values.Similarly, in Figure 8(c), we plot the neural network's estimates of the period spacing (∆Π pred ) against the reported values (∆Π ref ), and Figure 8(d) illustrates the distribution of differences between these values.
It can be observed that the uncertainties in the neural network's estimates of q are larger than 0.03 for the 1 We have considered the stars given in https://www.nature.com/articles/s41467-022-34986-z/tables/1majority of the stars, which does not meet our criterion for confident predictions.Despite the lack of precision, the medians of the distributions obtained from the neural network exhibit a strong correlation with the values reported in Vrard et al. (2022).The neural network measurements of q and the measurements of Vrard et al.  2022) within 0.05 for q, and 17 stars agree within 10s for ∆Π.The choice of a threshold of 0.05 for q and 10 seconds for ∆Π for statistical agreement was made based on the approximate mean uncertainty for these stars for those respective parameters.It is worth noting that larger uncertainties are expected since the training data does not encompass such variations.

Period-spacing measurements in low q stars
In D22, we showed that the machine-learning model is successful at measuring ∆Π in red giants across all evolutionary stages.We show measurements of ∆Π in  b,c) Echelle diagrams of typical simulated stars with q = 0.03 and q = 0.11.In all these echelle diagrams, blue rectangles denote ℓ = 0 p-modes, green rectangles indicate ℓ = 2 p-modes and red rectangles mark ℓ = 1 mixed modes.The echelle diagram of KIC 10157507 is closer to the simulated echelle diagram associated with q = 0.03 when compared with that of q = 0.11, indicating that the star's q ≤ 0.05.presents the MCMC distribution in the q-∆Π plane, which includes the 68% and 95% confidence intervals.The marginal distributions of q and ∆Π obtained through MCMC are shown in figures (a) and (c), respectively.The green dashed lines in both panels represent the distributions obtained by the neural network.Panel (d) compares the distributions of the coupling strength obtained from both methods, MCMC and the neural network.The blue line represents the initial value used in the MCMC run, with q initialized to a value larger than the neural network's prediction to avoid bias.The agreement between the two distributions supports our conclusion.In figure (e), the distributions of the period spacing (∆Π) from the MCMC and neural network methods are compared.The blue line represents the MCMC initialization.The two distributions do not converge to each other.The bin sizes in panels (d) and (e) correspond to the bin sizes of the neural network, for a better comparison with the network's outputs.than 0.05.Figures 6(b) and 9(d) indicate that the network's ∆Π distributions are not exhibiting any prominent peaks, and that precise estimation of ∆Π may not be possible.We established this also using MCMC, as seen in 6(b).We find that ∆Π does not converge to a single value, instead possessing three distinct peaks in its distribution, thereby suggesting the possibility that ∆Π cannot be constrained in this star.It implies that q can be measured reliably even if ∆Π cannot be constrained.
While the neural net is unable to output precise estimates of ∆Π in stars with low q, it is able to return accurate estimates in stars with high q.For example, in KIC 10001994, q is well above 0.1 and the measurement of ∆Π is precise as presented in figure 9(c).
From the three cases described so far, it is seen that the precision in measuring ∆Π is highly correlated with the coupling strength.To illustrate this, we studied the uncertainty in inferring ∆Π as a function of q.We present these results in figure 10, where we plot the variation of relative uncertainty at each level of q.As q increases, the mean uncertainty in ∆Π predictions improves.This study establishes that, for a measurement with relative uncertainty of ∆Π less than 0.2, q needs to be greater than ∼0.05.
The correlation between the precision of ∆Π measurement and coupling strength q in figure 10 may also be understood from theory.At low values of the coupling strength, the transmission factor is low and the amplitudes of g-dominated mixed modes are tiny.And since the transmission factor is low, modes encounter significant decay in the evanescent region, implying that the Brunt-Vaisala cavity cannot be accurately probed.This further indicates that ∆Π cannot be measured precisely, as seen in our observations.
The likelihood function for the 2-degree-of-freedom χ 2 distribution may have a weak convexity, with several local peaks arising from the fact that it is a high dimen- sional parameter space, incorporating parameters pertaining to period spacing, amplitudes, widths, and noise profile.It may not lead to a unimodal distribution in the MCMC fits.As a result, we observe the bimodality in Figure 6(a).However, this does not mean that q is unconstrained.We verify convergence through analysis of the acceptance ratio of samples, falling within the range of 0.2-0.25, as per Benomar, O. et al. (2009).Additionally, we assess convergence through the stability of likelihood and posterior distributions across multiple parallel chains.As seen in Figure 6(a), q is constrained to lie within the range 0.01-0.03,as opposed to ∆Π, for which the uncertainty is large in its distribution.Each peak in the bimodal distribution of q leads to multiple solutions for ∆Π, which further proves that ∆Π cannot be constrained for these low values of the coupling constant.Since ∆Π cannot be constrained, ϵ g cannot be constrained either, as shown in Figure 7.
The strong correlation between ∆Π and ϵ g in their joint posterior distribution is expected, as they are both important in the determination of g-mode frequencies.However, the neural network struggles to accurately infer ϵ g .In order to demonstrate that the uncertainty in ∆Π in stars with low coupling constant is not an artifact of determining ϵ g , we present similar findings using MCMC.Figure 7(b) provides evidence in KIC 10157507 that identical values of ϵ g around 0.4 can yield divergent solutions for ∆Π.This substantiates the assertion that, even with the successful constraining of ϵ g , the capacity to accurately infer ∆Π in these particular stars remains limited.

Coupling strength as a function of large separation
We show the results of the correlation between coupling strength and large separation in figure 11.For this study, we selected a sample of 5443 stars out of ∼21000 red giants in the Kepler catalog.As the neural net was able to measure both q and ∆ν precisely within 0.03 and 0.1µHz respectively for these 5443 stars, we have selected this sample.It may be seen that there are two branches in the distribution of q − ∆ν which split at ∆ν ∼ 8µHz.The first branch has low mass RGB stars, with ∆ν spanning from 3µHz to 19µHz and q ≤ 0.21 approximately.The second branch comprises high-mass He-burning clump stars with q ≥ 0.15 and ∆ν < 10µHz.These two branches reunite in their later evolved stages when ∆ν ∼ 4µHz.The study also shows a low density of stars in the phase space of ∆ν ∈ [5-6.5]µHz and q ∈ [0.08,0.2].This observation can potentially be attributed to a combination of the reduction in the coupling strength in the hydrogen shell-burning red giant branch (Jiang et al. 2020) and higher coupling strengths in He-burning stars, leading to an absence of stars in this particular region.
To compare observations with theory, we have simulated a 1.2M ⊙ star using the stellar evolution code (MESA Paxton et al. 2010Paxton et al. , 2013Paxton et al. , 2015Paxton et al. , 2018) ) using the inlist of Takata 2019 and calculated q at each evolutionary step.Coupling strength q is computed based on the asymptotic theory, which is originally developed in Takata (2016) and extended in Takata (2018).It is given by, where the quantity X is given by the following equation In equation ( 16), κ is calculated using equation ( 4).Note that X R is a correction term that is important when the evanescent region is thin.While equation 15 does not account for stars with structural glitches, as demonstrated in (Jiang et al. 2022), we assumed that the stars follow asymptotic theory.The simulated spectra are based on the assumption of asymptotic theory, assuming no variation in q and ∆Π as a function of frequency.Therefore, it is meaningful to compare the inferences derived from the network to the theoretical simulations without any discontinuities.We have taken care to eliminate spikes from the model structure and calculate the coupling strength accordingly.Consequently, the calculation of q in this study differs from that of (Jiang et al. 2022).
We calculate q using equation ( 15) for a 1.2M ⊙ star and select 664 hydrogen shell-burning red giants with masses in the range 1.15-1.25 M ⊙ from our sample to compare with the theory.Figure 11(b,c) shows q as a function of ∆ν for the set of 664 red giants against the theoretical calculation of q.In our simulation, as the star evolves, ∆ν undergoes a decrease from ∼20µHz to ∼6µHz that is correlated with q dropping from ∼0.16 to ∼0.04.The figure indicates that observations match the theory statistically within 1σ intervals across evolutionary stages2 .Furthermore, in Section 3.2, we have demonstrated that even in the presence of glitches, our neural network is able to recover the correct value of q with only increased uncertainty.Since the stars under consideration do not exhibit larger uncertainties, and most of these stars have ∆ν > 6µHz unlike Jiang et al. (2022), it is possible that they do not possess significant discontinuities.
Ong & Gehan (2023) claim that the measurements of q reported in Mosser, B. et al. (2017) may have been potentially overestimated.To explore the possible connection with this claim, we plot Figures 11(b,c) by analyzing a subset of 200 hydrogen shell-burning RGB stars from Mosser, B. et al. (2017) with masses ranging from 1.15 to 1.25 M ⊙ .Notably, as the red giants evolve from approximately 18 µHz to 6 µHz, the average value of q in our study decreases from around 0.15 to 0.02, whereas the average value of q in Mosser, B. et al. (2017) decreases from approximately 0.15 to 0.10.This analysis highlights that the overestimation observed in Mosser, B. et al. (2017) increases from 0.02 at 11 µHz to 0.08 at 6 µHz in our investigation.
It has been suggested by Ong & Gehan (2023) that if the sampling discriminant, D samp = 2 π qN 1 , where N 1 = ∆ν/ν 2 max ∆Π, satisfies D samp < 1, then it can lead to an overestimation of the coupling strength.In Section 3.1, we demonstrate that out of ∼1700 stars, there is a close agreement in the coupling strength measurements within a difference of 0.03 for 730 stars between the two methods (our work and Mosser, B. et al. (2017)).However, among these 730 stars, 375 stars satisfy the condition D samp < 1 from the measurements obtained by both from our network and Mosser, B. et al. (2017) but the network doesn't perceive any significant overestimation.Moreover, these stars do not suffer from the large uncertainty issue illustrated in section 3.3 and the estimates of ∆Π are reliable.Therefore, the explanation provided in Ong & Gehan (2023) may not be sufficient to account for the observed overestimation of the coupling strength in other stars as perceived by the network.

CONCLUSION
We present a deep-learning algorithm to measure the coupling strength in red-giant stars.The neural network was trained on a large library of 5 million synthetic examples which resemble observations.These synthetic simulated spectra were computed using an asymptotic theory of oscillations.We demonstrated that our neural network is calibrated and extracts the coupling strength accurately on synthetic spectra (Section 2.3).
In a dataset of 5166 stars, 53.8% of the network's measurements demonstrated reduced uncertainties compared to those reported in Mosser, B. et al. (2017).A comparison of our confident estimates on 1701 stars to the method from (Mosser, B. et al. 2017) showed that 730 stars (42.9%) among these are in agreement with Mosser, B. et al. (2017) to within a difference in the coupling strength of < 0.03.In 309 stars (18.2%), the neural network infers these measurements to be lower than previously known measurements by ∼0.08.
We observed that our neural network finds that ∼17.5% of these 1701 stars have a low coupling strength, i.e., q ≤ 0.05.We analysed an example KIC 10157507 from this sample of stars using MCMC to validate the measurement of this star.The parameter distribution of q obtained from MCMC agrees with the networkpredicted distribution.For this star, the MCMC distri-Figure 9. Probability distributions of q and ∆Π in two stars KIC 10001994 and KIC 7672292.In the first star, q = 0.15 ± 0.02 and pmax of the ∆Π distribution is ∼ 0.7.In the second star, q = 0.01 ± 0.01 and pmax of the ∆Π distribution is ∼ 0.05.It is evident that ∆Π is well-constrained when q is larger (KIC 10001994, q=0.15) and the distribution is flat when q is smaller (KIC 7672292, q=0.01).
. bution of ∆Π does not converge to one particular value which has also been observed in the neural net's ∆Π distribution.Therefore, q is ∼ 0.03 atleast statistically and ∆Π can't be determined for this star.
By analyzing a sample of 23 clump stars with structural discontinuities (Vrard et al. 2022), we observed that the neural network is capable of estimating the coupling strength within a range of 0.05 for 14 stars, and the period separation (∆Π) within 10 seconds for 17 stars.The results of this study illustrate the potential use of neural networks for obtaining estimates of q and ∆Π, even in the presence of structural glitches.
One advantage of the deep-learning method is that q can be measured even if ∆Π can't be constrained.In addition, we observed that the inference of ∆Π depends on the value of coupling strength (c.f. Figure 10).The ∆Π measurements obtained using the neural network are not precise when q < 0.05.Our study shows that uncertainty in measurement decreases with the value of the coupling strength.These findings may also be explained as the coupling strength is proportional to the square of transmission factor of the mixed mode originating from g-mode cavity.As the transmission increases with q, the amplitudes of g-dominated mixed modes improve which help in constraining ∆Π.
Our deep learning technique, which incorporates both amplitudes and frequencies, may outperform other methods for constraining the coupling strength in stars.However, a comprehensive analysis is needed to fully understand the differences between our method and others, which is beyond the scope of this paper.
The correlation between the coupling strength (q) and large frequency separation (∆ν) in the red-giant branch (RGB) is well-established (Mosser, B. et al. 2017).Remarkably, the neural network, trained on synthetic spectra devoid of artificial correlations, independently identifies a strong correlation between q and ∆ν in Kepler RGB stars.This observation serves as validation for both the neural network and the synthetic spectra developed in this study.While this relationship has been Relative uncertainty in ∆Π's measurement as a function of coupling strength (q).The black dot and red lines associated with each point indicate median and 1-σ interval of relative uncertainty distribution in ∆Π predictions for each value of q.As q increases from 0.01 to 0.15, the relative uncertainty in ∆Π decreases from ∼1 to ∼0.02.
previously observed by (Mosser, B. et al. 2017), the network's inferences have a slightly steeper slope in the correlation.
Theoretical calculations of a 1.2M ⊙ star show that the change in coupling strength as a function of the large separation ∆ν agrees with observations of 1.2M ⊙ stars to within 1σ at various evolutionary stages.This study explains the marginally stronger correlation discovered by the neural net.Our results indicate that the measurements of Mosser, B. et al. (2017) have been overestimated.Ong & Gehan (2023) proposes an insightful explanation for the observed overestimation.Among the ∼1700 stars in the sample, ∼510 stars have been overestimated by at least 0.05, and they satisfy the condition of sampling discriminant (D samp = 2q∆ν πν 2 max ∆Π ) values less than 1 as specified in Ong & Gehan (2023).However, there are stars that meet this condition, yet our network does not exhibit the same overestimation.
The network is able to extract the coupling strength from ∼1,000 stars in under ∼5 seconds, enabling ensemble asteroseismology on vast data sets.The current network can only provide reliable estimates of the coupling strength for a quarter of the entire Kepler red giants.It may encounter challenges in other samples due to low SNR, structural glitches, or inherent limitations of the network.In future work, we will improve this fraction and expand it to infer all global seismic parameters, such as core and envelope rotation rates, and inclination angle, by combining this method with Monte-Carlo-based techniques.
Figure 11.(a)Network-predicted q vs. ∆ν predictions in 5443 stars.The color of each point represents the ratio of the mass of the star to the solar mass, as calculated by a scaling relation (Kippenhahn et al. 2012;Brown et al. 1991;Mathur et al. 2012).Low-mass RGB stars and high-mass clumps split into two branches around ∆ν of ∼8µHz in this distribution and reunify in later stage of evolution i.e., around ∆ν of ∼4µHz.(b) This plot compares the q-∆ν distributions obtained from our work and the catalogue of Mosser, B. et al. (2017) for hydrogen shell-burning red giants within the mass range of 1.15-1.25 M⊙, along with the theoretical model of a 1.2 M⊙ star.The blue circular points represent data from the catalogue, while the dark square points represent data from our work.Each point is accompanied by grey lines that indicate the 1-σ errors in the value of q.(c) This plot presents a comparison between the distributions of (⟨∆ν⟩,⟨q⟩) obtained from our work (dark inverted triangles) and the catalogue of Mosser, B. et al. (2017) (blue circles).The vertical lines indicate the standard deviations of the q predictions within each ∆ν bin.Additionally, the green line represents the theoretical relationship between q and ∆ν derived from MESA.
Figure 1.Architecture of neural network used to infer seismic parameters.The network takes in the normalized power spectrum as input and outputs probability distributions of ∆Π, ∆ν, νmax and q.The neural network comprises 1D Convolutional layers, LSTM cells and a dense layer.We used dropout layers to prevent over-fitting.

Figure 2 .
Figure2.Results on synthetic spectra: (a) Predicted q (q pred ) vs True q (injected q) in a simulated star where the color of each point indicates uncertainty in its measurement.(b) Distribution of δq, where δq = qtrue − q pred .These distributions show error (δq) as a function of uncertainty.It shows that 67% of predictions below uncertainty of 0.03 have |δq| < 0.02.Hence, we demand that confident predictions to have uncertainty less than 0.03 in measurement of q.

Figure 3
Figure3illustrates measurements of coupling strength on 1701 stars fromMosser, B. et al. (2017) which consists a catalogue of 5166 stars.As we are comparing only confident machine predictions, we selected stars common among the confident predictions and catalogue presented inMosser, B. et al. (2017).Among 5166 stars, 53.8% of the network's measurements exhibit lower uncertainties in comparison to those reported inMosser, B. et al. (2017), and this percentage remains at 47.4% among the subset of 3465 stars not meeting the confidence criteria for measurements.In Figure3(a), we have plotted the neural net's predictions (q pred ) against published values (q ref ) and figure 3(b) graphs the distribution of differences between the respective values.In 730 stars, the differences between predictions and published values are less than 0.03.In 309 stars, the published values are greater than predictions by at least 0.08.In 301 stars, the coupling strength measurements of the machine are ≤ 0.05.As the network's measurements in these 309 stars are lower than published values, we investigated some of these stars by constructing echelle diagrams.Figure4shows an example of echelle diagram analysis.For KIC 10157507, the network's measurement of q is 0.03±0.02whereasMosser, B. et al. (2017) measured the coupling strength to be ∼0.11.In figures 4(b) and 4(c), we show synthetic echelle diagrams with injected q = 0.03 and q = 0.11 respectively.As q increases from 0.03 to 0.11, the number of distinct and apparent ℓ = 1 mixed modes increase.In fact, there is a conspicuous difference in the amplitude of g-dominated modes between the two cases.When we compare the echelle diagram of KIC 10157507 shown in 4(a) to synthetic analogues, the star is qualitatively closer to an echelle diagram with q = 0.03.
Figure 5(a) shows the mode distribution of the smoothed data and the model and figure 5(b) shows the comparison of the mode distribution as an echelle diagram.The echelle diagram indicates that the mode distribution matches the smoothed data across all modes (ℓ = 0, 1, 2) statistically.
(2022) have a 55% correlation, while the neural network measurements of ∆Π and the measurements of Vrard et al. (2022) exhibit a 92% correlation.Out of the 23 stars, 14 stars agree with the measurements of Vrard et al. (

Figure 4 .
Figure 4. (a) Echelle diagram associated with KIC 10157507.(b,c) Echelle diagrams of typical simulated stars with q = 0.03 and q = 0.11.In all these echelle diagrams, blue rectangles denote ℓ = 0 p-modes, green rectangles indicate ℓ = 2 p-modes and red rectangles mark ℓ = 1 mixed modes.The echelle diagram of KIC 10157507 is closer to the simulated echelle diagram associated with q = 0.03 when compared with that of q = 0.11, indicating that the star's q ≤ 0.05.

Figure 5 .
Figure 5. (left): Comparison of best-fit model obtained from MCMC with smoothed power spectrum of KIC 10157507.We smoothed the spectra using a box-car of window size 0.05µHz.The best-fit model was generated using q fit = 0.027, ∆Π fit = 74s, which are median values of their respective distributions.(right): It shows the comparison between the echelle diagram of the best-fit model and the smoothed data.These plots indicate that the best-fit model matches the observations.two stars with low coupling strength in figures 6(b) and 9.

Figure 6 .
Figure 6.Panels (a), (b), and (c) display the MCMC distributions in a corner plot for KIC 10157507.Panel (b) specifically presents the MCMC distribution in the q-∆Π plane, which includes the 68% and 95% confidence intervals.The marginal distributions of q and ∆Π obtained through MCMC are shown in figures (a) and (c), respectively.The green dashed lines in both panels represent the distributions obtained by the neural network.Panel (d) compares the distributions of the coupling strength obtained from both methods, MCMC and the neural network.The blue line represents the initial value used in the MCMC run, with q initialized to a value larger than the neural network's prediction to avoid bias.The agreement between the two distributions supports our conclusion.In figure (e), the distributions of the period spacing (∆Π) from the MCMC and neural network methods are compared.The blue line represents the MCMC initialization.The two distributions do not converge to each other.The bin sizes in panels (d) and (e) correspond to the bin sizes of the neural network, for a better comparison with the network's outputs.

Figure 7 .
Figure 7. Panels (a), (b), and (c) display the MCMC posterior distributions of ∆Π and ϵg in a corner plot for KIC 10157507.Panel (b) presents the MCMC distribution in the ∆Π-ϵg plane, which includes the 68% and 95% confidence intervals.The individual marginal distributions of ∆Π and ϵg obtained through MCMC are depicted in panels (a) and (c), respectively.

Figure 8 .
Figure 8.(a) Estimated values of q by the network (q pred ) corresponding to each reported value of q in 23 stars from Vrard et al. (2022) (q ref ) (b) Distribution of the differences between the estimated and reported values in these stars.(c)Estimated values of ∆Π by the network (∆Π pred ) corresponding to each reported value of ∆Π (∆Π ref ) in these stars.(d) Distribution of difference in these two independent measurements of ∆Π in these stars.The red lines associated in panels (a) and (c) with each point are the 1-σ uncertainties.

Figure
Figure10.Relative uncertainty in ∆Π's measurement as a function of coupling strength (q).The black dot and red lines associated with each point indicate median and 1-σ interval of relative uncertainty distribution in ∆Π predictions for each value of q.As q increases from 0.01 to 0.15, the relative uncertainty in ∆Π decreases from ∼1 to ∼0.02.

Table 1 .
Range of seismic parameters for the preparation of synthetic data.
Mosser, B. et al. (2017)e range of seismic parameters that were chosen to create the synthetic dataset.The range of parameters is chosen so as to cover the space of long-cadence Kepler red-giant starsMosser et al. (2015); Vrard et al. (2016);Mosser, B. et al. (2017).

Table 2 .
Prior of different seismic parameters for analysis of KIC 10157507.