Applications of different machine learning methods on nuclear charge radius estimations

Theoretical models come into play when the radius of nuclear charge, one of the most fundamental properties of atomic nuclei, cannot be measured using different experimental techniques. As an alternative to these models, machine learning (ML) can be considered as a different approach. In this study, ML techniques were performed using the experimental charge radius of 933 atomic nuclei (A ≥ 40 and Z ≥ 20) available in the literature. In the calculations in which eight different approaches were discussed, the obtained outcomes were compared with the experimental data, and the success of each ML approach in estimating the charge radius was revealed. As a result of the study, it was seen that the Cubist model approach was more successful than the others. It has also been observed that ML methods do not miss the different behavior in the magic numbers region.


Introduction
Nuclear radii are fundamental properties of atomic nuclei that have been studied extensively in nuclear physics [1,2].The size and shape of a nucleus play an important role in determining its stability and interactions with other particles and nuclei.Direct information about the Coulomb energy of nuclei can be obtained by examining the radii of nuclear charges and, more generally, the distributions of charge density in atomic nuclei.For this reason, charge radii have long attracted attention for nuclear mass formulas [3].It can be measured by various methods based on the electromagnetic interaction that takes place between atomic nuclei and electrons or muons.Commonly used methods are measurements of transition energies in muonic atoms, elastic electron scattering experiments, K α x-ray, and optical isotope shifts.Details of these techniques for measuring root mean square (RMS) charge radii of nuclei can be found in [4,5].With the latest advances in experimental techniques, such as the use of radioactive ion beams, more nuclei away from the β-stability line have been reached, thereby gaining access to the experimental nuclear charge.
The measurement of nuclear charge radii which is related to exotic phenomena such as skin and halo has been among the most interesting topics [6].Studying the nuclear charge radius is important for a better understanding of the proton distribution in nuclei and the skin and halo.For these reasons, accurate and reliable estimation of nuclear charge radii in the absence of experimental data is important for studies on exotic nuclei and effective nucleon-nucleon interactions.Recently updated experimental data for nuclear charge radii of over 1000 nuclei are already available in the literature [4,5,7].In our study, these experimental data were used as a source for different artificial intelligence methods and thus machine learning was carried out.
Machine learning methods have been used in many fields in nuclear physics as in other fields, such as the development of nuclear mass systematics [8], the identification of impact parameters in heavy-ion collisions [9][10][11], estimating beta decay half-lives [12], estimating beta decay energies [13], adjustment of non-linear interaction parameters for relativistic mean field approach [14], predictions for α-decay half-life superheavy nuclei [15], estimations of fission barrier heights [16,17], studying ground-state energies of the nuclei [18], estimation of fusion reaction cross-sections [19] and shell-model calculations supported by artificial intelligence [20].
When the machine learning studies on nuclear charge radius are examined in the literature, we can summarize some examples.Our previous work used artificial neural networks to derive a charge radius formula for A 40 and Z 20 nuclei [21].Utama et al [22] improved the nuclear charge radius estimation performance by combining the Bayesian neural network method with density functional theory.The main motivation in the work is to develop a model that will accurately predict the charge radius of isotopes whose charge radius has never been measured.In their work, they considered the atomic nuclei with A 40 and Z 20 and took the numbers A and Z as inputs to the network.Accordingly, by improving the results of the theoretical models by applying the neural network, they managed to improve the deviation between the results obtained for the radius from the models and the experimental values up to 3 times.
Wu et al [23] obtained nuclear charge radius by using feed-forward neural networks considering the Z, N, and electric quadrupole transition strength values of the nuclei as input parameters of the network.As a result of the training they performed by separating Ca, Sm, and Pb isotopes from the test data, they were able to observe the existence of magic numbers in the isotope chains of these isotopes.Therefore, besides estimating the nuclear charge radii, they were also able to observe the kink, which corresponds to the magic numbers in the Sn, Sm, and Pb isotope chains.In their study, which also emphasized the importance of B(E2) in generating the kink, they proposed the existence of a new relationship between the symmetry energy and the charge radius by including the symmetry energy term in the inputs of the network.Recently, a relationship between nuclear quadrupole deformation and the nuclear size for 98−118 Pd was pointed out in the study of Geldhof et al [24].Furthermore, they showed that pairing correlations attribute to a more correct description of nuclear charge radii for density functional calculations.
Dong et al [25] successfully predicted the nuclear charge radii in the neural networks calculations they performed in the A 40 and Z 20 regions.In the study, a Bayesian neural network is used, in which A, Z, pairing term, and promiscuity factor values are considered as inputs.By combining the NP formula, which allows the calculation of the load radius, with the Bayesian neural network, the results of the radius calculations were improved by approximately 2.7 times.In the illustrations carried out on Ca and K isotope chains, it was shown that while the NP formula exhibits a linear behavior, it is in agreement with the experimental data if it is supported by a Bayesian neural network.Later, authors revisited their study [26] to reduce the rms deviation difference (%30) between the validation set and training set which can cause a possible over-fitting.For this purpose, they added new features containing physical information.
In the study of Ma et al [27] nuclear charge radii were systematically estimated by the naive Bayesian probability classifier and the estimations improved up to about 1.7 times.By combining the raw results of the theoretical models with the predicted residuals of the naive Bayesian probability method, the theoretical charge radii from the HFB model and Shengʼs semi-empirical formula calculations were refined.The results were analyzed in Ca and Bi isotope chains and the success of NBP refinements was demonstrated.
The main purpose of this study is to use different machine learning algorithms on the nuclear charge radius through to nuclidic chart (A 40 and Z 20) as well as to obtain the best results with simple variables.The deformation effect on nuclear charge radii by taking experimental data of quadrupole transition strength values can be considered to be in the study of Wu et al [23] for improving the predictive power of machine learning methods.However, this is possible for even-even nuclei.In the present study, we have studied the global prediction of machine learning methods on nuclear charge radii covering even-even, odd-even, even-odd and odd-odd nuclei by using the same physical quantities in the study of Dong et al [25] such as proton number (Z), mass number (A), pairing effects, shell closure effects, isospin dependence, abnormal behavior of some Hg isotopes.We have used 8 different machine learning methods as an alternative approach for estimating a reliable model to obtain nuclear charge radius.Through detailed analysis of the results from the study, the success of machine learning in obtaining the nuclear charge radius was highlighted.After this extensive study with a large number of machine learning methods used in addition to previous literature studies, methods that improve the deviations between the experimental data and the estimation results have been determined.Thus, machine learning has been shown to be a suitable and reliable tool for estimating nuclear charge radii in the absence of experimental data.
The paper is organized as follows.In section 2, the materials used in the study are mentioned and the applied methods are explained in short summaries by supporting the relevant references.In section 3, the findings from the study are presented and discussed.In the last part section 4, there is the Conclusion section in which an evaluation of the study is made.

The data structure and software resources
In this research, experimental data reported in the literature [4,5,7] were used to estimate the nuclear charge radii with different machine learning algorithms.These data include the radius information of 933 nuclei (A 40 and Z 20).In this study, 699 randomly selected data were used as training data and the remaining 234 as testing data to evaluate the performance of the models.The predictive variables were taken as in the study of Dong et al [26].These are mass number (A), proton number (Z), isospin dependence (I 2 ), pairing term (δ), the promiscuity factor (P) related to shell closure effects [28,29], and a term related to abnormal charge radii behavior in 181,183,185 Hg (LI).The explicit form of the I 2 , δ, P and LI are given by The same input variables were used in all algorithms used in the study to estimate the atomic radius value in the machine-learning training process.

Machine learning algorithms
In this study, eight algorithms (artificial neural network, Cubist model, Gaussian process with polynomial kernel, multivariate adaptive regression splines, random forest, quantile random forest, support vector regression, and extreme gradient boosting) were used.These algorithms are summarized below.As will be seen in the next sections, the performance metrics of the Cubist model are higher than the others for nuclear charge radii predictions.Therefore more details for the Cubist model algorithm are given.
Gaussian Process with Polynomial Kernel (GPPK): Gaussian processes, developed to explain nonparametric relationships on a Bayesian basis, provide point estimates as well as confidence intervals for these estimates [51].In this approach, the common distribution of training and test data is represented by the multidimensional Gaussian density function obtained on the basis of the Polynomial kernel, so the estimated distribution for each test data is determined by the distribution conditions of the training data [51,52].Detailed information for this algorithm can be found in the cited sources [53,54].
Multivariate Adaptive Regression Splines (MARS): This method, which is a nonparametric modeling technique, was developed by Friedman [55].In the MARS approach, the entire data set is significantly divided into sub-datasets and a separate linear model is established for each sub-dataset [56].Thus, with the help of this piecewise linear model, which is automatically adapted to all data, the nonlinear connection between the independent variables and the target variable (splines) can be estimated [57,58].MARS is a very useful algorithm for problems where the relationships of the variables in the dataset may be different in each region [59,60].
Random Forest (RF): This algorithm developed by Breiman [61] has been used in many different fields in recent years due to its high performance in classification and regression problems [62][63][64][65][66][67].In the RF approach, the entire dataset is divided into subsets and more than one decision tree is created within each subset.Then Each tree is trained with randomly selected features.As a result, RF, an ensemble learning technique, is affected by specific weights from each tree for the final model result [68].Also, in the RF model, training each tree with randomly selected features and generating each tree dataset using a subset avoids the overfitting problem [63].
Quantile Random Forest (QRF): Unlike the traditional Random Forest algorithm, QRF uses only values from a certain percentile during splits in each tree [33].Thus, it allows customizing each tree to include only values in a certain percentile.This method can show better results in some cases than a standard Random Forest algorithm in regression problems by reducing the effect of outliers in the data set and allowing the model to better explain the distribution of the target variable (predicted variable) in certain percentiles [69][70][71].
Support Vector Regression (SVR): This method is a regression method that utilizes support vectors and the Lagrange multiplier approach for analyzing and predicting data [72].The SVR algorithm is based on the Support Vector Machine (SVM) algorithm, which provides effective solutions to classification problems [73].SVR is particularly useful when dealing with outliers and non-linearities in data [74].The SVR aims to obtain a regression estimate that accurately predicts the response values based on a subset of high-dimensional prediction variables [75].It utilizes support vectors and the Lagrange multiplier approach to achieve this goal [76,77].
Extreme Gradient Boosting (XGBoost): This algorithm, developed by Chen and Guestrin [35], has emerged by optimizing the Gradient Boosting Machine (GBM) [78] algorithm and expanding it to a scalable level.XGBoost is particularly successful in large datasets and high-dimensional feature spaces.XGBoost is a machine learning algorithm that models nonlinear relationships using multiple decision trees and is based on the principle of sequential error reduction [79,80].In this method, decision trees are created sequentially and each tree is reconstructed according to the predictions of the previous tree with a focus on error reduction.Thus, each tree created learns a new feature that increases the accuracy of the model and reduces the error rate [81].Furthermore, in the XGBoost model, a weight value is assigned to each tree and the contribution of each tree is determined more clearly in order for the results to have higher performance.In addition, the XGBoost algorithm has many regularization parameters to prevent overfitting [17].
Cubist model: The cubist model is a rule-based model capable of handling both numerical and categorical variables [82,83].It uses a 'separate and conquer' methodology, to create a rule-based regression tree that identifies different paths by iteratively splitting predictor variables within the model [84,85].The Cubist model results contain several sets of rules that can be broken down into sub-datasets with similar characteristics to represent the entire dataset.A multivariate linear regression model is fitted to the subsets of data generated by these rule sets to reveal the pattern of association between the predictor variables and the target variable [86].Unlike other rule-based tree models, Cubist uses a combination of models together with a smoothing coefficient to combine the linear models at each node of the tree, as expressed in equation (5) [85,87].
= ´+ -ý a y a y 1 5 where y k ˆ( ) is the prediction from the current model (child model), y p ˆ( ) is the prediction from the parent model located above it in the tree, and a is the smoothing coefficient.This coefficient can be determined as expressed in equation ( 6 where Var[e (p) ] is the variance of the errors (i.e.,y y k ˆ( ) ) of the parent model, Cov[e (k) , e (p) ] is the covariance of the residuals of the child and parent models, and Var[e (p) − e (k) ] is the variance of the difference between the residuals.This process is based on the covariance of the residuals of the child and parent models.The covariance indicates that the errors of the two models are linearly related.If the variance of the errors of the parent model is greater than the covariance, the child model is weighted more than the parent model.In the opposite case, Cubist gives more weight to the parent model.Thus, the model with the lowest error value is more dominant in the adjusted model.If the error values of the two models are the same, both models will have equal weight [85].Cubist combines the linear models at each node according to equation (5), creating a single linear model for each rule.This allows the models to be presented in a more organized and representative way [85].With new regression models developed at each tree wedding, branches with a high error are pruned, thus preventing overfitting for the model [88,89].
There are clear specific differences between the Cubist model and other tree-based models, such as committee models (a boosting-like procedure for building iterative model trees), specific techniques used for model smoothing, rule generation, and pruning, and instance-based corrections (using nearby points from the training set data to adjust predictions) [85,86].In addition, the Cubist model is highly interpretable and can provide insights into the decision-making process without the need for techniques such as SHAP (SHapley Additive exPlanations) [90] and LIME (Local Interpretable Model-agnostic Explanations) [91].Moreover, the ability to handle non-linear relationships between dependent and independent variables [17,92] and to use both numerical and categorical variables as model inputs give the Cubist model significant advantages in terms of high forecasting performance [93].

Comparison of algorithms
Machine learning algorithms have their own strengths and weaknesses.The choice of algorithm depends on the specific problem and the characteristics of the dataset.It is important to carefully consider the advantages and disadvantages of each algorithm before selecting the most appropriate one for a given task.The algorithms used in this study have their advantages and disadvantages, which can be summarized as follows.
Artificial neural networks are known for their ability to model complex relationships and handle large amounts of data.They can learn from examples and generalize well to unseen data.However, they can be computationally expensive to train and require a large amount of data to achieve good performance [94].They can also exhibit illogical behavior when not well trained [95].Cubist models are rule-based models that can handle both numerical and categorical variables.They are interpretable and can provide insights into the decision-making process.However, they may not perform as well as other algorithms when the relationships between variables are non-linear [93].Gaussian processes with polynomial kernels are flexible models that can capture complex relationships between variables.They can provide levels of uncertainty for predictions.However, Gaussian processes are computationally intensive and may not scale well to large datasets, and they offer a probabilistic interpretation [96,97].Multivariate adaptive regression splines are non-linear models that can capture complex relationships between variables and they are useful when the dataset is large and computation time is not an issue [98].Random forests are ensemble models that combine multiple decision trees.They are robust to overfitting and can handle high-dimensional data.They can also provide estimates of variable importance.However, they may not perform well when there are strong interactions between variables [99].Quantile random forests are a variation of random forests that can model the conditional distribution of the response variable.They can provide more information about the uncertainty of predictions.However, they may require more computational resources and may not perform as well as other algorithms when the conditional distribution is highly skewed or heavy-tailed [100].Support vector regression can handle non-linear relationships between variables and can provide good generalization performance.However, it can be sensitive to the choice of hyperparameters and may not perform well when the dataset is noisy or contains outliers [93].Extreme gradient boosting (XGBoost) can handle high-dimensional data and capture complex relationships between variables.XGBoost is known for its high performance and scalability, However, XGBoost can be sensitive to outliers and may not perform well on unstructured and sparse data [101].Additionally, XGBoost may have a slower prediction speed compared to the random forest due to the generation of sequential decision trees [102].

Model performance metrics
In this study, three basic metrics were used to evaluate the performance of machine learning models.These are the mean absolute error (MAE) and root mean square error (RMSE).The mathematical expressions of these performance measures can be shown as follows.
where n is the total number of data.A i and P i indicates actual data and predicted value of ith sample, respectively.
The RMSE and MAE metrics should be close to zero for the model with high predictive performance.

Training process
In this study, the results of the nuclear charge radius estimations for the atomic nuclei from eight different machine-learning methods were obtained.All of the data (933 nuclei) used in the machine learning process were separated into training (699 nuclei) and test (234 nuclei) data sets.The training process of machine learning algorithms was carried out with 10-fold cross-validation processes and the model hyperparameters that gave the best results for each algorithm were determined.Thus, it aims to increase each model's performance by ensuring that it is optimized within the training process.Table 1 shows the performance metrics and optimized model hyperparameter values determined as a result of the 10-fold cross-validation process.According to table 1, along with the learning performance of all models examined in the study being quite high, it can be determined that the best-performing machine learning algorithm is the Cubist model (RMSE = 0.01199 fm and MAE = 0.0077 fm).

Testing process
The performances of the models were evaluated by the test data and the results were shown with discrepancy distribution in figure 1.The one with the lowest RMSE value among these methods is the Cubist model, where In figure 1(b), the graph of the deviations of the predictions from the XGBoost from the experimental data is presented.The distribution in the curve, where the deviations are clearly seen to be concentrated around the horizontal zero line, appears to be in the range of about −0.05 fm to +0.05 fm.The RMSE and MAE values of the results obtained from the XGBoost model are 0.0125 fm and 0.0125 fm, respectively, which is the method in which the best results are obtained after the Cubist model.After the XGBoost, the results of the RF model, the method with the better predictions, are given in figure 1(c).It is seen that the distribution here is concentrated in the range of −0.03 fm to +0.06 fm, around the horizontal zero line.The RMSE and MAE values of the results obtained from this model were 0.0138 fm and 0.0102 fm, respectively.Next better results are obtained with the QRF model (figure 1 It is seen that the deviations of the estimations from the experimental data fluctuate around zero.However, the distribution was found to be in the range of −0.11 fm to +0.11 fm.Finally, the results from the ANN model are presented in figure 1(h).In the graph showing the distribution, it is clearly seen that the zero line disappears and the distribution is shifted down.It can also be seen from the figure that the deviations from the experimental values spread around −0.05 fm to +0.12 fm.The RMSE and MAE values of the ANN model were found to be 0.0564 fm and 0.0494 fm, respectively.These results obtained by ANN show close results to the findings of our previous study [21].It should be noted that the results of Cubist, XGBoost and RF model for RMSE and MAE are better than those of [23,25,27].As seen in this study, the Cubist model gives the most successful results in nuclear charge radius estimations.Cubist model success is based on its algorithm, which correctly partitions the data and generates regression equations that best fit each subset.This is due to its ability to model complex relationships using a combination of both tree structure and regression equations.This study used the bootstrap method to measure the uncertainty of machine learning predictions [103].In this method, a large number of models (1000 for this study) are trained by resampling from the original training set and their predictions on the test data are recorded.The standard deviation of these predictions represents the uncertainty.For the Cubist model, which gave the best prediction results in our study, the minimum and maximum values of the uncertainty in the test data were determined as 0.000 85 fm (for 199 Hg) and 0.050 21 fm (for 77 Rb), respectively.

An application of machine learning models
In this subsection, we are searching for the feasibility of considering machine learning methods based on experimental data of Kr and Sr isotopic chains to check their prediction on shell closure at N = 50 because the shell structure of nuclei around N = 50 is clearly visible in the experimental nuclear charge radii data where the lowest values are located at N = 50 for Kr and Sr isotopic chains.(More details for N and Z dependence of nuclear charge radii can be found in [4].)We also compare our results with the prediction of conventional semiempirical charge radii formulas and mean field models.As it is well known the radius of the nucleus is proportional to its mass number.However, the conventional A-dependent RMS charge radius formulation in equation ( 9) is not globally valid for all nuclei covered by nuclidic charts because some nuclei have different numbers of neutrons and protons even if they have the same mass numbers.On the other hand, the experimental nuclear charge radii values show that the R/A 1/3 ratio is not constant through the nuclidic chart [104,105].Therefore, Z or isospin-dependent formulas have been developed and it has been found that they describe nuclear charge radii of nuclei much better (Details and references can be found in [104,106]).Some well-known semi-empirical charge radii formulas are given below.
In these formulas, R c , Z, N, and A are charge radii, proton number, neutron number, and mass number of considered nuclei, respectively.r A , r Z , r p , b, and c are fitted parameters by using experimental data.These parameters were fitted by [6] for a better description of nuclear charge radii in the case of A > 40 region.For easy follow, the formulas given in equations ( 9), ( 10), ( 11), ( 12), ( 13) and (14) will be named as to be F1, F2, F3, F4, F5 and F6 in the text and graphics, respectively.The prediction of the Cubist model for nuclear charge radii of 72  −96  Furthermore, phenomenological microscopic nuclear models based on the mean-field approach give close charge radii values to experimental data for nuclei cover nuclidic chart [107,108].The kink on nuclear charge radii around N = 50 for isotopic chains is produced very well by the relativistic mean field (RMF) model with NL3 * interaction parameter [109].Because of this reason, we have compared predictions of charge radius values for 72−96 Kr and 78−100 Sr isotopes obtained from either Cubist, MARS, SVR, GPPK, XGBoost, RF, QRF, ANN models together with the predictions of microscopic nuclear models and the experimental data.In figures 3 and 4, the predictions of considered machine learning methods for nuclear charge radii of 72−96 Kr and 78−100 Sr are shown, respectively.The experimental data [5] and calculated results of the HFB (Hartree-Fock-Bogoliubov)

Conclusions
The estimation of the nuclear charge radius for atomic nuclei in A 40 and Z 20 region was performed with eight different ML models, which is a strong alternative to theoretical models.In the training stage of ML methods, using the available experimental data, it has been shown that ML is a good tool for this purpose, and it is concluded that the Cubist model produces the most successful results.After a detailed analysis of the radius estimates, the success of the ML results in the Kr and Sr isotope chains was compared with the results of some available radius formulas.It has been seen that Cubist, XGBoost, RF and QRF ML models unambiguously detect kinks for isotopes corresponding to magic numbers.In the last stage, the radius values obtained from the microscopic models were examined in comparison with the ML results, again in the Kr and Sr isotope chains.It has been concluded that the four ML method is successful in estimating the charge radii of atomic nuclei with accuracy and even in capturing the unusual behavior in the magic number region.When compared with the formulas obtained from different models and the results produced by the microscopic models, it has been seen that ML is a really powerful alternative method to the theoretical models in estimating the radius, which is clearly successful.

=
Kr and 78−100 Sr are shown in figures 2(a) and (b), respectively.The experimental data [5] and the predictions of semi-empirical formulas are shown for comparison.As can be seen in the figures, the F1, F2, F3, and F6 formulas give results far away from experimental data.The F4 and F5 formulas give close results to experimental data up to N = 50 neutron numbers then start to become far.On the right panels of figure 2(a) and (b) kinks around N = 50 are seen which means that shell closure is visible on experimental charge radii of Kr and Sr isotopes.The Cubist model predicts kinks around N = 50 for Kr and Sr isotopic chains very well.It should be noted that the Cubist model predicts the experimental data quite successfully.

Figure 2 .
Figure 2. Predicted charge radius values for 72−96 Kr (a) and 78−100 Sr isotopes from the Cubist model together with the experimental data [5] and the prediction of semi-empirical formulas given in the text.The units for RMSE and MAE are in fm.

Table 1 .
Performance metrics and optimized model hyperparameter values determined as a result of the 10-fold cross-validation process of machine learning models.deviations of the estimation results from the experimental values are given in figure1(a).This model's RMSE and MAE values are 0.0102 fm and 0.0075 fm, respectively.As clearly seen from the figure, the deviations from the experimental values are concentrated around the zero line except for a few atomic nuclei. the (d)).This model's RMSE and MAE values are 0.0140 fm and 0.0104 fm, respectively.As can be seen from the figure, the distribution of deviations from the experimental data of this model is concentrated in the horizontal zero line, in the range of −0.05 fm to +0.06 fm.When the results obtained from the GPPK model given in figure 1(e) are examined, it is seen that the RMSE and MAE values are 0.0346 fm and 0.0273 fm, respectively.The distribution here appears to be in the range of −0.08 fm to +0.11 fm.The distribution of results in this model is concentrated on the positive side of the zero line, indicating that the estimates are generally bigger than the experimental data.The results of the predictions of the MARS method are given in figure 1(f), showing their deviations from the experimental values.The RMSE and MAE values of this model are 0.0346 fm and 0.0277 fm, respectively, and it is seen that the concentration is in the range of −0.06 fm and +0.11 fm around the zero line.In figure 1(g), the distributions of the estimates obtained from the SVR model are given.In the model, in which RMSE and MAE values were obtained as 0.0439 fm and 0.0336 fm, respectively.