Machine learning for efficient grazing-exit x-ray absorption near edge structure spectroscopy analysis: Bayesian optimization approach

In materials science, traditional techniques for analyzing layered structures are essential for obtaining information about local structure, electronic properties and chemical states. While valuable, these methods often require high vacuum environments and have limited depth profiling capabilities. The grazing exit x-ray absorption near-edge structure (GE-XANES) technique addresses these limitations by providing depth-resolved insight at ambient conditions, facilitating in situ material analysis without special sample preparation. However, GE-XANES is limited by long data acquisition times, which hinders its practicality for various applications. To overcome this, we have incorporated Bayesian optimization (BO) into the GE-XANES data acquisition process. This innovative approach potentially reduces measurement time by a factor of 50. We have used a standard GE-XANES experiment, which serve as reference, to validate the effectiveness and accuracy of the BO-informed experimental setup. Our results show that this optimized approach maintains data quality while significantly improving efficiency, making GE-XANES more accessible to a wider range of materials science applications.


Introduction
Layered material structures play a key role in various technological fields such as electronics, renewable energy and material degradation studies.For instance, thin film transistors (TFTs) in electronics utilize layered structures for enhanced charge transportation and flexibility [1], while thin-film photovoltaic devices in renewable energy employ strategic layering for optimized light absorption and electron-hole separation [2].Similarly, the study of layered corrosion in compositionally complex alloys provides crucial insights into material degradation processes [3,4].Numerous examples, including those mentioned above and others, highlight the crucial role of layered structures in a wide range of research and application areas [5][6][7][8][9][10][11].
The conventional study of these complex layered structures employs advanced surface analysis techniques such as x-ray photoelectron spectroscopy (XPS), secondary ion mass spectrometry (SIMS), and Auger electron spectroscopy (AES) [12][13][14][15][16][17][18].While these methods provide detailed insights, they have limitations, such as the need for high vacuum conditions and limited depth analysis.
To address these challenges, x-ray absorption near-edge structure (XANES) spectroscopy has emerged as a promising alternative.XANES operates under ambient conditions and presents a broader scope of applicability across various research contexts.This extends the possibilities for material analysis, particularly for studies requiring more flexible experimental conditions.Additionally, XANES enables time-resolved measurements, providing dynamic insights into the evolving material structure and related properties under study [19,20].
Grazing exit x-ray fluorescence spectroscopy (GEXRF) emerges as a non-destructive, depth-resolved, element-specific characterization technique, enabling the study of materials on the nanometer scale.Depending on the grazing exit angle of the fluorescence radiation, the information depth of the experiment can vary from tens to hundreds of nanometers.Such adaptability is important for the analysis of thin films, corrosion layers, and interfaces in layered materials, where the complex relationship between structure and properties at the nanoscale is critical [4].
For the collection of emitted XRF intensity as a function of grazing exit angle, two methodologies are presented: one involves using an energy-sensitive 1D detector, where the range of grazing emission angles is captured by modifying the detector's position; the other employs a position and energy sensitive area detector, enabling the capture of XRF intensities without necessitating detector scanning, as demonstrated in this study.Within the scanning-free approach, a trade-off is encountered between angular resolution and the intensity of emitted fluorescence radiation.The angular resolution of fixed area detectors is inherently dependent on the distance between the sample and the detector.Increasing this distance, increases the angular resolution but decreases the detected intensity by a factor proportional to the square of the distance (D 2 ) and as a result, experiment times are extended.For greater XRF intensity, the sample-detector distance can be reduced, but this reduces angular resolution and can lead to overlapping fluorescence signals from layered materials due to variance in emission angles from different depths [4].
It is well known in grazing exit setups that shallow emission angles carry information from the surface of the sample being analyzed, whereas as the emission angle increases, information from deeper levels of the material is obtained.The emission angle of the XRF mainly depends on the complex refractive index, and when different oxidation states are analyzed, the angular intensity profiles of layers with different states are close to each other.To resolve the angular intensity profiles of the different layers in such a scenario, good angular resolution is required [4].
In our study, we attempt to achieve high angular resolution and overcome long experimental times.Our study integrates machine learning with Grazing Exit XANES (GE-XANES) spectroscopy, exploiting the inherent strengths of XANES for an improved analytical approach [4].This innovative strategy aims to overcome the limitations of traditional methods, thereby increasing the speed and efficiency of data acquisition, particularly in depth-resolved analyses of layered materials.By incorporating active learning, a subset of machine learning, we seek to refine the data acquisition process in GE-XANES.This integration allows for a more efficient and streamlined methodology, enabling the acquisition of comprehensive insights into layered structures relevant to multiple scientific disciplines.
Building on its proven track record in data analysis, image processing, and materials synthesis [21][22][23][24][25][26][27][28][29][30][31][32], machine learning offers promising ways to overcome obstacles in the GE-XANES data collection process.This integration of machine learning represents a new approach, as its potential benefits in facilitating the data collection process are yet to be fully explored.
Active learning has the potential to revolutionize traditional static data collection methods used in x-ray absorption spectroscopy analysis and other research areas [33,34].Active learning employs mathematical models to guide optimal data collection decisions.Bayesian optimization (BO), a particularly valuable technique when function evaluation is costly or time-consuming, can be embedded within an active learning framework [35][36][37][38].BO incorporates a prior probability distribution over the objective function that can be updated as new data are collected [39].This unique ability to quantify uncertainty assists in model interpretation and decision-making.
Advances in active learning are a good example of the broader trend of machine learning's transformative impact on experimental methodology.This shift towards more efficient, data-driven approaches is gaining considerable attention as it not only streamlines data collection processes but also increases the speed and effectiveness of research.Such advances are extremely useful in resource-constrained environments such as synchrotron facilities.The application of machine learning, and in particular its active learning subset, in the field of synchrotron radiation research has already yielded promising results, as evidenced by numerous studies [40][41][42][43][44][45][46].
Here, we present a novel integration of BO into GE-XANES analysis.This research aims to accelerate data collection, a much-needed development in the field.A layered reference sample consisting of a silicon wafer with a 500 nm layer of metallic chromium followed by a 300 nm layer of chromium oxide was used to test our approach.Through this innovative approach, it is potentially possible to reduce measurement time by a factor of 50 compared to conventional methodologies.
Our results suggest that this BO-based approach can be adapted to any sequential technique with prolonged measurement times, such as laboratory-based XANES, thereby potentially revolutionizing the field of experimental design.It also highlights the growing need for a faster, more efficient data acquisition method across different technological domains, which has important implications for future scientific discovery and research.

Methods
The present study proposes an optimized data collection strategy for a novel, scanning free GE-XANES measurements in the medium/hard x-ray range.We used a monochromatic beam and an energy and position-sensitive pnCCD detector.By leveraging the BO process, we aim to reduce measurement times.Measurements were conducted at the µ Spot beamline at the BESSY II electron storage ring, operated by the Helmholtz-Zentrum Berlin für Materialien und Energie.Further details about the experimental setup can be found in Cakir et al [4].

Beamline
The mySpot beamline, mounted on the 7 T wavelength shifter at the BESSY II synchrotron radiation ring, is our main instrument of use.The synchrotron beam is first focused using a toroidal total external reflection mirror placed strategically in front of the main beam shutter.This mirror enables a high photon flux of 1010-1011 ph s −1 at 6 keV, with a functional energy range of 4-30 keV.The concentrated beam then undergoes monochromatization via a double crystal monochromator, configured in Si (111), offering an energy resolution of 2 × 10 −4 (∆E/E) [47,48].

Detector
For the GE-XANES setup, we employed an energy-dispersive pnCCD detector for selective analysis of specific emission lines, ensuring improved data interpretation accuracy and precision.The detector boasts a sensitive area of 12.7 × 12.7 mm 2 , comprising 264 × 264 pixels with an individual area of 48 × 48 µm 2 .The detector's quantum efficiency exceeds 95% between 3 keV and 10 keV and presents an energy resolution of 152 eV for Mn-Kα (5.9 keV) at 450 kcps [49].

Sample description
In this study, a reference sample possessing a predetermined composition and layer thickness was employed to validate our results.The sample was prepared through a sequential magnetron sputtering technique, using high-purity chromium from an elemental target.The deposition of the metallic chromium and chromium-oxide layers occurred at room temperature, with the thickness controlled by the deposition rates.The metallic chromium layer was sputtered onto a Si-Wafer using a constant flow of Ar.In contrast, the chromium-oxide layer was deposited atop the chromium layer, using an Ar and O mixture.For a detailed overview of the sample preparation process, see Cakir et al [4].

BO for active learning
BO is a probabilistic framework for optimizing functions that are considered 'black-box' and may be linked to expensive and possibly noisy objectives.These objectives typically carry high evaluation costs, whether it is time, computational resources, or other expense types, such as long synchrotron radiation experiments [50].BO requires a posterior distribution prediction, and Gaussian process (GP) is a popular choice for this goal.

GP
GP models enable us to understand the relationship between system inputs and outputs.They assume that objective function f is a realization from a distribution of functions, Therefore, a GP represents an infinite collection of random variables and any finite subset of these also follows Gaussian distribution [51].It follows that the marginal distribution of each function value is a univariate Gaussian . The mean µ (x) and the variance σ 2 (x) functions can be derived by updating our prior with the observed data D = X; y and marginalizing it out [52]: Here, y = f (X) are the observed data points, m is a possibly parameterized mean function, which is set to zero through-out this paper, and K is the covariance function often chosen from the list of parameterized stationary kernel functions such as the Matérn kernel.The Matérn kernel is commonly employed by GPs to model a wide range of data relationships.It exhibits flexibility, capable of tracking both stable, smooth functions as well as abrupt changes and discontinuities.However, depending on the problem and data at hand, other kernel functions may prove more effective in different scenarios.The Matérn kernel, an extension of the radial basis function (RBF) kernel, introduces an additional hyperparameter, ν, which governs the smoothness of the resulting function.A smaller value of ν yields a less smooth approximation, while a larger ν causes the kernel to behave similarly to the RBF kernel.In this work, we use Matérn kernel K Matérn with ν = 1.5.
Additionally, we define as scaling factor s and homoscedastic noise σ n resulting in the following kernel function: Type-II maximum likelihood estimation was used to determine the hyperparameters l, s, σ n where l is the length scale of the function, and σ 2 n I represents the White kernel.In this research, we estimated statistical errors by the square root of the total count, integrating this estimate into our model with a White kernel.We employed a kernel function that combines the Matérn and White kernels, enhancing our model's ability to accurately account for statistical errors.

Acquisition function
The foundational strategy in the BO process involves deriving a new measurement point using the posterior distribution, which is used to construct the acquisition function.We employed the upper confidence bound (UCB) [53] sampling function to determine the subsequent sample points iteratively and to find the peaks of the intensity function.The UCB balances exploration and exploitation to probe unexplored regions within the input space efficiently.The forthcoming sample point is selected by maximizing the UCB function.The equation for the UCB acquisition function is as follows: ( The equilibrium between exploitation (µ (x)) and exploration (σ (x)) is maintained by weighing the variance function with a trade-off parameter (κ).The impact of the variance function on UCB is directly proportional to the trade-off factor, which remains constant throughout the optimization process.Nevertheless, as the sample size increases, the values derived from the variance function decrease due to the reduction in model uncertainty.This leads to an increased emphasis on the exploitation of the learned model, while the focus on exploration diminishes.Although the variance function could be directly employed as a utility function to predict the subsequent calculation points in maximum exploration mode [54], this might result in the omission of pre-peaks, a common phenomenon in XANES analyses.To circumvent this issue, we use the UCB function to calculate the next point instead of solely relying on the maximum of the variance function.The effect of the κ on UCB function is visualized in supplementary information.

Results & discussion
First, it must be emphasized that the standard experiment serves as the reference or ground truth in this analysis, a status supported by prior research and crucial for the comparative approach employed.This standard measurement is the best result experimentally achieved, as it was previously compared to know XANES spectra for the materials at stake [4].This standard is vital for evaluating the effectiveness and accuracy of the BO-informed experimental method.The experimental setup and data acquisition processes for GE-XANES are illustrated in figure 1.The primary objective is to obtain acceptable data within the shortest possible time frame, which closely resembles the data derived from a 20 min per point standard experiment.
The reference sample used in this study is a silicon wafer with a 500 nm layer of metallic chromium, followed by a 300 nm layer of chromium oxide.To begin the scanning process, we define an energy range around the absorption edge of the chromium atom.This range spans 60 eV, specifically from 5975 eV to 6035 eV, covering the Cr Kα absorption edge at 5989 eV.
To analyze the two distinct states of chromium (Cr), the sample-detector distance was set at 50 cm.This configuration offers a total angular range of 1.455 • and a horizontal solid angle of 0.0055 • per pixel.The angular intensity profile of the chromium oxide layer is compressed into approximately 40 of the 264 pixels, resulting in low count statistics in the XANES spectrum obtained from the surface layer.Consequently, longer measurement time is necessary to thoroughly analyze the surface layer.Two distinct data acquisition procedures are used to scan the defined energy range.In the standard procedure, a total of 60 points with 1 eV steps were scanned.The BO-informed method applies an intelligent approach to select energy points for measurement based on an algorithm, optimizing the energy scanning process down to 25 points.To obtain the desired energy values, the angle of the monochromator is aligned to the angle corresponding to the energy value sent from the code.The measurement process is triggered when the monochromator is aligned.The obtained data from both methods are compiled into a dataset.This process continues until a specific stopping criterion is reached, which is 60 measurement points in the case of the standard experiments and 25 measurement points in the experiments informed by BO.The decision to set the stopping criterion at 25 points in the BO-informed experiments will be discussed in the following paragraphs.
As shown in figure 1, with its unique two-dimensional structure, our detector records signal intensities over a specific range of angles corresponding to the emission angle.This two-dimensional design advantageously eliminates the need to physically scan different emission angles by moving the detector.It provides a scanning-free approach, saving considerable measurement time.Furthermore, the detector's sensitivity to the energy of incoming photons allows it to distribute them into 1024 channels according to their energy.This results in a detailed energy spectrum, providing a comprehensive analysis.
Transforming the three-dimensional data (pixel × pixel × channels) obtained from the detector into an XANES spectrum requires a series of processing steps.The initial goal is to condense the three-dimensional data collected from the analyzed energy range into a one-dimensional angular intensity profile.
In this study, we have a Cr oxide layer on top of Cr metal, so we can measure the most intense x-ray fluorescence line (Cr-K α ) free from other undesired contributions, such as scattering.As a result, we achieve higher accuracy by selectively capturing chromium's unique energy signature.We do this by initially defining a range of ±10 channels surrounding the Gaussian peak corresponding to the element under study, thus reducing the data from three to two dimensions.Given our setup, the data of interest lies in the horizontal direction of this two-dimensional data set.Consequently, we sum the data in the vertical direction.This process not only amplifies our statistics but also further refines our data from two dimensions down to a one-dimensional angular intensity profile (see figure 2(a)).
The next step is to collect angular intensity profiles at various excitation energies, which are represented in figure 2(b).We can now combine all of this information as a 3D dataset, featuring emission angle, intensity and excitation energy (figure 2(c)).Following this transformation, we apply non-negative matrix factorization (NMF) to the three-dimensional dataset.The NMF process yields two matrices as outcomes.The first matrix elucidates the angular intensity profiles of the two distinct layers within our sample.Meanwhile, the second matrix presents the XANES spectrum corresponding to the angular intensity distribution derived from the first matrix (see figure 2(d)).As mentioned above, the main advantage of BO-informed experiments lies in the intelligent selection of points in the energy range for analysis rather than statically scanning this range.We describe this process in detail in section 2.4.A flowchart of this process is shown in figure 3 (left panel).
To further illustrate this process, we chose to highlight the 1st, 3rd, 15th, and 17th iterations in figure 3 (right panel).These iterations have been chosen to illustrate the general improvement in the overall scanning process.The progression of the GP model's mean function over these iterations is shown in the right panel of figure 3. The lower graph shows the evolution of the UCB function derived from the selected iterations, highlighting the intelligent selection process inherent in the BO-informed experiment.The gold star on the lower graphs represents the next sampling point.
In BO, a precondition is the establishment of a 'prior function' , which forms the basis for the iterative optimization process.This function is essentially a statistical representation of our beliefs about the function before viewing any data.To establish this prior function, we initially measured seven energy points evenly distributed within our energy range (see figure 3(a)).The measurements are denoted by red dots in figure 3 (right panel-top).
The mean function derived from the GP model is represented by the blue curve in figure 3(a).This curve is shaped based on the measurements from the initial seven energy points and serves as a probabilistic model for our function.The light green area surrounding the curve represents the ±95% confidence interval.
The BO-informed experimental process proceeds in iterations following the initial seven-point measurements.It uses the UCB, a function calculated using the mean and variance functions from the GP model, to guide the selection of energy points for measurement.
By the third iteration (figure 3 By reaching the 17th iteration (figure 3(d)), the confidence interval has considerably narrowed for all but one point.Once this last point is measured, the maximum standard deviation between the points reduces to 0.005, which is a 20 fold reduction from the first iteration.Our empirical observations indicate that such a low standard deviation between points signifies that the model has already explored the given space and attained a sufficient level of precision.In our setup, we set the stopping criterion for the BO-informed experiments at 18 iterations after the first seven measurements, resulting in a total of 25 points measured.However, the standard deviation between points could also serve as an alternative stopping criterion.
In conventional XANES analyses, a total count of at least 10 6 in the analyzed range is a common requirement.Given the setups inherently includes a low detection angle, meeting this condition for the first layer required extended experiment times.In the standard experimental approach, we more than fulfill this requirement (≈7.5 × 10 6 ) with a total scan time of 20 h for the first layer.This was achieved by measuring 60 points, each lasting 20 min.
In particular, we ran two different BO-informed experiments with different point measurement durations of 5 min and 1 min.The results of the standard experiment and the two BO-informed experiments are shown in figure 4.
The left side in figure 4 shows the results of the standard experiment, while the right side shows the results of the BO-informed experiments, each performed with 1 min and 5 min per point measurements.The top graph shows the combined spectra, summed over all emission angles, before applying NMF analysis.
NMF analysis, which we previously detailed, was then applied to the data obtained from the standard and BO-informed experiments.The resulting first matrix, representing the angular intensity profiles of the two distinct layers, is presented in figure 4(b).Notably, even though the noise ratio increases as the measurement time decreases, the separation process of the two distinct components is carried out efficiently.Figure 4(c) displays the XANES spectra of the two layers derived from these angular intensity profiles.To facilitate a comprehensive comparison, we normalized the XANES spectra obtained for the different layers, and these results are presented in figure 5.
Figure 5, left panel, presents the results obtained from the first layer for the standard, BO-informed 5 min per point and 1 min per point experiments in the top graph.To assess the quality of the optimization process, the root mean square error (RMSE) was calculated between the outcomes of the standard  experiments, which involved acquiring 60 equidistant points, and the predictions made by the GP model trained with data from 25 points collected during BO-informed experiments lasting 5 min per point and 1 min per point, respectively.This resulted in a value of 0.016 a.u.eV −1 for the 1 min per point BO-informed experiment.This value was reduced to 0.0135 a.u.eV −1 for the 5 min per point BO-informed experiment, which can be attributed to the beneficial effect of a longer exposure time on the signal-to-noise ratio.When analyzing the percentage absolute error values between spectrums from standard experiment and BO-informed experiments, the maximum deviation from the standard experiment for the pre-peak region reached 7.78% and 6.85% in the 1 min and 5 min per point BO-informed experiments, respectively.However, for regions outside the pre-peak area, the 1 min and 5 min per point measurements showed an average maximum error of 1%.
The outcomes observed when a similar error analysis was applied to the second layer are genuinely encouraging (see figure 5, right panel).The RMSE between the standard experiment and the BO-informed 1 min per point experiment of the second layer has reduced to 0.011 a.u.eV −1 , with the maximum percentage difference also diminishing to 2.56%.When comparing the standard experiment with the BO-informed 5 min per point experiment, values approximating the BO-informed 1 min per point experiment are attained.Specifically, the RMSE for the BO-informed 5 min per point experiment in the second layer is 0.009 a.u.eV −1 , while the maximum difference is computed at 2.63%.Although the maximum difference is slightly higher in the BO-informed 5 min per point experiment, the RMSE suggests that the BO-informed 5 min per point experiment aligns more closely with the standard experiment than the BO-informed 1 min per point experiment.This close alignment as a result of the high-count statistics of the deeper levels.It is important to recognize that these deviations are due to the measurement and the optimization process.
An experiment was conducted to determine if the observed close alignment was due to BO or a GP. Figure 6(a) shows the comparison of normalized XANES spectra from the first layer of the reference sample, based on the exposure duration per point of the standard experiment, with data collected from 60 equidistant points as previously mentioned.The 1 min per point standard experiment captured the initial part of the spectrum (pre-peak region at 5975-6005 eV) but showed significant differences from the 20 min per point standard experiment in the region following the absorption edge (post-edge at 6005-6035).To quantify these differences, the RMSE was used, resulting in a value of 0.0157 a.u.eV −1 between the 1 min and 20 min per point standard experiments, and the RMSE between the prediction obtained from GP after trained with 25 points from 1 min BO-informed experiment and the 20 min per point standard experiment in the post-edge region is 0.005 a.u.eV −1 .
Figure 6(b) represents a comparison between two experimental procedures: a 1 min per point standard experiment and a 1 min per point BO-informed experiment.This comparison aimed to discern whether the close alignment of the BO-informed experiment's results with those of the established 20 min per point standard experiment was solely due to the application of a GP or an intrinsic aspect of the BO-informed methodology.In the 1 min per point standard experiment, a GP model was trained using 60 points collected from the 1 min per point standard experiment to assess its impact on handling noise and characterizing the data.However, even after applying the GP to the 1 min per point standard experiment data, the features of the curve do not match the ones from the 20 min per point standard experiment, while the 1 min per point BO-informed experiment do.The RMSE calculated between the 20 min per point standard experiment and the prediction obtained from GP after training it with 60 points from the 1 min per point standard experiment was 0.0124 a.u.eV −1 , indicating that while the GP contributes to data interpretation, it does not fully account for the features observed in the BO-informed experiment.The BO-informed approach is more effective at managing noise and keeping the features of the data from the 20 min per point standard experiment.
Despite using far fewer measurement points with shorter exposure time, the BO-informed experiments demonstrate their effectiveness.This suggests a more efficient method of data acquisition for GE-XANES and possibly conventional XANES applications.Our comparison with the reference standard experiment highlights the potential and limitations of the BO-informed approach and provides insight into areas for further optimization and refinement.
However, as the results for the first layer indicate, the algorithm could benefit from further tuning to account for local maxima in the pre-peak region more effectively.One possible solution is to refine the optimization process by having the algorithm perform an NMF analysis on the initial angular intensity profile, subsequently determining the emission angles for the layers, and then focusing on the angle range of the desired layer during the optimization process or other solution can be using adaptive sampling functions such as mixed adaptive Sampling Algorithm (MASA).
Looking ahead, we will initiate further research to refine the BO-informed experimental process, focusing on the algorithm's sensitivity to local variations in the data.With iterative improvements, we anticipate this method could become even more accurate and efficient in producing accurate XANES spectra.Further studies could also explore the effect of different kernel functions in the GP model on the scanning results, in order to handle noisy and irregular patterns optimally and further improve the efficiency of BO-informed experiments.

Conclusion
The GE-XANES technique offers crucial insights into the local structure, oxidation states, and interlayers of thin films under ambient conditions.Nevertheless, the extensive duration required for experimental procedures has significantly limited its application within the field of materials science.
In response to this challenge, our study has showcased the effectiveness of BO in enhancing the data acquisition process for GE-XANES.In the example presented here, this represents a potential reduction in measurement time by a factor of 50.By integrating BO, we have effectively reduced the number of necessary data points and decreased the exposure time per point.These improvements have enabled more efficient conduct of XANES experiments, particularly in photon hungry scenarios.Such advancements have addressed the issue of lengthy acquisition times, which previously rendered GE-XANES analyses impractical for various applications.
The benefits of this methodological advancement are not restricted to GE-XANES but are also applicable to other laboratory-based x-ray absorption fine structure (XAFS) techniques and high-throughput experiments.This includes experiments associated with materials accelerator platforms, indicating a broad potential for impact across materials science research.

Figure 1 .
Figure 1.Overview of the BO-informed experimental process used in the grazing exit x-ray absorption near edge structure (GE-XANES) technique.This illustration highlights the key steps of our approach.

Figure 2 .
Figure 2. Overview of data processing.(a) Dimensional reduction process, culminating in a one-dimensional angular intensity profile.(b) Angular intensity profiles at varied excitation energies.(c) Three-dimensional data cube formed from these profiles.(d) Outcomes of the applied non-negative matrix factorization (NMF): the corresponding x-ray absorption near edge structure (XANES) spectrum of two distinct components.
(b)), the UCB function values at previously measured points have decreased.This reduction in UCB values and the associated confidence interval indicates an improvement in the GP model's prediction accuracy due to the additional data.The narrowing of the confidence interval suggests a reduction in uncertainty of predicted values, ultimately leading to a more precise estimation of the target variable values.This iterative improvement continues, with data points and model parameters updating each cycle.

Figure 3 .
Figure 3.The BO-informed process and its iterative refinement.The left panel shows the flowchart of the BO-informed process, illustrating the intelligent selection of points for analysis within the energy domain.The right panel shows the progression of the GP model mean function and UCB function over the 1st, 3rd, 15th and 17th iterations (top and bottom graphs respectively).These selected iterations are an example of the continuous improvement and intelligent selection process inherent in a BO-informed experiment.

Figure 4 .
Figure 4. Comparative results of standard and BO-informed experiments.(a) Combined spectra from standard experiment (left) and BO-informed experiments with point measurement times of 1 min per point and 5 min per point (right) prior to NMF analysis.(b) The first matrix from the NMF analysis shows the angular intensity profiles of the two different layers.(c) The second matrix from the NMF analysis shows the XANES spectra of the two different layers.

Figure 5 .
Figure 5. (a) The left panel presents the results of the standard experiment, BO-informed 5 min, and 1 min per point experiments for the first layer, alongside the calculated root mean square error and percentage absolute error values.(b) The right panel details similar error analysis outcomes for the second layer.The observed errors are attributed to both the measurement and the optimization process.

Figure 6 .
Figure 6.The comparison of normalized XANES spectra from the first layer of the reference sample (a) standard experiments conducted over durations of 1 min and 20 min per point, with data collected from 60 equidistant points.(b) Comparative analysis 1 min per point BO-informed experiment, a 1 min per point standard experiment, and the prediction obtained from GP after training it with 60 points from the 1 min per point standard experiment.