A machine learning based Bayesian optimization solution to nonlinear responses in dusty plasmas

Nonlinear frequency response analysis is a widely used method for determining system dynamics in the presence of nonlinearities. In dusty plasmas, the plasma-grain interaction (e.g., grain charging fluctuations) can be characterized by a single particle nonlinear response analysis, while grain-grain nonlinear interactions can be determined by a multi-particle nonlinear response analysis. Here, a machine learning-based method to determine the equation of motion in the nonlinear response analysis for dust particles in plasmas is presented. Searching the parameter space in a Bayesian manner allows an efficient optimization of the parameters needed to match simulated nonlinear response curves to experimentally measured nonlinear response curves.


I. INTRODUCTION
Machine learning (or deep learning) has recently become one of the hottest analysis techniques in the scientific world as application of this powerful numerical method has proven useful in solving problems across a wide range of fields. For example, convolutional neural networks (LeNet, AlexNet) [1,2] now fulfill object recognition tasks to a high degree of accuracy, recurrent neural networks (LSTM) [3] are improving computational understanding of natural language, reinforcement learning agents are out-performing human experts in strategic decision making (AlphaGo) [4,5], and Generative Adversarial Networks (GANs) [6] are showing the ability to create music, paintings and dialogue in a human manner. In addition to industrial applications, machine learning techniques are now also being applied to solve physics problems. Examples include the prediction of molecular atomization energies by employing regression models [7], the application of a neural network to solve quantum many-body problems [8], and crystallization recognition through the use of a shallow neural network [9].
In this paper, a machine learning-based method is applied to nonlinear response problems in dusty plasmas. Dusty plasmas [10][11][12] are systems containing both weakly ionized gas and charged micron-sized dust particles. Due to the higher thermal velocities of electrons compared to ions, dust particles in a dusty plasma become negatively charged [13] in response to the frequent collisions between the plasma particles and the dust grain's surface. Dust particle behavior in plasmas is determined by many factors, with the restoring confinement caused by the balance between the electrostatic force and gravity, the neutral gas drag, and particleparticle interactions between dust particles among the primary of these. Understanding the physics behind dust particle behavior (i.e., investigating these factors) is one of the most important tasks in dusty plasmas. One of the ways in which this can be fulfilled is by studying the response of the particles to external excitations [14,15]. This is known as the nonlinear frequency response analysis, which has a wide application in mechanics, material science and nano science [16][17][18].
Here, a Bayesian optimization framework [19] is used to resolve a nonlinear response analysis [20][21][22][23] in a numerical manner in a dusty plasma. The undetermined coefficients in a dust grain's equation of motion will be derived by optimizing the simulated motion to match that obtained from experimental results which will be compared to the analytic results from a multiple-scale perturbation method. Also, we will measure the nonlinearity to the 4th order in displacement, which helps correctly characterize the potential energy of particle in the plasma sheath but has not been investigated before .
It is important to note that this framework is not limited to nonlinear response analysis, but can also be applied to the more general case of physics problems where the experimental results can be reproduced by simulations. In these cases, undetermined physics quantities can be revealed efficiently (especially when the simulation is very computational expensive) by optimizing the simulations to experimental results in this Bayesian manner.

II. EXPERIMENT AND BAYESIAN OPTIMIZATION
The experiment which will be discussed in this paper was conducted in a modified Gaseous Electronics Conference (GEC) RF reference cell (see Fig. 1) filled with argon gas. A single melamine formaldehyde (MF) particle having a diameter of 8.89 ± 0.09 µm was inserted into a glass box (height: 20 mm, length: 18 mm, width: 18 mm) placed on the lower electrode which was powered at 13.56 MHz. The plasma power and pressure were fixed at 1.68 W and 40 mTorr, respectively. The MF particle was levitated in the plasma sheath region due to the balance between gravity and the electrostatic force produced by the negatively charged lower electrode. The dust particle was illuminated by a laser sheet (wavelength of 660 nm) with the resulting motion recorded by a high speed camera mounted at the side port of the cell at a rate of 500 fps. A primary amplitude-frequency response curve was measured by applying a sinusoidal excitation signal to the lower electrode with a fixed amplitude at various frequencies. Particle motion was recorded and then transformed into the frequency domain (FFT spec-trum) using a Fourier transform at each value of the excitation frequency. The peak height of the FFT spectrum at the excitation frequency was measured, providing the primary response at this excitation. The secondary (super-harmonic) response to the excitation (a nonlinear response), can also be measured from the peak height of the FFT spectrum at twice the excitation frequency. The motion of a single such particle levitating inside the plasma sheath under a vertical sinusoidal excitation can be modeled as a forced oscillator [20], x + µẋ + ω 2 x + αx 2 + βx 3 = F exp(iΩt) + c.c., (1) where µ is the neutral drag coefficient, ω is the restoring constant, α and β are the second and third order derivatives of the restoring field, Ω is the frequency of the sinusoidal excitation, F is the amplitude of the excitation (in units of acceleration) and c.c. stands for the complex conjugate. Usually, the effective restoring force experienced by the particle at equilibrium can be approximated as a linear function in displacement (i.e., −ω 2 x where ω is considered the natural resonance frequency) under the assumption that the particle is levitating in a region where the sheath can be considered to exhibit a perfect parabolic sheath potential [24,25]. Unfortunately, this approximation becomes invalid in most realistic situations, such as where charge fluctuations are considered [21,22], or when the oscillation of the dust particle is so large that the sheath potential can no longer be considered parabolic. In this case, it is necessary to extend the restoring force to the nonlinear regime as −ω 2 x − αx 2 − βx 3 with terms higher than O(x 3 ) ignored for simplicity.
Based on Eq. 1, the particle motion x(t) as a function of time under an excitation with frequency Ω can be simulated (or numerically solved) given a set of known parameters θ = {µ, ω, α, β, F }. In this case, the particle motion was simulated employing the velocity Verlet algorithm, which updates the position and velocity in each iteration as where dt is the time step of the simulation, v(t) is the velocity at time t and a(t) is the acceleration normalized by the particle mass at time t as determined by Eq. 1 Following the same approach described above for measuring the amplitude-frequency response curves from experiment, the primary and secondary amplitudefrequency response curves can also be measured from the simulated particle motion x(t) by varying the excitation frequency Ω over the range of excitain frequencies used in the experiment.
This allows a parameter set θ * = {µ * , ω * , a * , b * , F * } characterizing the properties of the dust motion (which depend on properties of the nearby plasma environment) to be determined by searching the parameter space for the optimal set of parameters that generates a simulated amplitude-frequency response curve resembling the experimentally measured amplitude-frequency response curve. This process can be quantified as where R e represents the experimentally measured response curves and R s (θ) represents the simulated response curves for a given set of parameters {µ, ω, α, β, F }. L(R e , R s ) is a measure of the difference between the experimentally measured and simulated response curves. In order to quantify this difference, we define L as a function L : θ = {µ, ω, α, β, F } → R that maps a set of parameters to a real value which measures the 'distance' between the experimentally measured and the simulated response curve as where r e (Ω i ) and r s (Ω i ) are the experimentally measured and simulated response amplitudes at the excitation frequency Ω i , respectively. The summation is carried out over the excitation frequency span conducted in the experiment. The square operation is taken to ensure that L(θ) does not yield negative values, which guarantees the existence of minimal points. It is important to mention that this type of loss function L(θ) may not be welldefined everywhere. For unrealistic parameters sets, i.e., sets that either have no physical meaning or are not suitable for describing the condition of the plasma sheath, this simulation of nonlinear response curves diverges, resulting in an undefined distance function. In these cases, a large value is assigned to the distance function (e.g., L = 10 5 ) in order to ensure optimization success.
As can be seen from Eq. 5, calculation of the loss function L(θ) for even one set of parameters requires multiple simulations of the particle's motion, i.e., one for each excitation frequency used for measuring the response curve in the experiment. For example, the response curve shown in Fig. 2 requires 71 independent simulations to calculate the distance function for just one set of parameters. This is computationally expensive and, as such, a minimization of the distance function L(θ) based on a random search of the parameter space θ is infeasible.
Therefore, this loss function is minimized employing a Bayesian optimization. This technique has shown great promise in machine learning, especially for the fine tuning of neural networks for model selection. Now, instead of randomly searching the parameter space and then conducting simulations for each set, only those parameter sets selected in a Bayesian manner are simulated. A surrogate function f is introduced to model the distribution of the value of the loss function L(θ). The posterior distribution of this surrogate function at the parameter θ given the data observed D 1:t = {θ 1:t , L(θ 1:t )} can now be derived using the Bayes law where Tree-structured Parzen density estimators [26] (a generative model) are used to model the likelihood function p(θ|f ; D 1:t ) defined as In this likelihood function, l(θ) and g(θ) are nonparametric Parzen density estimators (i.e., Gaussian mixtures) of the likelihood for the data to have a value of the loss function L(θ) smaller and greater than a threshold f * , respectively. As such, the marginal distribution of the parameter set given the observed data set D 1:t (the denominator of Eq. 6) can in turn be calculated as The criteria for exploring the overall parameter space is to choose the next set of simulation parameters that maximizes the expected improvement E[max(f * − f, 0)] [27] as where the last equation holds since the cumulative distribution f * −∞ p(f ; D 1:t )df is strictly less than 1. As shown, this result is not affected by the exact form of the prior p(f ; D 1:t ). As such, the next parameter set whose distance function will be simulated is chosen to maximize the quotient of the Parzen density estimators l(θ)/g(θ). As each new simulation is conducted, the data set D will be updated with the new simulated data points (e.g., the data set D 1:t is updated to D 1:t+1 by adding a new simulated point {θ t+1 , L(θ t+1 )}). The posterior distribution of the surrogate function Eq. 6 will also be updated accordingly, eventually resembling the behavior of the real distance function Eq. 5. Notice that the threshold f * in Eqs. 7 and 9 are the up-to-date optima (i.e., the minima of the loss function updated to the data D).
Since secondary responses (as nonlinear responses) are very sensitive to nonlinear terms, i.e., αx 2 and βx 3 , while the primary responses are more sensitive to the linear terms, it is necessary to minimize the loss functions for both primary and secondary responses simultaneously. One simple way of achieving this is to minimize a weighted sum of these two loss functions rather than minimizing them individually (e.g., L = L p + 0.05L s ). Figure. 2 shows the Bayesian-optimized simulated primary response curves (dashed red curves) of a single dust particle levitated in the plasma sheath in the GEC RF reference cell at a plasma power of 1.68 Watts and pressure of 40 mTorr. The corresponding secondary response curves are shown in the subplots. Particles excited under excitation amplitudes of 1.0 V and 1.5 V are plotted in Fig. 2a and Fig. 2b, respectively. As shown, the optimized response curves (dashed red curves simulated according to Eq. 1) resemble the experimentally measured responses curves (solid black curves) in both the primary and secondary regions. Also, note that the spring softening phenomenon (i.e., the nonlinear phenomenon that results in the primary resonance peak being 'bent' in the low frequency direction) becomes more obvious as the excitation amplitude increases (Fig. 2a).

III. RESULTS
The corresponding optimized parameters obtained from model 1 (Eq. 1) are calculated as the average over five independent trials of the optimizing experiment and their values are listed in Table I ( the corresponding Coefficient of Variations (CV) shown in parentheses. Notice that the sign of the coefficient of the quadratic nonlinearity α is irrelvant in this response analysis since it only changes the direction of the asymmetric motion of the dust particle. Even though there is a large variation in randomness in this Bayesian search of the parameter space, the optimizing experiment converges to yield consistent results as evidenced by the low CV. The relatively high CV for the parameter β (the coefficient of the cubic nonlinearity) is due to the fact that the response curves are robust in responding to variation of nonlinearities of higher order. As such, a small variation in β will not significantly perturb the entire response curve. Fig. 3 shows the loss (Eq. 5) as a function of the number of iterations for the dust particle excited at both 1.0 V (a) and 1.5 V (b), each of which has five independent experimental trials. As shown, the loss values decrease rapidly after the first several iterations, reaching convergence after a few hundred iterations. (This again indicates the efficiency of the presented Bayesian optimization method in exploitation of the parameter space.) However, in order to boost overall accuracy and ensure wider exploration of the parameter space, we conducted a large number of iterations. The observed higher convergency loss value for the dust particle under 1.5 V excitation (≈ 4.3×10 −3 ) as compared to that under 1.0 V excitation (≈ 2.2×10 −3 ) can be attributed to the increased difficulty of capturing the spring softening phenomenon as the excitation amplitude becomes larger. (Compare Fig. 2a and Fig. 2b.)

IV. MULTIPLE-SCALE PERTURBATION METHOD
The parameters can also be derived analytically by solving the equation of motion (Eq. 1) employing the multiple-scale perturbation method. The details of this method are given in refs [23,28], with the main results needed for the analysis given below.
Assuming an external excitation at a frequency of approximately half that of the oscillator resonance frequency ω, i.e., Ω ≈ 1 2 ω, the solution to Eq. 1, to first order, yields where φ is the shifted phase which is dependent on the excitation frequency as φ = arctan( 4Ω−2ω µ ). The parameters ω, µ and α are determined using the experimentally measured secondary response curve fitted to the steady state theoretical secondary response αF 2 /4ω(ω 2 − Considering an external excitation having a frequency approximately equal to the oscillator resonant frequency, i.e., Ω ≈ ω, the solution to Eq. 1, to first order of approximation yields where the shifted phase is φ = Ω−ω −β. By eliminating the secular term appearing in the equation of motion to second order of approximation, the steady state theoretical primary response A(Ω) can now be derived as By fitting the corresponding experimentally measured primary response curve to Eq. 12, the parameter space for the cubic nonlinearities β and F can now be investigated. The parameters obtained in this way are shown in Table I (method 'Multiple-scale'). As shown, the parameters measured from the Bayesian optimization are consistent with those measured from the multiple-scale perturbation, except for the value of β under 1.5 V excitation (with approximately 57.6% percent difference). Notice that the measurements of the parameters from the multiple-scale perturbation serve as a benchmark and should not be considered as true values since they are derived and are precise only to the first order of approximation.
Due to limitations of the multiple-scale perturbation method, any extension of the model (Eq. 1) (e.g., including higher order nonlinearities of either displacement x or velocityẋ) and derivation of the corresponding approximate solutions would be tediously complicated. However, the Bayesian optimization scheme described here allows this process to be simplified greatly. As an example, the model is extended to include an additional nonlinearity of higher order in displacement x, Applying the Bayesian method, the optimized parameters are measured, with the results shown in Table I (with method 'Model 2') and the corresponding simulated response curves shown in Fig. 2 (dashed blue curves). Again, the corresponding CV for the five independent trials are shown in parentheses.

V. DISCUSSION AND CONCLUSION
By considering nonlinearities to fourth order, the primary response curves (dashed blue line) in Fig. 2a more closely resemble the spring softening behavior than do those considering nonlinearities to third order (dashed red line). Also, the loss is further reduced, reaching 1.6 × 10 −3 for a 1.0-V excitation and 2.3 × 10 −3 for a 1.5-V excitation, as compared to values based on the model with third order nonlinearities (Eq. 1) which results in loss values of 2.2×10 −3 and 4.3×10 −3 , respectively. This indicates a closer match of the simulated response curves to the experimentally measured ones. After introducing nonlinearities to the fourth order, the measured drag coefficient µ, excitation amplitude F , and the coefficient of the quadratic nonlinearity α more closely approach the values measured from the multiple-scale perturbation. However, it is noted that the coefficient for the cubic nonlinearities, β exhibits a large deviation. Considering the conditions for the existence of the spring softening effect (seen from Eq. 12) the critical value of β for the existence of the spring softening phenomenon can be derived as β < β c ≈ 8.1×10 −4 (by taking into consideration the fact that the measured values for the coefficient of the quadratic nonlinearity α are consistent in both models given by Eq. 1 and Eq. 13). In this case, a large value of β as measured for both a 1.0-V excitation and a 1.5-V excitation, based on the model represented by Eq. 13, seems to violate the condition of the existence of the spring softening phenomenon (Eq. 14). However, Eq. 14 is derived from only the first order of approximation in the multiple-scale perturbation. Thus, although a response curve simulated with a parameter β violating the condition given by Eq. 14 still reveals the spring softening phenomenon, this indicates a limited ability and accuracy of the multiple-scale perturbation method to explain nonlinear responses since it ignores higher order nonlinearities. In order to accurately determine the coefficient of the cubic nonlinearities β (an important factor characterizing the nonlinearity of the plasma sheath [20,21]), the effects from higher order nonlinearities are important and should not be ignored. Fig. 4a shows the effective restoring potential energy Φ of the particle (divided by the particle mass) in the vicinity of its equilibrium position for both Model 1 (red) and Model 2 (blue). The difference in restoring potential energy is observable for these two models.
With the coefficient for the fourth order nonlinearity available, the change in the electric field and the grain charge at varying levitation positions can be further investigated. By considering an expansion in the electric field E and grain charge Q the electrostatic force can be written as The coefficents of the polynomial in this expansion are related to the corresponding coefficents in Eq. 13. By assuming a linear electric field (i.e., E 2 = E 3 = 0), the nonlinearities in charge Q can be explored to the third order (Q 3 ) with γ provided by the Bayesian optimzation approach (Model 2), while without γ, the nonlinearities in charge can only be explored up to the second order (Q 2 ). Fig. 4b shows the grain charge in the vicinity of the equilibrium position for the third order polymonial (Q = Q 0 + Q 1 x + Q 2 x 2 + Q 3 x 3 ) and second order polymonial (Q = Q 0 + Q 1 x + Q 2 x 2 ) expansion in red and blue, respectively. As a reference, a linear charge model (Q = Q 0 + Q 1 x) is also shown in black. The corresponding linear electric field (divided by the particle mass) for each charge expansion are shown in Fig. 4c. The equilibrium charge Q 0 ≈ 1.4 × 10 4 e was estimated by using the levitation position comparison method [29].
In this method, a vertically aligned two particle pair is formed, and the difference in the levitation position for the upstream particle with and without the presence of the downstream particle (where the downstream particle is knocked out of the system using a laser pulse), is measured. As shown, the third order polymonial charge model predicts a weaker charge reduction in the downstream region, but a stronger charge accumulation in the region above the equilibrium position. Beyond assuming a linear electric field, we can also investigate the nonlinear expansions for the E-field and the grain charge simultaneously (i.e., E = E 0 + E 1 x + E 2 x 2 and Q = Q 0 + Q 1 x + Q 2 x 2 ). However, due to a lack of constraints, this investigation can only be explored to the second order for both electric field and grain charge even with γ known. Fig. 4b and Fig. 4c show the result of this charge model and the corresponding nonlinear electric field as dashed lines. As shown, both the grain charge and the electric field can be very different from the cases where the electric field is assumed to be linear in the downstream region. This indicates a resonable assumption of a linear electric field in a close vicinity of the equilibrium position. It is instructive to compare these results against the usual linear models for both the particle charge and electric field. With the assumption of a linear electric field, the charge varies considerably from the linear charge model in the upstream direction. Conversely, with the assumption of a quadratic electric field, it is seen that the charge varies significantly from the linear charge model in the downstream direction. Future experiments may be designed to determine which of these models is correct.
In conclusion, a nonlinear response analysis for dust particles in plasma was provided employing a machine learning based method. An efficient technique for optimizing the comparison between numerically simulated and experimentally measured response curves by searching the parameter space in a Bayesian manner was described. Using this approach, the physical parameters characterizing the plasma conditions can be derived. The nonlinearity of the response was determined to the fourth order, which is necessary in order to accurately determine the the coefficients for lower-order nonlinearities, as well as to correctly characterize the potential energy of the particle in the sheath. Beyond the field of dusty plasmas, the proposed framework provides a general method for measuring physical quantities by optimizing simulation parameters to match experimental observations in an efficient manner, especially when the simulation is computationally expensive.

VI. ACKNOWLEDGEMENTS
Support from NASA grant number 1571701, NSF grant number 1740203 and NSF grant number 1707215 is gratefully acknowledged.