Comparison of machine learning systems trained to detect Alfvén eigenmodes using the CO2 interferometer on DIII-D

A Machine-Learning (ML) based detection scheme that automatically detects Alfvén Eigenmodes (AE) in a labelled DIII-D database is presented here. Controlling AEs is important for the success of planned burning plasma devices such as ITER, since resonant fast ions can drive AEs unstable and degrade the performance of the plasma or damage the first walls of the machine vessel. Artificial Intelligence could be useful for real-time detection and control of AEs in steady-state plasma scenarios by implementing ML-based models into control algorithms that drive actuators for mitigation of AE impacts. Thus, the objective is to compare differences in performance between using two different recurrent neural network systems (Reservoir Computing Network and Long Short Term Memory Network) and two different representations of the CO2 phase data (simple and crosspower spectrograms). All CO2 interferometer chords are used to train both models, but only one is processed during each training step. The results from the model and data comparison show higher performance for the RCN model (True Positive Rate = 90% and False Positive Rate = 14%), and that using simple magnitude spectrograms is sufficient to detect AEs. Also, the vertical CO2 interferometer chord passing near the center is better for ML-based detection of AEs.

A Machine-Learning (ML) based detection scheme that automatically detects Alfvén Eigenmodes (AE) in a labelled DIII-D database is presented here.Controlling AEs is important for the success of planned burning plasma devices such as ITER, since resonant fast ions can drive AEs unstable and degrade the performance of the plasma or damage the first walls of the machine vessel.Artificial Intelligence could be useful for real-time detection and control of AEs in steady-state plasma scenarios by implementing ML-based models into control algorithms that drive actuators for mitigation of AE impacts.Thus, the objective is to compare differences in performance between using two different recurrent neural network systems (Reservoir Computing Network and Long Short Term Memory Network) and two different representations of the CO 2 phase data (simple and crosspower spectrograms).All CO 2 interferometer chords are used to train both models, but only one is processed during each training step.The results from the model and data comparison show higher performance for the RCN model (True Positive Rate = 90% and False Positive Rate = 14%), and that using simple magnitude spectrograms is sufficient to detect AEs.Also, the vertical CO 2 interferometer chord passing near the center is better for ML-based detection of AEs.

Introduction
The successful operation of planned nuclear fusion devices such as ITER depends on confined populations of superthermal particles that heat fuel ions for a self-sustaining plasma burn [1].If confined, alpha particles born from fusion reactions can provide the heating required to reach ignition.If these alphas become unconfined, they can carry away fusion power to the inner walls of the vessel and damage the first walls [2,3].The heat loss can be replaced using auxiliary heating mechanisms such as Neutral Beam Injection (NBI) or Radio Frequency (RF) waves and both of these methods can create populations of fast ions that are useful for momentum transfer and current drive [4].Fast ions born from fusion reactions or external heating can resonate with special types of plasma waves called Alfvén Eigenmodes (AEs) [5][6][7], transfer energy to the wave, drive the plasma unstable and degrade energy confinement [8,9].Also, particle redistribution can expel fast ions from the plasma [8][9][10][11][12][13][14] and damage the inner walls of the vessel [15,16].Therefore, studying fast ions and controlling AEs is imperative for the realization of controlled nuclear fusion.
Real-time control of AEs in high performance burning plasmas without damage to the inner walls is a high priority for the Plasma Control System (PCS) at ITER [17,18].It is currently an important goal to determine the best set of external actuators in order to control AEs and alpha losses [19].Suitable techniques include Electron Cyclotron Resonance Heating (ECRH) and current drive, and NBI.Since AEs can appear for short time scales on the order of milliseconds, simple feed-forward physics models are used to detect and control AEs.There is a need in the community for models with quick response times that could accurately detect the presence of AEs in real-time experiments.
Machine Learning (ML) applications in magnetic confinement fusion energy are growing and exciting opportunities exist in the fast-ion physics research field.Currently, the largest application of ML is in the area of disruption mitigation, where models are trained to prevent the rapid loss of thermal and magnetic energy during a quench of the plasma [20][21][22][23][24][25][26][27].Surrogate model generation and experimental planning also benefit from data-driven methods [28].On the other hand, ML in fast-ion research is a relatively new field.For example, Alfvénic and magnetohydrodynamic modes were detected using deep learning-based models, manually-labeled targets and magnetics on TJ-II [29] and COMPASS [30].More examples used supervised learning to detect AEs [29][30][31], and data mining techniques combined with clustering for extraction of plasma fluctuations [32,33].
In recent years, significant advancements have been made in detecting and controlling AEs using Electron-Cyclotron Emission (ECE) data on DIII-D.Originally, in-shot variation of neutral beam energy showed promise for AE control [34], then the first active real-time control of AEs in a tokamak utilized modulated beams to tune the drive for AEs using feedback from high resolution ECE signals [15].Shortly after, the Large 2009-2017 DIII-D AE Energetic Particle Database [35] was created to better understand low frequency AEs and was later used for ML analysis in two papers [36,37].Deep Neural Networks were trained using ECE data in both studies.Reservoir Computing Networks (RCN) and Multi-layer Perceptron (MLP) Networks were trained in the former and latter study, respectively, and both achieved high performance.Section 2 of this paper discusses the Large AE-EP Database in more detail.
In this work, we focus on training Recurrent Neural Networks and labels created from the Large AE-EP Database, but use CO 2 interferometer data instead of ECE since there are several advantages: (1) calculating crosspower spectrograms between two chords is common in the fast-ion physics community since AE patterns can be highly visible in this representation of the data, (2) the 1D phase signals are routinely processed by the PCS for nearly every discharge and can be used for real-time control in future DIII-D experiments, and (3) although ECE measurements are high resolution and can measure AE fluctuations with good signal-to-noise, issues associated with resonances and cutoff frequencies pose challenges for AE detection.Using the CO 2 interferometer for AE identification is useful for reliably detecting AEs since it does not have limitations with cutoffs.For reasons 2 and 3, more shots are available in the Large AE-EP Database to train datadriven models.The baseline technique was initially trained to detect AEs using CO 2 interferometer data in a conference paper [38], and we report significant advancements here.
Building from our prior work, the primary objective of this paper is to study the performance by comparing the following: (1) different feature sets (simple magnitude and crosspower spectrograms), (2) recurrent neural networks (RCN vs LSTM), and (3) stacking outputs vs individual crosspower.The State-Of-The-Art (SOTA) technique in [36] used 40 stacked timedomain signals of ECE data and created labels from the Large AE-EP Database to train an RCN.We instead use spectrograms of CO 2 interferometer data, and individually forward pass each chord through both Neural Networks one-at-a-time and compare the results.The aim is to match our prior performance by training with CO 2 interferometer data for the potential long-term goal of creating an ML detector that could be useful in real-time AE control.
This paper is organized as follows: the CO 2 interferometer on DIII-D, labels from the Large AE-EP Database and important challenges are discussed in section 2. The results of model and feature comparison are shown in section 3.
Correlation analysis between predictions and metadata (equilibrium, beam, etc) are reported in section 4. Our conclusions appear in section 5.

Experimental data
DIII-D is a well diagnosed tokamak housing many diagnostic systems that measure the effects of AEs, with large amounts of available data from decades of experimental campaigns.Electron cyclotron emission [39], CO 2 interferometry [40], beam emission spectroscopy [41], and magnetic fluctuation diagnostic systems [42] can be used to study the effects of fast-ion driven instabilities.Diagnostic and plasma information can be relayed to actuators for real-time control of AEs in DIII-D experiments [19,34,43,44].
The two-color vibration compensated CO 2 interferometer is a real-time system routinely used for feedback control of the plasma state at DIII-D.Additionally, it can provide useful information about the internal mode structure of AEs since it observes the AE induced density perturbations with a resolution in the ∆(nL)/nL ∼ 10 −5 range at the frequencies of interest.A layout of the CO 2 interferometer for an example equilibrium is displayed in figure 1.All four chords (three vertical and one horizontal) are digitized for 9 s per shot at a rate of 1.67 MS s −1 , and the CO 2 phase data are studied in this work since AE frequencies are well above typical mechanical vibration frequencies.Also, the phase data are processed in real time by the PCS at DIII-D, making the AE detector in this work applicable for actuator driven mitigation of AE impacts.
In the past, identfication of AEs was usually done in a post-shot framework using crosspower spectrograms of CO 2 interferometer data and other AE fluctuation diagnostics (or plasma parameters).Doing spectral analysis is useful since generating spectrograms can remove low frequency noise and machine vibrations seen in the 1D signals.Although this method worked, it can be time consuming and requires extensive domain knowledge.In this work, we automate the identification process by training RCNs and LSTMs using simple and crosspower spectrograms of CO 2 interferometer data.
The original curated Large AE-EP Database was created to investigate the dependence of AE stability on plasma parameters in over 1139 shots [35].It includes the occurrences of six plasma instabilities: Ellipticity (EAE), Toroidal (TAE), Reversed-Shear (RSAE), Beta-Induced (BAE), Low-Frequency Mode (LFM), and Energetic Particle-Induced Geodesic Acoustic Mode (EGAM) [35].Table I of [36] shows a description of these modes.Times were selected when the various AEs were stable, marginal, or unstable.The number of time stamps per discharge was chosen to sample changes in plasma parameters and mode activity in a representative fashion.Time stamps usually appear in the middle of a type of activity, and many occur during the first 1.9 s since some AEs depend on the q profile and q steadily evolves during that phase of the discharge.
There are several challenges using the Large AE-EP Database and they are addressed here.The time stamps need to first be made binary and we adopt the one-hot encoding  method described in table II of [36].We consider AEs originally marked unstable as being present in the discharge and mark them as 1, otherwise flags are reassigned to 0. Figure 2 shows the presence of AEs over the selected 1069 shots studied in this paper.Since predicting single time stamps is a challenge for ML-based methods, the re-assigned flag for all AEs are widened over ±125 ms.This completes the creation of the labels used to train the RCN and LSTM in this work.The third challenge is shown in figure 3, where the distribution of labels is imbalanced and heavily skewed towards TAE and RSAE.This imbalance motivates using True Positive Rate (TPR) and False Positive Rate (FPR) as the metrics of success since the accuracy metric would be 94% if a model always predicted 0. TPR and FPR are defined as follows: where TP = true positive, FN = false negative, FP = false positive, and TN = true negative.Although the ML classifiers train using information over the entire discharge and the original curated label is only available at discrete random times, TP and FP are modified such that a given prediction is reassigned only if an AE label is nearby within a window of ±140 ms.Lastly, CO 2 interferometer, ECE and magnetics were all used to originally classify AEs in the Large AE-EP Database, which creates a classification challenge since a certain mode might show up more clearly in a different diagnostic than the CO 2 interferometer.
Given these challenges, our prior work using ECE data accomplished TPR = 91% and FPR = 7%, (table III of [36]).Our aim here is to match or improve these results using different feature sets of a new diagnostic system (CO 2 interferometer) and recurrent neural networks (RCN vs. LSTM).

Comparisons and results
In an effort to discover the best performing ML-based model for this new CO 2 interferometer project, several methods are explored in the following order: (1) linear regression, (2) MLPs, (3) convolutional neural networks (convnet), and (4) recurrent neural networks.A brief, qualitative summary for the regression, MLP and convnet classification appear in section of the appendix.Here, three major goals are addressed: 1. Compare the features of different inputs, i.e. simple magnitude and advanced crosspower spectrograms.The extraction of these different feature sets is discussed in section 3.1 2. Determine the best performing recurrent neural network (RCN or LSTM) for this study.The different models are introduced in section 3.2.3. Compare the performance of stacking outputs vs. crosspower combinations (2 simple spectrogram chords and 1 crosspower calculation).Results are shown in section 3.3.

Inputs
The inputs for both recurrent neural networks are simple magnitude and advanced crosspower spectrograms.These are windowed Fourier transform calculations using a window length of 4.9 ms and overlap of 80%.The spectrograms are downsampled using a maxpool function and the final input shapes are (time, frequency) = (140, 508).Maxpooling is commonly used in computer vision tasks and produced good results in this work.For the LSTM model, spectrograms are 'cut' into 280 ms windows, concatenated and fed into the model.For the RCN case, windowing is not implemented and the model processes 1D vectors of frequencies per training step.More details about the input preparation for the LSTM model can be viewed in section III of [38].

Architectures
The Python toolbox PyRCN (Python Reservoir Computing Networks) [45] is used for optimizing and training the RCN in this classification project.We utilize the more common RCN architecture, Echo State Networks (ESNs) [46], to perform the classification of AEs.Also, the hyper-parameter optimization routine is handled within the PyRCN framework and is based on the search strategy introduced in [47].
A two-layer RCN is developed by sequentially stacking two RCNs [48] on top of each other.It has been shown in [47][48][49] that stacking RCNs increases the temporal model capacity and reduces errors learned in early layers by rectifying their outputs in the subsequent layers.Larger capacity (more temporal information) can improve the RCN's performance in detecting specific AEs such as LFM. Figure 4 shows a diagram of this architecture.The RCN processes a timestamp vector of frequency values (N freq × 1) and uses the provided labels (N modes × 1) to train the readout layer.The output of the first RCN are scores for each of the five AEs.This output vector is then fed into the second RCN as input and the second readout layer is trained using the same labels.The final outputs are rectified scores for each AE.
The hyperparameter optimization strategy closely follows the method described in section 3c of [47].Table 1 shows the results from the hyperparameter optimization routine for both  layers.The process uses a three-step sequence of searches for the hyperparameters input scaling, spectral radius, bias scaling and leakage.The steps of the method are as follows: 1. Perform a random search for the input scaling and spectral radius while the bias scaling and leakage terms are held constant.2. Fix the leakage to 1 and search for the bias scaling.3. Search for the leakage term.
Step 1 determines the balance between forward and recurrent connections, step 2 can introduce more non-linearity into the system, and step 3 determines the attention the network gives to temporal information in the inputs.The hyperparameters at each step are selected based on the minimization of the Mean Squared Error (MSE) curve.
The LSTM model consists of three layers using 64 Long-Short Term Memory cells, one dropout layer with dropout probability of 0.5, and four layers using MLPs with 128 nodes per layer.All hidden layers utilize relu activation functions, and the weights are initialized using uniform variance scaling [35].The prediction layer consists of 5 nodes with sigmoid activations and outputs scores for each AE.Binary Crossentropy is used evaluate the loss score, and the Adam optimizer with a learning rate of 0.0001 will tune the weights.The hyperparameters for this network are shown in table 2, and they are optimized by sequentially scanning values and analyzing predictions over three selected discharges.

Results
A data-driven convention is initially implemented to evaluate the performance of the model, and detailed analysis of the predicted errors follow.Thus, the TP and FP metrics are modified such that the time slice of each prediction is reassigned if any AE label information is available within a window of ±140 ms of the predicted timeslice.Two examples at the end of this section show a few observed errors.Also, comparisons here evaluate performance over all four CO 2 interferometer chords.The classification results that compare the RCN and LSTM model for simple and crosspower spectrograms are summarized by table 3. The results for the simple spectrograms are nearly equal or better than crosspower spectrograms for the RCN and LSTM model.Also, the RCN performs slightly better than the LSTM model when using simple spectrograms.The LSTM can trigger slightly stronger predictions than the RCN.This is visible in the slightly higher TPR for EAE and FPR for TAE and RSAE. Figure 5 is a specific example with a lot of AE activity that demonstrates Error type 1 is due to effects from overfitting, since the model could be triggering scores for LFM due to the overall pattern of the discharge.Error type 2 occurs due to noise in the spectrograms.Error type 3 is attributed to time delays for predictions.Error type 4 is categorized as a general AI error, where the model failed to predict correctly.Letter A indicates an incorrectly assigned error since there is still activity but the ∆t extension of the label is too short.Letter C also indicates an incorrectly assigned error due to ambiguity in the discharge.the feature set comparison using the RCN and LSTM models.
Both models might be slightly overestimating detection of TAE and RSAE since the FPR is relatively higher for these AEs.Since the RCN triggers slightly lower, the overestimation effect is smaller.In regions where the AI failed, this is likely due to several reasons: (1) models are overfitting to training data, (2) noise in the CO 2 interferometer spectrograms, 3) latency associated with sparse time stamp, (4) general AI error.However, there are cases where the AI is working well, but an error is assigned.Possible reasons for this are the following: Incorrect value assigned to curated database through (A) ∆t label extension and (B) calling no label stable, and C) some cases can be ambiguous.Figure 6 illustrates some of these points.Despite these issues, both models are capable of learning the patterns associated with AEs in this database and achieve high performance.
The RCN model demonstrates better results, and additional advantages include finer resolved predictions and the training speed for an RCN can be faster than for an LSTM.The RCN substantially improved the linear baseline technique using only linear regression since the memory of the model is higher with the addition of reservoirs containing random recurrent connections.Figure 7 shows the effect of adding a second layer to the RCN model.This effect is similarly observed when adding a third layer and increasing the number of nodes to 64 for the LSTM model.
In an effort to determine the set of chords with the highest AE detection, we check the performance for one chord, two-stacked chords or one crosspower combination using the F 2 score.This metric is a harmonic mean of the recall (TPR) and precision ( TP TP+FP ) metrics, where β = 2 in the following equation: The metrics TP and FP are further modified here by an additional ∆t = ±71 ms in the calculation of the F 2 score to capture more information per chord from the discharges.These values are collected into a confusion matrix shown in figure 8.
For the upper diagonal (crosspower), the best performing combinations are V2R0, V3R0 and V2V2 (autopower).The anecdotal favorite combination in the control room during experiments at DIII-D is V2R0 and the RCN model scores highest for this combination.For the lower diagonal (stacking outputs), adding predictions from V2 to V1 and any chord to R0 slightly improves the performance of V1 and R0, respectively.Although these differences are small, additional AE information from different chords might be needed when predicting using chords V1 or R0.Lastly, the darkest shaded region indicates that predictions for chord V2 achieve the best performance.

Analysis of metadata
Additional information in the Large AE-EP Database can be used to study model interpretability, and correlations between misclassification and operating regime parameters.The following inferred and experimental data are available [35]: (1) EFIT equilibrium reconstructions [50] provide plasma shape, magnetic field, and beta information, (2) kinetic temperature, plasma rotation, and electron and impurity densities from between-shot profile fitting algorithms, and 3) information about neutral beams such as injected power, energy, voltage and orientation.The goal is to determine if there are any tendencies with misclassification by calculating Pearson correlation coefficients, r, between TPR and FPR with all 68 parameters in the database for the validation set.Although many parameters have coefficient values near zero for both TPR and FPR, we report parameters with the highest values here.For the AE labels, BAE has the strongest Pearson correlation coefficient with values of −0.22 and −0.21 for TPR and FPR, respectively.For the plasma parameters, the strongest correlation for TPR is with In panel a, the pitch-angle scattering (PAS) time is the 90-degree scattering time in the NRL Formulary [51].The r between PAS and TPR is 0.20.In panel b, the BAE frequency is from equation (1) of [52], and the r with FPR is −0.17.Since most of the analysis shows low correlation values, concerns regarding the RCN model failing to predict AEs at the limits of the parameter range are alleviated.
Pitch-Angle-Scattering (PAS) time on axis, and for FPR is with an analytical calculation of the BAE frequency; see figure 9.The r for PAS with TPR and BAE frequency with FPR are 0.20 and −0.17, respectively.In both cases, there are about 500 points used in the comparison, and an |r| ≃ 0.20 indicates that the dependence is either non-existent or very weak.Thus, there is no evidence of any dependence on the operating regime-suggesting that we could safely use the identifier throughout this parameter range and likely somewhat beyond.

Conclusion
Recurrent neural networks are trained using CO 2 interferometer data and labels from the Large AE-EP Database on the DIII-D tokamak.Two models (RCN and LSTM) are trained separately using simple and crosspower spectrograms.The additional steps required to calculate crosspower are unnecessary since the predictions are similar for both types of inputs.Both models are trained using one CO 2 chord per training step and achieve high results.The RCN performance is slightly higher with TPR = 90% and FPR = 14%.Detection using any single chord is feasible (V2 is slightly better than the other three).Since the model is primarily trained using labels marked during the current ramp phase, more cases labelling the steady-state portion of the discharge would improve generalizability.Lastly, analysis of the metadata demonstrates that the RCN model still works at the limits of the experimental parameter ranges.
The CO 2 Interferometer is commonly used in fluctuation analysis, acquires data for nearly every DIII-D experiment, is available in the PCS in real-time, and does not have issues with cutoff frequencies.Given these results and advantages, it is strongly recommended to detect AEs using RCNs trained with simple magnitude spectrograms calculated using the vertical chord passing near center (R m = 1.94m at DIII-D).
Future work would consist of implementing the RCN reported in this paper into real-time control algorithms to detect AEs at DIII-D.The SOTA detector currently installed on the PCS is an RCN that trained using ECE data with 8000 and 500 nodes for layers 1 and 2, respectively.It processes time domain signals and makes predictions in approximately 400 microseconds for each time step.The RCN developed in this work is smaller for both layers (4000 and 50 nodes).Although there is an additional step of calculating spectrograms, the RCN trained using CO 2 Interferometer data could have a similar or faster response time during real-time experiments.Implementation of Fast Fourier Transforms into the PCS is currently under consideration and we plan to test it in the near future.

Disclaimer
This report was prepared as an account of work sponsored by an agency of the United States Government.Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof.The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Appendix. Linear regression, MLP and Convnets
The crosspower spectrograms were initially used to train Linear Regression, Multi-Layer Perceptron (MLP) and Convolutional Neural Networks.For regression, Tikhonov regularization is used as follows: where R is the data, D are the labels and α is the regularization parameter.For the MLP network, the models are nearly the same as the LSTM model described in section 3.2, only without the LSTM block.Lastly, the convnet contained a sequence of 5 convolutional and max pooling layers followed by a small MLP block for the final classification.All models were capable of detecting the most common modes (TAE and RSAE) and struggled to detect the other three modes (BAE, EAE and LFM).This is likely due to the lack of memory in the models.However, the convnet was capable of performing slightly better when trained over the entire discharge (similar to the popular cat-dog classification problem) instead of the windowing technique described in section 3.2.Although temporal information is lost when training over the entire discharge, the convnet is capable of predicting the presence of these modes within approximately 10% of the results reported in this paper.

Figure 1 .
Figure 1.The elevation view of the CO 2 interferometer installed on DIII-D for shot #178631.Three vertical chords are located at Rm of 1.48 m, 1.94 m and 2.10 m, and the radial chord is horizontal on the midplane.The black curves are the magnetic flux surfaces (the last closed flux surface is in blue).The magnetic axis is denoted by the blue × symbol.

Figure 2 .
Figure 2. The CO 2 interferometer part of the AE-EP database begins and ends with DIII-D shot #132215 and #178880, respectively.The presence of AEs are plotted against these chronological shot numbers.In this figure, we show that TAE and RSAE are labelled frequently across many experimental campaigns.LFM, BAE and EAE have relatively sparse representation in the database.

Figure 3 .
Figure 3.The occurrence of labels for the training set (801 shots) and validation set (268 shots) are skewed towards TAE and RSAE.The sets are randomly shuffled to preserve distribution shape.In comparision, there are barely any LFM or EAE instances throughout the database.

Figure 4 .
Figure 4. Schematic of the stacked two-layer RCN used to classify AEs trained with simple and crosspower spectrograms.The input layer of the first RCN is connected to a reservoir of nonlinear neurons and gets mapped to a higher dimensional space, where the data are more separable.The readout layer of the first RCN is trained using linear regression and processed as inputs for the second RCN.The second reservoir consists of less neurons since less model capacity is needed to rectify the mistakes of the first layer.The final outputs are AE scores.

Figure 5 .
Figure 5.A comparison of the raw RCN and LSTM predictions using simple magnitude (panel (a)) and advanced crosspower (panel (b)) spectrograms for shot #178631.The simple spectrogram is calculated for chord V2 and the crosspower is between chords V2 and R0.The red vertical ticks and horizontal strikethroughs indicate the curated time stamp and label, respectively.The purple pixels are raw predictions for the RCN and LSTM models.Regions where the purple pixels overlap the red strikethroughs are considered good agreement.The dotted regions are times where the curated database does not indicate anything, yet the model is robust enough to capture the AE activity observed in the spectrograms.

Figure 6 .
Figure 6.AE labels, thresholded predictions and simple magnitude spectrograms for shot #170669.The colored predictions are denoted as follows: TP = green, FP = orange, FN = red, and TP = black.White vertical lines in the spectrograms indicate the original timestamp.Error type 1 is due to effects from overfitting, since the model could be triggering scores for LFM due to the overall pattern of the discharge.Error type 2 occurs due to noise in the spectrograms.Error type 3 is attributed to time delays for predictions.Error type 4 is categorized as a general AI error, where the model failed to predict correctly.Letter A indicates an incorrectly assigned error since there is still activity but the ∆t extension of the label is too short.Letter C also indicates an incorrectly assigned error due to ambiguity in the discharge.

7 .
LFM and EAE predictions using the RCN model for shots 178 636 and 175 985 for CO 2 chord V2.A second reservoir recitifies the mistakes made by the first layer and produces better predictions for the least common modes in the database.

Figure 8 .
Figure 8. F2 scores for the crosspower (upper diagonal), stacked chords (lower diagonal) and single chord (right vertical bar) comparison using the RCN model.Stacking chords can perform better than crosspower, and chord V2 performs slightly better than the other three chords.

Figure 9 .
Figure 9. Points for the strongest Pearson correlation coefficient, r, in the comparison between AE metrics (TPR & FPR) and metadata are shown here.In panel a, the pitch-angle scattering (PAS) time is the 90-degree scattering time in the NRL Formulary[51].The r between PAS and TPR is 0.20.In panel b, the BAE frequency is from equation (1) of[52], and the r with FPR is −0.17.Since most of the analysis shows low correlation values, concerns regarding the RCN model failing to predict AEs at the limits of the parameter range are alleviated.

Table 1 .
The results from the hyperparameter optimization routines used to train the RCN network.A sequential search hyperparameter optimization strategy is used to train the readout layers of the stacked two-layer RCN.Final values for each hyperparameter and each layer are reported in the final two columns.

Table 2 .
Similar to table 1, only for the LSTM network.A simple sequential scan is implemented here.Final values are listed in the last column.

Table 3 .
Comparison of results using simple and crosspower spectrograms for the RCN and LSTM models.RCN predictions are made binary (0 and 1) using AE scores = 0.05, 0.15, 0.11, 0.07 and 0.08 for the five AEs listed in the left column.Similarly, the LSTM threshold values are 0.06, 0.13, 0.13, 0.10 and 0.07.The RCN trained using simple spectrograms is the top performer.