Data mining techniques for quality improvement of electron beam welding process

Besides the fulfilment of the technological requirements for the geometry of the obtained welded joints by electron beam welding, there is a necessity to avoid the conditions, which more probably will lead to defect appearance. It is assumed that the appearance of defects is more probable under some regime conditions. For the modelling of the dependence of bivariate quality characteristics (such as the defect appearance) on the process parameters two different modelling approaches are applied and compared – logistic regression and neural networks. The implemented model-based approaches are compared and applied for the prediction of the defect appearance, depending on the variation of the electron beam process parameters.


Introduction
The multicriterial optimization tasks, concerning the quality characteristics of welded joints, obtained by electron beam welding (EBW) process, include requirements for the geometry of the molten and/or the heat affected zone (if present), as well as requirements for lack of defects [1] or standard requirements and guidelines [2].
Different types of defects, which are formed during the deep penetration of the electron beam into the processed material(s), can be defined [3][4][5]: internal root defect, void or cavities caused by spiking.gas porosity, lack/incomplete/excess of penetration, cavities, electron beam gun discharge defects, slag inclusions, spatter, undercut/overlap, warpage, etc. ISO 6520-1:2007 [6] is giving the basis for a precise classification and description of weld imperfections and ISO 13919-2:2001 standard [7] is a guidance on quality levels for imperfections.The defect appearance is strongly influenced by the form of the welding joints -depth and width [3][4][5], which are the result of the choice of operating EBW process parameters.Several nondestructive testing (NDT) methods can be applied for the estimation of the defectiveness -ultrasonic testing, radiographic testing, eddy current testing and visual testing [3], magnetic particle inspection, acoustic emission, etc.The implementation of each NDT method should be combined with determination of the probability of detection (POD) of the defects and the image quality indicators (IQI).Such NDT systems can be integrated into automatic production lines, e.g. in automotive industry.
In our previous research papers different modelling approaches have been implemented for the estimation of the dependence of the defect appearance in the produced welds on the EBW process parameters -electron beam power, welding velocity, the distances from the main surface of the magnetic lens of the electron gun to the beam focusing plane and to the sample surface: discriminant analysis [1], neural networks [8,9] (implementing different structures and training methods) and regression analysis [1].
For the modelling of the dependence of bivariate quality characteristics (such as the defect appearance) on the process parameters two different data mining methodologies are applied and compared in this paper -logistic regression and neural networks.The influence of the choice of the EBW process parameters -electron beam power, welding velocity, the distances from the magnetic lens to the surface of the treated stainless-steel samples and to the focus of the electron beam on the appearance of defects is considered.The definition and avoiding of the unfavorable process parameter areas is a necessary condition for the quality improvement of electron beam welding process.

EBW classification models
The industrial digitalization implementing Digital Twins (DT) intends the virtual representations of real physical objects, manufacturing processes or services.The DT should be highly integrated with the physical objects by dynamic two-way real-time communication.This includes the implementation of analytical, simulation, management and decision-making tools, as well the access and storing of numerous sources of data and information (including historical data) [10,11].In order to develop a fully functional DT for EBW manufacturing process different data mining methodologies, tools and techniques should be implemented for analyzing the stored experimental and simulated data in EBW databases in the direction of extracting new useful information by revealing the deep and hidden relationships and patterns.
Two classification data mining modeling approaches are implemented and compared: the logistic regression classification algorithm with ridge (L2) regularization and neural network models -with a single and three hidden layers, and trained by a multi-layer perceptron (MLP) algorithm with backpropagation.
The analyzed experiment is for EBW of 1H18NT stainless steel samples [12] with an accelerating voltage 70 kV.The 81 weld cross-sections are investigated.The process parameters are varied in the following ranges: electron beam power P - The experimental dataset was split into two parts: 85% for training (80%) and testing (20%), as well as 15% for validation of the estimated models.
The binary logistic regression (LR) uses the transformation of the dichotomous variable y (which has values y = 0for absence of defects or y = 1in presence of defect/defects) into a latent continuous variable  ̂ (figure 1).The estimated logit model is the following [13]: Then the probabilities (у = 0/) or (у = 1/) can be estimated by: The corresponding predicted value is  ̂() = 0, if  ̂ exceeds the accepted threshold value  = 0.5 or  ̂() = 1, if  ̂ does not exceed it.
Different structures of neural networks (NN) are considered and the best trained and tested NN models are obtained for the NN structures with one and three hidden layers, which are presented in figure 2.
Several prediction accuracy measures are implemented for the comparison of the estimated classification models: area under the receiver operating characteristic curve (AUC), classification accuracy (CA), F1 score, Precision (Pr) and Recall (Re) [10]
This model can be used to determine the probability for the appearance of defects.The nomograms in figure 3 are obtained by the Orange datamining software [14] and give the possibility to calculate quickly the probabilities for obtaining defect-free welds (0) or joints with one or more defects (1), based on the first order logistic regression models.An analogous nomogram can be built for the case of y = 1 for estimation of the probability to have defect(s) in the weld under certain process parameter conditions.
In table 2 are presented the calculation for an arbitrary EBW process parameter values -with the focus of the electron beam at the surface of the welded sample.It can be seen that the total sum of points is 92.09, coinciding to probability of 82% and the prediction is for working regime resulting on defectfree weld.The best results of trained, validated and tested NNs were found for NN structures with one (NN1) and three (NN3) hidden layers, presented in figure 2. The trained, validated and tested NN can be applied for prediction of the defect appearance.In figure 4 predictions from NN1 and the distance to the focus zo = 176 mm and the distance to the sample surface zp = 326 mm ('o' -no defects, '*' -with defects) are shown.The operating regime corresponds to a focus position high above the surface of the welded sample.The predictions from other classification models LR and NN3 give no defects at the chosen focus position and the distance to the sample for all electron beam power and all welding velocity values.
The prediction accuracy comparisons of all estimated models, using several measures: area under the receiver operating characteristic curve (AUC), classification accuracy (CA), F1 score, Precision (Pr) and Recall (Re), are presented in table 3.For training and validation of the models 85% of the initial experimental set (69 experiments) are used.These experiments were divided into two parts -80% for training the models and 20% for validation.The validation set is used during the training to select the model by seeing how well the model performs with this dataset.15% of the initial experimental set (12 experiments) were put on hold and used for testing of the estimated (trained and validated) models for this independent set of data.The prediction accuracy results from the testing are also given in table 3, giving measures for the generalized accuracy of the selected models with unused (unseen) data during the training and validation.
From figure 5 it can be seen that the area under the receiver operating characteristic curve (AUC) is largest for the NN3, where the ROC analysis of the classification models is presented graphically.This can be seen also from the value of AUC in table 3 for training and validation of NN3.
The classification table 4 presents the prediction results (TP, FP, TN, FN) from NN3 and LR (they coincide).The obtained results show that the multilayer neural network and the logistic regression models give better results during the training and validation, as well as for the independent testing set, which is not used during the classification model training and validation.

Conclusion
Different machine learning methodologies were implemented and compared for obtaining defect-free welds by selection of appropriate electron beam welding process parameters: electron beam power, welding velocity, the distances from the magnetic lens to the surface of the treated stainless-steel samples and to the focus of the electron beam.
From the considered process parameters for obtaining defect free welds the largest contribution has the choice of the distance from the magnetic lens to the sample surface, followed by the electron beam power.Using first order logistic regression hinders the consideration of the interactions between the process parameters.For that purpose, the second or higher order model can be applied [13].This will improve all prediction properties of the classification model.Further improvement of the multi-layer NN also can be achieved by considering more complex NN structures.This is connected with receiving new experimental data for the investigated stainless steel.
The optimization of the process electron beam welding can be done by integration of different technological requirements for the geometry of the welded joints, minimization of variation of the technological requirements for the quality indicators under production conditions, with the additional constraint for the absence of defects.Multicriterial parameter optimization methods can be applied for the simultaneous fulfillment of all requirements for all quality characteristics.
Another data mining approach for classification of defects, which can be implemented, is by training the classification models by images of the cross-sections of weld without and with observed defects.This approach needs application of Image Embedding tool, which takes the loaded images as an input and applies the pre-trained neural network (Inception v3) to extract meaningful features from the images [14].These features are then used as inputs for further analysis, clustering (non-supervised machine learning) or classification tasks (supervised or non-supervised machine learning).
[4.2  8.4] kW, welding velocity v -[20  80] cm/min, the distances from the main surface of the magnetic lens of the electron gun to the beam focusing plane zo -[176  276] mm and to the sample surface zp -[126  326] mm.The experimental observations are separated into two groups (classes), not considering the type and the number of the defects: y = 0without defects and y = 1 -with defects.

Table 2 .
Prediction of the probability for defect appearance by LR classification model.