Process identification of laser powder bed fusion anomalies based on process-monitored melt pool radiation intensity data

Laser powder bed fusion (LPBF)) has great application prospects in aerospace and other fields, but low process stability and difficult quality assurance in the process of Laser Powder Bed Fusion are the key problems that restrict its extensive development at present. One of the important ways to solve this problem is to realize online monitoring of molten pool and closed-loop quality control. In this paper, the photodiode sensor is used to collect the radiation signal of the molten pool in real time, and the characteristics of the radiation intensity signal of the molten pool in time domain and frequency domain are extracted. The most appropriate modeling features are selected by comparing and analyzing different features, and XGBoost classical integration algorithm is used to model, and the relationship between the radiation intensity signal of the molten pool and process parameters is established, which provides a new method for abnormal process identification and quality control.


Introduction
Laser Powder Bed Fusion (LPBF) is a powder bed fusion process which uses high-intensity laser as energy source to selectively melt and fuse [1].It can quickly and economically manufacture metal parts with complex geometric shapes, and it shows a good application prospect in aerospace, automotive, biomedical and shipbuilding fields [2].Currently, LPBF has become one of the important processes for the preparation of metal components for aerospace equipment, which has a very wide application prospect.However, in the process of LPBF forming, under the action of laser, the metal powder melts and solidifies at a very high speed, and the temperature and phase of the molten pool change drastically, which makes it easy to produce defects inside the parts and difficult to guarantee the quality [3].At present, pyrometer, photodiode, CCD camera and other sensors are widely used at home and abroad to monitor the manufacturing process online to ensure the process stability and quality consistency of additive manufacturing [4].Gunenthiram et al [5,6].monitored PBF-L process based on high-speed camera, studied the generation mechanism of splash, powder denudation, molten pool fluctuation, etc. based on image characteristics, and put forward the manufacturing method of heat conduction to reduce splash and improve product quality.Kriczky et al [7].used a dual-wavelength pyrometer to obtain the coaxial thermal image of the molten pool, and then extracted the temperaturerelated features such as temperature gradient, aspect ratio, peak temperature and molten pool area, and carried out spatial reconstruction to evaluate the forming quality.Stutzman et al [8].set up a multisensor system, which combines spectrometer with CCD camera to capture plume radiation for realtime defect detection and quality evaluation.Shevchik et al [9].used Fiber Bragg Grating, FGB) sensor to collect the acoustic signals in PBF-L process, established the relationship between acoustic signals and porosity, and realized the monitoring of the porosity of components in PBF-L process [10].Deep learning was used to identify defects, and the quality of each layer was predicted based on the acoustic signals.Among various monitoring methods, online monitoring equipment based on photodiode is more suitable for large-scale application in online process monitoring of additive manufacturing because of its high sensitivity, good robustness, small amount of data, short response time and fast data acquisition speed [11].The basic principle of on-line monitoring based on photodiode in LPBF additive manufacturing process [12,13] is that the photodiode converts the real-time detected molten pool light radiation signal into voltage signal, so as to obtain the molten pool light radiation signal.As the light radiation signal of the molten pool contains a lot of real-time information of the molten pool, monitoring and analyzing the molten pool signal can reflect the internal dynamic state of the molten pool in real time and accurately, which is conducive to mastering the law of LPBF forming process and further optimizing the process parameters [14].In this paper, in the process of LPBF, the on-line monitoring equipment based on photodiode is used to collect the radiation signal of molten pool, extract the features in time domain and frequency domain, select the combination of features.XGBoost classic integrated algorithm is used to identify the process of LPBF forming, and the abnormal process identification and closed-loop quality control of LPBF process are realized.

Experimental design
In this paper, EP-M250 laser powder bed forming equipment produced by Beijing E-plus 3D Technology Co., Ltd. is used in the experiment.The material is K438 superalloy powder, which has excellent thermal corrosion resistance and creep resistance, good high-temperature strength and microstructure stability, and is widely used in hot-end parts of gas turbines and aero-engines.Fifteen different printing process parameters were designed, and each process parameter printed two samples, with a total of 30 samples.The forming size of each sample was 10*10*40mm, and the number of printing layers was 100.The process parameters are shown in the following table: The test sample is shown in Figure 1.According to the different parameters, the sample blocks 3-1, 3-2, 8-1, 8-2, 13-1 and 13-2 are in the optimal parameter window, and the parameters of other samples deviate from the optimal window.

Signal acquisition
During the forming process of LPBF, PDA10A2 silicon photodiode sensor produced by Thorlabs Company of the United States was used to collect the radiant light of the molten pool.The wavelength range of the photodiode was 200-1100nm, and the collection frequency was 50kHz.Fig. 2 is a schematic diagram of the online monitoring system of powder bed molten pool.The monitoring system filters out the interference light waves such as illumination light and reflected laser light through a 700-950nm band-pass filter, and then collects the infrared band light near the peak of radiation light to a photodiode, which converts the radiation signal of molten pool into a voltage signal and uploads it to a computer for storage.In the process of LPBF forming, information such as radiation signal of molten pool and process parameters are collected by acquisition card in real time, and xml and txt files are formed by layers.
The file stores information such as laser power, scanning speed, starting and ending coordinates of scanning line, intensity of acquisition point, switching value, etc.

Data segmentation
The radiation intensity data of molten pool is stored by layer, and the data of all samples in each layer are connected together.In order to facilitate the later data analysis, it is necessary to divide the intensity data of molten pool corresponding to different test blocks in each layer and correspond to the test blocks one by one.
According to the laser switch information, the data can be divided into HE lanes in layers, because the laser switch interval between different test blocks will be larger than the laser switch interval between two blocks, and the radiation intensity data of molten pool can correspond to each test block and each melting channel one by one.Python language is used to divide the collected radiation intensity data of molten pool into 615,816 sample data.

Feature extraction
By extracting the characteristics of the radiation intensity signal of molten pool in time domain and frequency domain respectively, almost complete information of the radiation signal of molten pool can be obtained.In this paper, the melting channel data is used as a unit for feature extraction.In the time domain, the Maximum (max), Minimum (min), Mean, Peak-to-peak, Variance (var), standard deviation (Std), root mean square (Rms), skewness, Kurtosis, Shape-factor, Crest-factor, Impulsefactor, etc.In the frequency domain, the radiation intensity signal of molten pool is random, which can't be expressed by a definite formula like a definite signal.It is necessary to use statistical methods to characterize the characteristics of the random signal.Firstly, the power spectral density of the radiation intensity data of the molten pool of the corresponding melting channel is calculated,then, the maximum power (f_max), minimum power (f_min), extreme power range (f_pk), median power (f_median), average power (f_mf), center of gravity frequency (f_fc), root mean square of frequency (f_rmsf) and standard deviation of frequency (F _ RMSF) The specific meaning and physical meaning are shown in the following table:

Rms
The root mean square value is averaged over time to f_rmsf The weighted average of the signal power square reflect the energy of the signal.
is then calculated as the square root.

Skew
It is a measure of the direction and degree of statistical data distribution deviation. f_rvf The inertia radius centered on the frequency of the center of gravity describes the energy dispersion degree of the power spectrum.

Kurtosis Measuring the Peak State of Probability Distribution of
Real Random Variables psdE Based on information entropy, the spectral structure of power spectrum is described, and the uncertainty and complexity of signal power spectrum are described.

Shapefactor
This is the ratio of root mean square value to absolute mean value.

Crest-factor
The peak value factor is the ratio of the signal extreme value to the effective value (RMS), which represents the extreme degree of the peak value in the waveform.

Impulsefactor
The ratio of the peak value to the rectified average value (the average value of absolute values).

Ratio of peak value to square root amplitude
To determine the optimal performance of a model, besides proper algorithm and reasonable parameter configuration, it is very important to choose proper features.In this paper, a total of 22 features are constructed, 13 based on time domain and 9 based on frequency domain.Which feature is more conducive to the improvement of model performance is often not based on a single type of feature.For data sets with too low effective feature dimensions, it is necessary to construct new features by quadratic combination.Therefore, the characteristics of the input model are divided into three categories, Class A is the whole time domain characteristics, Class B is the whole frequency domain characteristics, and Class C is the combination of time domain and frequency domain characteristics.The three kinds of ABC features are input into the model respectively, and the features most beneficial to process identification are obtained through the final result analysis.The feature set dimensions in this paper are all low, so all of them are selected for subsequent modeling.

Data set calibration and division
615,816 pieces of sample data formed by the segmentation are collected to construct a data set, and the data samples under the optimal process parameters are marked as "normal parameter"; Mark the data samples whose scanning power is less than or greater than the optimal parameter as "low laser power" and "high laser power" respectively; Mark data samples with scanning speed less than or greater than the optimal speed parameter as "low scanning speed" and "high scanning speed" respectively; Mark the data samples with the scanning spacing less than or greater than the optimal scanning spacing parameter as "low scanning spacing" and "high scanning spacing" respectively.And add labels 0, 1, 2, 3, 4, 5, 6 respectively to facilitate model identification.
The whole data set is divided into ABC three groups according to feature types.Data set A is composed of time-domain features and tag data, data set B is composed of frequency-domain features and tag data, and data set C is composed of time-frequency domain combined features and tag data.All three data sets randomly selected 4/5 data as the training set.During the training process, the training data are used to adjust the hyperparameters of the model, and the best hyperparameters are searched through the grid.Then, the performance of the prediction model with the best superparameter is evaluated and verified in the remaining 1/5 data.

XGBoost model building
Extreme Gradient Boosting (XGBoost) is a lifting learning algorithm developed by Chen Tianqi of Washington University in 2016 [15].The algorithm idea of XGBoost is to continuously form a new decision tree and split features to fit the residual error of the previous prediction, so that the residual error between the predicted value and the real value is continuously reduced, thus improving the prediction accuracy.
As one of the most classical algorithms of integrated model, XGBoost has attracted more and more attention in recent years.In essence, the integration algorithm effectively combines multiple weak learners to form a good strong learning model, and its accuracy and generalization ability are better than those of a single learner.Through mutual error correction among different learners, the generalization ability and robustness of learners are improved, so as to achieve the final accuracy.In this paper, XGBoost integration algorithm is used to train and predict the model of data sets A, B and C respectively.The data of training sets A, B and C are input into XGBoost model one by one for training, and the abnormal types of process parameters are identified and classified.In order to reduce data over-fitting and improve the generalization ability of the model, the 50% cross-validation method is adopted for model training in the training process.The steps are as follows: 1) Do not repeat sampling and divide the data into 5 copies at random.
2) Select one of them as the verification set and the other four as the training model of the training set each time.
3) The second step is repeated 5 times, so that once of each subset is used as the verification set and the rest as the training set.4) Save the verification results of five training sessions, and calculate the average value of five groups of verification results as the final precision, which can be used as the performance index of the model under 50% cross verification.

Result analysis
Fig. 3 shows the confusion matrix.The dat  in the matrix indicates the percentage of sample points that distinguish the I-class sample as the J-class sample, so the higher the diagonal value , the higher the accuracy of the classification model.For example, in fig.3a, the first value of 0.85 on the diagonal line indicates that 85% of the data whose real category is "high laser power" is correctly predicted as "high laser power".From the above analysis, it can be seen that the classification effect of XGBoost model based on the combined features of time domain and frequency domain is better, so we will score and analyze the features of the classification model based on data set C, as shown in Figure 4.It can be seen that "max", "Mean" and "Rms" in time domain and "f_neadian", "f_fc", "f_rmsf" and "psdE" in frequency domain contribute more to the classification effect, which is a very important feature in the process of process parameter classification.

Conclusion
In this paper, the radiation intensity data of molten pool in LPBF process are extracted, analyzed and applied.The main work and conclusions are as follows:  (1) In this paper, the process monitoring data (radiant intensity of molten pool) are collected for the sample forming experiment based on different process parameters, and the radiant intensity data of molten pool of each layer is segmented, feature extracted and calibrated based on the best (2) Three data sets of time-domain features, frequency-domain features and time-frequency combination features are input into XGBoost model by using the method of 50% cross-validation, respectively, for model training and final classification.The F1-score value of time-frequency combination feature data set is 83.69%, and XGBoost model has the best classification effect.
(3) The classification model based on the time-frequency combination feature data set is scored and analyzed.max, Mean and Rms in the time domain features and F _ nearest, f_fc, f_rmsf and psdE in the frequency domain features have higher contribution weight to the classification, which is in the process of process parameter determination.

Figure 2 .
Figure 2. On-line monitoring system of LPBF molten pool.
In multi-classification problems, the commonly used evaluation indicators to measure the generalization ability of the model are: Accuracy represents the number of correctly classified samples in the total number of samples, Precision represents the proportion of correctly predicted positive examples in the predicted positive examples, Recall represents the proportion of correctly predicted positive examples in the actual positive examples, and F1-score is based on the weighted harmonic average of precision and recall.For the binary classification problem, the samples are combined according to the real categories and the predicted columns of classifiers (multi-classification problem can be simplified as binary classification problem), which can be divided into four situations:

Figure 3 .
Figure 3. Model classification confusion matrix The F1-score models based on three sets of ABC data sets are calculated respectively.The results show that XGBoost model has the best classification effect based on the combined features of time domain and frequency domain, and the F1-score value is 83.69%.The model classification performance based on time domain features comes second, with the F1-score value of 70.93%;The classification performance based on frequency domain feature model is the worst, with the F1-score value of 44.68%.From the above analysis, it can be seen that the classification effect of XGBoost model based on the combined features of time domain and frequency domain is better, so we will score and analyze the features of the classification model based on data set C, as shown in Figure4.It can be seen that "max", "Mean" and "Rms" in time domain and "f_neadian", "f_fc", "f_rmsf" and "psdE" in frequency domain contribute more to the classification effect, which is a very important feature in the process of process parameter classification.

Figure 4 .
Figure 4. Feature weight of XG boost model

Table 2 .
Physical significance of time domain and frequency domain characteristics.

Table 3 .
Formatting sections, subsections and subsubsections.As an evaluation index, accuracy has some drawbacks.When the sample is unbalanced, the category with a large proportion will affect the accuracy and can't reasonably reflect the prediction ability of the model.Accuracy and Recall are generally a pair of contradictory indicators.When Precision is high, the Recall value is often low.F1-score just takes these two indicators into consideration, so this paper chooses F1-score as the indicator to measure the performance of the model.