Detection of balling levels on the surface of SLM formed parts based on finite depth separable convolution network

To meet the real-time requirements of balling levels detection in selective laser melting processes, a modified detection model, called Finite Depth Separable Convolution Network (F-DSCNet), is proposed by optimizing the existing benchmark model (BM) with two lightweight structures: Depth Separable Convolution (DSC) and Global Average Pooling (GAP). This model balances the effect of reducing model’s parameters and increasing model’s structural complexity brought by DSC on the computation and convergence speed of the model, and only introduces DSC in the higher-level convolution layers of the BM. In addition, the GAP structure is adopted instead of the fully connected layer to further reduce the number of parameters and accelerate model training and convergence. The experimental results show that the F-DSCNet model not only maintains high recognition accuracy but also significantly improves the model’s computation and convergence speed, as well as the recognition response time of a single image, exhibiting strong practicality for engineering applications.


Introduction
Selective Laser Melting (SLM) is an incredibly promising additive manufacturing technology that has revolutionized the production of highly complex and fully functional 3D components, surpassing the limitations of traditional manufacturing methods in terms of geometric designs.Its widespread adoption In fact, the computational burden of the CNN primarily lies in the convolutional and fully connected layers, as the number of parameters and computational complexity in these two layers determine the model's computation and convergence speed.To address these challenges, various model compression and acceleration techniques have been proposed, such as model pruning [15], model quantization [16], and low-rank decomposition [17], taking into account both model performance and parameter reduction.However, these techniques for large networks are often complex and challenging to implement.With a better understanding of neural network principles and their validation in practice, a more efficient and simpler lightweight neural network design method has emerged.The main idea is to design networks using different convolutional kernels and convolutional modes, replacing larger kernels with smaller ones.Moreover, techniques like grouped convolution, depth separable convolution, and transposed convolution are employed instead of standard convolution to reduce computational complexity and accelerate network computations.Typical lightweight CNN architectures include GoogleNet [18], MobileNet [19], SqueezeNet [20], and ShuffleNet [21].These architectures have been applied in lightweight semantic segmentation, object detection, and classification models, and have achieved good performance.Additionally, in CNN models, replacing fully connected layers with GAP [22] and introducing batch normalization (BN) [23] can also speed up network computations.
It can be seen that the excellent performance of lightweight convolutional neural networks could be attributed to the introduction of lightweight structures.Considering the issue of slow computation and convergence speed caused by the complex structure and large number of parameters in the benchmark model for balling levels detection, this paper introduced two lightweight structures named DSC and GAP to construct an improved convolutional neural network model, F-DSCNet, for balling levels detection.The improved model combined the advantages of DSC and GAP, and improved the model's computation and convergence speed, as well as the unit image recognition response time while maintaining high recognition accuracy.

Benchmark model for balling levels detection
In reality, balling levels refer to the overall assessment of the number and size of balling particles, which directly reflects the severity of balling phenomenon within the forming layer during the SLM process.Therefore, rapid and accurate recognition of balling levels within the forming layer is the fundamental guarantee for improving the quality of the final part.On this basis, we have studied and explored the application of the deep CNN framework in the automatic detection of balling levels during the SLM process.The specific implementation process is illustrated in figure 1, and further details can be found in reference [14].Figure 2 presents the structure of the benchmark model depicted in figure 1.Its basic idea is based on the LeNet-5 network with low structural complexity.By reducing the size of the convolutional kernel and increasing the depth and width of the network, as well as incorporating BN, L2 regularization, and dropout terms, the network's capability to extract features is improved while also preventing overfitting, thereby enhancing the network's generalization performance.

Note:
Conv k-m refers to a convolution layer with a kernel size of kk  and m channels; Maxpool/2 represents a max pooling layer with a stride of two pixels, and BN stands for the batch normalization layer; Relu and Softmax are the activation functions; Dens denotes a fully connected layer.
Among them, BN is applied after each convolutional layer and before the activation layer in the network.The purpose is to map the feature data extracted from the convolutional layer to a region with a variance of 1 and a mean of 0, thereby improving the data distribution in the middle layers of the network.This helps to avoid the network having to adapt to different data distributions in each iteration, reducing the learning adaptation time during the model's iterative process and accelerating the training process of the network.The specific implementation process is as follows: Where represents the data (output of the convolutional layer) that needs to be normalized;  and 2  are the mean and the variance, respectively; i y is the normalized output.Although the introduction of BN can prevent the vanishing and exploding gradients during the model training process and improve the generalization performance of the network, it also increases the model's parameters and computational complexity.Additionally, the presence of fully connected layers further increases the parameters and computational complexity of the model, severely impacting the computation and convergence speed of the model, as well as the response time for unit image recognition.

Improved convolutional neural network model for balling levels detection
To address the issues of the complex structure and large number of parameters in the benchmark model for balling levels detection, we replaced the standard convolution and fully connected layers with DSC and GAP strategies, respectively.In addition, the impact of DSC structure on the model's computation and convergence speed in terms of reducing the number of parameters and increasing the structural complexity of the model was also considered.

Depth separable convolution
Unlike standard convolution, DSC involves a two-step convolution process: depthwise convolution (DC) and pointwise convolution (PC).In DC, the number of convolution kernels depends on the number of input image channels, and each kernel convolves with only one channel of the input image.PC, on the other hand, is similar to standard convolution but with a kernel size of 1×1×input image channels.The convolution operation combines the features obtained from DC in the depth dimension to generate new feature maps, and the number of output feature maps depends on the number of convolution kernels.Figure 3 illustrates the operation of traditional standard convolution and depth separable convolution.
and the number of parameters is ( ) In contrast, DSC first performs DC using M convolution kernels with size of KK DD  , resulting in M feature maps with size of FF DD  .Then, PC is performed using N convolution kernels with size of 11 M   to achieve information interaction between channels.The computational complexity of DSC is , and the number of parameters is 2 Then, the ratio of computational complexity ( P ) and the number of parameters ( ) between DSC and standard convolution can be calculated as follows: As can be seen from the above equations, DSC significantly reduces the number of parameters and computational complexity of the network.Moreover, when the size of the convolution kernel K D increases, DSC has even lower computational complexity compared to standard convolution, resulting in faster network computation speed.

Global average pooling
GAP is commonly used to replace the fully connected layer that existed in traditional convolutional neural networks.Its purpose is to merge each feature map outputted by the last convolution layer into a single feature node.These feature nodes are then directly inputted into the Softmax function for classification.By eliminating the fully connected layer, GAP greatly reduces the number of parameters and computational complexity of the network.This reduction in complexity helps to address problems such as slow training speed and overfitting.
Assuming that x is the element value at the position ( ) , ij in the k-th feature map outputted by the last convolution layer, and the size of the feature map is mn  .Then, the GAP process can be expressed as follows: Where k GAP y is the k-th component of the one-dimensional feature vector outputted by the GAP layer.

Structure of the improved CNN model
The DSC structure accelerates the computation and convergence process of the model by reducing the model's parameters and computational complexity.However, it also increases the depth of the network due to its two-stage convolution process.As a result, the structural complexity of the network increases, indirectly inhibiting the computation and convergence speed of the model.It can be inferred that the actual improvement or reduction in computation and convergence speed during the training process depends on the trade-off between these two factors.Therefore, when introducing the DSC structure, it is necessary to carefully consider the impact of the DSC structure on the computation and convergence speed of the model in terms of reducing the model's parameters and increasing the model's structural complexity.
According to Section 3.1, the ratio of the number of parameters between DSC and standard convolution is: In this work, the size of the convolution kernel is 3 K D = , and the relationship between the number of kernels in a single convolution layer and the channel of input feature maps is N=2M (except the first convolution layer).Then, the above equation can be simplified as follows: Considering the ratio of computational complexity between DSC and standard convolution: It can be found that as the number of kernels N increases, the values of  and P both decreases.
Therefore, when the DSC structure is introduced, the computational complexity and the number of parameters at the higher-level convolutional layer tend to reduce more.Therefore, it can be inferred that when introducing the DSC structure in the higher-level convolutional layers of the model, the model tends to prioritize reducing the number of parameters to accelerate computation and convergence.To validate this, we conducted several comparative experiments and proposed an improved CNN network structure, as shown in figure 4, in which the DSC structure is only used in the sixth convolutional layer of the BM, and the fully connected layer is replaced with GAP.Considering the limited use of the DSC structure in the improved model structure, this improved CNN model is named F-DSCNet (Finite Depth Separable Convolution Network).Note: Conv k-m refers to a convolution layer with a kernel size of kk  and m channels; Maxpool/2 represents a max pooling layer with a stride of two pixels, and BN stands for the batch normalization layer; Relu and Softmax are the activation functions; DSC and GAP are the depth separable convolution layer and the global average pooling layer, respectively.

Data set preparation
In this work, the data set is composed by the surface microscopic images of nine specimens obtained through orthogonal experiments, and each individual microscopic image has a resolution of 2592×1944 pixels.In these experiments, the variations in process parameters mainly include laser power (P), scan speed (V), hatch space (H), and layer thickness (D), and different combination of these process parameters affects the balling phenomenon through the key factor of laser energy density (J/mm  1 presents the parameter settings and results of these orthogonal experiments [14].Based on these results, the severity of balling phenomenon can be classified into three levels: slight balling (specimens 1, 2, and 3), moderate balling (specimens 4, 6, and 7), and severe balling (specimens 5, 8, and 9). Figure 5 shows the microscopic images of the part surface with different balling levels.It is evident that the balling level of the part surface is determined by its microscopic image, as the balling phenomenon exhibits significant variations under different process parameter conditions.Although training on high-resolution images could provide better recognition accuracy, considering the GPU performance and time cost of the computing platform, the collected microscopic images were segmented and converted into small grayscale image blocks with size of 600×600 pixels to generate the final sample data set.The dataset was then divided into training, validation, and testing sets with a ratio of 80%:10%:10%, in which the labels of the collected individual microscopic images were consistent with the balling levels of the part surface, and the labels of the small segmented image blocks were consistent with the labels of the collected microscopic images.

Experimental platform and parameter settings
In this work, all of the experiments were conducted on the Windows platform with a 12×2.2GHzIntel i7-8750h processor and NVIDIA GTX 1050Ti GPU.The code was developed in Python 3.6.8using TensorFlow (version 1. 13 entropy loss function was used to evaluate the CNN output error, and the Adam optimization algorithm was employed to train the network.In addition, the batch size was set to 8, the number of iterations was set to 300, the learning rate was set to 0.0001, and the dropout rate was set to 0.25.

Experimental results and analysis
In this section, the improved convolutional neural network F-DSCNet was used to address the detection problem of balling levels on the surface of SLM-formed parts, aiming to ensure recognition accuracy while improving the computation and convergence speed of the model, as well as the response time for unit image recognition.Therefore, the experiment was divided into two parts, where the detection accuracy was compared and analyzed with other combination optimization structures, as well as the computation speed, convergence speed, and inference speed (unit image recognition response time) were also compared and analyzed.In order to better analyze the impact of DSC and GAP, as well as the number of DSC layers on the model accuracy, several other comparative experiments were also conducted.Table 2 shows the comparative results of the test accuracy for different improvement strategies and their combinations, in which the model BM+DSC(m, n, k)+GAP indicates that only the m-th, n-th, and k-th convolutional layers of the model used the DSC structure, and the fully connected layer was replaced with GAP.

Comparative analysis of the model's accuracy.
From table 2, it can be observed that improving all the standard convolutional layers and fully connected layers of the BM model, respectively, can significantly improve the model's performance, but the former brings more obvious performance improvement.This can be mainly attributed to the twostage convolution process of the DSC structure, which allows the convolutional layers to extract more diverse balling features.When further improving the fully connected layers of the BM+DSC model, the test accuracy of the BM+DSC+GAP model further improves, indicating that the BM+DSC model exhibits overfitting, and the overfitting is mainly caused by the large number of parameters in the fully connected layers.Optimizing this part can better address the overfitting issue, which also demonstrates the feasibility and effectiveness of combining these two structures in this study.Furthermore, it can be found that changing the number of DSC layers, that is, optimizing only the higher-level convolutional layers of the model, results in a smaller fluctuation range in the test accuracy.This suggests that optimizing the lower-level convolutional layers, which extract lower-level balling features, has less impact on the improvement of model accuracy.However, when the DSC and GAP structures were used to replace the sixth convolutional layer and the fully connected layer of the BM model, respectively, the BM+DSC( 6)+GAP (F-DSCNet) model achieved the highest test accuracy and lowest loss.This further validates the reliability of the proposed method.decreasing at the same number of iterations.This indicates that when all the standard convolutional layers of the BM model are replaced with DSC, the increase in model's structural complexity dominates in inhibiting the model's computation speed.Additionally, when combining the advantages of DSC and GAP, the training time of the BM+DSC+GAP model only slightly decreases, indirectly confirming this point.On the other hand, when only implementing the DSC structure in the higher-level convolutional layers of the BM model and using GAP to replace the fully connected layers, the training time significantly decreases for models such as BM+DSC(4,5,6)+GAP, BM+DSC (5,6)  (2) Convergence speed.As shown in figure 7, among all the improved models, only the models of BM+GAP, BM+DSC(5,6)+GAP, and BM+DSC(6)+GAP proposed in this paper converged before the end of training.Combining the training times of 36.50h,34.78h, and 35.15h for these three models given in table 3 at 300 iterations, it can be found that in the practical training process, the BM+DSC(6)+GAP model obtained the fastest convergence speed.It achieved the best recognition performance at 100 iterations, with an actual effective training time of 11.72h, which is significantly lower than the convergence time of 12.16h for the BM+GAP model at 100 iterations and the convergence time of 17.39h for the BM+DSC (5,6)+GAP model at 200 iterations.Therefore, it can be concluded that the proposed method has a significant advantage in accelerating the model's convergence speed.
(3) Inference speed.The inference speed of the models in this study is reflected by the unit image recognition response time of the trained models.As can be seen from table 3 that the inference speed of the models is closely related to the model's parameters (model size), and the introduction of the GAP structure has a more significant effect on accelerating the model's inference speed.After all, the parameters of the full connection layer account for about 80% of the total parameters of the model.When improving both all the standard convolutional layers and the fully connected layers of the model simultaneously, the inference speed of the BM+DSC+GAP model was increased by 11.76% compared to the model of BM.However, when changing the implementation layers of the DSC structure, although the model's parameters increase to some extent, the actual inference speed of the models does not differ significantly.Furthermore, it can be observed that the proposed method in this study achieves a faster inference speed, comparable to the fastest response time of 0.0161s for the BM+DSC(5,6)+GAP model, with an inference time of 0.0165s.

The structure and performance of the improved convolutional neural network
Taking into account factors such as the model's testing accuracy, computation and convergence speed, and unit image recognition response time, the proposed method in this study has the best performance among all the improvement strategies.Its network structure is illustrated in figure 3. Apart from the input and output layers, it consists of six convolutional and pooling layers, as well as a global average pooling layer.The sixth convolutional layer is a depth separable convolutional layer, while the other convolutional layers are the standard convolutional layers.In addition, a batch normalization (BN) layer is added after all the convolutional layers and before the activation layers.Table 4 shows the confusion matrix of this model on the test set, where the overall recognition accuracy reaches 96.1%.Moreover, the recognition accuracy for each balling level is higher than 95%, indicating that the model exhibits good generalization performance on the test set.Although the improved CNN model F-DSCNet proposed in this study achieved high recognition accuracy on the test set, considering the influence of image segmentation on the extraction of balling features by the network, we examined all the microscopic images collected from a single component and judged the balling levels of the component from the global image blocks.As shown in table 5, the global prediction results obtained for the nine components in the experiment show that the improved CNN model achieved recognition accuracy of over 93.2% for all small-scale image blocks of each component.Moreover, the balling levels for all nine components were accurately recognized with an accuracy of 100%.It is evident that the improved convolutional neural network model proposed in this study exhibits excellent performance in recognizing the balling levels on the surface of SLM-formed components.

Conclusion
In this paper, in view of the real-time requirements of balling levels detection in SLM production and experimental processes, a surface balling levels detection method for SLM-formed parts based on an improved convolutional neural network was proposed.This method combined the advantages of DSC and GAP, and also took into account the influence of the number of DSC layers on the model's performance.Based on the experimental results and analysis, the following conclusions can be drawn.
(1) The number of DSC layers has a significant impact on the model's performance.In particular, the recognition accuracy, as well as the computation and convergence speed of the model, were significantly improved when the DSC structure was introduced in the higher-level convolutional layers.However, the change in the number of DSC layers has a limited impact on the model size, resulting in little improvement in inference speed.
(2) The introduction of the GAP structure is beneficial for accelerating the model's inference speed.By combining the advantages of DSC and GAP, the improved CNN model F-DSCNet has demonstrated outstanding performance in terms of computational speed, convergence speed, and inference speed, exhibiting a strong practicality for balling levels detection in the field of additive manufacturing.
(3) Although image segmentation can affect the extraction of balling features and reduce the recognition accuracy of the network, it does not affect the overall judgment of balling levels of the specimen surface.Global image block testing showed that the improved CNN model obtained in this study can accurately recognize the balling levels of the specimen surface, with an accuracy of 100%.

Figure 3 .
Figure 3. Schematic of the operation of these two different convolution structures: (a) standard convolution and (b) depth separable convolution.Concretely, when the input image size is II D D M , standard convolution uses N convolution kernels with size of KK D D M  to perform convolution, resulting in a feature map with size of FF D D N  .The computational complexity of standard convolution is FFKK D D D D M N      ,

Figure 5 .
Figure 5. Local surface microscopic images of the parts with different balling levels: (a) slight balling, (b) moderate balling, and (c) severe balling.

Figure 6 .
Figure 6.Comparison of the model's training accuracy between BM and F-DSCNet.

Figure 7 .
Figure 7.Comparison results of the model's convergence speed under different improvement strategies.

Table 2 .
Comparison results of the model's test performance under different improvement strategies.Comparative analysis of the model's computation, convergence, and inference speed.This section analyzes the impact of the improvement strategies on experimental results from three aspects: computation speed, convergence speed, and inference speed of the model.The detailed results are shown in table 3 and figure7.

Table 3 .
Comparison results of the model's computation and inference speed under different improvement strategies.Computation speed.As can be seen from table3, although improving all the standard convolutional layers and fully connected layers of the BM model respectively can reduce the model's parameters to some extent, the training time of the BM+DSC model actually increases instead of +GAP, and BM+DSC(6)+GAP.In this case, the reduction in model's parameters plays a dominant role in accelerating the model's computation speed.Furthermore, among the various improvement strategies, the proposed method in this study has a relatively short training time of 35.15 hours, second only to the training time of 34.78 hours for the BM+DSC(5,6)+GAP model.However, compared to the training time of 48.03 hours for the BM model, it is reduced by 26.82%, indicating a significant improvement in computation speed.This further validates the superiority of the proposed method in accelerating the model's computation speed.

Table 4 .
Confusion matrix of F-DSCNet on the test set.

Table 5 .
Prediction results of nine specimens on the global image blocks.