An Ensemble Learning Algorithm for Machinery Fault Diagnosis Based on Convolutional Neural Network and Gradient Boosting Decision Tree

In the massive mechanical fault data, the value density of fault information is low, and the data quality is uneven. What’s more, the multi-source signals collected by different sampling methods are different. At present, the expert system or shallow neural network model with weak self-learning ability cannot meet the requirements. Therefore, aiming at the characteristics of coupling, uncertainty and concurrency of mechanical faults, this paper constructs two kinds of 2D-CNN fault feature data sets, and uses Convolutional Neural Network (CNN) with strong self-learning ability to build two kinds of fault diagnosis models: CNN-Z and CNN-F. With CNN-Z and CNN-F models as base learners, this paper utilizes the ensemble algorithm Gradient Boosting Decision Tree (GBDT) to combine multiple bases. Compared with the results of the single base learner, the outcomes have higher accuracy and lower generalization error. Through the analysis and comparison of the performance indicators of the algorithm, this paper concludes that the diagnosis error of the fault diagnosis algorithm based on CNN and GBDT is the lowest with 1.79%, and the effectiveness, reliability and accuracy of the proposed algorithm in mining hidden fault state information are verified.


Introduction
Mechanical equipment is an important carrier of national economy and national defence construction.
With the advancement of the information age, its degree of automation, real-time, intelligence is higher than before. A lot of equipment works in harsh environment like high temperature, high pressure and enemy threat. All these factors put forward higher requirements for its operation quality and performance. In the process of mechanical equipment operation, mechanical failure is the "potential killer" of the whole safety service of mechanical equipment [1]. Once a failure occurs, it will affect the function of mechanical equipment, leading to safety accidents in serious cases. Therefore, fault diagnosis research, mining hidden fault state information in multi-source signals, timely positioning and confirming fault location and degree, provide accurate information for maintenance of mechanical equipment and fault prediction, and play a key role in ensuring the normal working performance of equipment.
Mechanical equipment has a long service time and many data acquisition points, so we can obtain massive data, which is suitable for using artificial intelligence methods such as machine learning and IOP Publishing doi: 10.1088/1742-6596/2025/1/012041 2 deep learning for fault diagnosis and diagnosis [2]. Traditional machine learning algorithms such as Support Vector Machines (SVM) [3], Probabilistic Neural Networks (PNN) [4], Back Propagation Neural Networks (BP Neural Networks) [5], Radial Basis Function Networks (RBF) and so on, RBF Networks [6] and other algorithms are suitable for processing small-scale data samples. Due to the influence of input data dimension, traditional algorithms have limited learning ability and are sensitive to samples, which easily leads to over fitting. Therefore, researchers gradually introduce deep learning, such as RBM, CNN and RNN [7][8][9][10]. In view of the randomness, discreteness, periodicity and structure of fault data, RBM is not suitable for supervised learning and has no scale invariance. RNN is used to process continuous data and cannot identify faults with high accuracy. There are many researches on using CNN algorithm to realize fault diagnosis. Chen and others use deep neural network model to identify the fault state of rolling bearing, which shows that the method has high accuracy and reliability [11]. Chen et al. propose a deep learning method based on convolutional neural network, which introduces the advantages of image recognition and visual perception into bearing, and has achieved high accuracy by simulating the cognitive process of cerebral cortex [12]. Zhang et al. propose a CNN model with kernel in input layer [13]. There are many forms of data input for CNN algorithm: O. Janssens et al. process the original signal by EEMD, and then put one-dimensional signal into CNN model [14]. The diagnosis accuracy is higher than that in reference [15]. However, this method not only takes a long time, but also does not consider the internal correlation of bearing vibration signal, resulting in low fault diagnosis accuracy. In order to improve the accuracy, M. Zhang et al. use CNN algorithm based on two-dimensional samples [16]. Hoang et al. also use CNN algorithm based on two-dimensional data, which obtains better accuracy and consumes longer time [17]. In addition, in order to improve the diagnosis accuracy, it is necessary to deepen the network layers of a single classifier, which will lead to the increase of parameters and make the model fall into the local optimal value, which is prone to over fitting. In order to improve the generalization performance of the algorithm, the ensemble algorithm synthesizes the results of each base classifier, thus reducing the dependence on a single classifier. For example, J. Wang et al. use ensemble learning for Autism Spectrum Disorder (ASD) Diagnosis [18]. Lango et al. use ensemble algorithm for unbalanced data processing [19], and obtain a system with low generalization performance. To conclude, we may find that ensemble algorithms can solve this kind of problems very well.
Therefore, aiming at the sample data of mechanical fault feature information, feature similarity and difference mixed together, this paper constructs 2D-CNN data set, establishes CNN based classifier, and finally proposes an ensemble learning algorithm for mechanical fault diagnosis based on CNN and GBDT. This model trains multi-classifier to replace single classifier, improves the accuracy of fault diagnosis and narrows the generalization gap of the model. The implementation flow chart of the proposed algorithm is shown in figure 1.

Intercepting The Tested Data Samples
This paper, targeting the general unbalanced fault data from sampling counter examples and over sampling positive example, assumes that there are k ( 1 k  ) fault types of the tested object, then . Time series transformation is performed for each sample in the tested sample set , in which the data characteristic period T is the width of time series image

Obtaining Time-Frequency Image
As for the non-stationary and nonlinear mechanical fault signals, we perform Continuous Wavelet Transform (CWT) on the data set , so as to produce the time-frequency image. Assumption: a is a scale and ,0 a R a  , indicating the frequency-related scaling;  means translational value and is the wavelet basis function, which is obtained by the stretch and translation of wavelets. The transforming process of CWT is as follows: We get time-frequency image

CNN
In order to reduce the requirements of memory consumption and computation, the classical LetNet5 Convolutional Neural Network (LetNet5 CNN) model is used as the base classification model. The LetNet5 CNN model consists of the input layer, the convolutional layer C1, the sampling layer S1, the fully connected layer and the output layer. The framework of the classic LetNet5 CNN model is shown  figure 3. As is shown in figure 3, the input image 11 WH  is input through the input layer. After that, S W H  is obtained by stretching and expansion, and finally output layer transmits after the fully connected layer.
In figure 3, the CNN output layer uses the softmax function to calculate the maximum value of various output results of the fully connected layer, which is the output prediction. The expression is as follows: In which, P is Pooling Feature Matrix;  is the weight matrix of fully connected layer; b is the bias. Remark for figure 3. There are two kinds of convolution: strict convolution and convolution after padding. Dimensions remain unchanged after strict convolution, while the dimensionality decreases after convolution are padded. In order to avoid the reduction of the dimensionality of the input data during the convolution process, this paper adopts the method of convolution after padding.

Setting up CNN Based Classifier Model Based on Time Series Image (CNN-Z)
The time series image is divided into  groups, and the balance data is input into the LeNet5 CNN model as the first training data set 1

Setting up CNN Based Classifier Model Based on Time-Frequency Image (CNN-F)
In the same way, the time-frequency image is divided into  groups, and the balance data is input into the LeNet5 CNN model as the second training data set 1

Setting up Fault Diagnosis Model Based on CNN and GBDT
In this paper, the CNN-Z model and CNN-F model are serially integrated by GBDT algorithm. The first base classification model  In this paper, GBDT algorithm is used to deal with multi classification problem. Therefore, its Loglikelihood Loss Function is:   6 We initial the classification tree by adopting log-likelihood loss function, which is: The best residual fitting value of each leaf node is calculated: After updating the tree: The formula of feasibility index F1 is: where P is the precision rate and R is the recall rate.

Model Validation
In order to verify the proposed CNN and GBDT based fault diagnosis algorithm, this paper uses the rolling bearing fault data of Western Reserve University, which is the signal with the motor drive terminal DE and the sampling frequency of 12KHZ. Bearing failure occurs in these position of inner race, bar and outer race, so the  (10) and (11). The results are shown in table 1.  Table 1 shows that the diagnosis accuracy of GBDT model based on CNN is high, with the lowest 97.98% and the highest 99.44%, and the metric F1 can reach 0.976, which proves the feasibility of the invention.
In order to prove the effectiveness and high-precision performance of the model, this paper compares   Figure 8 shows the comparison results of diagnosis error among CNN model based on time series image (CNN-Z), CNN model based on time-frequency image (CNN-F) and GBDT model based on CNN. As can be seen from figure 8, when the number of training periods is 50, the average diagnostic errors of CNN-Z, CNN-F and GBDT model based on CNN reach relaxation, and when the number of training periods is 55, the diagnostic errors of the three models are 8.83%, 2.89% and 1.79% respectively. In addition, the diagnosis error of the integrated model GBDT model based on CNN is smaller, which improves the fault diagnosis accuracy.

Conclusion
In this paper, the fault data of bearings is transformed into different data feature sets to improve the value density of fault data and increase the fault database. And GBDT algorithm is used to integrate the results of multiple base learners. The fault diagnosis algorithm based on CNN and GBDT has the highest accuracy. The related research of this paper has transformed into invention patents. In addition, the research results of this paper provide the data and fault diagnosis model for the follow-up research of data fusion, intelligent fault diagnosis and fault-tolerant control.