A novel motor fault diagnosis method based on principal component analysis (PCA) with a discrete belief rule base (DBRB) system

Hang Yu; Haibo Gao; Yelan He; Zhiguo Lin; Xiaobin Xu

doi:10.1088/1361-6501/aca2ce

1. Introduction

New permanent magnet synchronous motors are smaller than traditional synchronous motors but have high efficiency and high power density, so they are used in many fields. Failure of these motors can seriously affect various production processes and safety of equipment, so it is particularly necessary carry out research into early fault diagnosis. The vibration signal of a permanent magnet motor contains a lot of characteristic information, and fault diagnosis using motor vibration signals has been a research focus for decades [1]. However, usually the collected signals cannot be used directly as input data for pattern recognition tools, because these original signals are not only large in data volume but also sensitive to the working conditions of the motor. The collected data therefore requires further processing. Commonly used methods for doing this can be divided into time domain methods, frequency domain methods [2].

The more common feature vectors are composed of time and frequency domain features. The authors in [3, 4] analyzed the characteristics of vibration signals in the time and frequency domains, which were used for fault diagnosis of bearings and planetary gearboxes respectively. In [5], 11 time domain and frequency domain features are extracted as input for bearing fault diagnosis [6]. Extracted 256 statistical features, which were used for gear fault classification [7]; constructed a feature space containing 52 features by time and frequency domain analysis and wavelet decomposition, which enabled fault diagnosis for gear boxes.

Compared with time domain and frequency domain analysis, time-frequency analysis more clearly describes the relationship between signal frequency and time change. The most popular of these is probably the Short Time Fourier Transform (STFT) [8], which mainly filters the signal. Another popular time-frequency analysis method is the continuous wavelet transform [9], which differs from STFT in frequency resolution. According to the Heisenberg uncertainty principle [10], a smaller time window results in a poorer frequency resolution. Since the selection of window or wavelet function will affect the time-frequency resolution, the adaptability and readability of the time-frequency analysis is limited. In the following decades, to solve this limitation, many scholars proposed the Wigner–Ville distribution [11–13], but this quadratic time-frequency analysis method hindered the time-frequency representation due to strong interference. Synchrosqueezing transform (SST) [14–17] is another post-processing method, which redistributes coefficients in scale or frequency to avoid the uncertainty of linear transformation. To analyze multi-component signals in strong frequency mode, the authors in [18, 19] proposed a second-order SST based on phase approximation.

Although traditional time-frequency analysis methods can fully extract features from vibration signals, they are also affected by the cross terms and Heisenberg uncertainty principle. To accelerate the development of the fourth industrial revolution, industry 4.0, intelligent fault detection methods are increasingly emerging in academic and industrial circles. To diagnose bearing faults [20], constructed an integrated weighted arithmetic mean-convolutional neural network model. Based on prototype learning network, adversarial semi-supervision (DASS) has been proposed in [21], which was used for fault diagnosis of rolling bearing vibration signals [22]. Used variational mode decomposition to decompose vibration signals and used genetic algorithms to optimize support vector machine parameters to give an improved classification performance [23]. Extracted the fuzzy entropy of vibration signals by fine composite multi-scale analysis and inputted it into the random forest for fault diagnosis. Experiments verified the usefulness of this method.

However, during the operation of the motor, the characteristic information is difficult to be completely and clearly obtained from the diagnostic signal. Traditional intelligent models have poor processing ability for uncertain and incomplete information and cannot further distinguish the degree of failure. Early detection of potential failure is essential for device maintenance. Compared with the traditional IF-THEN rule base, the belief rule base (BRB) system [24–26] has more advantages in dealing with various uncertainties. BRB can handle problems qualitatively and quantitatively, including fuzzy uncertainty, incompleteness, and probability uncertainty; unlike other black-box models, the reasoning process is transparent and easy for users to understand. The output results are in the form of confidence. BRB has been effectively applied in safety analysis [27, 28], production management [29, 30], behavior prediction [31, 32] and medical diagnosis [33].

Fault diagnosis is also one of the main application areas of the BRB method, which is currently widely used in aerospace, machinery manufacturing, power electronics, and other fields [34]. Used the pipeline flow difference and pipeline pressure difference as BRB inputs to detect the degree of pipeline leakage, which became a classic case study of the use of BRB based models [35]. Used the BRB method to establish the nonlinear relationship between risk factors and risk levels, using ocean-going fishing vessels and determined the priority of failure in using Failure Mode and Effects Analysis. The authors in [36, 37] developed a real-time diagnosis and prediction model combining qualitative expert knowledge and quantitative information using BRB [38]. Further proposed an updating modeling of implicit variables based on implicit Markov chain and BRB, which solved the important prediction problem of dynamic implicit variables with environmental influence.

However, current BRB systems can only deal effectively with 2D or 3D inputs. When the external variables or reference values of practical problems increase, the 'combination explosion' problem of rules is inevitable, which leads to a performance decline of the system and consumption of large amounts of computing power [39]. This issue can be alleviated using principal component analysis (PCA), the features are reduced from high to low dimensions. And the variance contribution rate is used as the antecedent feature weight to restore the original feature, it not only amplifies the qualitative and quantitative advantages of BRB, but also makes up for the defects of PCA method. In addition, input data is transformed from continuous to discrete, the new proposed discrete BRB (DBRB) reduce the size of the initial rule base and the complexity of the model. Compared with previous studies, this paper is the first to propose PCA-DBRB in the field of fault diagnosis. Here we have developed the use of the DBRB approach to deal with the uncertainty of the motor vibration signal and show that this allows high confidence fault diagnosis with improved classification accuracy. The emergence of DBRB is a new variant of BRB and fills the gap in the field of fault diagnosis.

The remainder of this study is organized as follows. Section 2 introduces the algorithms used for data processing and feature extraction. Section 3 introduces the basic parameters and challenges of the BRB approach. Section 4 introduces the overall algorithm flow. In section 5, the algorithm is verified using experimental data, while section 6 draws out the main conclusions.

2. Data preprocessing and feature dimension reduction algorithm

2.1. Wavelet threshold denoising

The working environment of the motor is diverse and the vibration signal is typically non-stationary. Traditional filtering methods cannot be used to effectively analyze non-stationary signals which often also suffer from interference noise during the acquisition process, making fault feature extraction very difficult. Thus, it is important to eliminate noise from the signal. Wavelet threshold denoising [40] has been widely applied to signal denoising (figure 1).

**Figure 1.** Flow chart of wavelet threshold denoising.
Download figure:
Standard image High-resolution image

The threshold function is the key to the effect of wavelet threshold denoising. Traditional threshold functions include both soft and hard threshold functions, as follows:

$\begin{equation}{\mathop w\limits^ \wedge} _{j,k} = \left\{ {\begin{array}{*{20}{l}} {sgn({w_{j,k}})(\left| {{w_{j,k}}} \right| - \tau ){} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} \left| {{w_{j,k}}} \right| \geqslant \tau } \\ {0{} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} \left| {{w_{j,k}}} \right| \geqslant \tau } \end{array}} \right.\end{equation} \tag{ 1 }$

where ${w_{j,k}}$ is the wavelet coefficient; ${\mathop w\limits^ \wedge} _{j,k}$ is the wavelet coefficient after processing; τ is the threshold. sgn is a sign function, when ${w_{j,k}}$ >0, sgn( ${w_{j,k}}$ ) = 1; when ${w_{j,k}}$ = 0, sgn( ${w_{j,k}}$ ) = 0; when ${w_{j,k}}$ <0, sgn( ${w_{j,k}}$ ) = −1.

The mathematical model of the hard threshold function is as follows:

$\begin{equation}{\mathop w\limits^ \wedge} _{j,k} = \left\{ {\begin{array}{*{20}{l}} {{w_{j,k}}{} {} {} {} {} {} \left| {{w_{j,k}}} \right| \geqslant \tau } \\ {0{} {} {} {} {} {} {} {} {} {} {} {} {} {} {} \left| {{w_{j,k}}} \right| < \tau } \end{array}} \right..\end{equation} \tag{ 2 }$

2.2. Principal component analysis (PCA)

When statistical methods face multivariate problems, the complexity of the algorithm will be greatly improved. In many cases, if two variables are correlated, it means that there is overlapping information between them. To solve this problem, PCA recombines the original variables into a group of independent comprehensive variables. The central idea is to reduce the high-dimensional features to a few unrelated features and reflect the original information as much as possible [41–43]. Specific methods are as follows:

Assuming that the input array has m features and n sets of data, the output results in a two-dimensional topological structure with j neurons. The specific vector process is as follows:

(a)
Input (n,m) eigenmatrix
$\begin{equation}X = ({x_{ij}}) = \left( {\begin{array}{*{20}{c}} {{x_{11}},}&{{x_{12}}, \cdots }&{{x_{1m}}} \\ {{x_{21}},}&{{x_{22}}, \cdots }&{{x_{2m}}} \\ \begin{gathered} \vdots \\ {x_{n1}}, \\ \end{gathered} &\begin{gathered} \\ {x_{n2,}} \cdots \\ \end{gathered} &\begin{gathered} \cdots \\ {x_{nm}} \\ \end{gathered} \end{array}} \right)\end{equation} \tag{ 3 }$
where sample feature ${x_{ij}}$ represents the jth feature of the ith sample in the sample.
(b)
Correlation analysis
$\begin{align}\rho = \frac{\sum\limits_{i = 1}^n \left( x_{ij} - \frac{1}{n}\sum\limits_{i = 1}^n x_{ij}\right)\left(x_{ij + \tau } - \frac{1}{n}\sum\limits_{i = 1}^n x_{ij + \tau }\right) }{\sqrt {\sum\limits_{i = 1}^n {\left(x_{ij}^{} - \frac{1}{n}\sum\limits_{i = 1}^n x_{ij}\right)^2} } \sqrt {\sum\limits_{i = 1}^n {\left(x_{ij + \tau } - \frac{1}{n}\sum\limits_{i = 1}^n x_{ij + \tau }\right)^2} } }\end{align} \tag{ 4 }$
where $x_{ij + \tau }^{}$ represents the $j + \tau$ th feature of the ith sample in the sample data, and $\rho$ is [−1,1], so as to represent the correlation of the two features. $\rho$ = −1 means that the features of the two vectors are negatively correlated, $\rho$ = 0 means that the two vectors are unrelated, and $\rho$ = 1 means that the two vectors are positively correlated.
(c)
Standardized data
$\begin{equation}y_{ij}^{} = \frac{{nx_{ij}^{} - \sum\limits_{i = 0}^n {x_{ij}^{}} }}{{\sqrt {\frac{1}{{n - 1}}\sum\limits_{i = 1}^n {{{\left(nx_{ij} - \sum\limits_{i = 1}^n x_{ij} \right)}^2}} } }}\end{equation} \tag{ 5 }$
where $y_{ij}^{}$ is the standardized data sample, i is the sample number of input data, and j represents the characteristic dimension of input data.
(d)
Normalized processing
$\begin{equation}Y = ({y_{ij}}) = \left( {\begin{array}{*{20}{c}} {{y_{11}},}&{{y_{12}}, \cdots ,}&{{y_{1m}}} \\ {{y_{21}},}&{{y_{22}}, \cdots ,}&{{y_{2m}}} \\ \begin{gathered} \vdots \\ {y_{n1}}, \\ \end{gathered} &\begin{gathered} \cdots \\ {y_{n2}}, \cdots ,\\ \end{gathered} &\begin{gathered} \vdots \\ {y_{nm}} \\ \end{gathered} \end{array}} \right)\end{equation} \tag{ 6 }$
where n and m are the corresponding dimensions of normalized data samples in equation (6), and $Y$ is the normalized feature matrix, aiming to map data results between (0,1).
(e)
Solve the covariance matrix
$\begin{equation}S = \frac{1}{{n - 1}}Y{Y^T}\end{equation} \tag{ 7 }$
where, ${Y^T}$ is the transpose matrix of $Y$ , S is the covariance matrix, and its eigenvalues and eigenvectors are solved to reduce its dimension. The non-negative eigenvalue is ${\lambda _1} \geqslant {\lambda _2} \geqslant \ldots \geqslant {\lambda _m} \geqslant 0$ , and the orthogonal unit eigenvector is ${u_k}$ .
(f)
Principal component calculation
$\begin{equation}{Z_k} = {({u_k})^T}Y\end{equation} \tag{ 8 }$

$\begin{equation}{v_k} = \frac{{{\lambda _k}}}{{\sum\limits_{k = 1}^m {{\lambda _k}} }}\end{equation} \tag{ 9 }$
where, ${Z_k}$ is the kth principal component (k ⩽ m), and ${v_k}$ is the variance contribution rate, which represents the proportion of the variance value of each principal component after dimension reduction to the total variance value. The larger the proportion, the more important the principal component is, and their sum is 1.

3. Belief rule base

BRB is a new fusion reasoning method which combines fuzzy expert systems, evidential reasoning (ER), multi-attribute decision making and utility theory. BRB reasoning models mainly consists of knowledge base (belief rule base) and fusion reasoning model (evidence reasoning). BRB system makes full use of objective sample data and subjective empirical expert knowledge to deduce nonlinear causality. The model parameters can be adjusted, and the physical meaning is clear, which makes the model more applicable to engineering problems. Therefore, in this paper, BRB is used to process vibration signals with complex and strong randomness, to obtain high confidence diagnosis results.

3.1. BRB

The core of BRB is ER, and the belief rule is expressed as follows:

$\begin{align} & {\text{IF}} \left({x_1}is{A^k}_1\right){} {} {} {} {} {\text{and}}{} {} {} {} \left({x_2}is{A^k}_2\right)\ldots \left({x_{Tk}}is{A^k}_{Tk}\right)\nonumber\\ & {\text{THEN}}\left\{ {(Di{s_1},{\beta _{1k}}),(Di{s_2},{\beta _{2k}}),(Di{s_3},{\beta _{3k}})\ldots(Di{s_N},{\beta _{Nk}})} \right\},\nonumber\\ & \quad \left( {\sum\limits_{i = 1}^N {{\beta _{ik}} \leqslant 1} } \right)\nonumber\\ & {\text{with a rule weight}}{} {} {} {} {} {} {\theta _{\text{k}}}(k = 1,\ldots ,L)\nonumber\\ & {\text{and attribute weights}}{} {} {} {} {} {} {\delta _{\text{1}}},{\delta _2},\ldots ,{\delta _{Tk}} \end{align} \tag{ 10 }$

where ${x_i}(i = 1,2,\ldots ,{T_k})$ represents the antecedent attribute, and $A_i^k$ represents the ith antecedent attribute's reference value, T_k is the antecedent attribute's number. $Di{s_n}(n = 1,2,\ldots ,N)$ is the consequent attribute's reference value, N is the reference value's number of the consequent attribute; ${\beta _{Nk}}$ is the reference value's confidence degree of consequent attribute; ${\theta _k}(k = 1,\ldots ,L)$ is the kth rule's weight; in the entire rule base, L is the total number of rules, ${\delta _i}$ is the antecedent attribute's relative importance.

3.2. Inference process of the BRB

The fusion method of BRB is based on ER rule, which mainly includes calculation of activation weight and output ER fusion. The overall algorithm flow is as follows:

(a)
Determine the reference pointsThe linear piecewise function method is used to determine the degree of similarity between the input and the reference point. The formula is shown below
$\begin{align} {\alpha _{ij}} & = \frac{{u({a_{i(j + 1)}}) - x_i^{}}}{{u({a_{i(j + 1)}}) - u({a_{ij}})}}\nonumber\\[4pt] {\alpha _{i(j + 1)}} & = 1 - {\alpha _{ij}}{} {} {} {} {} {} \,\,\,if{} {} {} {} {} {} {} {} {} {} u({a_{ij}}) \leqslant x_i^{} \leqslant u({a_{i(j + 1)}})\nonumber\\[4pt] {\alpha _{ik}} & = 0{} {} {} {} {} {} {} {} {} {} {} for{} {} {} {} {} {} {} k = 1,\ldots ,{J_i},k \ne j,j + 1 . \end{align} \tag{ 11 }$
(b)
Calculate the each rule's activation weight
$\begin{equation}{w_k} = \frac{{{\theta _k} \times \prod _{i = 1}^{{T_k}}{{\left( {{\alpha _{ik}}} \right)}^{\overline {{\delta _i}} }}}}{{\sum\nolimits_{j = 1}^L {\left[ {{\theta _j} \times \prod\nolimits_{l = 1}^{{T_k}} {{{\left( {{\alpha _{lj}}} \right)}^{\overline {{\delta _l}} }}} } \,\right]} }}\end{equation} \tag{ 12 }$
where ${\bar \delta _{\text{i}}} = \frac{{{\delta _i}}}{\begin{gathered} \max \{ {\delta _i}\} \\ i = 1,\ldots {T_k} \\ \end{gathered} }$ represents the relative weight of the antecedent attribute, ${\alpha _{ik}}(i = 1,\ldots {T_k})$ represents the matching degree between ${x_i}$ and each reference value. Here, ${\alpha _{ik}}$ ≧ 0, $\sum\nolimits_{i = 1}^{{T_k}} {{\alpha _{ik}}} \leqslant 1$ , ${\alpha _k} = \prod \sum\nolimits_{i = 1}^{{T_k}} {({\alpha _{ik}}} {)^{{{\bar \delta }_i}}}$ is the joint matching degree. ${w_k}$ is the each rule's activation weight.
(c)
ER fusionIn the previous step, the activation weight of each rule is calculated, and then the ER rule is used to combine multiple rules to generate the final conclusion, which refers to the confidence of the BRB model for each consequent attribute.

The confidence formula is as follows:

$\begin{align} {\beta _j} & = \frac{{\mu \times \left[ {\prod\nolimits_{k = 1}^L \left({w_k}\beta _{jk} + 1 - {w_k}\sum\nolimits_{j = 1}^N \beta _{jk}\right) } \right] - \prod\nolimits_{k = 1}^L \left(1 - {w_k}\sum\nolimits_{j = 1}^N \beta _{jk}\right) }}{{1 - \mu \times \left[ {\prod\nolimits_{K = 1}^L {(1 - {w_k})} } \right]}}\nonumber\\ \mu & = \left[ \sum\nolimits_{j = 1}^N {} \prod\nolimits_{k = 1}^L \left({w_k}\beta _{jk} + 1 - {w_k}\sum\nolimits_{j = 1}^N \beta _{jk}\right) - (N - 1) \times \prod\nolimits_{k = 1}^L \left(1 - {w_k}\sum\nolimits_{j = 1}^N \beta _{jk}\right) \right]^{ - 1} .\end{align} \tag{ 13 }$

Then the final fusion result is as follows:

$\begin{equation}O(U) = \left\{ {(Di{s_j},{\beta _j}),{} {} {} {} j = 1,\ldots ,N} \right\}\end{equation} \tag{ 14 }$

where $Di{s_j}$ is the output attribute of the jth consequent attribute, ${\beta _j}$ is the output confidence of the jth consequent attribute, k is the kth rule.

3.3. The challenge of BRB

The establishment of the BRB needs to cover all the antecedent attributes and the each attribute's reference points. Therefore, in the BRB model, when T antecedent attributes exist and the number of reference points of the ith antecedent attribute is Ji, the BRB's rules can be represented as $\prod\limits_{i = 1}^T {{J_i}}$ . For example, when the antecedent attribute's number is 5 and the number of reference points of each antecedent is 3, the rule numbers (RNs) of the BRB can be counted as 3⁵ = 243. Obviously, the RN in the BRB shows exponential growth. Therefore, a large number of antecedent attributes and reference points will lead to the 'combination explosion' problem, the most direct and effective solution is to reduce the number of antecedent attributes. To solve this problem, numerous studies have been carried out previously [44]. Reduced the number of antecedent attributes through grey target theory, multidimensional scaling, isomap and PCA. By evaluating the overall capability of armored vehicles, it was found that the BRB model based on PCA attribute reduction gave the best performance. However, when screening key attributes, the above method only considers the contribution degree of principal components and ignores the relative importance of each principal component. The amount of information of each principal component is different, so the importance of different principal components is different. The researchers used them as inputs for equally important traits, which distorted the original traits and rendered the diagnosis invalid.

Therefore, in this study it was proposed that the variance contribution rate of each principal component was taken as the relative weight of DBRB antecedent attributes, and the original characteristic information was restored. This not only makes up for the shortcomings of PCA, but also gives full play to the advantages of BRB, which can combine historical data and expert experience to deal with uncertainties and improve the operation speed and efficiency of the BRB model.

4. Proposed method of fault diagnosis based on PCA-DBRB

Data preprocessing, feature extraction, feature dimension reduction and complete BRB diagnostic algorithm flow are shown in figure 2. Firstly, the vibration acceleration signals are collected, then the wavelet threshold denoising algorithm is applied. Secondly, six typical time domain features and seven digital statistical features are extracted from each time window. Thirdly, because the multi-feature input requires too much computing power, PCA is used to reduce the extracted feature's dimension to condense into three principal components, which are used as inputs to the BRB model. Finally, the variance contribution rate of each principal element is chosen as the attribute weight of the antecedent feature, K-means the algorithm is used to divide the feature clustering centers to determine the referential point. Additionally, the DBRB replaces the original continuous BRB (OBRB) to reduce the size of the rule base. Then, fmincon function in Matlab toolbox is used to optimize the parameters, and the optimized BRB model is obtained. ER algorithm is used for outputs fusion to obtain a result with high confidence.

4.1. Data denoising

Wavelet soft threshold and wavelet hard threshold denoising are used to process the motor vibration signal respectively. The wavelet base is a db3 wavelet, the wavedec function in Matlab is used for multi-scale one-dimensional wavelet decomposition, and the wdencmp function is used for denoising. To compare the denoising effects of two different thresholds, Signal to Noise Ratio (SNR) and Mean Square Error (MSE) are used as the criteria to evaluate the denoising performance.

$\begin{equation}{\text{SNR}} = 10\lg \frac{{\sum {f^2}(t)}}{{\sum {{(\mathop f\limits^ \wedge (t) - f(t))^2}}}}\end{equation} \tag{ 15 }$

$\begin{equation}{\text{MSE}} = \frac{1}{N}\sum {(\mathop f\limits^ \wedge (t) - f(t))^2}\end{equation} \tag{ 16 }$

where $f(t)$ is the original signal, $\mathop f\limits^ \wedge (t)$ is the estimated signal after wavelet denoising, N is the length of the signal, t is the variable time, the sum operation along variable t. The larger the SNR and the smaller the MSE after denoising, the better the denoising effect.

4.2. Feature extraction

Previous studies focused more on feature extraction of typical time domain features. In this study, typical time domain feature parameters and mathematical statistics feature parameters are used as dimension reduction and denoising feature parameters of PCA, and one-dimensional vibration signals are treated as a large number of values to carry out information mining using mathematical statistics to extract features of the data more effectively. The main feature parameters are shown in tables 1 and 2.

Table 1. Typical time domain feature parameters.

Parameter	Meaning
Min	Minimum
Max	Maximum
Std	Standard deviation
Var	Variance
Skew	Skewness
Kurt	Kurtosis
Mean	Mean

Table 2. Typical mathematical statistics feature parameters.

Parameter	Meaning
Per5	The value of the top 5%
Per95	The value of the top 95%
Per99	The value of the top 99%
Sum	Sum of data
Abs-sum	Sum of absolute values
Median	The median value

4.3. Dimension reduction by PCA

PCA is a useful statistical analysis method, which allows dimension reduction and denoising for the above 13 time domain mathematical statistical features. It reduces the above 13 features to three principal components with a high degree of combinatorial information.

4.4. BRB training process

Step 1: Determine the referential points

In order to avoid a large-scale rule base and a subsequently high calculation cost of the BRB, there should not be too many features corresponding to antecedent attributes. The referential points determined by expert experience are stochastic, and the division of referential points greatly affects the calculation speed and accuracy, so it is necessary to find the appropriate referential values. Here, a K-means algorithm [45] is used to determine the cluster centers, which are taken as the referential points, and the maximum and minimum values of features are taken as the reference maximum and minimum boundaries. The specific principle of the K-means algorithm is as follows:

(a)
Cluster centers $\left\{ {{C_1},{C_2},{C_3}, \ldots ,{C_k}} \right\}$ is initialized, then the Euclidean distance from each object to the center of each cluster is calculated, its representation is as follows:
$\begin{equation}dis({X_i},{C_j}) = \sqrt {\sum\limits_{t = 1}^m {{{({X_{it}} - {C_{jt}})}^2}} } \end{equation} \tag{ 17 }$
where ${X_i}$ denotes the ith object, $1 \leqslant t \leqslant m$ ; ${C_j}$ denotes the jth clustering center; ${X_{it}}$ denotes the tth attribute of the ith object; ${C_{jt}}$ denotes the tth attribute of the jth cluster center; $1 < k \leqslant n$ , $1 \leqslant j \leqslant k$ , $1 \leqslant t \leqslant m$ .
(b)
The distances between each object and each cluster center are compared in turn, then the object is assigned to the class family of the closest cluster center to obtain K class groups, $\left\{ {{S_1},{S_2},{S_3}, \ldots ,{S_k}} \right\}$
(c)
In each dimension of a class family, the average of all its objects is called the class center, and the formula is as follows:
$\begin{equation}{C_l} = \frac{{\sum\nolimits_{{X_i} \in {S_l}} {{X_i}} }}{{\left| {{S_l}} \right|}}\end{equation} \tag{ 18 }$
where ${C_l}$ denotes the lth cluster center, $\left| {{S_l}} \right|$ denotes the object's number in the lth family of classes, $1 \leqslant l \leqslant k$ , $1 \leqslant i \leqslant \left| {{S_l}} \right|$ .

Step 2: Establish discrete belief rule base (DBRB) model

After the reference values are determined, the OBRB calculates the match degree between the two reference values. To simplify the rule base, speed up diagnosis and reduce over-fitting, the DBRB is proposed. When the input is discrete, the interval of every two reference values is classified as one, and the similarity is directly assigned to 1, which produces a reference point.

Step 3: Calculate the matching degree of input data

The confidence degree of each category into which the feature falls is directly assigned as 1, and that of the others is assigned as 0. The corresponding confidence degree of the three categories are [1 0 0], [0 1 0], [0 0 1] respectively.

Step 4: Calculate the activation weights of input samples obtained online for corresponding rules.

Step 5: The analytic ER algorithm is used to fuse the activated rules.

Step 6: Optimize the established model

Through steps 1–5, the DBRB model is established. However, as the initial parameters are determined by experts, they have certain randomness and limitations, so optimal diagnosis results cannot be obtained. Therefore, we use historical data to train parameters and introduce an optimization method to optimize the DBRB model. Typically, parameters that can be optimized are output belief degree ${\beta _{jk}}$ , initial rule weight ${\theta _k}$ , and input weight ${\delta _T}$ . To optimize the model, the objective function should be built first. The optimization system is shown in figure 3.

**Figure 3.** System optimization of the BRB model.
Download figure:
Standard image High-resolution image

P is the parameter to be optimized, which is the vector of the above three adjustable parameters:

$\begin{equation}P = ({\beta _{ik}},{\theta _k},{\delta _j};{} {} {} {} {} {} {} i = 1,\ldots ,N;{} {} {} {} {} {} k = 1,\ldots ,L;{} {} {} {} {} {} {} j = 1,\ldots ,T).\end{equation} \tag{ 19 }$

The number of optimization parameters p is (N *L +L +T). The fmincon function in MATLAB is selected to seek the optimal parameters' value in the specified interval. $\mathop {({\textrm O})}\limits^ \wedge$ is the form of observed belief degree for the real system; $({\textrm O})$ is the estimated belief degree set of system which is obtained through simulation. The minimum Euclidean distance (MED) of the observed belief degree $\mathop {({\textrm O})}\limits^ \wedge$ and the estimated belief degree $({\textrm O})$ are selected as the objective function of optimization. Where $\zeta (p)$ is the objective function to be optimized.

$\begin{align} & \mathop {\min }\limits_p {} {} {} {} {} {} {} \left\{ {\zeta (p)} \right\}\nonumber\\ & \zeta (p)\, = \,\frac{1}{n} \times \sum\limits_{i = 1}^n {\sqrt {{{(\mathop {\textrm O}\limits^ \wedge - {\textrm O})}^{\text{2}}}} }\nonumber\\ & s.t.{} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} {} 0 \leqslant {\beta _{ik}} \leqslant 1 \nonumber\\ & \sum\limits_{i = 1}^3 {{\beta _{ik}}} = 1 \nonumber\\ & {\text{0}} \leqslant {\theta _k} \leqslant 1 \nonumber\\ & 0 \leqslant {\delta _j} \leqslant 1 . \end{align} \tag{ 20 }$

5. Experimental analysis and performance comparison

5.1. Experimental bench test for collection of motor vibration data

To verify the ability of the proposed algorithm to diagnose faults, real fault data was collected. The relevant data samples of the vibration signals in this paper were collected from a permanent magnet motor TZ205XS70K01 provided by UNIlink Energy Innovation company, (figure 4). The motor parameters are shown in table 3.

**Figure 4.** Test bench.
Download figure:
Standard image High-resolution image

Table 3. Main technical parameters.

Parameter	Value
Rated voltage (v)	DC350
Input voltage (v)	DC240-420
Rated speed (r min⁻¹)	3000
Rated power (kW)	55
Efficiency	⩾95%
Overload capacity 1 Overload capacity2 Insulation and protection class Cooling type Rotation direction Control method Flow (l min⁻¹) Net quality (kg) Sensor type Sample frequency (kHz)	110% rated current 60 s/300 s 150% rated current 10 s/300 s IP44 Water cooling Counter clockwise Vector control 8–12 70 Piezoelectric sensor 20

Three one-dimensional vibration acceleration signals were collected at a sampling frequency of 20 kHz. The three working conditions were: G1—bearing inner ring fault, G2—rotor eccentricity, G3—stator short circuit. A total of 300 000 data samples for these three conditions were extracted separately from the test bench.

5.2. Denoising

Wavelet soft threshold denoising and wavelet hard threshold denoising methods are used to obtain an improved denoising signal. For intuitive display, the results of 6000 data points in a time window are shown in figure 5. The abscissa is the sample, and the ordinate is the vibration acceleration.

**Figure 5.** Signal denoising of bearing inner ring fault.
Download figure:
Standard image High-resolution image

Obviously, the vibration signal has different degrees of improvement after the wavelet threshold denoising. The amplitude fluctuation becomes smaller and the curve is smooth. To compare the denoising effect, SNR and RMSE were used as reference indexes. With the wavelet soft threshold algorithm, SNR = 12.5701 and RMES = 0.2182; With the wavelet hard threshold algorithm, SNR = 10.4368 and RMES = 0.2847.

The rated speed of the motor is 3000 r min⁻¹ and the sampling frequency is 20 kHz. Each revolution is treated as a cycle, and each cycle contains 400 data samples. In order to select an appropriate time length to characterize each time domain information and avoid the randomness of small-period signal fluctuations, 15 cycles were selected as a data sample. Every 6000 data points are taken as a time window in chronological order, there are 50 data samples in each working condition data, a total of 6000 × 50 data sets. Therefore, three working conditions constitute a total of 150 data features, thus forming a data matrix of (150,6000). Each working condition is named label 1, label 2, and label 3 respectively.

5.3. Feature extraction

After the original one-dimensional data was segmented into time series, a time matrix of 150 rows and 6000 columns was formed, and features were extracted from each row of data. A total of seven typical time domain features and six digital statistical features were extracted.

5.4. Feature dimension reduction

The 13 extracted features are reduced to three principal components, and the feature visualization is shown in figure 6 below. After PCA dimension reduction, all features are separated, with only one feature of G2 and G3 slightly overlapping, which reduces the complexity of fault identification. The contribution rate of variance of the first component is about 82%, that of variance of the second component is about 12%, and that of variance of the third component is 6%. The complete data processing flow diagram is shown in figure 7:

**Figure 7.** Data processing flow chart for feature dimension reduction.
Download figure:
Standard image High-resolution image

5.5. DBRB model

Before establishing the DBRB model, the precursor properties are set, ${\delta _{\text{1}}} = {v_1} = 0.82$ , ${\delta _2} = {v_2} = 0.12$ , ${\delta _3} = {v_3} = 0.06$ . The variance contribution rate is calculated according to Formula 9. In order to simplify the rule base, the clustering quantity K is set as 2,1,1 respectively, so that the first feature clustering center is −9.014, 51.242; The clustering center of the second feature was 11.470; The clustering center of the third feature was 0.661, and the maximum and minimum of each feature is taken as the reference value boundary. The referential points are as follows:

$\begin{equation*}\begin{array}{*{20}{l}} {{A_1} = \left\{ { - 93.323, - 9.014,51.242,98.85} \right\}} \\ {{A_2} = \left\{ { - 52.646,11.470,50.833} \right\}} \\ {{A_3} = \left\{ { - 4.151,0.661,5.776} \right\}} \end{array}.\end{equation*}$

If traditional continuous BRB (OBRB) is used here, the initial rule base should include 36 (4 × 3 × 3) rules. In contrast, DBRB is introduced to reduce the referential points, where Ll means Low level, ml means Middle level, Hl means High level. So the discrete interval is divided as follows:

$\begin{equation*}\begin{array}{*{20}{l}} {{A_1} = \left\{ {Ll = 1;Ml = 2;Hl = 3} \right\}} \\ {{A_2} = \left\{ {Ll = 1,Hl = 2} \right\}} \\ {{A_3} = \left\{ {Ll = 1,Hl = 2} \right\}} \end{array}.\end{equation*}$

According to the interval form of 3, 2, and 2, the initial rule base of vector form is established as shown in table 4. It includes 12 rules (3 × 2 × 2), the size of rule base is reduced by 2/3. The initial rule weights are all set to 1.

Table 4. Initial DBRB.

No.	θ_k	X₁ and X₂ and X₃	Confidence distribution structure
1	1	Ll and Ll and Ll	$\left\{ {\left( {Di{s_1},1} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},0} \right)} \right\}$
2	1	Ll and Ll and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},1} \right),\left( {Di{s_3},0} \right)} \right\}$
3	1	Ll and Hl and Ll	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},1} \right)} \right\}$
4	1	Ll and Hl and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},1} \right),\left( {Di{s_3},0} \right)} \right\}$
5	1	Ml and Ll and Ll	$\left\{ {\left( {Di{s_1},1} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},0} \right)} \right\}$
6	1	Ml and Ll and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},1} \right)} \right\}$
7	1	Ml and Hl and Ll	$\left\{ {\left( {Di{s_1},1} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},0} \right)} \right\}$
8	1	Ml and Hl and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},1} \right),\left( {Di{s_3},0} \right)} \right\}$
9	1	Hl and Ll and Ll	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},1} \right)} \right\}$
10	1	Hl and Ll and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},{\text{1}}} \right),\left( {Di{s_3},{\text{0}}} \right)} \right\}$
11	1	Hl and Hl and Ll	$\left\{ {\left( {Di{s_1},1} \right),\left( {Di{s_2},0} \right),\left( {Di{s_3},0} \right)} \right\}$
12	1	Hl and Hl and Hl	$\left\{ {\left( {Di{s_1},0} \right),\left( {Di{s_2},1} \right),\left( {Di{s_3},0} \right)} \right\}$

5.6. Comparison and analysis of diagnostic results

5.6.1. Fault diagnosis using the DBRB model.

In this case, three principal components are taken as three features, each fault has 300 000 vibration datasets, and 6000 datasets are taken as a time window. Then each working condition has 50 data segments, that is, each feature has 50 data segments, and each feature has a total of 150 samples for three working conditions. Here, 100 samples and 50 samples are randomly selected as the training set and the test set, respectively. After optimization of the DBRB model, the results of the training set are shown in figure 8(a) and the results of the test set are shown in figure 8(b).

In the figure 8(a), the accuracy of the training set is 92%. The accuracy of the test set is up to 96%, and the MED is 0.7146. The optimized D-BRB structure is shown in table 5. Attribute weights before and after optimization are shown in table 6.

Table 5. Optimized DBRB structure.

No.	θ_k	X₁ and X₂ and X₃	Confidence distribution structure
1	1	Ll and Ll and Ll	$\left\{ {\left( {Di{s_1},0.024} \right),\left( {Di{s_2},0.953} \right),\left( {Di{s_3},0.023} \right)} \right\}$
2	1	Ll and Ll and Hl	$\left\{ {\left( {Di{s_1},0.971} \right),\left( {Di{s_2},0.012} \right),\left( {Di{s_3},0.017} \right)} \right\}$
3	1	Ll and Hl and Ll	$\left\{ {\left( {Di{s_1},0.013} \right),\left( {Di{s_2},0.013} \right),\left( {Di{s_3},0.974} \right)} \right\}$
4	1	Ll and Hl and Hl	$\left\{ {\left( {Di{s_1},0.013} \right),\left( {Di{s_2},0.973} \right),\left( {Di{s_3},0.014} \right)} \right\}$
5	1	Ml and Ll and Ll	$\left\{ {\left( {Di{s_1},{\text{0}}{\text{.941}}} \right),\left( {Di{s_2},0.{\text{029}}} \right),\left( {Di{s_3},0.{\text{030}}} \right)} \right\}$
6	1	Ml and Ll and Hl	$\left\{ {\left( {Di{s_1},0.017} \right),\left( {Di{s_2},0.017} \right),\left( {Di{s_3},0.966} \right)} \right\}$
7	1	Ml and Hl and Ll	$\left\{ {\left( {Di{s_1},{\text{0}}{\text{.012}}} \right),\left( {Di{s_2},0.{\text{975}}} \right),\left( {Di{s_3},0.{\text{013}}} \right)} \right\}$
8	1	Ml and Hl and Hl	$\left\{ {\left( {Di{s_1},0.955} \right),\left( {Di{s_2},0.019} \right),\left( {Di{s_3},0.026} \right)} \right\}$
9	1	Hl and Ll and Ll	$\left\{ {\left( {Di{s_1},0.012} \right),\left( {Di{s_2},0.012} \right),\left( {Di{s_3},0.976} \right)} \right\}$
10	1	Hl and Ll and Hl	$\left\{ {\left( {Di{s_1},0.023} \right),\left( {Di{s_2},0.953} \right),\left( {Di{s_3},0.024} \right)} \right\}$
11	1	Hl and Hl and Ll	$\left\{ {\left( {Di{s_1},0.954} \right),\left( {Di{s_2},0.021} \right),\left( {Di{s_3},0.025} \right)} \right\}$
12	1	Hl and Hl and Hl	$\left\{ {\left( {Di{s_1},0.{\text{019}}} \right),\left( {Di{s_2},{\text{0}}{\text{.013}}} \right),\left( {Di{s_3},0.{\text{968}}} \right)} \right\}$

Table 6. Attribute weights comparison.

	Before optimization			After optimization
Weight	${\delta _{\text{1}}}$	${\delta _{\text{2}}}$	${\delta _{\text{3}}}$	${\delta _{\text{1}}}$	${\delta _{\text{2}}}$	${\delta _{\text{3}}}$
Value	0.82	0.12	0.06	0.82	0.26	0.13

Interestingly, the first antecedent weight ${\delta _{\text{1}}}$ was the same before and after optimization, the first principal component information is preserved completely, which further illustrates the superiority of the proposed method in defining the weight factor.

Here, the variance contribution rate is set as the initial attribute weight in order to better express and restore the initial feature information, so as to better show the real operation state of the motor. In this paper, the initial attribute weights are optimized by DBRB model in order to find the parameters that can best fit the model, so as to get the optimal diagnosis results.

5.6.2. The influence of feature number on diagnosis result.

To explore the influence of feature numbers on diagnosis accuracy, mathematical statistical features were gradually added while traditional time domain features were retained. 12, 11, 10, 9, 8 and 7 features were selected respectively and inputted into DBRB model after PCA dimension reduction. The average value was taken after 20 experiments. The results of diagnosis accuracy and MED are shown in figures 9(a) and (b).

**Figure 9.** Influence of the number of original features: (a) for accuracy and (b) for minimum Euclidean distance (MED).
Download figure:
Standard image High-resolution image

As can be seen from the figures above, when the feature numbers are 10 and 11, although the diagnostic accuracy is the same, the MED increases. With a decrease in feature number, the diagnosis accuracy gradually decreases and the MED gradually increases. Therefore, it is necessary to select mathematical statistical features as the initial feature extraction parameters, which increases the amount of information containing data to deliver improved fault diagnosis.

5.6.3. DBRB compared with OBRB.

To further verify the fault diagnosis abilities of the improved DBRB algorithm, a comparison with OBRB was made. For the first group: the three features of PCA after For made. For the first group: the three features of PCA after dimension reduction were used as the input of the OBRB, and their antecedent attributes is set as 1. For the second group: the features after dimension reduction are used as the input of the OBRB, and the corresponding weights of antecedent attributes are set to the variance contribution rate; For the third group: features after dimension reduction are used as input of the DBRB, and their antecedent attributes is the third group: features after dimension reduction are set as 1; For the fourth group: the weights of antecedent attributes are set to the variance contribution rate and features are used as input of the DBRB. To evaluate the diagnostic effect of the model, the RN, optimized parameter number (OPN), MED, time (T/S) and Akaike Information Criterion (AIC) were used as evaluation indexes as follows:

(a)
The number of rules refers to the RNs in the initial rule base; the smaller the RNs, the simpler the model.
(b)
The fewer the number of optimization parameters, the less computing power.
(c)
MED indicates better diagnostic accuracy.
(d)
The smaller the time cost, the lower the computer calculation power requirements.
(e)
AIC refers to an indicator under the combined consideration of model fitting data and complexity. It helps to find models that best handle the data while containing relatively few parameters. A smaller AIC value indicates that the model is optimal under comprehensive consideration. The smaller the value of AIC, indicates that the model is optimal in all aspects. It is defined as follows:
$\begin{equation}{\text{AI}}{{\text{C}}_{{\text{BRB}}}} = P{\text{ln}}(P \cdot {\text{MED}}) + 2{\text{OPN}}\end{equation} \tag{ 21 }$

where P denotes the amount of data in the original data set, which is 150 × 3 = 450. The evaluation indexes of the four models are shown in table 7 below:

Table 7. Comparison of OBRB and DBRB experimental results.

Model	RN	OPN	MED	T(s)	AIC
OBRB	36	147	0.6726	96.38	2868.68
OBRB (weight)	36	147	0.5364	67.19	2762.86
DBRB	12	51	0.7146	12.86	2699.94
DBRB (weight)	12	51	0.6872	10.27	2682.35

Among these models, the OBRB rule base includes 36 rules, whereas the DBRB rule base includes 12 rules, size of the initial rule base being reduced by 2/3. The number of OBRB optimization parameters is 36 + 3 + 36 × 3 = 147, while the number of DBRB optimization parameters is 12 + 3 + 12 × 3 = 5, a large decrease. The MED of OBRB with weight is 0.5364, and that of DBRB with weight is 0.6872. Interestingly, the MED of the OBRB diagnostic model in this case is slightly smaller than that of the DBRB model,which has a better diagnostic effect. However, this is due to the relatively large initial rule base and more parameters of the OBRB, which is normal, but it is not entirely desirable to consider model accuracy in isolation. From the perspective of the AIC index, DBRB is slightly less than OBRB. In this case, the diagnosis time was short due to a small number of classification categories and data volume, which were 67.19 s for OBRB and 10.27 s for DBRB, respectively, although both were not long enough to be acceptable. However, when extended to other fields, large data volumes or a large initial rule base will lead to exponential growth in model training time.

In addition, comparing OBRB with OBRB (weight), the MED decreased slightly, and diagnosis time decreases by nearly 30 s. When DBRB is compared with DBRB (weight), both MED and diagnosis time decrease slightly, indicating that the setting of weight factor has an impact on diagnosis accuracy and diagnosis time, and diagnosis time will increase to a certain extent, with an increase in model complexity.

Overall, DBRB has a number of key advantages compared to the OBRB where classification problems are concerned. At the same time, the setting of the weight of the initial antecedent attributes is extremely important, which not only preserves information in the data sets, but also trains the model in advance, reducing the training time and improving overall robustness of the model.

5.6.4. DBRB compared with other popular classification algorithms.

In a similar way, the DBRB model developed here was compared with several traditional mainstream classifiers, the three features after feature dimension reduction are taken as input attributes, 50, 40 and 30 samples of 150 data samples are selected as test sets, and the average value of 30 experiments is taken respectively. The Naive Bayes (NB), Bayes Net (BN), Random Forest (RF) and Support Vector Machine (SVM) classifiers were selected for comparison. Their diagnostic accuracy is summarized in figures 10(a)–(c). The upper and lower edges of the box are the upper quartile (Q25%) and the lower quartile (Q75%) of the accuracy, the upper and lower edges represent the maximum and minimum values of the data, the small rectangle is the average value of the accuracy, and the black diamond outside the box is the outlier.

As can be seen from figure 10, the DBRB has the highest diagnostic accuracy compared to other algorithms, with no outliers and a more concentrated accuracy distribution, indicating that the DBRB model is more robust. To further verify the reliability of experimental results, 40 test samples and 30 test samples were used in the training set and test set. Again, similar conclusions can be drawn, confirming the superior accuracy and robustness of the presented model for fault diagnosis from inherently uncertain motor vibration signal data.

6. Conclusion

To effectively process uncertain motor vibration signal data effectively, a novel fault diagnosis method using a combined PCA-DBRB approach was presented. Wavelet threshold algorithm was used to reduce vibration interference, PCA was used to reduce the number of antecedents, and the combination explosion of rules problem was avoided. The k-means algorithm was used to determine the referential points, and the DBRB was used to replace the OBRB, which also reduced the number of initial rules. Due to the uneven distribution of principal component feature information after PCA dimensionality reduction, the introduction of variance contribution rate as feature weight not only amplifies the qualitative and quantitative advantages of BRB, but also makes up for the defects of PCA method. The approach proposed can reduce the complexity of the model while ensuring accuracy and it represents an effective support tool for fault diagnosis and maintenance of motors.

Acknowledgments

The authors gratefully acknowledge financial support provided by the National Natural Science Foundation of China (NSFC)-Zhejiang Joint Fund for the Integration of Industrialization and Informatization Project No. U1709215, Zhejiang outstanding youth fund (LR21F030001) and Zhejiang Province Key R&D projects (No. 2021C03015).

Data availability statement

The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.

Conflict of interest

The authors declare that they have no conflicts of interest.

CRediT authorship contribution statement

Hang Yu: Methodology, Writing—original draft, Haibo Gao: Conceptualization, Supervision, Yelan He: Supervision, Suggestions, Zhiguo Lin: Suggestions, Xiaobin Xu: Methodology.

A novel motor fault diagnosis method based on principal component analysis (PCA) with a discrete belief rule base (DBRB) system

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Data preprocessing and feature dimension reduction algorithm

2.1. Wavelet threshold denoising

2.2. Principal component analysis (PCA)

3. Belief rule base

3.1. BRB

3.2. Inference process of the BRB

3.3. The challenge of BRB

4. Proposed method of fault diagnosis based on PCA-DBRB

4.1. Data denoising

4.2. Feature extraction

4.3. Dimension reduction by PCA

4.4. BRB training process

5. Experimental analysis and performance comparison

5.1. Experimental bench test for collection of motor vibration data

5.2. Denoising

5.3. Feature extraction

5.4. Feature dimension reduction

5.5. DBRB model

5.6. Comparison and analysis of diagnostic results

5.6.1. Fault diagnosis using the DBRB model.

5.6.2. The influence of feature number on diagnosis result.

5.6.3. DBRB compared with OBRB.

5.6.4. DBRB compared with other popular classification algorithms.

6. Conclusion

Acknowledgments

Data availability statement

Conflict of interest

CRediT authorship contribution statement

A novel motor fault diagnosis method based on principal component analysis (PCA) with a discrete belief rule base (DBRB) system

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. Data preprocessing and feature dimension reduction algorithm

2.1. Wavelet threshold denoising

2.2. Principal component analysis (PCA)

3. Belief rule base

3.1. BRB

3.2. Inference process of the BRB

3.3. The challenge of BRB

4. Proposed method of fault diagnosis based on PCA-DBRB

4.1. Data denoising

4.2. Feature extraction

4.3. Dimension reduction by PCA

4.4. BRB training process

5. Experimental analysis and performance comparison

5.1. Experimental bench test for collection of motor vibration data

5.2. Denoising

5.3. Feature extraction

5.4. Feature dimension reduction

5.5. DBRB model

5.6. Comparison and analysis of diagnostic results

5.6.1. Fault diagnosis using the DBRB model.

5.6.2. The influence of feature number on diagnosis result.

5.6.3. DBRB compared with OBRB.

5.6.4. DBRB compared with other popular classification algorithms.

6. Conclusion

Acknowledgments

Data availability statement

Conflict of interest

CRediT authorship contribution statement