Study on cavitation fault features of centrifugal pumps based on principal component analysis

The centrifugal pump plays a key role in the ship water supply system, and cavitation is a common fault mode that leads to poor efficiency of centrifugal pumps. To select valid features as the criteria of cavitation fault diagnosis, we calculate RMS, peak factor, kurtosis factor, wave factor of raw data, and the data after empirical mode decomposition (EMD). Although the raw features can be obtained, they were still in a high-dimensional space and contain a lot of redundant information. So, Principal Component Analysis (PCA) method was used to decrease dimensions and extract sensitive features. As a case study, the data were obtained from a centrifugal pump fault simulation bench, and a variety of cavitation states were observed in the experiment. And a six-dimensional sensitive feature vector was determined through data analysis.


Introduction
In the case of a pump entrance pip block or deviating from the rated working condition, cavitation would occur in a centrifugal pump.Cavitation could lead to pitting or crack of the pump blade, even blade break.As a result, the vibration of the pump was intensified, and the operation efficiency and useful life of the centrifugal pump were reduced dramatically.Therefore, it is important to prevent and reduce the cavitation damage of centrifugal pumps.Extracting and analyzing the features of cavitation failure mode were the first step to detecting the cavitation fault [1][2][3][4].
There are three types of features of raw vibration data: time domain, frequency domain, and timefrequency domain features.In this paper, the root mean square value, kurtosis factor, peak factor, and the features of the signal after EMD were extracted.Though the raw features can be obtained, they were still in a high-dimensional space and contain a lot of redundant information [1].Therefore, the principal component analysis (PCA) method was suitable for processing the raw features to extract the sensitive features.

Method of Signal Feature Extraction
This chapter gives a discussion on the methods for signal feature extraction, including time domain features extraction, Empirical Mode Decomposition, and Principal Component Analysis.The feature extraction procedure were shown in Figure 1.

Empirical Mode Decomposition
EMD is a time-frequency domain signal analysis method suitable for analyzing non-stationary and non-linear signals.The signal could be decomposed into several IMF components by the EMD method.The IMF components need to meet two requirements.(1) In all the components, the number of extreme points and the number of zero crossing points must be consistent or different by one; (2) In any case, the mean of the upper and lower envelope curve point is equal to zero [2].The decomposition steps are shown below: (1) Based on the upper and lower extreme points of the raw signal, we draw the upper and lower envelope curve.
(2) We compute the mean value of the upper and lower envelope: ݉ ଵ ‫;)ݐ(‬ (3) We obtain the first component: (4) We determine whether the component meets two conditions of the IMF.And if it is satisfied, the signal is the very first IMF component ݄ ௧ ‫.)ݐ(‬ On the contrary, we repeat Steps (1)~(4) based on ݄ ଵ ᇱ ‫)ݐ(‬ until the last IMF component is obtained; (5) We compute the remainder signal: where ‫ݎ‬ ‫)ݐ(‬ = ‫.)ݐ(ݔ‬(6) We use ‫ݎ‬ ଵ ‫)ݐ(‬ as a new signal repeating Steps (1) ~ ( 5) until the EMD decomposition is completed.
Finally, the signal can be expressed as: This paper used the first five components )ݐ(‬ to extract features, and different IMF components contain different frequency information.

Features extraction
Time-domain features are important indexes for signal features, and they are usually divided into dimensioned indexes and dimensionless indexes.Dimensional indexes are sensitive to signal characteristics, but the defects are unstable in changing working conditions [5][6][7].In contrast, the dimensionless index can exclude disturbance, so it is widely used in signal feature extraction.In this paper, we both used two kinds of indexes, and the indexes can be expressed as follows: Root-mean-square(RMS): Standard deviation: Kurtosis factor: Peak factor: where ܺ is the peak of signal ܺ .
Wave factor: The feature vector was extracted from the raw data and 5 IMFs obtained by the EMD decomposition method, and the 30-dimensional vector was denoted as follows:

Data acquisition
Experiments were carried out on a water supply system fault simulation bench at Shanghai Ship Equipment Research Institute.It can simulate multiple system faults, including valve faults, centrifugal pump faults, and pip faults.The water pump speed was measured by a hall sensor.
The simulator provides a cavitation fault kit of normal, mild cavitation, moderate cavitation, and severe cavitation.A data acquisition system was used to gather data, the sampling frequency of data is 131072 Hz, and each working condition is collected for 120 s.Then 1920 samples were obtained by splitting, of which 1000 samples were for training and 920 samples were for testing.

Feature analysis
The vibration signal of severe cavitation and its five IMF components were shown in Figure 2.And the EMD method was used to decompose all 1000 samples, and each sample produced five IMFs and one residue component.And the features were obtained from the raw signal and 5 IMF components, which composed one 30-dimensional feature vector with frequency information.
Figure 2: The EMD decomposition of the cavitation signal With four working conditions of the fault simulation experiment, there are 4000 training samples and 30 features for each sample, so the feature dataset was a 4000 × 30 matrix ܵ ସ×ଷ .For example, the kurtosis factor was shown in Figure 3, and the normal distribution model of the sample kurtosis factor in each working condition was shown in Figure 4.The model parameters were listed in Table 1.Although the normal condition was separated from other conditions, the three cavitation conditions were mixed.So the next step is to use the PCA method to extract sensitive features while reducing feature dimensions.

Dimensional reduction based on PCA
Firstly, the feature dataset matrix ܵ ସ×ଷ was pre-processed, then the covariance matrix, its eigenvalue, and eigenvector were computed.The eigenvalue sorted in descending order was shown in Table 2.It could be found that the first 6 eigenvalues were much larger than the other eigenvalues, so there were 6 sensitive features, and the project matrix denoted as ܷ ଷ× was composed of the corresponding eigenvectors.The new sensitive features ܵ ଷ଼× of the test samples were obtained by multiplying the matrix ܵ ଷ଼×ଷ of testing samples and ܷ ଷ× .A 3D scatter was plotted based on the first three columns of ܵ ଼× which were the principal features, as shown in Figure 5.The normal distribution model of four working conditions in principal feature 1 was shown in Figure 6.It could be seen that normal, mild, and moderate cavitation had good discrimination in the axis of feature 1, and severe cavitation is separated from the other three in the axis of feature 3.In short, the principal features could well separate the testing samples in 4 working conditions.Meanwhile, the standard deviation of the samples in the new feature space also tends to be consistent, which has been improved.

Conclusion
In this paper, a sensitive feature analysis procedure for the cavitation fault of a water pump was proposed.Firstly, the domain features were calculated for raw data and the IMFs to constitute the raw feature vector, and the normal distribution model of the feature was studied.Then PCA method was applied to extract sensitive features.Finally, the validation of the new features for separating cavitation faults with different degrees was verified through testing samples analysis, which laid a good foundation for the analysis and diagnosis of the cavitation failure of the water pump in the future.

Figure 1 :
Figure 1: Procedure of feature extraction

Figure 5 .
Figure 5. Scatter plots of the principal features

Figure 6 .
Figure 6.The normal distribution model of principal feature 1

Table 1 .
Normal distribution parameters of kurtosis factors