New statistics for simultaneously machine incipient fault detection and monotonic degradation assessment

Machine condition monitoring (MCM) has become an important tool to avoid sudden machine breakdown and gaining more economic profits. Tasks including early fault detection and monotonic degradation assessment are important in MCM. For the incipient fault detection, statistics such as kurtosis, Gini index are widely utilized, but they cannot give an accurately incipient fault detection time, and many fluctuations may exhibit. For the monotonic degradation assessment, root-mean-square are commonly used, however, it is sensitive to energy, and cannot show distinct degradation tendency in an early fault state. Those drawbacks have limited the development of practical MCM algorithms. To address those issues, this paper proposed four parameterized statistics for simultaneously early fault detection and monotonic degradation assessment. The four parameterized statistics can be health indicators and simplify the MCM algorithms, which can be beneficial to the practical MCM applications.


Introduction
Machine condition monitoring [1] (MCM) has become an important tool to avoid sudden machine breakdown and gaining more economic profits, which has attracted much attention from academia and industry. Clearly and accurately early fault time detection can be an alarm for fault diagnosis and indicate a start time for remaining useful life (RUL) prediction [2]. Monotonic degradation assessment can be used to establish a model for RUL prediction. Therefore, the early fault detection and monotonic degradation assessment are two major prerequisite tasks to acquire good MCM results.
Faults of bearings and gears usually have features of non-Gaussian and non-stationarity [3], i.e., impulsiveness and cyclo-stationarity. Many statistics are utilized to characterize the two features to distinguish different health states. Antoni et al. [4] proposed to quantify different bandpass-filtered signals with kurtosis to locate at which frequency band the fault signatures exist, and they further constructed fast Kurtogram [5] as an optimal frequency band selection tool. The fast Kurtogram was widely used and gradually became a classic fault diagnosis method [6]. Since the kurtosis is sensitive to non-Gaussian behaviors, Li et al., [2] employed it to determine the first predicting time in a bearing RUL prediction algorithm. But kurtosis cannot exhibit a monotonic degradation assessment tendency. Since root-mean-square (RMS) is an indication of signal's energy, Li et al., [2] used it to predict bearing RUL. Kong et al. [7] also constructed a modified RMS-based model to predict RUL of bearing. Ahmad et al. [8] also developed a hybrid technique to predict RUL of bearing, the RMS was used in this Machine learning-based methods have received great concerns for MCM [9,10]. Feature extraction is one of the most important steps in most machine learning-based methods, and some basic statistical parameters, such as kurtosis, skewness, and entropy, usually play a significant role at this step. Guo et al. [9] established a recurrent neural network-based method to predict the RUL of bearing, in which the entropy, kurtosis, crest factor, and energy ratios of wavelet packages are utilized. Based on statistical features including RMS, crest factor, kurtosis, etc., Liao [10] used genetic programming to generate more effective features to predict the RUL of bearing.
Sparsity measures such as Gini index, kurtosis, and smoothness index, etc., are also essentially statistical parameters. The most known application of kurtosis is the spectral kurtosis developed by Antoni et al [4]. And the smoothness index was constructed by Liang and Bozchalooi [11] to determine wavelet parameters for bearing fault diagnosis. Recently, Wang et al., [12] found that kurtosis can be reformulated as the square of 2/ 1 , and the smoothness index is equal to 0/ 1 . In this way, a connection between the kurtosis, smoothness index and / is established. The Gini index, which is constructed in economic field, was also utilized by Miao et al. [13] to replace kurtosis in the Kurtogram for fault diagnosis. Antoni et al. [14] also studied the negative entropy originally developed in thermodynamics, and then put forward a new tool of spectral negative entropy, it aimed to simultaneously characterize the cyclo-stationarity and impulsiveness of fault signatures.
Though many achievements have been obtained on the statistics for MCM applications nowadays, there are still few works that can be used for simultaneously early fault detection and monotonic degradation assessment. To make some improvements in this aspect, four parameterized statistics are proposed as health indicators to achieve accurately early fault detection and monotonic degradation assessment. The proposed four parameterized statistics are promising to make the MCM algorithms simpler and more reliable.
The remaining part of the present article is structured as follows: Section 2 reports the proposed 4 new statistics, and Section 3 verifies their effectiveness with two public bearing run-to-failure datasets. Conclusions are drawn in Section 4.

New statistics for simultaneously early fault detection and monotonic machine condition monitoring
A sampled signal is denoted as , where 1,2,3, … , , and denotes the total signal sample points. By using Hilbert transform • , the analytic signal can be obtained as , and square envelope signal is S | | 2 . Statistics are the basis of many MCM algorithms, and four classic sparsity measures used in the field of MCM are given as follows: where 〈•〉 means arithmetic mean operation, and means ordered from the smallest to the largest.
Recently, Hou et al., [15] found that six classic sparsity measures including Gini index, / , pq-mean, negative entropy, smoothness index, and kurtosis can be redefined as the ratio of different quasi-arithmetic means. Based on this new finding [15], ratio of different quasi-arithmetic means (RQAM) is put forward as a new framework to construct new statistics for machine condition monitoring. Because the RQAM have a form of ratio, they are naturally dimensionless and may be free from varying operation working conditions. Moreover, RQAM has introduced some machine conditionrelated parameters in the definition of health indicators, which may be beneficial for monotonic degradation assessment. Based on the four sparsity measures and RQAM, Hou et al., [16] further proposed an adaptive weighted signal preprocessing technique (AWSPT) to improve their performance for MCM. Since the sparsity measures are intrinsically affected by impulsive noise, a monotonically decreasing function is designed to weight on the square envelope signal for adaptive damping of impulsive signal points. Further, sparsity measures are used to quantify the weighted square envelope signal. The weighted function is designed as follows: , where is a constant defined by a variance of white Gaussian noise under health state and a shifted percentage of probability density curve of square envelope: By substituting it into the definition of the four original sparsity measures, four new parameterized statistics can be obtained, and respectively named as PS1, PS2, PS3, and PS4 as follows: It should be noted that the constant is parameterized by the variance of white Gaussian noise under health state, and the variances of vibration signal in different health states are different, the four parameterized statistics are promising for showing monotonic degradation tendency. Therefore, the four statistics will be used as health indicators for MCM in Section 3.

Experimental verification with two bearing run-to-failure datasets
Accelerometers Thermocouples  Figure 1. Experimental platform of NASA IMS bearing dataset.
Two public bearing run-to-failure datasets respectively released by the NASA Center for Intelligent Maintenance Systems (NASA IMS) [17] and Xi'an Jiaotong University (XJTU) [18] will be used to demonstrate the effectiveness of the proposed new statistics in this section. According to Reference [16], the parameter should be larger than 15%.

Verification with NASA IMS bearing run-to-failure dataset
An experimental platform of the NASA IMS dataset is shown in Figure 1. Every 10 mins, a signal was collected at a sampling frequency of 20 kHz with 20480 sample points, During the run-to-failure test, 984 signals were collected and respectively stored in 984 data files. A previous study [19] has checked that the incipient fault happened at file number 532. An outer race failure was finally affirmed.
To monitor the incipient fault time and degradation tendency, the proposed four parameterized statistics including PS1, PS2, PS3, and PS4 were also employed, and the four sparsity measures including negative entropy, smoothness index, kurtosis, and Gini index were respectively used for comparison. The first health signal was used to calculate variance parameter to define constant . When 15%, the variance was 0.0054, and the constant was 0.0018. To obtain an increasing tendency, the smoothness index and PS2 were multiplied by 1 and respectively named as negative smoothness index and NPS2.
The monitoring curves of classic sparsity measures and proposed four parameterized statistics are given in Figure 2(a) to 2(d) and Figure 2(e) to 2(h), respectively. From Figure 2, it can be known that though both sparsity measures and the proposed parameterized statistics can show incipient fault time around file number 532, but only the proposed parameterized statistics (i.e., PS1 to PS4) can show monotonic degradation assessment tendency. And it should also be noticed that the proposed four statistics can exhibit the incipient fault time around file number 532 in a clearer manner.

Verification with XJTU bearing run-to-failure dataset
The bearing run-to-failure dataset released by XJTU [18] was also utilized to further validate the effectiveness of the new parameterized statistics. Experimental platform of XJTU bearing run-to-failure dataset was given in Figure 3. At a sample frequency of 25.6 kHz, the signal was acquired every 1 min, each sample had a length of 1.28s. After the experimental test, a total of 491 samples were collected and denoted as bearing 2-1 sub-dataset.  Figure 3. An experimental platform of the XJTU bearing run-to-failure dataset.
The four sparsity measures and proposed four parameterized statistics were utilized to monitor the bearing degradation process. Since the variance of the first health signal is 0.1936, and the is assigned as 30% , the constant is 0.0629. And the monitoring curves are given in Figure 4(a) to 4(h), respectively. From Figure 4, it can be observed that though four classic sparsity measures and the proposed parameterized statistics (i.e., PS1 to PS4) can detect an early fault around file number 450, but the sparsity measures can not show monotonic degradation tendency, only the four parameterized statistics (i.e., PS1 to PS4) could simultaneously show clear early fault time and monotonic degradation trend.

Conclusions
To simplify the MCM algorithms and make them more reliable for practical applications, this paper proposed four new parameterized statistics (i.e., PS1 to PS4) as health indicators to simultaneously achieve accurately early fault detection and monotonic degradation assessment. Experiments with two bearing run-to-failure datasets have verified their effectiveness to simultaneously detect incipient fault and monitor degradation trend. As a result, the four parameterized statistics are promising to make MCM algorithms more simple and reliable.
As for the reason why the proposed four new parameterized statistics can simultaneously show early fault time and monotonic degradation tendency: Firstly, because that the parameterized statistics have a form of ratio, they are sensitive to non-Gaussian behaviour and can detect incipient fault time. Secondly, because that the four statistics are parameterized by a variance of vibration signal collected under healthy state, and the variances of signals during the degradation are changed continually, the parameterized four statistics can achieve monotonic degradation assessment. Since two run-to-failure experiments are verified the excellent properties of the four parameterized statistics, future works will be focused on a strictly mathematical proof of the excellent properties.