Topic Correlation Analysis for Bearing Fault Diagnosis Under Variable Operating Conditions

This paper presents a Topic Correlation Analysis (TCA) based approach for bearing fault diagnosis. In TCA, Joint Mixture Model (JMM), a model which adapts Probability Latent Semantic Analysis (PLSA), is constructed first. Then, JMM models the shared and domain-specific topics using “fault vocabulary” . After that, the correlations between two kinds of topics are computed and used to build a mapping matrix. Furthermore, a new shared space spanned by the shared and mapped domain-specific topics is set up where the distribution gap between different domains is reduced. Finally, a classifier is trained with mapped features which follow a different distribution and then the trained classifier is tested on target bearing data. Experimental results justify the superiority of the proposed approach over the stat-of-the-art baselines and it can diagnose bearing fault efficiently and effectively under variable operating conditions.


Introduction
Rolling bearings, as one of the most important components in rotary machines, are widely used in modern manufacturing systems. However, faults occurring in bearings could cause dreadful consequences such as performance deterioration, costly downtime, production delay or even personnel casualties [1]. Therefore, detecting the faults occurring in bearings as early as possible can not only ensure reliability, safety and good operation of mechanical equipment, but also have the ability to avoid unnecessary losses. With the development of fault diagnosis in rolling bearings, various methods [2][3][4] have been employed to measure certain physical quantities, including temperature monitoring, oil analysis, vibration signal analysis, and etc., which serve as a crucial preprocessing step for fault diagnosis. Among them, the use of vibration signal analysis has been proven to be quite common, practical and effective. In many existing researches [5][6], the fault diagnosis of bearings can be treated as a problem of pattern recognition solved by machine learning-based algorithms using extracted features from vibration data. However, in many applications, bearings are operated under complex working conditions [7], such as high speed, variable load, high temperature and other complex conditions, effective fault diagnosis for bearings is challenging. Nowadays, most studies of bearing fault diagnosis focus on a specific working condition(e.g., in laboratory or under controlled conditions), where traditional machine learning methods (e.g., back propagation neural network, transductive support vector machine, K-means) are proved to be effective using labeled, unlabeled data or both of them [8][9][10]. However, successful utilization of all these methods requires that the training and testing data are drawn from the same distribution (e.g., vibration data for training and testing process are measured from the same set of experiments) [11]. In many real-world applications, bearings are working under variable operating conditions, where there exist various speeds, changing load as well as some unknown jamming, there is no guarantee that the training and testing data are drawn from the same distribution. When the distribution changes, it is common to rebuild the diagnostic model using sufficient newly labeled data. But it is sometimes time-consuming or expensive to obtain vibration data of the target bearing to train a precise classifier, because the installation position of bearing is inside some unable dismantled parts, thus the performance of traditional machine learning methods in terms of bearing fault diagnosis under complex working conditions would come out to be poor. In contrast, transfer learning, a new machine learning approach with ability to recognize and apply knowledge and skills learned in previous tasks to new tasks, has attracted more and more attention since 1995 [12][13]. It is a good solution for fault diagnosis in complex situations, where the training and test data are drawn from different distributions.
Inspired by this, the paper proposes a novel bearing fault diagnosis strategy based on Topic Correlation Analysis (TCA). In TCA, Joint Mixture Model (JMM), a model which adapts Probability Latent Semantic Analysis (PLSA), is constructed first. As JMM is originally designed well for text classification, the following modifications have been taken to ensure success in applying JMM to fault diagnosis: (1) view a time series of vibration data as a document consisted of a number of "fault word", and take energy distribution as well as load, speed, and other parameters related to vibration data as the topics embedded in the document; (2) extract frequency-domain features and time-domain features from vibration data; (3) apply K-means to generate a set of "fault word" using extracted features. Then, JMM models the shared and domain-specific topics using "fault word". After that, the correlations between two kinds of topics are computed and used to construct a mapping matrix. Furthermore，a new shared space spanned by the shared and mapped domain-specific topics is built where the distribution gap between different domains is reduced. Finally, a classifier is trained with mapped features which follow a different distribution and then the trained classifier is tested on target bearing.
The rest of the paper is organized as follows. In section 2, the principle of Topic Correlation Analysis (TCA) along with its modification, applied in bearing fault diagnosis, is introduced. Then experimental study using bearing data set is carried out in Section 3, followed by the conclusions in Section 4.

TCA-based transfer learning approach for bearing fault diagnosis
In practice, classification algorithm based on machine learning plays an important role in fault diagnosis. But a basic assumption of traditional machine learning is that the training and testing data follow the same distribution. However, when it comes to bearing fault diagnosis under variable conditions, the distributions of training and testing data do not always satisfy the assumption. In this case, transfer learning will become an ideal tool for bearing fault diagnosis.
Here an improved algorithm, named enhanced TCA, which can reduce the data difference between the two data samples, is put forward for fault diagnosis. TCA is based on the model of Probabilistic Latent Semantic Analysis (PLSA) [14], and the principle of PLSA is introduced first.

A Brief Review of PLSA
PLSA, originating from latent semantic analysis (LSA), has been widely used for document clustering and related tasks. Given a set of documents D={d 1 ,d 2 ,…,d N }, words W={w 1 ,w 2 ,…,w M } and latent topic variables Z={z 1 ,z 2 ,…,z K }. A generative model for document co-occurrences, as is shown in Figure 1, can be defined by the following scheme: Where P(d i ) denotes the probability that a word occurrence will be observed in a particular document d i , P(w j | z k ) denotes the class-conditional probability of a specific word conditioned on latent topic variable z k , and P(w j | d i ) represents a document specific probability distribution over the latent variable space.
Then, the Expectation-Maximization (EM) algorithm, which is suitable for estimating model parameters of latent variable model [15], is applied. The aim of EM is to find suitable ( | ) ki P z d and ( | ) jk P w z by maximizing objective function where n(d i , w j ) denotes the number of times that word w j occurs in document d i . PLSA model is originally designed to address the semantic problem caused by synonyms and polysemy in document clustering task. Similarly, in the field of fault diagnosis, fault features, equivalent to words in document categorization, also have the characteristics of synonyms and polysemy. For example, there must be a number of commonalities between features extracted from the same fault type under different working conditions because of the same root cause of failures, which is similar to synonyms in text. At the same time, due to these commonalities, the extracted features can not only represent their own fault category, but also have a certain contribution to the clustering of the fault type under another working condition, under this circumstance, these fault features can also be seen as polysemy.
In document clustering task, vectors which are represented by word frequency are used to stand for documents. The latent 'bag-of-words' is used to discovery the topics of documents. Thus, in fault diagnosis, if one fault type is viewed as a document consisted of a number of "fault word", and energy distribution as well as load, speed, and other parameters related to vibration data is taken as the latent topics embedded in the document, then the probability distribution of the latent topic can be found with PLSA. Specifically, given a set of vibration signal D={d 1 ,d 2 ,…,d N }, some modifications are listed as follows: (1) Extracting time-domain features and frequency domain features respectively from the vibration signal, then utilizing k-means to generate N fault words (2) Computing the similarity between each fault word and the features extracted from each vibration signal, then co-occurrence matrix n(d i , w j ) is obtained, where n(d i , w j ) denotes the similarity between fault word w j and vibration signal d i .
Then, the rest steps are the same as traditional PLSA used in document clustering task.
12th The main idea of TCA can be summarized into three parts:

Part I. Modelling both Shared and Domain-Specific Topics
In this part, Joint Mixture Model(JMM), which adapts PLSA, is used to extracting both shared and domain specific topics. In JMM, as is shown in Figure 2  (3) The joint probability model for the data generative process is described as: The equation (5) related matrix decomposition is illustrated in Figure 2 (

Overview of Bearing Fault Diagnosis Based on Transfer Learning
The enhanced TCA algorithm can map the feature data that under different working conditions to a new shared space, where the difference between the data will be reduced. Therefore, the bearing fault diagnosis problem under variable conditions can be solved using this transfer learning technology. The complete process of the proposed method is summarized in Algorithm 1.

Experimental setup
To verify the effectiveness of the presented approach in bearing fault diagnosis, the bearing data set contributed by Western Reserve University Bearing Data Center was analyzed. As is shown in Figure 3, the experiment system consists of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown). The model of drive end bearing is SKF6205, and the model of fan end bearing is SKF6203.
Vibration signals were measured using a 16 channel DAT recorder with12 kHz sample frequency from the test system under variable speeds and loading conditions. Data were collected at 12,000 samples per second. Speed and horsepower data were collected using the torque transducer/encoder and were recorded by hand. Different bearing working conditions are listed in Table 2.

Experimental results and analysis
Firstly, we extracted 8 time-domain features and 8 frequency-domain features respectively, and then a total of 16 features are obtained. In this study 200 data samples with 16 features from each fault type under each working condition were obtained, thus, the size of original target and source sample feature matrix is 1600*16. It should be pointed out that the training sample and testing sample belong to different working conditions.
To test the effectiveness of enhanced TCA for bearing fault diagnosis, a comparison study is performed between enhanced TCA and TCA share , which only uses the shared topics to represent fault samples for classifier training. Furthermore, the enhanced TCA is also compared with principal component analysis (PCA), kernel principal analysis (KPCA), factor analysis (FA) and locally linear embedding (LLE) for feature selection. The mapped source data set are fed into SVM classifier for training, then the target data are used to test the trained model.
Two cases are studied in this paper: transfer among drive end bearings, and transfer between fan end bearing and drive end bearing. The classification results are shown in Tables 3 and 4, where the notation, for example D1797R1-D1797R2, is explained as follows: the left represents source data (for training only), the right is target data(used for testing), and 1797 represents the speed, R1, R2, R3, R4 stand for fault diameter with 0.1778mm, 0.3556mm, 0.5334mm, 0.7112mm respectively. The symbols F and D represent fan end bearing and drive end bearing respectively.  The following conclusions can be obtained from Tables 3 and Table 4: 1) Using original features without any feature selection techniques as the input of SVM has the lowest performance.
2) The classification accuracy of traditional machine learning algorithms (PCA, KPCA, LLE and FA) comes to be only 78.88% on average, which means they are not suitable to train a precise diagnostic model for bearings under different conditions.
3) Compared with machine learning methods, the classification accuracy of transfer learning (TCA share , enhanced TCA) is improved by about 8.31% and 14.47% respectively, and the best classification accuracy of enhanced TCA in both two cases is 96.25%. Thus, the enhanced TCA has the best classification performance.

Conclusion
A fault diagnosis method based on enhanced TCA strategy is proposed for bearing fault diagnosis under different working conditions. Here vibration data are viewed as a document consisted of a number of "fault word", and with the K-means method, "fault word" can be generated. Compared with traditional machine learning algorithm and TCA share , the enhanced TCA has the superior performance.