Hydraulic fault prediction of integrated transmission system based on LSTM

Integrated drive-line fluids play a role in transmitting torque, providing control pressure, lubrication, and cleaning in the integrated drive-train. The increase of oil abrasive particles will accelerate the wear rate of rotating parts, while insufficient oil will lead to rapid ablation and gluing of rotating parts. According to statistics, more than 75% of hydraulic equipment failure is caused by oil pollution. Oil pollution leads to filter blockage which will cause fatal failure so that the entire transmission system cannot work. Therefore, it is extremely important to predict the fault and health management of the hydraulic system. This paper studies the oil fault of the integrated transmission system, proposes a data preprocessing method, and solves the problem of difficult extraction of deterioration index in the multichannel strong noise signal of the hydraulic system through the information fusion algorithm and the mean smoothing algorithm. According to the abnormal data judgment criterion based on the sliding window, the identification of the early fault starting point of the hydraulic system is realized, which provides a basis for the start-up fault prediction and the division of normal and degraded states. A multi-channel fusion fault prediction algorithm for hydraulic systems based on a Long-Short Term Memory (LSTM) deep network is proposed, which realizes long-term and high-precision fault prediction of hydraulic systems. Finally, the oil failure prediction algorithm of the integrated transmission system is verified by using real vehicle fault data.


Introduction
In the integrated transmission hydraulic system, the coupling relationship among the pressure parameters is complex, and the failure degradation mechanism is complex, too.It is difficult to establish a failure model, and it is impossible to predict the failure through the model-based method [1][2][3][4].A data-driven approach is digging mine equipment degradation information directly from historical data to predict deterioration trends and remaining service life.This method gives full play to the role of the data, which does not need to analyze its complex mechanical model and transmission relationship, reduces the dependence on the equipment degradation mechanism, and uses the datadriven fault prediction method to realize the fault prediction of the integrated transmission hydraulic system [5][6].
Data-driven online fault prediction generally only predicts a single channel, while an integrated transmission hydraulic system has many hydraulic parameters, and the integration of multiple hydraulic parameters can obtain more accurate fault prediction results.Long-Short Term Memory (LSTM) deep network is used for long-sequence natural language processing and text processing, with strong long-sequence data processing capabilities.When the network structure is reasonably designed, its processing power increases with the increase of training data.The LSTM deep network fault prediction algorithm has been successfully used in the multi-parameter fault prediction of the NASA aero-engine simulation fault dataset.The network is used for comprehensive transmission hydraulic system fault prediction.High-precision fault prediction of hydraulic systems can be achieved by setting reasonable network depth, number of neurons at the network layer, and training duration.
In summary, the failure model cannot be constructed for the integrated transmission hydraulic system, therefore it is necessary to build a learning model based on historical data in the situation when offline training is possible.In this model, the multi-channel parameters of the hydraulic system will be integrated to train a powerful fault prediction model to realize the failure prediction of the integrated transmission hydraulic system.

Data preprocessing
2.1.1.Elimination of the outlier.This abnormal point is caused by the acquisition system, which is not the failure of the transmission system.The oil pressure stable degradation is a slow change process.The sudden failure of the hydraulic system should be through iterative comparison to eliminate the outlier, through the current moment data and the previous moment of the collection data for comparison.The previous moment set the zoom ratio.When the data fluctuation is large beyond the set range, it can be considered that the data point is a system abnormal point, and then we replace the current moment value with the previous moment sampling point.
where O is the scaling factor, and 0.8 O .

Sliding window smoothing.
Because the median value has a good smoothing effect on the data mutation value, the method of taking the median value of the data point in the sliding window is used to smooth the data, and the length of the sliding window is set to slidewidth .: ( ) x median x (2) where () median means to take the median value of the data series.We take slidewidth =100, which takes a median point for every 100 sampling points.

Mean smoothing.
We smooth the data by using the mean.The smoothing process is: where () mean represents the average processing from the moment of acquisition to this moment.

Hydraulic signal deterioration index normalization
To improve the versatility of model processing, the data of each channel is normalized, so that the distribution range of each channel data is consistent.In the process of predictive model training, each channel is equally important to the model parameters.Data normalization can improve the feature mining ability, prediction ability, and generalization ability of the model.We set the acquisition signal as 0 , 1,..., , ,..., where N represents the sampling data of each channel to the period because the numerical range of each channel varies greatly.To improve the generalization ability of the model, the sampling value is first normalized, each channel is normalized separately, and all channels are normalized within the same range, as shown below: ) where i P is the mean and i V is the variance of the i th channel.

Early fault detection
Correctly determining the starting point of faults is conducive to discovering early faults, reasonably segmenting training samples, and starting predictive models.The principle 3V is often used to monitor anomalies.The sliding window 3V principle is used to detect oil pressure abnormal data.When the deterioration index at a certain moment deviates from the variance of the previous eigenvalue at this moment L more than 3 times, it can be considered as an abnormal oil pressure parameter.The feature 2.4 Fault prediction algorithm of LSTM hydraulic system multi-channel fusion 2.4.1 LSTM algorithm architecture.LSTM increases the control logic of the forgetting gate, memory gate, and outputting gate compared with RNN.LSTM RNN replaces the hidden layer of traditional RNN through the memory unit, in which the correction coefficient represents the state through a node [7][8][9][10].
(1) Forget gate.The first step of LSTM is to decide what information to keep and to discard.This is implemented by a Sigmoid function, while this function is called "forget gate".Its input is t x and 1 t h , and the output value is between 0~1, where 0 means all information is abandoned, and 1 means all information is retained.The forget gate can be expressed as: (2) Input gates and input nodes.The next step is to choose to introduce new information and store it in the internal state.This step consists of two parts: (1) The Sigmoid function acts as an "input gate" to decide what information is updated; (2) The tanh function acts as a vector for the input node to generate a new state for updating the state.The input gate and input node equations can be expressed as: (3) Output gate Finally, an "output gate" is formed by a Sigmoid function, which is used to determine the output information.The internal state is controlled in the range of -1~1 through the tanh function, and the output result is obtained by multiplying with the output gate, as shown below: We take the mean square of the predicted value and the true value as the loss function, that is:

Experiments
This fault occurs in the real integrated transmission system.During the vehicle driving, the oil pressure before and after the control filter is imbalanced and then makes the hydraulic system fail.The raw data of hydraulic system failures is shown in Figure 1.The data sampling period is 50 ms, the sampling point is about 5 2.8 10 u , and the length of the collected data is 3.85 h.

Figure 1.
Integrated transmission system and raw data of hydraulic system failures.In this paper, the deterioration degree index is first extracted, and the fault starting point is determined by 3V principle.Finally, the remaining life prediction of the multi-channel hydraulic system is carried out by using LSTM.

Construction of dataset
The use of reasonable time intervals can reduce the amount of calculation and improve the prediction accuracy.For the sparse sampling rate data of 50 ms, we take a point every 600 points, which means the time interval between every two points is 30 s.After data preprocessing and 3V principle fault point identification, the deterioration degree index is shown in the Figure 2, where the red dotted line is the fault starting point obtained by the 3V principle judgment.While the fault appears, RUL begins to be linear degradation.The remaining life value is obtained, as shown in the Figure 3.We input 4 10 u oil pressure parameters to the LSTM RNN and the current oil pressure corresponding to the remaining life value will be predicted.The prediction results of the training set are shown in Figures 5 and 6.The predicted values of the training set are in good agreement with the real values.The trained model is used to test the test set to obtain prediction results and real value results, as shown in Figure 5.The test set also has a good prediction effect.From normal data to the beginning of degradation, the LSTM RNN fault prediction model can give an accurate remaining life value.Its prediction ability is not limited by the prediction time.From the degradation starting point, the remaining life can be obtained.There are 104 minutes from the final failure.LSTM RNN fault prediction model can dig deep into the degradation information contained in the oil pressure data, which is of great significance for the multi-channel residual life prediction of the integrated transmission hydraulic system.
SVR and gray prediction models are used to predict the oil pressure change trend of a single channel.The prediction accuracy of SVR and gray prediction models is significantly lower than that of LSTM, as shown in Figure 7 below.These two methods can only learn and predict for a single channel, which limits the model's modeling ability to the failure process.

Conclusion
In this paper, by establishing a multi-channel LSTM prediction model, online fault prediction and offline training are realized in the hydraulic system of the integrated transmission system.The prediction model is verified and analyzed through the actual fault data in the process of a real car sports car.LSTM has good prediction accuracy, benefiting from its deep learning architecture.The prediction ability of the LSTM model will be enhanced with the increase of sample data.Under the condition of sufficient data, the LSTM trained offline can achieve a longer prediction time and better prediction accuracy.The hydraulic fault prediction method of the integrated transmission system can better provide maintenance decisions.
the objective function of the entire network architecture, the entire network is trained by backpropagation.The training process adopts the gradient descent optimization algorithm for backpropagation training and stops training when the training results converge or reach the maximum number of training rounds.

3. 2
Training process In this paper, the LSTM RNN model is written by using the Pytorch framework.The LSTM RNN model has three layers.The number of neurons is set to 50, the Dropout is taken as 0.5, and the learning rate is set to 0.001.80% of the sample points are randomly selected as the training data set, and the other 20% of the data not participating in the training is used to finally test the prediction accuracy of the training model.The training process uses GPU for acceleration.The training platform is the same as the machine used in the second chapter of this article.The loss function value with the iterative process of its convergence curve is shown in the Figure 4.During the training process, the loss is rapidly reduced.After 100 rounds of training, the training process is completed.

Figure 6 .
Figure 5. Prediction results of the training set.Figure 6. Prediction results of the test set.
Loss function.By constructing the objective function to train LSTM, we set the true remaining lifetime as: