A remaining useful life prediction method based on LSTM-DCGAN for aero-engines

Turbofan engine is a key component in aerospace. Its health condition determines whether an aircraft can operate reliably. However, it is difficult to predict the remaining useful life (RUL) precisely because of the characteristics of complex operating conditions and various failure modes. To predict the RUL more accurately and make full use of the advantages of neural networks, a RUL prediction model based on a long short-term memory network (LSTM) and deep convolutional generative adversarial network (DCGAN) is proposed and called LSTM-DCGAN in this paper. In the proposed LSTM-DCGAN, DCGAN is used to obtain knowledge of the training dataset, then the generator after pretraining in the DCGAN is attached after an LSTM network for further feature extraction. The effectiveness of the proposed LSTM-DCGAN is validated on the C-MAPSS aero-engine degradation dataset and compared with other methods.


Introduction
As one of the core equipment in aerospace, turbofan engine failure may cause incalculable losses, making it important to predict the remaining useful life (RUL) of an engine before a catastrophic failure occurs.By analyzing the condition monitoring data of equipment, its RUL or degradation trend can be predicted.A large number of RUL prediction models have been proposed which are roughly categorized into model-based approaches [1][2][3] and data-driven approaches [4][5][6][7][8][9].Model-based RUL prediction methods capture the degradation process of a component by modeling the component failure mechanisms.The common model-based methods include Eyring model [1], Wiener process model [2], and Weibull distribution [3], etc. Model-based methods often require a large amount of prior knowledge, which restricts the application of model-based methods to complex nonlinear systems.
Data-driven RUL prediction approaches focus on establishing the relationship between the historical condition monitoring data and RULs.Among the data-driven methods, deep learning-based RUL prediction methods attract lots of attention.Li et al [10] improved the feature extraction ability of a deep convolutional neural network (CNN) and developed a RUL prediction method for turbofan engines based on the improved multi-scale CNN.Since the advantages of long short-term memory (LSTM) networks in processing time series and extracting features among them, Song et al. [11] and Li et al. [12] predicted the RUL of aero-engines based on LSTM networks.Ruan et al. [13] proposed a framework based on LSTM and dense layers for task transfer learning between fault diagnosis and RUL prediction.
Although significant progress has been made with the deep learning-based RUL prediction methods, the prediction performance significantly depends on the quality and quantity of samples in the prediction methods.In fact, collecting large amounts of run-to-fail training data is expensive and time-consuming.How to mine more useful information with the limited number of training data is still challenged.Generative adversarial network (GAN) addresses the problem of training sample scarcity and combines generative algorithms with adversarial ideas to improve the ability to generate new samples in consideration of data distribution [14].Usually, the data used in RUL prediction tasks are multivariate and contain complex operating conditions and failure modes, which makes it hard to capture data distribution.If GAN is used to handle such a problem, it has to overcome unstable convergence, model collapse, and vanishing gradient [15].Deep convolutional generative adversarial network (DCGAN) [16] improves the stability and data generation capability, but its application to RUL prediction is limited.Zhang et al. [17] adopted DCGAN to generate fake samples for missing data.Hou et al. [18] integrated DCGAN with an autoencoder to extract features in the RUL prediction task.
Different from [17,18], in this paper, DCGAN is used to pre-learn the knowledge of training data, and the generator of the well-trained DCGAN is involved in the RUL prediction task.Based on the above analysis, a new RUL prediction method is proposed in this paper.The proposed method exerts the advantages of DCGAN on knowledge mining and LSTM on time-series data processing.More details about the proposed method are given in Section 2. The proposed method is validated in Section 3 and the whole paper is concluded in Section 4.

Proposed method
Figure 1 shows the flowchart of the proposed RUL prediction method which includes two stages named the generator training stage and the RUL prediction stage.

The generator training stage
In the generator training stage, random noise and real sensor data after preprocessing are fed into a DCGAN.The noise is randomly generated subject to a uniform distribution.The preprocessing for real data includes data removal, data normalization, and time window sliding.The sensor data with low sensitivity to time and degradation trends are removed because such data is ineffective and may increase the training burden of a model.The rest sensor data after data removal is normalized by Z-score normalization to decrease the negative impact caused by the different magnitudes and units.After normalization, sliding window processing is adopted to encapsulate data at adjacent time points and enrich samples [19].The last data in a time window is taken as the RUL value of the time window.In the generator training stage, some of the generated noise is randomly sampled by the generator.The sampled noise mixed with the real data after processing is input into the discriminator which is responsible for distinguishing the real data and the fake data generated by the generator with the random noise.The generator and discriminator are alternately optimized by adversarial training and improving the quality of the generated data until the discriminator cannot distinguish the real and generated data.The Adam optimizer is used to perform gradient descent and update parameters during the training process so that the loss of discriminator is maximized and the loss of generator is minimized.The aim of training DCGAN is to make randomly generated noises subject to a similar distribution of real data.

The RUL prediction stage
In this stage, the raw sensor data is also preprocessed by data removal, data normalization, and time window sliding.The data after preprocessing is imported into an LSTM network to process longdistance dependencies and extract the time-series features.The extracted features from the LSTM network are imported into a generator.The generator is initialized with the parameters obtained from the generator training stage, but it is finetuned in the training process.Following the generator, there are two fully-connected (FC) layers to map features extracted by the generator to the RULs.After training the LSTM network, generator, and two FC layers by Adam optimizer, test data after data preprocessing is imported to obtain the RUL.

Dataset description and evaluation indicators
The C-MAPSS simulation turbofan engine dataset [20] is adopted to validate the proposed method.The dataset contains 4 sub-datasets with different fault modes and operating conditions which are shown in Table 1.Each sub-dataset consists of 26 dimensions: the number of engines, operating cycles, 3 operating settings, and measurements of 21 sensors installed at different components.2), are adopted to evaluate the proposed method.The smaller the values of the two indicators, the better the prediction accuracy.The E RMSE gives an equal penalty to early and late predictions.E score penalizes late predictions more than early ones.This refers to the case where the predicted RUL is higher than the actual RUL, it may generate more serious consequences than underestimating the RUL.In (1) and (2), J donates the total number of testing samples.R ' j and R j denote the predicted and the real RUL values of the j th engine. (1) (2)

The adjustment of parameters
Parameters are important to a deep learning-based RUL prediction method.In the generator training stage, the sample number of random noise input to the generator is set to 128.The learning rates in the generator and discriminator of the DCGAN are set to 0.0004.The batch size is set to 512.The number of iterations is 10000.These parameter settings are referred to [16].In the training process of the RUL prediction stage, the maximum epoch is set to 100 to ensure training efficiency and loss convergency.The Adam optimizer is used to improve training efficiency.Dropout and early stopping are adopted to prevent overfitting [21,22].To achieve better prediction performance, some commonly used parameters are evaluated on the training dataset of 4 sub-datasets including batch size, learning rate, and number of hidden units in the LSTM network.Single-factor control is adopted in the evaluation.To make a fair comparison, ten experiments were conducted on each sub-dataset.Here, only the results on FD001 are given as an example in Table 2 due to the limitation of paper length.Overall, it is proper that the learning rate and the number of hidden units in the LSTM network are set to 0.001 and 64 for the 4 sub-datasets.The proper batch size for FD001 and FD003 is 256, and 512 for FD002 and FD004.The parameters related to the proposed method are given in Table 3. BN, Conv, and ConvTran denote batch normalization, convolution, and transpose convolutional, respectively., 3, 4, 7, 8, 9, 11, 12, 13, 14, 15, 17, 20 and 21.Then these data are preprocessed by Z-score normalization and time window sliding.The length of a time window is set to 30 and the sliding step is set to 1.The number of time windows is shown in Table 4 where the sample numbers are enriched by the time window sliding.Piecewise function is adopted to remark the RUL tags of the dataset, where the initial RUL is set to 125 and decreases linearly [23].To visualize the prediction performance, one engine randomly selected from each sub-dataset indicates the difference between the predicted RUL values and the real values, as shown in Figure 2. From Figure 2, the predicted RULs basically follow the real RUL decrease trend.For the sub-datasets of FD002 and FD004, the inherent complexity of the data increases the difficulty of extracting abstract features, therefore the prediction stability remains to be improved.To illustrate the effectiveness of the DCGAN, a comparison is conducted and also shown in Figure 2. In the comparative method, RUL is by only the LSTM network which has the same parameters as the proposed LSTM-DCGAN.The predicted RULs from the LSTM-DCGAN are still closer to the real RULs than the values from the LSTM network, which indicates the pre-trained DCGAN can capture the degradation trend and features and improve the prediction performance.

Comparisons with other methods
To make more comparisons, the LSTM network whose results are shown in Figure 2 along with some state-of-the-art methods are compared with the proposed LSTM-DCGAN-based RUL prediction method in Table 5.The results from the compared methods are directly drawn from the corresponding work [4][5][6][7][8][9].The results from the LSTM-DCGAN are average values from ten runs.The values in parentheses denote the standard deviations from multiple runs.The smaller value of standard deviations indicates the better stability of the RUL prediction method.Here, "-" indicates that the values are not given in the associated work.5, most results obtained from the proposed method show higher accuracy and stability compared with other methods, especially on FD002 and FD004 which refer to more complex working conditions.Multiobjective deep belief networks are ensembled in [4], and only the E RMSE on FD003 shows slightly lower than the value from the proposed LSTM-DCGAN while the rest results are greater.In [5,6,9], the RUL prediction models are based on improved LSTM networks without DCGAN.In other words, the proposed LSTM-DCGAN performs better in dealing with data from more complex working conditions.This also illustrates the effectiveness of DCGAN on mining more information underlying data.All the results from [5] are greater than the values from the proposed method.Some results on FD001 and FD003 from [6,8,9] are slightly lower than the values from the proposed method, the working conditions of FD001 and FD003 are relatively not such complex.In [8], the bidirectional gated recurrent unit with temporal self-attention mechanism is adopted to predict RUL.In [7], a recurrent neural network (RNN) is adopted as a feature extractor and the prediction is based on similarity.Only the results on FD001 are slightly lower than the values from the proposed method.
For more fairly comparison, the overall E RMSE and E score are calculated and listed in Table 6.Here, the equations of overall E RMSE and E score (i.e., E RMSE_overall and E score_overall ) are given in (3) and (4) which considers the overall prediction performance on the four datasets.
In ( 3) and ( 4), E RMSE_FD001 and E score_FD001 means the E RMSE and E score from a method on FD001, respectively.J FD001 means the total number of testing samples in FD001.From Table 6, the overall E RMSE and E score of the proposed method are dominant to the other methods.Compared with these methods, the improvements of overall E score and E RMSE are at least 12.90% and 60.38%, where (17.42-15.43)/15.43=12.90%,(5.10×10 3 -3.18×10 3 )/ (3.18×10 3 )=60.38%.

Conclusions
In this paper, a new two-stage RUL prediction method for aero engines is proposed.The proposed method consists of an LSTM network and DCGAN.DCGAN is firstly trained by the random noise and real sensor data after preprocessing to mine more useful knowledge underlying the training data.The well-trained generator in DCGAN is remained and attached behind an LSTM network in the second stage.The LSTM network performs its merits in processing time-series data.In the second stage, the real sensor data after preprocessing are imported into the LSTM network which is combined with the well-trained generator and two FC layers for training.During the training process, the parameters in the generator are fine-tuned along with the parameters in the LSTM network and the FC layers.Finally, the well-trained LSTM-DCGAN is used to obtain RULs for given test data.The proposed method is validated by the C-MAPSS dataset and shows its outperformance by comparing it with other methods.The proposed method performs better in dealing with the data from more complex working conditions but performs worse for relatively simple working conditions.In the future study, more generalized and robust RUL methods should be proposed for adapting different working conditions and failure modes.Besides, how to use complex operating conditions to improve the accuracy of RUL prediction remains to be further studied.

Figure 1 .
Figure 1.The data flowchart of the proposed approach.
[1] Jouin M, Gouriveau R, Hissel D, Péra M C and Zerhouni N 2016 Particle filter-based prognostics: Review, discussion and perspectives Mechanical Systems and Signal Processing 72 2-31 [2] Yu W, Tu W, Kim I Y and Mechefske C 2021 A nonlinear-drift-driven Wiener process model for remaining useful life estimation considering three sources of variability Reliability Engineering & System Safety 212 107631 [3] Ali J B, Chebel-Morello B, Saidi L, Malinowski S and Fnaiech F 2015 Accurate bearing remaining useful life prediction based on Weibull distribution and artificial neural network Mechanical Systems and Signal Processing 56 150-172 [4] Zhang C, Lim P, Qin A K and Tan K C 2016 Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics IEEE Transactions on Neural Networks and

Table 1 .
Details of the C-MAPSS dataset.
Two widely used evaluation indicators, i.e., root mean square error (E RMSE ) and scoring function (E score )[20]as shown in (1) and (

Table 2 .
The adjustment results of parameters of LSTM-DCGAN on FD001.

Table 3 .
The parameter of the proposed LSTM-DCGAN.
3.3.ValidationsAfter data removal, only 15 dimensional measurements are remained among the original 26 dimensions.The remained measurements are from Sensors 2

Table 4 .
Information of time windows.

Table 5 .
E RMSE , E score and the standard deviations of existing methods on sub-datasets.

Table 6 .
Comparison of overall E RMSE and E score on the C-MAPSS dataset.
Zhang J, Jiang Y, Wu S, Li X, Luo H and Yin S 2022 Prediction of remaining useful life based on bidirectional gated recurrent unit with temporal self-attention mechanism Reliability Engineering & System Safety 221 108297 [9] Wang T, Guo D and Sun X M 2022 Remaining useful life predictions for turbofan engine degradation based on concurrent semi-supervised model Neural Computing and Applications 34 5151-5160 [10] Li H, Zhao W, Zhang Y and Zio E 2020 Remaining useful life prediction using multi-scale deep convolutional neural network Applied Soft Computing 89 106113 [11] Song T, Liu C, Wu R, Jin Y and Jiang D 2022 A hierarchical scheme for remaining useful life prediction with long short-term memory networks Neurocomputing 487 22-33 [12] Li J, Jia Y, Niu M, Zhu W and Meng F 2023 Remaining useful life prediction of turbofan engines using CNN-LSTM-SAM approach IEEE Sensors Journal 23 10241-10251 [13] Ruan D, Wu Y, Yan J and Gühmann C 2022 Fuzzy-membership-based framework for task transfer learning between fault diagnosis and RUL prediction IEEE Transactions on Reliability 1-14 [14] Alipour-Fard T and Arefi H 2020 Structure aware generative adversarial networks for hyperspectral image classification IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13 5424-5438 [15] Ruan D, Chen X, Gühmann C and Yan J 2023 Improvement of generative adversarial network and its application in bearing fault diagnosis: A review Lubricants 11 74 [16] Radford A, Metz L and Chintala S 2015 Unsupervised representation learning with deep convolutional generative adversarial networks arXiv:1511.06434[17] Zhang S, Li T, Si X, Hu C, Zhang H and Ma Y 2021 A new missing data generation method based on an improved DCGAN with application to RUL prediction 2021 CAA Symposium on Fault Detection, Supervision, and Safety for Technical Processes 1-6 [18] Hou G, Xu S, Zhou N, Yang L and Fu Q 2020 Remaining useful life estimation using deep convolutional generative adversarial networks based on an autoencoder scheme Computational Intelligence and Neuroscience 2020 9601389 [19] Liu L, Song X and Zhou Z 2022 Aircraft engine remaining useful life estimation via a double attention-based data-driven architecture Reliability Engineering & System Safety 221 108330 [20] Saxena A, Goebel K, Simon D and Eklund N 2008 Damage propagation modeling for aircraft engine run-to-failure simulation 2008 international conference on prognostics and health management 1-9 [21] Liu Z, Liu H, Jia W, Zhang D and Tan J 2021 A multi-head neural network with unsymmetrical constraints for remaining useful life prediction Advanced Engineering Informatics 50 101396 [22] Cheng Y, Wang C, Wu J, Zhu H and Lee C K 2022 Multi-dimensional recurrent neural network for remaining useful life prediction under variable operating conditions and multiple fault modes Applied Soft Computing 118 108507 [23] Yan J, He Z and He S 2023 Multitask learning of health state assessment and remaining useful life prediction for sensor-equipped machines Reliability Engineering & System Safety 234 109141