Remaining useful life prediction method of lithium-ion battery Based on KPCA-IVMD-SE-DNN

In this paper, aiming at the problems of feature processing and capacity regeneration in the prediction of remaining useful life (RUL) of lithium-ion batteries, an RUL prediction method based on kernel principal component analysis (KPCA), improved variational mode decomposition (IVMD), sample entropy (SE), and deep neural network (DNN) are proposed. Firstly, six health indicators (HI) are extracted by analyzing the character of batteries charging and discharging process, and their correlation with capacity is calculated. Secondly, the KPCA is used to denoise and simplify the dimension of the HI set and ensure that they fully contain the degradation information. Thirdly, the battery capacity is decomposed into trend and interference components by using the improved VMD of the central frequency method (CFM), and the reconstruction is carried out according to the SE of each component to increase the efficiency and accuracy of prediction. Finally, the prediction model is constructed based on DNN. The experimental analysis of NASA battery data sets shows that the proposed method has the best prediction accuracy, efficiency, and robustness than DNN, KPCA-DNN, KPCA-EMD-DNN, KPCA-VMD-SE-DNN, and so on.


Introduction
With entering the era of new energy, lithium-ion batteries have been the application and research hotspots in the domains of electronic equipment, aerospace, new energy vehicles, etc., because it has the advantages of strong power storage, safety, no pollution, long cycle life, and so on [1].The chemical composition degradation of batteries and environmental changes will cause the battery to fail and cause safety accidents.Therefore, it is essential to accurately forecast the RUL of the batteries.
Nowadays, model-driven and data-driven are two commonly used methods for lithium battery RUL prediction [2].Data-driven does not need to require prior knowledge and build complex mathematical models like model-driven, only by analyzing the relationship between historical data that characterizes battery performance degradation and battery capacity.Data-driven is mainly divided into two steps: extracting HIs and constructing prediction models.
For the feature extraction and processing: as the capacity and internal resistance of direct HI, it requires high acquisition requirements and is less used.Zhou et al. [3] calculated the mean voltage fading (MVF) between 500 s-1500 s in the battery decay as HI, but a single HI will mine insufficient aging information.Guo et al. [4] extracted 14 HIs based on charging current, temperature, and voltage curves, and used principal component analysis (PCA) to denoise and simplify HIs.PCA is a linear principal component analysis, which has limitations in dealing with nonlinear capacity degradation data.
For the establishment of prediction models: machine learning models have the disadvantages of low data fitting, difficulty in optimizing hyperparameters, and so on, so DNN is often used.Khumprom & Yodo [5] compared DNN with other machine learning methods, such as linear regression (LR), k-nearest neighbor (k-NN), and support vector machine (SVM), and verified the superiority of DNN.Shen et al. [6] proposed the deep convolutional neural network (DCNN) method, and compared it with relevant vector machines (RVM), shallow neural networks, and other methods to verify that it has higher robustness and precision in the online estimation of battery capacity.
In addition, the capacity regeneration caused by battery standing and the random noise caused by environmental changes will affect the prediction accuracy.At present, signal decomposition methods are mainly used, such as discrete wavelet decomposition [7], empirical mode decomposition (EMD) [8], and so on.The VMD method not only avoids the problem of modal aliasing and endpoint effect introduced by the EMD but also greatly reduces the residual noise of other methods when reconstructing the signal [9].It can also customize the mode number, improve the prediction accuracy, and show better robustness.However, it will introduce the problem of optimizing the number.
Based on the above analysis, this paper proposes the KPCA-IVMD-SE-DNN model.Firstly, the KPCA method is used to nonlinearly transform the high-dimensional features extracted from battery current temperature and voltage curves to obtain low-dimensional features.Then, the CFM method is used to optimize the decomposition number of the VMD algorithm, and the components with similar SE are merged to increase the prediction accuracy and efficiency.Finally, the open-source data set provided by NASA is used to prove the validity and superiority of the model.

KPCA principle
The KPCA method mainly introduces the kernel function, uses the nonlinear transformation to project the multi-dimensional data of the original space into the high-dimensional feature space, and then uses the standard PCA method for data processing.The specific calculation process is as follows: Step 1: We multiple HIs that characterize the capacity degradation of batteries are constructed as the original input variable matrix and standardized.It is defined as: The above = [ … ] and denotes the th HI and denotes the number of data in a single HI.The space of × dimensional matrix is the original input space.
Step 2: By introducing the kernel function (•), the multiple single-column vectors in the is mapped to the high-dimensional space, and a new × dimensional matrix ( ) is obtained: Step 3: The new matrix ( ) is centralized, that is: Step 4: We calculate the correlation matrix which is the covariance matrix : Step 5: We calculate the eigenvalue of the covariance matrix and its eigenvector : Step 6: We calculate the contribution rate of each eigenvalue and the cumulative contribution rate :

VMD principle
The VMD algorithm is a signal-processing method proposed by Dragomiretskiy in 2014.It is the process of seeking the optimal result for the variational constraint model.The algorithm decomposes a multi-frequency signal into Intrinsic Mode Functions (IMFs) ( ).Each mode function has a center frequency .This method needs to minimize the sum of the bandwidth of each mode, and the sum of each mode function is equal to the original input signal .The specific calculation process is: Step 1: The Hilbert transformation is used to calculate the analytical signal of each modal function to obtain the single-sided spectrum: Step 2: The spectrum of each ( ) is adjusted to the relevant basic frequency band by mixing with the exponential term of its central frequency: Step 3: The bandwidth of IMFs is estimated by the Gaussian smoothness of the analytical signal.Therefore, the confined variation problem to be constructed is: . .∑ = (10) where represents the partial derivative of , ( ) is the Dirac function, and { } = { … } represents the center frequency of each mode.
Step 4: By bringing in a quadratic penalty indicator α and a Lagrange multiplier λ(t), the constrained problem of Equation ( 10) is converted to an unconstrained variational problem: Step 5: { }, { }, are updated by the alternating direction method of multipliers in the Fourier transformation domain to obtain the optimal solution of Equation (13).Its updated equations are: Step 6: We stop updating the condition ( is the degree of noise tolerance in the above equations):

SE principle
Sample entropy is improved based on Approximate Entropy (AE), which is often used to measure the complexity of sequence data.It not only has faster calculation speed and does not rely on the length of data, but also has the same impact of window size m and distance threshold r.The larger the entropy is, the more unstable the sequence data is.The specific calculation is shown in [10].

DNN principle
DNN is composed of input, hidden, and output layers.Each layer is fully connected, while neurons in the same layer are not connected.The input layer is used to input feature data.The hidden layer is used for the nonlinear transformation of the front output.The output layer is used to output prediction results.
The training process of DNN is divided into two parts: forward propagation and reverse update.The first step is to get the prediction result by the nonlinear transformation of the input feature data and compare it with the real data to obtain the error.In the second step, the net parameters are adjusted by back-propagation.The error is continuously reduced until the stop update requirement is met.Finally, the prediction can be achieved by saving the trained network.The specific calculation equation is: where denotes the th neuron of the previous layer, denotes the current th neuron, and and denote the input and output of the current layer, respectively; is the weight of the hidden layer, is the bias value of the hidden layer, and is the activation function, which is the key to nonlinear transformation.The general activation functions are Relu, Tanh, and Sigmoid functions, etc.

Modeling Process
The KPCA-IVMD-SE-DNN modeling flow is shown in Figure 1.The specific process is: Step 1: Based on various curves of the battery, 6 HIs about voltage, current, temperature, and time are extracted, and calculate Person, Spearman and analyze their correlation with capacity.Then, a high-dimensional feature set of batteries is constructed.By using the KPCA of the sigmoid kernel function, it is simplified to three-dimensional according to the cumulative contribution rate of variance.
Step 2: CFM is used to seek the optimal decomposition number of the VMD algorithm.
Step 3: IVDM decomposes the training set into multiple components and trends.The SE of each component is calculated, and the components with similar SE are merged into NewImf and NewRes.
Step 4: The T cycle is used as the starting point of prediction, the data of the 1-T cycle is regarded as the training set, and the remaining part is regarded as the testing set.The division ratio is 40% and 50%.
Step 5: The Low-dimensional feature set is divided into the training and prediction sets.Then the former and the combined component are used as input for DNN training, and the latter inputs to the trained net to get the prediction results of each component.Each result is combined into the final result.
Step 6: The final prediction results are compared with the actual values, and multiple classes of error indicators are used to evaluate the method proposed in the paper.
(Note: The specific calculation equation of the various algorithms applied in the above steps refers to the principle description part.)

Experimental data description
The experimental data in the paper are from NASA PCoE.Four kinds of lithium-ion batteries B0005, B0006, B0007, and B0018 are selected for experiments at 25°C.The experimental process is: (1 Charging process: the battery was charged at a constant current (CC) of 1.5 A until the voltage reached 4.2 V, and then charged at a constant voltage (CV) until the charging current is reduced to 0.02 A.
(2) Discharge process: the battery was discharged at a constant current (CC) of 2 A until the voltage of the four batteries faded to 2.7 V, 2.5 V, 2.2 V, and 2.5 V, respectively.
(3) Termination of failure: the value of the battery-rated capacity decaying to 70% (1.40 Ahr) is regarded as the failure threshold.Since the B0007 battery capacity does not achieve the threshold when it is degraded, it is set to 1.42 Ahr.

Feature extraction fusion
To fully describe the degradation process, taking B0005 as an example, the curves of charge and discharge current, temperature, and voltage are analyzed, as shown in Figures 2, 3, 4, and 5.According to Figure 2, with the increase of the periodicity, the measured current at 4000 seconds in the charging stage gradually decreases, and the ability to reflect the storage capacity of the battery gradually decreases.Therefore, it is used as the first HI (n represents the cycle): According to Figure 3, the highest measurement temperature in the discharge stage gradually increases, which can reflect the occurrence of some irreversible chemical reactions inside the battery, the reduction of active substances, and the aging of the battery.Therefore, it is used as the second HI: The time to detect the highest measured temperature is also shorter, so it is used as the third HI: According to Figure 4, the distance between the real-time voltage and the nominal voltage gradually increases in the time of 500 s-1500 s.Therefore, the average value is taken as the fourth HI: The integral of the measured voltage curve to time in the whole single cycle, that is, the smaller the area surrounded by the curve and the coordinate axis, so it is taken as the fifth HI: According to Figure 5, the time of the lowest load voltage in the discharge stage is also shorter, so it is taken as the sixth HI: We calculate Pearson and Spearman correlation coefficients to make a preliminary screening of the extracted health indicators.The specific calculation results are shown in Table 1.According to Table 1, the extracted HI has a high correlation (the Person minimum is -0.6952 and and the Spearman minimum is -0.7087), which can be used as the input values of the prediction model.
According to the condition that the cumulative variance contribution rate satisfies about 95%, the dimension reduction is 3, and the calculation result of 40% data of the B0005 battery is taken as an example.The dimension reduction component and variance contribution rate are shown in Figure 6.The correlation coefficient between the fused HI and the battery capacity is calculated, respectively.The results: -0.9521 (B0005), -0.9859 (B0006), -0.9623 (B0007), and -0.9756 (B0018).Because of the high correlation, the fused HI processed by KPCA can be used as the input of DNN.

Improve VMD processing
In this paper, the VMD algorithm is applied to decompose the battery capacity data into multiple IMFs and RES (the relevant parameters of VMD are set as: alpha=2000, DC=0, init=1, tol=1e-7, tau=0).Taking the 40 % capacity data of the B0005 battery as an example, the results are shown in Figure 7 (a) (b).
According to Figure 7, the RES obtained by VMD is more stable than EMD, IMF1 is better corresponding to the random noise, and IMF2 is better corresponding to the local regeneration.
Too many IMFs will affect the prediction.Therefore, this paper calculates the sample entropy of each component, as shown in Figure 8  For improved VMD, this paper uses CFM to decompose the capacity at different numbers K and calculates the iterative central frequency.Once a similar frequency occurs, this K is the best number.Taking the calculation of 40 % data of B0005 as an example, the results are shown in Table 2.

Experimental analysis
From the angle of experimental methods, to verify the reliability of KPCA-IVMD-SE-DNN method (M7), several comparison methods are proposed: DNN (M1), KPCA-DNN (M2), KPCA-EMD-DNN (M3), KPCA-EMD-SE-DNN (M4), KPCA-VMD-DNN (M5), and KPCA-VMD-SE-DNN (M6).From the angle of experimental parameters, M1 is set to 2 hidden layers, the number of neurons is 32, the batch_size is 20, the epochs are 1000, and other parameter settings are consistent with other methods.The number of neurons from M2 to M7 is 100, the batch_size is 25, the epochs are 500-800, the activation function is the Relu function, and the learning rate is 0.01.From the angle of qualitative evaluation, according to the analysis of Figures 10 and 11 of the above prediction results, it can be seen that: (1) The prediction curves of the M7 method are almost close to the actual capacity curve in the degradation process, indicating that the M7 has a better ability to track battery capacity degradation.
(2) Compared with M1 and M2 methods, M3-M7 with various modal decomposition algorithms can better capture the local regeneration and random noise in the process of degradation.
(3) The prediction curve fitting effect of (b) and (d) in Figure 10, (b) and (c) in Figure 11 is poor and the fluctuation is relatively large when the M2 method is applied.It shows that the prediction result of a single component after the modal decomposition has a great impact on the final prediction result, and it also shows the importance of combining the components according to the SE for prediction.
From the perspective of quantitative evaluation, we choose Mean Absolute Error (MAE), R-Squared, and Root Mean Square Error (RMSE) to evaluate the prediction results.The specific calculation results are shown in Figures 12 and 13.Absolute Error (AE) is selected to estimate the RUL prediction results, and running time is selected to evaluate the prediction efficiency.The results are shown in Table 3.By comparing the simulation results and performance indicators, it can be concluded that: (1) According to the analysis of Figure 12, the prediction results of the components obtained by the M3 method have great errors in the final prediction results.For example, the MAE of B0006 is 32.82% and the RMSE is 35.73%.The MAE of B0018 is 6.94% and the RMSE is 7.6%, indicating that the reconstruction error of VMD is smaller than that of EMD decomposition.
(2) According to Figures 12 and 13, the MAE and RMSE of M7 are maintained at about 1% with a 40% training set, and the lowest values are 0.46% and 0.61%.In the case of the 50% training set, they are maintained at about 0.5%, and the lowest values are 0.33% and 0.43%.The error of M7 is the lowest in different batteries and training sets, indicating that the M7 has higher prediction accuracy.
(3) According to the comparison between Figures 12 and 13, if the prediction starting point moves backward, the model learns more cyclic data, and the prediction result is more accurate, but it has less influence on M7, which proves that the M7 method has better stability.
(4) According to Table 3, the of the M7 method is about 99%, and its minimum is 97.46% in the case of the 40% training set in B0018, but it is also the maximum under the same battery and the same data set.The maximum is 99.76%, indicating that M7 has good fitting.
(5) According to Table 3, the absolute error of RUL of M7 is basically 0, and the maximum is 2, which is the minimum in the same group prediction.It shows that M7 has high prediction accuracy.
(6) According to Table 3, the running time maximum of the M2 is 19.38 s, which is also lower than the minimum of the M1 (25.09 s).The MAE maximum of M2 is also reduced by about 5% compared with M1, and the minimum is also reduced by about 22%, indicating that KPCA can improve prediction efficiency and prediction accuracy.When comparing M4 with M3 and comparing M6 with M5 method by using SE, the results show that it also significantly reduces the running time.The predicted running time of M7 is about 20-25 s, which also shows that M7 can ensure efficient and realtime dynamic monitoring of RUL.

Conclusion
In accordance with problems of feature processing, capacity regeneration, and parameter optimization, this paper proposes the RUL prediction model based on KPCA-IVMD-SE-DNN.By dividing NASA data with different numbers and battery types for the experiment, the conclusions are as follows: (1) Firstly, KPCA is used to process six HIs, such as current, voltage, time, and shape, which greatly improves the prediction efficiency, and the prediction error is also reduced by denoising.
(2) Secondly, the optimal number of VMD decomposition is determined by CFM, and the local regeneration and random noise components in the original capacity data are decomposed efficiently, while avoiding the error caused by the modal aliasing and endpoint effect introduced by EMD.
(3) Thirdly, the combination of similar SE components not only improves the prediction efficiency but also avoids the error impact of multiple single-component predictions on the final results.
In summary, the KPCA-IVMD-SE-DNN proposed in the paper has better robustness, generalization, and higher RUL prediction accuracy, and has certain reference significance for future RUL prediction.

Figure 2 .
Figure 2. Change diagram of measured current.Figure 3. Change diagram of measured temperature.

Figure 3 .
Figure 2. Change diagram of measured current.Figure 3. Change diagram of measured temperature.

Figure 4 .
Figure 4. Change diagram of measured voltage.Figure 5. Change diagram of load voltage.

Figure 5 .
Figure 4. Change diagram of measured voltage.Figure 5. Change diagram of load voltage.

Figure 6 .
Figure 6.Explained variance of varied principal components.

Table 1 .
The correlation analysis of battery degeneration features.

Table 2 .
The center frequency based on different decomposition numbers of VMD.The second column in Table2uses the scientific counting method.
a Note:

Table 3 .
Battery life prediction error and other performance indicators.