Short-term Load Forecasting Based on CEEMDAN-LSTM-AdaBoost

High precision load forecasting is of great significance for effectively allocating resources in energy systems and improving energy utilization efficiency. In order to improve the accuracy of load forecasting, this paper proposes a new load forecasting model based on the adaptive noise complete ensemble empirical mode decomposition (CEEMDAN)-long short-term memory neural network (LSTM)-adaptive boosting algorithm (AdaBoost). Firstly, CEEMDAN is used to decompose the original load sequence to obtain a series of eigenmode components, which can reduce the impact caused by the non-stationary nature of data. Then, these eigenmode components are input into LSTM for prediction, and an integrated learning model by AdaBoost is introduced. A strong predictor is constructed through several weak predictors to increase the prediction accuracy. Finally, the prediction result of each component is superimposed and reconstructed to obtain the final prediction result. Compared with other models, the two evaluation indicators of the proposed prediction model have decreased by 35.52% and 76.61% respectively, indicating the good prediction accuracy and generalization performance of the proposed method.


Introduction
With the massive access of distributed power sources, the popularization of electric vehicles, and the increase of electricity customers, the power system has become more complicated.Accurately grasping the law of load changes can improve the safety and stability of system operation [1].In addition, scientific and reasonable power planning and management are inseparable from the accuracy of shortterm load forecasting.And, the load forecasting provides data support for the arrangement of the number and capacity of units in the power system.However, a small error in forecast data will cause great operating costs for the power system, making short-term load forecasts with high accuracy can improve energy utilization [2].
The traditional methods of load forecasting include Kalman filter method, regression analysis method, time series method, etc [3], which have strict requirements on historical data and cannot obtain accurate load forecast values when processing large amounts of nonlinear data.With the rapid development of artificial neural network algorithms, more and more artificial neural network algorithms are applied to the problem of power system load forecasting [4].
LSTM is an improved recurrent neural network (RNN) [5], which can not only approximate any nonlinear function, but also accurately learn the long-term dependence in the time series.Therefore, the result of using LSTM for load forecasting is more accurate.In [6], a framework based on long shortterm memory (LSTM) neural network and Prophet time series forecasting algorithm is proposed to solve the problem of short-term power load prediction with the volatility and uncertainty.Although LSTM can effectively consider the relationship between nonlinear data and load time series However, the unstable part of the load data may affect the expected results.CEEMDAN can isolate unstable load signals into a certain number of smooth subsequences [7].In the context of the widespread application of combination prediction methods to various prediction problems, the combination model using AdaBoost and Bi-LSTM is created in [8].
This paper proposes a hybrid model based on CEEMDAN-LSTM-AdaBoost.First, the original load data is decomposed into several subsequence through CEEMDAN.Then, predicts its components through several LSTM weak predictors, and builds strong predictions through AdaBoost calculator gets the prediction results of each subsequence.Finally, the final load prediction result is obtained by summing the prediction results of each subsequence.Experiments show that the proposed model has better prediction results than a single LSTM model, and can be applied to actual load forecasting.

CEEMDAN
CEEMDAN makes up for the modal confusion of the two methods of EMD and EEMD and the shortcomings of residual noise in the reconstruction sequence, and also improves the computational efficiency [7].The specific algorithm is as follows: (1) Add a limited amount of adaptive noise ( ) (2) Each ( ) ( ) Where ( ) IMF t is the result of EMD decomposition in the i-th test.
( ) ( ) ( ) (3) Add white noise ( ) , where E is the EMD decomposition operator, and then perform EMD decomposition to obtain the second modal component 2 IMF and margin ( ) Where ( ) (5) Add white noise margin decomposition to each one to obtain the k+1th order modal component of CEEMDAN: Repeat ( 4) and ( 5) until all intrinsic mode function (IMF) components are found.

LSTM
The principle of the LSTM unit structure is shown in figure 1.The LSTM unit uses forgotten gates, input gates, and output gates to control the information to be forgotten, the information to be remembered, and the update of the information [6].Three gating mechanisms control its output value to vary between 0 and 1 through the sigmoid activation function σ.Forgotten gate keep the important information in the previous cell information Ct-1, and filter out the information that has an impact on the current time state, The calculation formula is as follows: Where f W and f b are the weight and bias of the forgotten gate.For the input gate, the expression is:   ( ) Where   ( ) ( ) Where O W and O b are the weight and bias of the output gate.By combining LSTM units, an LSTM network with a more complex structure and more powerful functions can be formed.

AdaBoost
The idea of the AdaBoost algorithm is to constantly update its weights for the same sample point, train multiple weak regression models, and aggregate the weak regression models with different weights to form a strong regression model [8,9].The specific steps of the AdaBoost algorithm are as follows.
(1) Input training set , , , , , , , , , ( ) (5) To prevent the AdaBoost algorithm from overfitting, the regularization term β needs to be added.The weak learner of the AdaBoost algorithm iteratively adds the regularization term as follows.

( ) ( ) ( )
Where ( ) Yx is the m-th round of strong learner; m  is the weight of the weak learner in the final strong regressor.
(6) The final strong regressor is: Where ( ) YX is the output predicted value of the AdaBoost model; ( ) m gx is the median of all ( ) mm yx  .

Predictive Model Establishment
The model flow of the hybrid model based on CEEMDAN-LSTM-AdaBoost can be divided into the original load sequence decomposition and the LSTM-AdaBoost model prediction stage as shown in figure 2. In order to evaluate the performance of the CEEMDAN-LSTM-AdaBoost combined algorithm model, the paper adopts the two indicators of mean absolute percentage error (MAPE) and root mean square error (RMSE) [10], as shown in equation ( 18) and ( 19). ( ) Where predicted x is the load forecast value, real x is the true value of the load.

Example Analysis
In order to verify the accuracy of the model, the load data of a substation in a certain city in 2016 and 2017 was selected for 24 hours a day, and wind speed, temperature, precipitation, sunshine, time type and date type were added as input data.RMSE and MAPE are indicators for evaluating the prediction effect of the model.And, the prediction results of the LSTM model and the CEEMDAN-LSTM-AdaBoost model are compared and analysed.Set the CEEMDAN algorithm parameter std N to 0.2, NR to 100, Maxlter to 5000.Set the maximum number of LSTM iterations to 200, the learning rate to 0.01, and the gradient threshold to 1. Set the number of weak predictors to 5.

CEEMDAN Power Load Decomposition
Use CEEMDAN to decompose the pre-processed load data, and decompose it into 10 IMF components from high frequency to low frequency step by step.The decomposition results are shown in figure 3. Compared with the original electrical load sequence, the decomposed components are more stable in turn.

Comparison of Experimental Results
In order to verify the accuracy of the prediction of the proposed model, the LSTM model and the CEEMDAN-LSTM-AdaBoost model were used to predict the weekly load in the same data set.At the same time, use RMSE and MAPE to evaluate the prediction results of each model..The comparison of prediction errors is shown in 1.Compared with the single LSTM model, the MAPE predicted by the CEEMDAN-LSTM-AdaBoost model at 168h decreased by 35.52%, and the RMSE decreased by 76.61%.4 and figure 5.
As shown in figure 4, the proposed CEEMDAN-LSTM-AdaBoost combined approach in the article can adapt to the actual load curve, with higher prediction accuracy and better prediction results.As shown in figure 5, the prediction error of CEEMDAN-LSTM-AdaBoost is significantly smaller than that of the LSTM model.Most of the IFM modes decomposed by CEEMDAN have obvious regularity, while the rest have smaller amplitudes.At this time, by training different IFM modalities, the obtained model can better adapt to each modal, and get more accurate prediction results.

Conclusions
At the day-ahead time scale, this paper proposes a short-term load forecasting method based on CEEMDAN-LSTM-AdaBoost to improve the accuracy of load forecasting.CEEMDAN can decompose the load data into several more regular IMF components, thereby reducing the influence caused by the non-stationarity and complexity of the original electric load.Then, the LSTM-AdaBoost strong prediction model is used to predict each subsequence.And the final load prediction situation is obtained

b
are the weight and bias of the input gate, C W and C b are the weight and bias of the current unit state.The expression for the output gate is as follows:

y( 4 )
are the input and output of the training set.(2) Initialize the weight distribution of the training data.If there are N samples, assign the same weight to each sample.is the training set, the subscript l will increase as the number of iterations increases; li w is the weight of each sample point in the data, 1, 2, , N i = .(3) After setting the initial state, start to use the weight distribution as the sample T, set the m-th training to get the weak learner.Update the weight distribution of each sample in the training data set and use it for the next iteration.

*
Note: The horizontal axis represents data points, and the vertical axis is measured in MW.

Figure 3 .
Figure 3. Decomposition result of original load series.

Figure 4 .
Figure 4. Comparison of prediction results.Figure 5. Error comparison of different models.

Figure 5 .
Figure 4. Comparison of prediction results.Figure 5. Error comparison of different models.

Table 1 .
Comparison of prediction errors of each model.It can be seen from table 1 that whether it is 24 h forecast or 168 h load forecast, the MAPE of the CEEMDAN-LN-LSTM model is smaller than that of the LSTM model, and RMSE is also smaller than that of the LSTM model, indicating that the proposed model has quite prediction accuracy.The fit of the curve is relatively better.The prediction results and errors of different models are shown in figure