Short-Term Load Forecasting Based on Improved VMD and KELM

To address the problem of low accuracy of short-term load prediction in the past, this thesis uses a prediction model with neural network LSTM as the main body and constructs an adaptive and dynamically adjusted rolling load prediction model to meet our load prediction needs through VMD decomposition, kernel limit learning machine, and correlation analysis techniques. To reduce the maximum load error and the overall result error, a correction algorithm is proposed to improve the VMD to minimize the impact of load decomposition; after that, the model is used to analyze a load of a day in a region to verify the superiority of the prediction model. The example shows that the load forecasting method can help to make more accurate predictions of short-term loads.


Introduction:
Short-term load forecasting is a very important element of power system operation and dispatching, which is both an important aspect to ensuring the safe and economic operation of the power system and to achieving scientific management and dispatching of the power grid, as well as an integral part of the energy management system and a basic element necessary for future commercial operation of the power grid [1].
There has been a great deal of research on improving the accuracy of load forecasting, and [2] highlights the influence of key factors by changing the Attention weights and using a kernel limit learning machine combined with meteorological data for error prediction and correction.In [3], the parameters in VMD are obtained by the PSO algorithm for finding the best, so that a higher accuracy can be obtained.In [4], a dynamic adaptive prediction model with human comfort as a meteorological factor is proposed.In [5], a temperature correction model is proposed to solve the cumulative temperature effect by fitting a temperature rise curve.[6] proposed a method for prediction by fitting body temperature and load values.In [7], a segmented prediction model was developed by improving the particle swarm algorithm and thus increasing the degree of data convergence.[8] proposed an LSTM ultra-short-term load prediction method based on the attention mechanism by analyzing the autocorrelation of load data and implementing network training using the Adam algorithm.[9] proposed a self-adaptive updateable instantaneous learning algorithm by using mutual information to calculate the correlation of meteorological data, each meteorological factor, and other variables.
In this paper, a variety of load-influencing factors such as temperature, humidity, and pressure are considered, and the concept of human comfort is introduced.The data decomposition used in the previous literature often uses the data information to be predicted.For this situation, a pseudo-prediction method is used to initially predict an initial data situation to be predicted, which is subsequently merged into the original data set and then decomposed into VMDs, taking into account that the pseudo-predicted data points The number of pseudo-prediction data points also affects the actual VMD decomposition results, not the more prediction points, the closer the decomposition effect is to the original value, here, the best pseudo-prediction points are found by PSO/SSA search.Finally, the LSTM algorithm, which is more accurate for the load development trend, is chosen to predict it, and the VMD-KELM-LSTM prediction model is proposed, and the accuracy of this method is verified.

Data correction
Due to maintenance or substation failure, it will lead to a short-time load value inaccuracy, by a moment when the load value and the adjacent moment load value will not change too much, we can derive a threshold value when the difference between the two adjacent moments load is greater than this threshold, the point is identified as a data anomaly and rounded off.
In Equation (1), t load indicates the value of the load at time t, and ( ) t


indicates the threshold value.When the above equation is satisfied, it indicates that the data is anomalously constant and needs to be replaced.The point load is weighted by the four adjacent moment loads.
In Equation ( 2), is the weight value.The weight value is determined by the most similar day of parameters in the historical load.

Weather factors
The historical load, temperature, humidity, barometric pressure, wind, precipitation probability, sunshine intensity, and wind direction of each factor affecting the load change were subjected to Pearson analysis, and weights were assigned according to the magnitude of correlation and input into the model.To facilitate data processing, the date and moment types are quantified and expressed as binary variables from Monday to Sunday, respectively, and human comfort is introduced as an important reference index.

32
In Equation ( 3), DI refers to human comfort, a T indicates temperature, RH indicates humidity, u indicates wind speed.To solve the impact of weather lag on accuracy, the weather factor of the previous day is added as the characteristic variable of the days to be predicted.

Error evaluation index
In addition to finding the causes of errors in load forecasting, it is also necessary to judge the errors.In this paper, MAPE and RMSE are used in power load forecasting.% 100 In Equations ( 4) and ( 5), i y  indicates the predicted value and i y indicates the actual value.VMD is a time-frequency analysis method.It establishes a constrained optimization problem based on the component narrow-band condition to estimate the center frequency of the signal components and reconstruct the corresponding components.Assuming that the original signal is decomposed into k components, under the guaranteed constraints, the expression can be obtained as Equation (6).
In Equation ( 6), K denotes the number of positive integers of the modalities to be decomposed, } { k u denotes the k modal component, and } { k w denotes the frequency center of the k component.

Particle swarm algorithm
The particle swarm algorithm refers to using one particle to simulate an individual bird and adjusting the velocity of the particle according to the particle's historical optimal position and the population's historical optimal position.The velocity and position optimal solutions are as in Equations ( 7) and ( 8): In the equations,  denotes the inertia factor, 1

Kernel Extreme Learning Machine
ELM algorithm which can learn fast.For single hidden layer neural networks, ELM can randomly initialize the input weights and biases and obtain the corresponding output weights, as shown in Figure 1.KELM is an improved algorithm of ELM.KELM not only can retain all the advantages of ELM but also can improve prediction performance significantly.The connection weights of the input and hidden layers and the threshold of the hidden layer can be set randomly.The connection weights β between the implicit layer and the output layer are determined by solving a system of equations.

Improving the VMD algorithm with PSO and KELM
In searching the VMD algorithm influence parameters using the PSO algorithm, the energy tracking method is chosen for the fitness function [10].To avoid the problem of unsatisfactory decomposition caused by the lack of information about the day to be predicted, in this paper, KELM pre-corrects the VMD sequence by extracting 26 days called the training set and taking the 27th day as the test set, and the experiment finds that the accuracy of the decomposition sequence obtained when decomposing all 27 days of data far exceeds that of the decomposition sequence of 26 days, and finally leads to more accurate prediction results.Therefore, the BP neural network, ELM model, and KELM model were used to predict α points for the load sequence first and then decompose it respectively, and the decomposition diagram is shown in Figure 2.

Figure 2 Prediction Comparison Chart
The BP algorithm was found to be less effective than the KELM algorithm after several experiments, so a new training set was formed with the KELM-predicted 27th-day data and the original training set for VMD decomposition to achieve higher accuracy.

LSTM memory neural network
The LSTM is optimized by RNN to solve the problem caused by long time series, but it adds an input gate, output gate, and forgetting gate to filter the computational results and will use the forgetting gate to remove the inconsistent results.The LSTM adds input gates, output gates, and forgetting gates to filter the computational results and uses forgetting gates to remove inconsistent results.Only valid results are retained.Previous research results show that.The architecture of the standard LSTM consists of a memory cell state and three gates, as shown in Figure 3.

Figure 3 LSTM topology diagram
The equations for the forgetting gate t f , input gate t i , output gate t o , output t l and t h are in Equation ( 9).
In Equation ( 9 d denote the bias terms;  and tanh are the activation functions [11].

Introduction to the algorithm
To further study the accuracy of this combined prediction model, the actual load data, and the actual weather data (including humidity, temperature, precipitation probability, barometric pressure, and other factors) from August 10, 2022, to February 7, 2023, at 8:00 in a western region with a data interval of 15 min a point is selected and used as a test set for future day prediction.The 96 points from 8:00 on February 7 to 8:00 on February 8 were used as the test set for short-term load forecasting.

Analysis of prediction results of different models
The paper uses this prediction model, LSTM model, BP model, and VMD-LSTM prediction model to predict a load of this northwest region for one day.Table 1 shows the comparison results.Compared with other comparative models, this model achieves the best result in the prediction result for one day, which indicates that this model has higher accuracy and stronger robustness; compared with the VMD-LSTM model, we add the KELM correction link to make the overall prediction accuracy of VMD improve 4.7% and compared with the LSTM and BP neural network, we can see that VMD can increase the prediction accuracy for the first half of 0-48 data points, which can better highlight the influence of key features on the prediction results.This indicates that the LSTM neural network can better exploit the development pattern of load data and thus improve prediction accuracy.To show the prediction effect of different models more intuitively, the predicted and real values of load for one day of different models are selected for comparison, as shown in Figure 4.

Conclusion
The paper analyzed the data, considered factors such as temperature, humidity, and air pressure, experimented with various correction algorithms to correct the VMD decomposition results, and then proposed a combined VMD-KELM-LSTM prediction model, which can improve the load prediction accuracy, and after analyzing the actual electricity consumption, the following conclusions were obtained: 1) The concept of human comfort is introduced, and this mechanism highlights the factors that have a greater impact on load prediction accuracy by assigning different weights to weather factors such as temperature and humidity.
2) It was found in several experiments that the LSTM neural network model is found to be more practical and adaptable in power system load forecasting compared to neural networks such as BP, and can give higher accuracy prediction values and prediction trends.
3) To address the problem that the gap between the decomposition series and the real data decomposition is too large and the intrinsic law is not strong when the VMD decomposition is performed on the load sequence, and also to avoid the unsatisfactory decomposition effect caused by the lack of information of the day to be predicted we introduce the KELM correction algorithm and establish the combined VMD-KELM-LSTM prediction model, which has strong nonlinear mapping ability.The prediction accuracy is greatly improved.
4) The combined method proposed in this paper is not very sensitive to small changes in load, and the prediction accuracy needs to be improved.In the future, we will continue to optimize the parameters and introduce clustering algorithms and new weather impact factors to improve prediction accuracy.

1
number on the interval 0 to 1. id P denotes the d dimension of the individual extreme value of the i variable, and gd P denotes the d dimension of the global optimal solution.

Figure 1
Figure 1 ELM topology diagram

Figure 4
Figure 4 Results Comparison Chart