Application of Gray Wolf Optimization Algorithm in Urban Electricity Load Forecasting Model

A combined prediction model based on long short-term memory neural network (LSTM) and convolutional neural network (CNN) is proposed in order to increase the prediction accuracy of short-term load. To address the issue that the gray wolf optimization (GWO) search process is prone to falling into local optimum. An improved grey wolf algorithm (IGWO) is proposed to update the convergence factor using the lower incomplete gamma function to improve the global optimization performance. The Dropout technique is used to improve the generalization ability of the model; the network layers are improved by increasing the initialization of weights; the model is trained using an adaptive moment estimation (Adam) optimizer, test data are input to the trained neural network model, and finally, the optimized model is used for prediction. The high prediction accuracy of the proposed method is demonstrated experimentally.


Introduction
With the power sector expanding quickly and smart grid technology becoming more widely used, power load forecasting plays a vital role in energy planning such as power generation and distribution, and short-term load forecasting technology is the basis for efficient operation and analysis of power systems [1].Load forecasting methods can be roughly divided into statistical methods, techniques using artificial intelligence, and combination techniques.Statistical methods include the time series method, regression analysis method, etc. Artificial intelligence method includes fuzzy forecasting method, expert system method, gray model method, etc.The combination method is based on the combination of the above two methods [2].Nonlinearity and dynamic uncertainty in the smart grid environment are major barriers to prediction accuracy [3].Recent years have seen a significant increase in interest in deep learning as a new machine learning technique in the field of short-term power load.In addition to having advantages when processing high-dimensional and nonlinear data, it has strong automatic feature extraction capabilities.

LSTM
The long-and short-term memory network is based on RNN and has three gating units, forgetting gate, input gate, and output gate [4].The LSTM can process long-time sequences effectively and overcome the problem of long-time dependence on the input sequence thanks to the special gating unit and memory unit.The nerve element structure distributed in the hidden layer is the main difference between RNN and LSTM, and the LSTM model has added a cell state to preserve the long-term state.In Figure 1

Gray Wolf Optimization Algorithm
The Seyedali Mirjalili proposed gray wolf optimization algorithm is a pack intelligence optimization algorithm that draws inspiration from the social hierarchy and hunting behavior of gray wolves.It is distinguished by straightforward operation, a limited number of adjustment parameters, and simple programming implementation.Gray wolves are social predators with a strict social hierarchy.In a wolf pack, gray wolves can be divided into four social levels in the pack.As shown in Figure 3 ,the first level is called the head wolf, which is responsible for making decisions such as hunting and resting, and it is represented by ; the wolves in the second level are subordinate to  wolves, represented by θ, and their duty is to assist  in making decisions; the wolves in the third level are subordinate to the upper two levels, represented by ε denoted by ε; the last level is called γ wolves, which are subordinate to the wolves of the upper levels The mathematical model of the GWO algorithm is expressed as follows.
where B is the distance between  (prey) and X (gray wolf); A, C (Collaborative coefficient vector) are expressed as: (5)

Nonlinear Adaptive Convergence Factor
The standard GWO algorithm's convergence factor a is linearly decreasing, which significantly reduces the algorithm's later-stage capacity for local searches.The algorithm's effectiveness depends on its capacity to conduct a global search.The gamma function, which effectively balances the algorithm's ability to perform both local and global searches, and updates the convergence factor a as shown in the following expression: where _ denotes the most iterations allowed. is the random number.The upper and lower bounds of a are denoted by  and  , respectively. wolves are in charge of the population's overall hunting habits.The algorithm is speeded up by adding random perturbation and increasing the weight of  wolves, which decreases algorithm stability but aids in jumping out of the local optimum.
The precise update formula is as follows: where Random is the random number of the standard positive-terminus distribution

IGWO-CNN-LSTM Load Prediction Modeling
In this paper, the improved grey wolf optimized CNN-LSTM model is called IGWO-CNN-LSTM model.
The LSTM has a two-layer structure, and the hyperparameters involved are mainly: the number of neurons in the hidden layer L1 and L2, the learning rate.These four hyperparameters are used as features for the IGWO algorithm to find the best.
The Dropout technique was used to increase the generalisation capability of the model; the network layer capability was enhanced by increasing the weight initialization.The model is trained using the Adaptive Moment Estimation (Adam) optimiser to feed the test data into the trained neural network model and finally the optimised model is used for prediction.

Load forecast data processing
The dataset uses the annual electricity load electricity price and climate data from Queensland, Australia.The load data is a 30-minute granularity with 48 data points per day.The test set selects the last 20 days of the year's data, while the training set includes the remaining 345 days of data.
Data gaps that may be caused by external factors are deleted and sorted by time.The data must be normalized to keep the dimensions of the data consistent and to make the model simpler to train.The normalization processing formula is: In Formula (3), X is the true data.X n is the normalised data.X M is the maximum value.X M is the minimum value

Evaluation indicators
To evaluate the precision of the model's prediction results, this paper uses MAPE, RMSE and MAE.Relative errors curve as evaluation criteria.The lower the values of RMSE, MAE and MAPE are, and the closer the relative error is to zero, and the more accurate the model is.The specific formula is defined as [10]: where y t is the true value and p t is the predicted value.

Analysis of forecast results
In order to confirm the model's dependability and accuracy, LSTM, CNN-LSTM and the IGWO-CNN-LSTM method used in the paper were selected for comparison.The quantitative values for the comparison of the assessment indicators and the improvement of the model effect of the prediction results are listed in Table 1.The comparison shows that the effect of the optimised assessment indicators has been improved to some extent to achieve the prediction results.

Figure 2 .
Figure 2. Structure diagram of CNN-LSTM According to Figure 2, the CNN-LSTM model, which combines CNN and LSTM, maximizes the ability of CNN to extract feature information and the sensitivity of LSTM to time series data to improve the prediction of electric load.

Table 1 .
Comparison of assessment parameters The short-term load forecasting model proposed in this paper is based on the IGWO-CNN-LSTM and has been found to have high accuracy and applicability through the comparison of experimental simulations of different models.The problem that the gray wolf optimization algorithm is easily to fall into local optimum and convergence accuracy is not high is improved the convergence factor.It is discovered through the comparison of simulation experiments that the IGWO has a significant improvement in its capacity for seeking out optimizations and its rate of convergence.According to the experimental findings, the load forecasting model's improved grey wolf optimization algorithm has a significant advantage.