A combined wind-PV power prediction method based on clustering prescreening

Due to the volatility caused by the high uncertainty of renewable energy, the large-scale integration of renewable energy into the electric power grid brings huge challenges to the scheduling and operation of the electric power system. Effective power prediction of power source can not only relieve the pressure of power system peaking and frequency regulation, but also improve the accuracy of decision-making. So as to make the most of the information between wind and photovoltaic power source, this paper puts forward a wind-PV joint power forecasting model. First, the raw power data is pre-screened by clustering to select the closer connection between wind and PV power; next, the screened data are passed through the upper CNN structure respectively after pre-processed. Then, the outputs from CNN modules are fed together into the prediction layer, which include LSTM layers and fully connected layers. Finally, the proposed wind-PV joint prediction model is verified by simulation experiments.


Introduction
Driven by resource scarcity and environmental degradation, renewable energy is more and more important due to its zero-pollution and low cost.The intermittence and uncertainty of wind and photovoltaic power pose a giant threat to large-scale variable renewable power integration.To improve the security and steady of the electric power systems in dispatching problems, a lot of scholars have conducted a series of studies.Yang et al. [1] proposed a probabilistic interval prediction method for wind power based on a two-layer CNN.Wang et al. [2] put forward a new deep Asymmetric Labyrinthian Neural Network, which determines the optimal input by describing the linear and nonlinear relationships between the targeted and trained data.Lee et al. [3]proposed an LSTM-based PV power prediction model, which uses only the PV power obtained before meteorological, seasonal and peak time.Hussein Sharadga et al. [4] proposed a new Bi-LSTM deep learning algorithm for power prediction of massive PV power plants based on comparing different prediction methods.Wang et al. [5] proposed a partial daily pattern prediction(PDPP) using LSTM-RNN model based on the time correlation principle framework to offer accurate daily pattern information for a specific date.It can be found that many papers' prediction target only contain one power source, but few papers combine wind power and PV power for power prediction.In this paper, we construct a joint power forecasting model where wind and photovoltaic power sources are included in order to fully utilize the information between them.

Methods
Due to the close connection between wind and photovoltaic power, this paper puts forward a joint power prediction model on the basis of pre-screening.The model includes two parts: the first part is to prescreen the power sources by clustering and select the more correlated power sources as the input of the prediction layer; the second part is the prediction layer; this part includes the convolutional neural network layers (CNN) and the long and short term memory (LSTM) network layers and the fully connected layers.The wind power and photovoltaic power that are pre-screened and pre-processed are passed through the CNN structure, which is fully fitted and parameter optimized while transformed to a suitable dimension, then the output of the above structure is fed into the LSTM network layers and the fully connected layers.

Feature calculation
In order to improve accurate prediction, the wind power and PV power are simultaneously predicted to fully make use of the previous information.As Figure 1 shows four wind turbines and PV panels with different installed capacities.What we can get from the figure, the power trend lines are tightly interwoven between 8 am and 4 pm, which also indicate the tight connection between them.

Figure 1. Wind power and PV power trend graph
In this paper, the power of multiple wind and photovoltaic power sources for the past 24 hours at each 5-minute interval is used as input and predict the next 15 minutes' power.
So as to improve the prediction accuracy of this model, this paper first normalizes the raw data using the StandardScaler function.StandardScale can make the processed data conform to the standard Gaussian distribution.The process is shown in Equation (1).
̂ ( 1 ) whereμ is the mean value of the selected data, andσ is the standard deviation of the selected data.
In the meantime, the model is trained by minimizing the Mean-Squared Error (MSE) between the predicted and true values to find the optimal parameters of the model.The loss function is shown in Equation (2). , where y is the predicted wind power and y is the actual value of wind power.

Model description
In this paper, the raw power data is first pre-screened by clustering and then power prediction is performed on the selected power sources.
Xu. [6] first used a loss function in the speech domain for power energy clustering, which will be used in this paper to cluster power sources and select more closely connected wind and PV power sources for power prediction.The model structure diagram is shown in Figure 2.
whereM , is the similarity matrix,v is the embedding vector obtained by training.CNN module: The module consists of a series of CNN blocks, which is shown in Equation ( 6).The number of channels of convolution blocks 1 to 3 are 128, 16, and 1, and the kernel size is 3×3.
Where x is the jth feature of the lth layer,k is the convolution kernel,M is the feature map, b is the bias,  is the convolution operation, and f • is the activation function.LSTM layer: A LSTM layer usually consists of multiple memory cells and three control gates, the iterative process of each state of the LSTM network is shown in Equation ( 7

Data and Parameters
The data for this experiment were obtained from RTS-GMLC.In this data set, the wind power contains 4 wind power sources, and the photovoltaic power supply selects 4 photovoltaic power sources at the same location.The 288 points of every 24 hours are used as input and the data are rolled according to the time axis for the selection of training.
The clustering effect in pre-screening is shown in Figure 4.   rho is the threshold value of clustering pre-screened which is the similarity of the embedding vector between two power sources.

Experimental results
To testify the validity and stability of the proposed model, different models were constructed in this experiment.The results are shown in Table 2 and Figure 5: As can be seen, the prediction results of the wind power are more accurate, especially after the prescreening by clustering of the power data before the joint prediction, the prediction's accuracy of the wind power is obviously improved and the deviation decreases about a half.Thus, the validity of the joint prediction model which proposed in this paper is proved strongly.

Conclusion
In the context of building a new type of electric power system, variable renewable energy represented by wind and other power sources are receiving more and more attention.Considering the close connection between the two power generation methods, a joint wind-pv power prediction model based on pre-screening by clustering is proposed in this paper.The results show that the proposed model can learn the implied information between the two sources and predict the wind and PV power very well.

Figure 2 .Figure 3 .
Figure 2. Wind and light joint power prediction model diagram Screening: the raw data are passed through a series of LSTM layers and trained to obtain a series of embedding vectors.Then the power sources that are more closely connected are then selected.The following Figure 3 depicts the procedure of data clustering.

Figure 4 .
Figure 4. Power t-SNE clustering diagram The power sources with closer distance in the diagram are screened out as the input power sources for joint prediction before the experiment.The specific parameters of the training model are shown inTable1.Table 1.Parameter settings Parameter Name Parameter size batch_size 64 learning_rate 0.001 Hidden_size 64 optimizer A d a m activation ReLU rho 0.5 ,  ,  and  ,  ,  denote the weights and biases of the input, forgetting, and output gates, respectively; σ • and tanh • are both activation functions.

Table 2 .
Comparison of MSE for different models Figure 5. comparison of actual power and forecasting power