Multi-Task neural network model considering low power output risk for short-term photovoltaic forecasting

As the proportion of installed photovoltaic power generation continues to increase, low output under the influence of large-scale weather systems has an increasingly significant influence on the power grid. There is an urgent need to improve photovoltaic power generation forecasting accuracy and reduce the risk of insufficient output in forecast results. To this end, a multi-task neural network model considering low power output risk (LPOMTN) for short-term photovoltaic forecasting is proposed. First, a numerical weather forecast encoder based on multi-scale CNN-Attention is established, which can extract multi-time scale photovoltaic output characteristics. Then a low-output day prediction module based on parallel CNN decoding and a photovoltaic low-output process prediction module based on the clear-sky model and LSTM were established. The latter uses the prediction results of the former as input, and the two modules perform training modeling in a multi-task learning manner to strengthen the model’s sensitivity to low-output states and improve the accuracy of low-output process predictions. The calculation example results show that LPOMTN has higher average forecasting accuracy for power output processes compared to methods such as XGBoost.


Introduction
Focusing on the "14th Five-Year Plan" and the "double carbon" goal, China's new power system structure continues to develop.Photovoltaic power generation continues to make breakthroughs in the scale of its power generation installed capacity [1].As of September 2023, the country's renewable energy installed capacity is approximately 1.384 billion, a year-on-year increase of 20%, accounting for approximately 49.6% of my country's total installed capacity, of which photovoltaic power generation installed capacity is 521 million kilowatts.This makes it more difficult for the grid to maintain a stable supply of electricity.Different from the continuity, controllable, and adjustable characteristics of traditional energy power generation, photovoltaic output power is affected by external meteorological properties and the electrical properties of the power station itself.It has characteristics such as intermittence, variability, and randomness [2][3].As the proportion of photovoltaic power generation connected to the grid continues to increase, the uncertainty of photovoltaics has an increasingly serious impact on the stability of the power system.Especially when the output is insufficient and a large amount of power grid backup resources are occupied, it's likely to lead to insufficient capacity of the grid, tight balance, or even power shortage, resulting in a large number of economic losses.Accurate photovoltaic power prediction can guide the power grid department to formulate accurate and timely dispatch plans, thereby reducing the impact of photovoltaic power generation uncertainty to a certain extent.At present, photovoltaic power generation power prediction technology can be divided into two categories: numerical prediction and probability prediction according to the prediction mode.Numerical prediction of photovoltaics represents the expectation of the station's output in the future.According to different prediction methods, photovoltaic output prediction is mainly divided into physical models and statistical models.The physical model studies the mechanism of photovoltaic power generation and combines the component parameters of the photovoltaic power station, solar radiation data, and some geographical location information to establish physical equations such as the solar radiation transfer equation to predict photovoltaic power generation.The statistical model mainly uses the operating data of NWP and photovoltaic stations to establish a linear or nonlinear mapping relationship between the NWP and power based on a certain method.Relevant studies are combinations of physical models and statistical models, combinations of linear and nonlinear models, and combinations between nonlinear models [4][5][6].This type of method can integrate the advantages of various methods to obtain global optimal prediction performance.Scholars from the University of Nevada first optimized and improved the ARMA model, and then performed power prediction on the photovoltaic power generation system of NV Energy Company.They found that the prediction error on sunny days was 23% to 43% [7].Antonello Rosato et al. used the ADMM algorithm and Echo State Neural Network (ESN) to predict distributed photovoltaic power [8].
To further characterize the uncertainty of photovoltaic output, some scholars use probability analysis to characterize the uncertainty of photovoltaic output.The probability-based method usually uses a certain distribution to describe the prediction error of photovoltaic output, such as Gaussian distribution, beta distribution, normal distribution, and t location-scale distribution (t distribution), etc.For the probabilistic prediction of photovoltaic power, Zhao et al. [9] introduced the t-distribution neighborhood embedding algorithm for feature dimensionality reduction.They proposed a probabilistic prediction method that improves the Bayesian neural network.In the case of sudden changes in photovoltaic power, a prediction interval with a certain accuracy is obtained.Function to obtain massive weather scenarios and use them as input variables of the machine learning model to obtain probability predictions of photovoltaic power generation.For photovoltaic output interval prediction, Li et al. [10] obtained the PV output directly and quickly by optimizing the output weights of the boundary valuation theory.Ultrashort-term prediction interval method, Ogliari et al. [11] use artificial neural networks and intelligent algorithms to analyze meteorological factors that affect photovoltaic output.
To sum up, existing research mainly studies photovoltaic output prediction technology from two aspects: improving the average accuracy of prediction results and characterizing output uncertainty.However, the phenomenon of low output is still less considered, and there is a greater risk of underreporting of low output.It is difficult to meet the demand for photovoltaic power generation to ensure power supply in the context of the current gradual increase in the proportion of photovoltaic installed capacity.So, we propose a multi-task neural network model considering low power output risk for short-term photovoltaic forecasting.The main innovations are as follows: (1) A numerical weather forecast encoder based on multi-scale CNN-Attention and a multi-task learning low-output daily-power sequence prediction neural network architecture is proposed.Training modeling is performed in a multi-task learning manner to strengthen the model's sensitivity to lowoutput states and improve the accuracy of low-output process predictions.
(2) A power sequence reconstruction model based on a clear-sky model embedded in a neural network is proposed to enhance the learning effects for the diurnal variation curve of photovoltaic output.

Overall structure
The structure of the proposed LPOMTN is illustrated in Figure 1.The LPOMTN has three parts, namely the input information encoder module, the low power output day (LPOD) prediction module, and the output process prediction module.The input information encoder module is mainly composed of position encoding, linear layer, attention mechanism, and layer normalization.It is used to preprocess IOP Publishing doi:10.1088/1742-6596/2771/1/0120233 numerical weather forecast information and time tag information, as well as extract LPOD and power sequence prediction features.The LPOD prediction module consists of three parallel convolutional neural networks, and each neural network is used to predict whether the first day is an LPOD.The power sequence prediction module consists of a clear-sky model, a feature fusion module, and three parallel LSTM neural network modules.Each branch is used to predict the power sequence for one day.

Encoder
The input information of the input information encoder includes NWP and timestamps [12].First, position encoding is performed through PositionalEncoding.Taking the i-th dimension and t-th position data in the input data as an example, the time series encoding method is as follows.
PE sin( ), 2 F ( , ) cos( ), 2 1 where model d is the dimension.Then the linear layer and activation function are input to expand the dimension of the input data.
where () V , i w , and b are the activation function, neuron weight, and bias respectively.Data processing is performed through the layer norm layer, and the calculation method is as follows.
E( ) * Var( ) where a is the J weight coefficient, E is the bias parameter, both of which are learnable parameters, is a fixed parameter, the default value is 1e-5, E( ) x is the mean value of the input tensor x, and Var( ) x is the variance.with span n as an example, the calculation rules of attention weight vector t D and weight vector t x are as shown in Equations ( 6)-( 8).
( ) ( ) where W D is the weight matrix; : is the Hadamard c is the Sigmoid activation function.Using the data obtained from the above modules as contribution features to the subsequent lowoutput daily prediction decoder and low-output process decoder, the feature extraction capabilities of the coding neural network can be simultaneously adjusted through joint learning of the two tasks, thereby improving prediction accuracy.

Decoder for forecasting LPOD
The proposed low-output daily prediction decoder is mainly composed of CNN and the corresponding pooling layer.The convolutional neural network is specially used to process grid-structured data.It can flexibly scale data while retaining global effective information.Therefore, the article establishes a CNN to downsample and reduce the time resolution of NWP.
where M , l, k, b, x , and M is the input feature vector, the l-th layer network, the convolution kernel, the network bias, the l-1-th layer input, and the l-th layer output.
After obtaining the parallel prediction results of three low-output days (whether it is a low-output day), the obtained prediction results are input into the feature fusion module for splicing.The method is as follows.
> @ , , X X X are the three-day LPOD prediction results respectively.

Decoder for predicting power sequences
The proposed power sequence decoder mainly includes a clear-sky model and a power sequence reconstruction decoder.The former constructs the theoretical photovoltaic output curve under clear-sky mode based on time tags, while the latter combines the shared features obtained by the encoder model, the prediction results of the clear-sky model, and the low-output day prediction results to reconstruct the intraday power sequence curve to achieve prediction.
(1) Clear sky model The clear sky model is a model derived through physical equations and empirical equations when the weather environment is clear-sky conditions.It does not consider complex weather conditions and can better characterize the daily output changes of photovoltaic power generation.We use the ASHRAE to decode and predict intraday power sequences.ASHRAE is a general prediction model of surface solar irradiance under clear sky conditions based on the research of Threlkeld and Jorda.The ASHRAE model believes that under clear-sky conditions there are no or almost no clouds in the sky and the air quality is very good, so the irradiance calculated by the ASHRAE model is the theoretical clear-sky irradiance.
The calculation equation of total surface radiation I is as follows.
where A is the extra-atmospheric radiation flux on the surface; k is the dimensionless factor of optical depth; z T is the solar zenith angle; n is the n-th day of the year starting from January 1.A and k are functions that consider the distance between the sun and the earth and the revolution of the earth.They are obtained by iterative regression of actual irradiance and can be expressed as: where a1, a2, a3, k1, k2, and k3 are all coefficients obtained by iterative regression.The ASHRAE model mainly relies on the sin function in the parameter sum to reflect the annual and intraday curve changes of solar irradiance.
(2) Decoder for power sequence reconstruction Based on the irradiance results obtained by the above-mentioned clear-sky model, as well as the shared features and low-output day prediction results obtained by the above-mentioned models, various features are first spliced through the contact module, and then three parallel LSTM modules are input for power sequence prediction.The calculation method of LSTM is as follows: first, the important historical information is screened through the forgetting gate, and its opening and closing state determines the degree of passing of historical memory; then, the information inflow intensity of the memory unit is controlled by the updating gate, and a reasonable information weight distribution is achieved as much as possible, to update the unit state.LSTM introduces the error backpropagation algorithm to repeatedly train the model until it reaches the expected value of the model index.

(
) where G is the sigmoid function; f W is the weight of the forgetting gate; 1 l h is the output of the unit at the previous moment; f b is the bias of the forgetting gate; the update gate is also called the input gate, which shares the information input at the current moment.The unit status of the storage unit is updated and the information to be updated is identified.The update gate equation is calculated as follows: 1 ( ) The output gate passes the input at the current moment and the hidden state at the previous moment to the sigmoid function at the same time, and then passes the output of the sigmoid function to the tanh function to determine the state information of the hidden layer, thereby determining the value of the output at the next moment and output.The equation for the output gate is calculated as follows: 1 ( )

Evaluation indicators for prediction results
We take the value of standardized root mean square error (RMSE) as the power prediction accuracy of each prediction model.

Initial setup
The experiment uses Python (version 3.9.10)as the development language and builds the proposed model based on the PyTorch framework.Lr=0.001,Bath_size=32

Forecast results and analysis
This section conducts an example test based on the data of a photovoltaic power generation cluster in northern China and sets the comparison methods to XGBoost, Attention, and LSTM.In terms of power prediction results, the prediction results of each model were selected under four typical output scenarios.In Figure 3, the red curve is the prediction result obtained by the proposed LPOMTN.From Figure 3, the LPOMTN has better trend-tracking effects in various weather scenarios and is more in line with the actual power fluctuation curve.The accuracy of LPOMTN and various comparison methods was further analyzed and compared, and the daily prediction accuracy for the next three days was calculated, as shown in Table 2. From Table 2, LPOMTN has the smallest prediction error.The average accuracy on the first day in the future has increased by 2.59%, the prediction accuracy on the second day has increased by 2.75% on average, and the prediction accuracy on the third day has increased on average.2.48%.
To further analyze the prediction accuracy under low photovoltaic output conditions, the RMSE prediction accuracy of the above four types of methods under different output levels is plotted, as shown in Figure 4.The prediction error increases with the increase of output level, but in low output state (0%-20% rated power) and medium output state (50%-80% rated power), the accuracy of the proposed LPOMTN is significantly higher than Existing methods verify the effectiveness of LPOMTN.

Conclusion
As the grid-connected capacity of photovoltaic power generation continues to increase, the risk of insufficient power supply caused by its random output gradually increases.To reduce the risk of lowoutput false alarms in existing forecasting technologies, this paper proposes a multi-task neural network model considering low power output risk for short-term photovoltaic forecasting.The conclusions are as follows: (1) A numerical weather forecast encoder based on multi-scale CNN-Attention and a low-output daily-power sequence prediction neural network architecture for multi-task learning is proposed.This can enhance the sensitivity of the model to low-output states, thereby improving the accuracy of lowoutput process predictions.
(2) A power sequence reconstruction model based on a clear-sky model embedded in a neural network is proposed, which can improve the model's learning ability for the diurnal variation curve of photovoltaic output.
(3) Example analysis shows that the proposed method has lower prediction errors in the low output state of 0%-20% rated power and the medium output state of 50%-80% rated power, and the average accuracy of the next day's prediction accuracy is increased by 2.59%.

5
The schematic diagram of the relationship between solar irradiance is shown in Figure2.According to the ASHRAE model, the normal direct radiation Ib can be expressed as follows.Id can be expressed approximately as follows.

Figure 2 .
Figure 2. Schematic diagram of irradiance source of PV station.

Figure 4 .
Figure 4. Forecast error under different output levels.
The preprocessed data is further input into the attention mechanism to extract features.Taking the time series 1 2 [ , , ..., ] 4

Table 1 .
Table 1 below lists the main parameter values.Main parameter settings of the model.

Table 2 .
RMSE errors of various methods in the next 3 days %.