Ultra-short-term forecast of distributed photovoltaic power based on satellite cloud image and LSTM model

Photovoltaic (PV) power generation, with its volatile, intermittent, and random characteristics, and large-scale PV access pose a threat to grid stability. For this reason, predicting the photovoltaic output will help keep the grid safe and stable. On the basis of the influence of cloud groups on solar radiation, a very short-term forecast of distributed PV energy will be made using satellite cloud picture information to improve the forecast accuracy of PV energy production. The paper presents a method to predict distributed PV power at very short notice based on satellite clouds and a network model with Long Short-Term Memory (LSTM). First, extract a subset of meteorological and PV power data from the forecast area as training samples., and the abnormal part of the samples is cleaned by an isolated forest algorithm. Secondly, the occlusion feature is extracted from the satellite cloud image in the same period. Finally, the measured solar irradiance, meteorological information, and obscuration features are input into the LSTM network for prediction, and the photovoltaic power prediction results in the next 4 hours are obtained. The measured PV power of Jinghai Guangfu Power Station in Hefei, Anhui province on the 5th day was the training sample for the prediction of PV power on the 6th day. The prediction results show that the prediction error is 2.73% when a satellite cloud image is added, and 16.15% when a satellite cloud image is not added, and the prediction error is reduced by 13.42%.


Introduction
As the share of PV installations increased, the model of large-scale distributed development, low-voltage access, and local consumption [1] became more apparent.Compared with centralized photovoltaics, distributed photovoltaics can be installed on a carrier such as a building, which has high flexibility and can greatly improve the utilization rate of photovoltaics.However, the installation locations of distributed PV are relatively scattered and the installed capacity is small.Considering the cost, there is generally no meteorological measurement device and power prediction model based on a meteorological acquisition system [2] .At the same time, real-time power statistics are generally not carried out, and only the total power generation data are uploaded to the grid.In addition, the time needed to develop distributed solar PV is relatively short, the historical samples of distributed photovoltaic power stations are few and the statistics are not standardized.There are few historical data that can be used for modeling, so it is difficult to build a predictive model on the basis of historical data [3] .The capacity of photovoltaic power production is highly dependent on the magnitude of solar irradiance, which is very important for 2 forecasting distributed photovoltaic power production [4] .
Currently, research on the prediction of solar irradiance for central PV systems relies primarily on two approaches.First, the clear sky model derived from the empirical formula for surface solar irradiance; In the other one, is numerical weather forecast (NWP) information and historical data are used to build a machine learning model to directly predict solar irradiance.The first method is a model derived from physical and empirical formulas when the weather environment is assumed to be a clear sky, such as the ASHRAE model, Hottel model, REST2 model, IQBAL-T model, etc. [5][6] .The input of such models is simple, and the predicted irradiance can be obtained only by knowing the longitude and latitude of the station to be measured.Because complex weather conditions are not considered, the forecast results are generally higher than the real value, and the effect is poor in complex weather, which makes it difficult to meet the actual demand.The second type of method is the machine learning prediction model established based on historical actual data and NWP data, which generally has higher prediction accuracy than the clear-sky model.The commonly used models mainly include support vector machines, neural networks, etc. [7][8] , but the model input of such methods is complex, and the prediction accuracy is extremely dependent on NWP information.At the same time, with the deepening of research, more and more meteorological information and historical data are input into the model in order to pursue higher prediction accuracy [9] .The distributed photovoltaic power station is limited by cost, lack of highprecision NWP information, and insufficient historical samples.It's hard to get good results when trying to apply a machine learning model directly.
As one of the main factors affecting solar irradiance, many scholars have started to model the characteristics of cloud clusters [10] and used them together with conventional meteorological information to predict photovoltaic power generation and solar irradiance.At present, the acquisition of cloud characteristics is mainly done by ground-based cloud imagery [11] and satellite cloud imagery.Ground cloud cover is a real-time picture of a cloud cluster above a measurement site, while satellite cloud cover is a picture obtained by observing Earth from a weather satellite.Compared to this, the ground-based cloud map has a higher spatiotemporal resolution, and the cloud characteristics over the test site are more accurate, but the price of the all-sky imager is high, and it is not suitable for distributed photovoltaic sites with small investment costs.The cloud thickness features extracted from satellite cloud images in [4] have been used to predict photovoltaic power generation and good results have been achieved.However, the shielding effect of clouds is not only related to the thickness of clouds but also to the type of clouds [11] .
This paper starts with the type and thickness of the cloud cluster, analyzes the satellite cloud image of the area over the test site, extracts the cloud shielding characteristics, supplements the NWP information of distributed photovoltaic power stations, and proposes a new technique on the basis of a satellite image of the clouds and a network of LSTM.The time series forecast of photovoltaic power in the future 1-4 (h) hours is performed, and the method is applied to the power forecast of a distributed PV power plant in China to verify the forecasting effect of the method.

Data cleaning based on the iForest algorithm
Due to machine failure, poor communication, and abnormal operation in the distribution network, there will be abnormal measurement data.The isolated forest algorithm is suitable for the detection of abnormal data and has obvious advantages in the detection process of abnormal data in large data sets.The basic principle is to achieve the isolation of abnormal data points by repeatedly cutting large data sets, and then complete data cleaning.The central idea of isolated forests is that outliers are more likely to be isolated than normal ones.The idea stems from the assumption that outliers will be more sparsely distributed in the feature space.Based on this assumption, the isolated forest looks for outliers by constructing a random binary search tree.

The training process for isolated forests
For a given data set, a feature and a feature value are randomly selected as the segmentation point, and the data set will be segmented into a sub-tree on the left and a sub-tree on the right.
Repeat the previous step, recursively building the binary search tree until a termination condition is reached, such as the height of the tree reaching a predetermined maximum height, or the number of samples in the node is less than a certain threshold.
By calculating the path length of the sample in the tree (including the distance between the root node and the leaf node), the abnormal degree of the sample is evaluated.The shorter the path length, the more easily the sample is isolated and therefore the more likely it is to be an outlier.
In the anomaly detection process, the isolated forest calculates the average path length of randomly constructed trees to obtain each sample's anomaly score.The higher the abnormality score, the more likely it is that the sample is an abnormality.

Anomaly detection
After t trees are obtained, iForest is constructed, and x points in the sample are retrieved for each tree.The calculation result h(x) is substituted into formula (1) to calculate the anomaly index.h(x) represents the path length of x in the tree.The shorter the path length, the higher the anomaly degree.
The sample anomaly index is defined as follows: Where the S(x) range is [0,1]; The mean path length of all sample points in the tree is Where h(x)=ln(x)+ξ, with ξ being Euler's constant.Abnormal points are determined by the formula (1): (1)S(x)→1, when all meteorological data in the sample are abnormal.
(2)S(x)→0, in which case there are no meteorological data anomalies in the sample.
(3)S(x)→0.5, at which time there is no prominent anomaly of meteorological data in the sample.

Cloud obscuration feature extraction of satellite cloud images
Cloud movement in an atmosphere can be random, rapid, and violent, making point statistics less than ideal for describing clouds over an experimental area.For this reason, regional statistics on the clouds over the predicted location were carried out in this paper.Figure 1 shows the specific modeling process for cloud masking feature extraction.First, texture detail is enhanced by pre-processing the original satellite cloud image.Next, the cloud image's angular second momentum (ASM), entropy, contrast, and correlation characteristics are extracted from the grey co-occurrence matrices.Combined with the power prediction effect, it is shown that entropy and correlation have the most significant influence on solar irradiance.Therefore, entropy and correlation are selected as cloud masking features.Homomorphic filtering is a method of compressing an image's brightness range while simultaneously enhancing its contrast in the frequency domain, which can increase contrast while simultaneously smoothing brightness to make an image more evenly lit.Removing multiplicative noise from the image allows you to enhance the details of darker areas without losing the details of lighter areas [12] .
An imagery f(x,y) consisting of pixels (x,y) is usually composed of two parts: irradiation function i(x,y) and reflection function r(x,y), as shown in formula (3).i(x,y) describes the illumination part of the image, which is in low frequencies; r(x,y) describes the image's texture detail at high frequencies.
( , ) ( , ) ( , ) f x y i x y r x y = (3) The multiplication of two parts in the image jointly represents the image, which cannot be processed separately, while the homomorphic filtering uses the logarithmic function to transform the two parts from multiplication to addition, as shown in equation (4).ln( ( , )) ln( ( , )) ln( ( , )) The Fourier transform gives the frequency domain image Z(u,v), the irradiance function I(u,v), and the reflection function R(u,v) as shown in formula (5).
( , ) ( , ) ( , ) ( , ) ( , ) ( , ) Because reflection is concentrated at high frequencies, H(u, v) is selected as the high-pass filtering function in order to obtain more texture details in the image, and its expression is shown in equation ( 7).
( ) In the formula, Hγand Lγare high-frequency gain values and low-frequency gain values respectively (selecting Hγ>1 and Lγ<1 can achieve the purpose of attenuation of low frequency and enhancement of high frequency); c is the sharpness of the slope of the control function; D(u,v) and D0 are distance to midpoint and limit frequency respectively.
Then, the regional cloud image is standardized.Image standardization is the process of data centralization through de-averaging.Based on convex optimization theory and relevant knowledge about data probability distributions, models can be generalized by standardizing their data.

Gray co-occurrence matrix
A grey co-occurrence matrix is a tool to statistically analyze the spatial grey value characteristics between pixels [13] .Randomly select a point (x,y) in the image and another point different from the point (x+a,y+b), assuming that the corresponding gray value of the point is (g1,g2).Move the point (x,y) around the image, you will get different (g1,g2) values.If the sequence of grey values is L, the combination of (g1,g2) has a total of L 2 types.Look at the whole screen, count the number of times each (g1,g2) value appears, and then put them in a square matrix, which is the grey level co-occurrence matrix, see Figure 2 for details.

Cloud shielding characteristics
There are distinct differences between cloud types.Cirrus clouds are silky and dispersed; Stratus clouds are similar to fog, with a coarse texture but an even distribution.Cumulus clouds are thick and uniform in texture.The nimbus is complex in structure, dense and thick.In this way, cloud groups can be well classified on the basis of their intensity and texture, as the masking effect of cloud groups is impacted by their intensity and type.Therefore, the intensity and texture characteristics of the cloud cluster are chosen to describe the masking characteristics.In image space, the thickness and texture of the cloud reflect the value and arrangement of the pixels in the image.The pixel value shows the brightness of the image of the point, which can reflect the intensity of the cloud albedo.Therefore, the average pixel value of the cloud picture is used to describe the thickness of the cloud cluster above the station being measured, i.e. the cloud thickness characteristics.In addition, cloud texture features are acquired with the help of the grey co-occurrence matrix.In view of the independence of the features, angular second moment (ASM), entropy, contrast, and correlation are often applied in practice to describe image textures.
ASM describes how evenly grey tones are distributed across an image and how thick textures are.If there is a large variation in the value of each component in the grey level matrix, the value is large; if not, the value is small.The equation for calculating the second angular momentum FASM is 2 ASM 11 ( , , , ) Entropy is a measure of the degree of dispersion of the grey values in an image.The more complex the structure of the image, the higher the value, otherwise the value is lower.The formula for the calculation of the entropy Fent is as follows: ( ) ( , , , ) ln( ( , , , )) The contrast indicates the depth of the grooves in the image texture.The deeper the textured groove, the higher the value, and conversely, the lower the value.Contrast Fcon is calculated as ( ) Correlation is a measure of the regularity of the image texture in a given orientation.If the image has a structure in a certain orientation, the matrix correlation in that orientation will be greater, if not, it will be less.The formula for calculating the correlation Fcorr is ( ) Where L represents the grey level of the segmented image, p(i,j,s,θ) is the probability that a twopoint pair of grey level i and j will appear simultaneously under the condition that the scan step size is s and the scan orientation is θ, μx and σx are horizontal pixel probability means and standard deviations, μy and σy are the mean and standard deviation of the perpendicular pixel probabilities.They are expressed as shown below The entropy and correlation are obtained from the historical satellite cloud image of the station as cloud masking features.

Ultra-short-term forecast model of distributed PV power based on satellite cloud image-LSTM model
By combining the LSTM network, this paper verifies the validity of the model from a data selection perspective.The weather data and PV power generation data of a certain city are randomly selected, the forecast date is taken as a reference, and the data of the next day are selected as training samples.The isolated forest algorithm is applied to clean the abnormal part of the samples.Combined with the LSTM network algorithm, ultra-short-term photovoltaic output is accurately predicted, as shown in Figure 3: (2) From a data mining perspective, the anomalous data in the sample are treated: the isolated forest algorithm will be applied to cleanse them, and the anomaly index will be calculated to extract the anomalous points from the data; (3) Establish a network of long and short-term memory and define the overall framework of the network.
(4) Finally, the results are compared with the sample prediction data without satellite cloud images to check the accuracy of the proposed prediction method.
After the above four steps are performed, the overall model of ultra-short-term PV power generation will be established.

Extraction of satellite cloud image data
The satellite cloud image shown in this article is the infrared cloud image provided by the FY-4A satellite of the National Satellite Meteorological Centre.The name of the power plant is Jinghai Field Measurement, located in Hefei City, Anhui Province.1. Weather samples are randomly selected from Hefei City.The time period from August 1 to 5, 2018 is selected as the training sample.The ultra-short-term forecast is made for the next four hours on the 6th of August and 8 sets of real-time observation data including temperature, humidity, irradiance entropy, and correlation extracted from satellite cloud images are selected.In view of the discontinuous nature of satellite cloud image data, the time period selected is from 6 a.m. to 6 p.m., that is, 21 time sampling points are selected every day.Training samples and forecast days are shown in Table 2.

Data Preprocessing
The training days are 5 days in total and 105 training samples are obtained.The selected prediction day is 1 day in total, and 21 test sample data are obtained.As mentioned above, similar daily data outliers and 100 data, i.e. 100 training samples, are processed using the isolated forest algorithm and are finally obtained.Then four hours of measurement data on the forecast day are added, to make for a total of 107 data sets

Prediction results and analysis
Before training the LSTM network, the meteorological data of the similar day and the predicted day must be normalized and the values of each parameter must be uniformly between 0 and 1 to improve the prediction accuracy of the LSTM network model.The following formula is used: Where yi is the initial data, ymax, and ymin are the max and min data values, and zi is the result.Two evaluation indices, the mean absolute percentage error (mape) and the root mean square error (rmse) are used to analyze the model.This is calculated as shown below: Where Pti is the real power, Pprei is the forecasted power and K is the amount of data.

Comparative analysis of numerical examples
From the above analysis, we can see that the training day and the forecast day are randomly selected, and new data samples are obtained through abnormal data cleaning operations.Next, the LSTM network prediction model is built, and meteorological data from these similar days and forecast days are taken as model input for the forecast.To confirm the accuracy of the prediction model by adding cloud image data, the LSTM network prediction model without cloud image is added for the comparison experiment (Model B), and then the prediction model in this paper is given as the comparative experiment (Model C).There are two models in total.The performance prediction results of the two models on 6 August 2018 are shown in Figure 5.

Figure 5. Power prediction results
As shown in the graph, the predictive and actual values are relatively close, and the predictive performance has improved in this weather In addition, the proposed forecasting model is more accurate, as there are certain rules for changes in photovoltaic power generation.However, there are a few points where the forecast is not sufficiently accurate, which may be due to the instability of the weather and the lack of sampling.As shown in Table 3, (Model C is the LSTM-distributed solar PV short-term forecast model based on satellite clouds and Model B is the comparative experimental model with no clouds), the average MAPES of Model C and Model B on sunny days are 2.73% and 16.15% respectively.The RMSE mean of model C is 0.57MW, and that of model B is 1.93MW.Compared with model C, the prediction of model C is more accurate.In addition, the MAPE error of model B at 7:00 is larger, exceeding 10%, which may be due to the large change caused by the addition of more water vapor and carbon and oxygen compounds to the morning air.In summary, the prediction results of the model presented in this paper

Figure 1 .
Figure 1.Feature extraction of cloud obscuration3.1 Satellite cloud image preprocessingBecause of the satellite's high-resolution cloud image, it is not enough to depict the cloud texture over the test site.To do this, we first transform the local cloudscape over the site into the frequency domain using a Fourier transform, and then we apply a homomorphic filter to enhance the features of the local cloudscape.Homomorphic filtering is a method of compressing an image's brightness range while simultaneously enhancing its contrast in the frequency domain, which can increase contrast while simultaneously smoothing brightness to make an image more evenly lit.Removing multiplicative noise from the image allows you to enhance the details of darker areas without losing the details of lighter areas[12]  .An imagery f(x,y) consisting of pixels (x,y) is usually composed of two parts: irradiation function i(x,y) and reflection function r(x,y), as shown in formula (3).i(x,y) describes the illumination part of the image, which is in low frequencies; r(x,y) describes the image's texture detail at high frequencies.

Figure 3 .
Figure 3. Ultra-short-term forecast model of distributed PV power based on satellite cloud image-LSTM model Specific forecasting steps are as follows: (1) Randomly select the weather data and PV power generation data of a certain city, and divide them into training samples and test samples.(2)From a data mining perspective, the anomalous data in the sample are treated: the isolated forest algorithm will be applied to cleanse them, and the anomaly index will be calculated to extract the anomalous points from the data;(3) Establish a network of long and short-term memory and define the overall framework of the network.(4)Finally, the results are compared with the sample prediction data without satellite cloud images to check the accuracy of the proposed prediction method.After the above four steps are performed, the overall model of ultra-short-term PV power generation will be established.

Figure 4 .
Figure 4. Pre-processing result of regional cloud image The regional cloud image of Hefei City is intercepted from the cloud image for image processing.The specific steps are as shown in Figure4.The extraction of cloud shielding features is shown in Table1.

Table 1 .
Cloud shielding characteristics and their values

Table 2 .
Training days and forecast days

Table 3 .
Prediction error estimates of the two prediction models