Regional photovoltaic grid power generation output prediction based on SVM and autoencoder

Photovoltaic power generation, which is an important component of clean energy, has been widely used around the world. However, the predictability of photovoltaic power generation systems has always been a challenge because it is affected by meteorological conditions and other external factors. Therefore, accurate photovoltaic power generation output prediction is crucial for the stability and reliability of the power system. This work aims to use two advanced machine learning technologies, supporting vector machine (SVM), and autoencoder (Autoencoder), to improve the prediction performance of regional photovoltaic grid power generation output. First, this article introduces the autoencoder and SVM network, and then we use the SVM algorithm to analyze and model these data. By using SVM, we can build a highly accurate photovoltaic power generation output prediction model that can predict on different time scales, from hourly level to daily level. Next, the principle and working mechanism of the support vector machine will be introduced in detail, and its advantages in processing complex data and high-dimensional features will be explained. Then, this article will use the grid search method to optimize the support vector machine model. A new cluster performance index based on electrical distance and regional voltage regulation capability is proposed, which divides the distribution network into multiple clusters. The voltage control strategy combines cluster autonomous optimization of time scales and distributed inter-cluster coordination optimization. Then we select evaluation indicators to evaluate the obtained model. The results show that the model can predict power generation output and provide a reference for making point arrangements. In terms of model building, this article will explain how to select appropriate features and parameters and use the collected data to train the SVM model. We also perform a detailed evaluation and analysis of the model’s performance. We use a variety of evaluation metrics to measure model accuracy and stability. Experimental results show that the SVM and autoencoder hybrid model have higher prediction accuracy and better generalization performance than traditional methods.


Introduction
Photovoltaic power generation is a process that uses solar energy to convert light energy into electrical energy, usually through photovoltaic cells (also known as solar cells).A photovoltaic cell is an electrical device that can directly convert photon energy in sunlight into electric current, thereby generating electrical energy.This process is achieved through the photoelectric effect, in which photons strike the semiconductor material on the surface of the photovoltaic cell, causing electrons to be excited and generating an electric current within the cell.Photovoltaic power generation has many important significances in real society.
Photovoltaic power is a renewable energy source because solar energy is unlimited, unlike limited fossil fuels.This means that in the long term, photovoltaic power generation systems can provide a sustainable energy supply for electricity needs and reduce dependence on limited resources.Photovoltaic power generation systems do not emit greenhouse gases or other pollutants, so they have less negative impact on the environment.Compared with traditional coal or oil-fired power generation, photovoltaic power generation helps reduce air pollution and climate change.Photovoltaics can be built in a variety of locations, including rooftops, solar farms, industrial sites, etc.This distributed power generation model can reduce power transmission losses and improve the stability of the power system.Photovoltaics allow many regions to reduce their dependence on external energy supplies, thereby increasing energy security.This is particularly important for energy supplies in remote areas, islands, or unstable areas.Although the initial investment cost is high, with the advancement of technology and the realization of economies of scale, the cost of photovoltaic power generation has dropped significantly.This makes solar energy more affordable and attracts greater investment and adoption.The growth of the photovoltaic industry has created a large number of jobs, involving a variety of skills and positions, from the manufacturing of photovoltaic cells to the installation and maintenance of systems.This helps economic growth and job creation.The distributed nature of photovoltaic power generation can improve the stability of the power system.It can provide additional power during periods of peak power demand, relieving load pressure.
Photovoltaic power generation is an environmentally friendly, renewable, and distributed energy solution that is of great significance for reducing greenhouse gas emissions, improving energy independence, creating jobs, and improving the reliability of the power system.Therefore, photovoltaic power generation plays a crucial role in addressing climate change and achieving circular development goals and is gradually becoming a major component of future electricity supply [1][2][3][4].
SVM, as a powerful learning algorithm, provides a powerful tool in the field of power grid output prediction [5].Its excellent performance has attracted much attention in processing complex data and high-dimensional features.The working principle of SVM involves mapping data into a highdimensional space to find an optimal hyperplane in this space that effectively separates different categories of data [6][7].This makes SVM excellent at solving nonlinear problems and complex classification tasks.
In the field of power grid output prediction, SVM is widely used to build prediction models [8][9].First, by collecting a large amount of data related to power grid operations, including meteorological data, power demand, etc., a data set for training and testing can be established.SVM can use these data to learn features of different modes to make accurate predictions on unknown data.Especially in power systems, due to the diverse and complex factors that affect the power grid output, data often have high dimensions and complexity, which is the strength of SVM [10].It is capable of processing large-scale data sets and finding the best-separating hyperplane in high-dimensional space to achieve more accurate predictions.
In addition, SVM can also be applied to actual power grid output prediction systems.By collecting data related to grid operation in real-time, the SVM model can be continuously updated and optimized to maintain sensitivity to different factors .This real-time and adaptive nature makes SVM a powerful tool in power grid output prediction.Therefore, the application of SVM in power grid output prediction can not only improve the prediction accuracy but also optimize the operation of the power system, thereby achieving effective utilization of electric energy resources.
Therefore, this study aims to propose an "SVM-based grid output prediction" method that fully utilizes the advantages of support vector machines (SVM) to provide reliable and intelligent prediction solutions for grid output.By emphasizing the importance of accurate prediction of grid output and detailing the principles and working mechanisms of SVM, a new cluster performance metric and a combined cluster autonomous optimization (CAO) control strategy are proposed.By optimizing the APC and RPC of photovoltaic units under the constraints of voltage, the goal is to minimize the loss of photovoltaic APC and network active power in CAO control.The CAO control scheme can fulfill independent autonomy for each cluster by alternately updating the optimal solution within the cluster and virtual relaxation bus voltage, without the need for inter-cluster communication.It not only decreases the voltage control complexity and communication pressure but also improves the speed of voltage regulation, while avoiding excessive APC and RPC in photovoltaic units.This research will bring a new approach to the power industry to solve the increasing power supply challenges.
In this study, we first annotated the data set, including information on parameter thresholds and maintenance records.Next, we apply support vector machine classification techniques and analyze with the help of other relevant data to identify whether there is a fault in the grid output.Ultimately, we will extend this approach to predict and diagnose specific types of failures before they occur, thereby improving the availability and safety of power equipment.Through this approach, we aim to provide the power industry with an innovative and feasible solution to better manage and optimize the power supply and ensure the stability of the power system.

Related work
Electricity output prediction refers to first planning in advance based on the prediction information, obtaining the prediction results, and then making necessary adjustments according to the requirements of power balance in actual operation.Literature research on power output prediction is as follows.
An economic model is studied for power output forecasting that integrates day-ahead planning dispatch and real-time dispatch.Especially when a high proportion of renewable energy is involved, this model can effectively deal with power fluctuations caused by prediction errors.Another static power output prediction strategy is proposed for microgrids based on mixed integer linear programming (MILP), which verifies the excellence of this strategy in operating efficiency.While a multi-objective power output prediction model is proposed based on the NSGA-II algorithm to achieve dual optimization of the economy and environment of power output.Since there may be certain deviations in day-ahead forecasts of power output and loads such as wind and solar power, some researchers have proposed power output forecasting methods based on stochastic programming theory, using random scenarios or chance constraint methods to deal with these uncertainties.In addition, a power output prediction model is proposed for household microgrids based on stochastic programming theory, using a multi-time scale optimization method to reduce the impact caused by prediction uncertainty.Similarly, considering the time-of-use electricity price the opportunity-constrained programming method is used to deal with the uncontrollability of the output and load of distributed power sources, and a hierarchical genetic algorithm is used to solve the established model.It simulates an actual microgrid as an example and proves the validity of the study.Taking the power system containing wind turbines as the research object, and the optimal dispatch of power output based on chance constraints in detail is studied.In addition, to minimize operational risks, robust optimization methods are also widely used in power output prediction.A robust optimization model of the Combined Cooling Heating and Power (CCHP) microgrid with the lowest operating cost is also established, and the negative impact of uncontrollable factors on microgrid performance optimization is studied.Adopting a robust optimization method in grid-connected microgrids and combines with the Lyapunov optimization method to reduce the complexity of operation optimization, and the superiority of this method is verified through simulation.

Support vector machine
To improve the machine-learning ability, SVM, a machine-learning method that can minimize the risk is proposed.It is a class of linear classifiers that divides sample data into two classes by constructing separated hyperplanes.For linearly indivisible sample data, SVM makes it linearly divisible by mapping the original data to a higher dimension [12].It is supposed that the sample data is represented as follows: where  ∈  and  ∈ (−1, 1);  and  are sample space vectors and class flags respectively; R is the number of training samples; n is the dimension of the sample space.The general form of the function is () =  + , and the corresponding classification equation is  +  = 0, as shown in Figure 3, where  and  are the values of the first and second dimensions of  respectively,  is the normal vector of the optimal hyperplane, and  is offset.
The above function is normalized so that all samples in the training set meet () ≥ 1, and the sample nearest to the classification plane meets () = 1.The classification plane can realize the correct classification of all samples, namely: The data sample on the hyperplane in Figure 1 is satisfied.
These data samples are called support vectors.We transform the problem of solving the optimal classification plane into a constrained optimization problem.Under the constraint of Equation ( 2), the minimum value of the following function is found.
The Lagrange algorithm is used to solve the above equation and calculate the partial derivation, and the following results are obtained.

𝑎 𝑦 (𝑤𝑥 + 𝑏
Where  ≥ 0 is a Lagrange multiplier.Finally, the optimal classification surface function can be obtained as follows:

K-fold cross-validation
Parameter selection of the model is a vital factor affecting the application effect of the model.To avoid the problem of overfitting the model in the actual analysis and increase the accuracy of the model's prediction for the test set data, the generalization ability of the model is applied.The K-fold cross-validation method shown in Figure 2 can be summed up in the following simple steps: 1.We divide the test dataset into K equal parts.If there is an unequal number of cases, we will distribute them as evenly as possible; 2. One of the K parts is selected as the validation to set the K-1 parts as the training set; 3. We use the set to train the pattern and record the training results; 4. We use the validation set to analyze and verify the model's performance; 5. We repeat Steps 2-4 K times, using each part once as the validation set; 6.We calculate the average of the evaluation metrics from the K iterations to obtain the cross-validation model's overall performance index.

Metrics and partitioning
The cluster metric consists of two parts: a modular metric of electrical distance and a voltage capability metric corresponding to each cluster.In this paper, the taboo algorithm is applied to search for the optimal partition.The network partitioning flow based on taboo search is shown in Figure 3.The network partitioning process based on taboo search is shown in Figure 3.As a basis for CAO control, this paper realizes network separation by using a decomposed coordination method.The nodes of the upstream clusters are similar to the downstream slack buses, and the inter-cluster currents are considered as the virtual load power of the upstream nodes.Considering the network separation and equation of LDF approximation, the CAO of clustered  can be rewritten below.
, , , : min 7 Where  means the function corresponding to  ;  =  ;  is a collection of all inter-cluster branches;  ∈  represent inter branches between cluster  and its downstream cluster;  and  are the meritorious and wattless power flow along inter-cluster membrane , accordingly, deemed as load power of boundary node j;  means voltage amplitude of the node a;  ∉  is the node of the upstream cluster deemed as the bus of  ; Point  serves as the collection of  , namely, its real bus;  ≈  is assumed in Equation ( 7).
The CAO control process: The CAO control suppresses excessive APCs and RPCs of the PV units through the alternating updating of the virtual slack bus voltages and the optimal solution within the cluster.The CAO control is only activated in case of a voltage violation in the cluster so that it does not repeat the action and lead to voltage fluctuations.

Self-coding neural networks
AutoEncoder (AE) is an unsupervised neural network structure that maps inputs to outputs and usually consists of a layer of input and output and hidden types with full connectivity, and its complete network structure is shown in Figure 4, where L1 represents the input layer, L2 represents the hidden layer, and L3 represents the output layer.Among them, the forward conduction process from the input to the hidden one is seen as encoding; the forward conduction process from the hidden layer to the output layer is called decoding; the hidden one is called the extraction layer.The encoding process is to take the input data  =  ,  ,  , ⋯ ,  through Equation ( 12) to obtain a new feature expression  =  ,  ,  , ⋯ ,  , and the function expression is shown in the following equation.
=  () = s ( + ) Where  ∈  × serves as the input data; n denotes the dimension of data;  ∈  × denotes the feature expression of the hidden layer; r denotes the number of neurons in the hidden layer;  ∈  × denotes the input weights of the hidden layer;  ∈  × denotes the input bias of the hidden layer; s denotes the ReLU activation function.
The virtual process of decoding, i.e., the expression z of the hidden layer, is mapped to the data x by Equation (13), and the function expression is shown in the following equation.
The mean square error (MSE) function is a key constant for evaluating the performance of the network, which is mainly used in autoencoders to calculate the square of the difference between the reconstructed output and the original input and then to judge the goodness of the data reconstruction process, whose mathematical expression is shown as follows.

Evaluation criteria
In the current research, the prediction accuracy of the model is commonly used to evaluate the performance of the model.This is a quantitative indicator, which can intuitively reflect the quality of the model from the specific value.However, after analyzing the obtained data on wind turbines, it is found that the fault data is far less than the non-fault data.Therefore, accuracy as the only index to judge the performance of the model may lead to a certain deviation in the evaluation results.
The confusion matrix describes the relationship between the actual categories of sample data and the recognition results of the classifier and can be applied to calculate performance.The confusion matrix is a simple square matrix, which can be divided into four scales: true, negative, false positive, and negative.It can intuitively represent the accuracy rate of pattern recognition by the classifier and the error rate of pattern recognition.Therefore, this chapter uses the confusion matrix as an index to calculate the effect of the model.In the square matrix, you can display the number of predictions of the classifier's four types.For the classification problem of an unbalanced number of categories, the accuracy rate, recall rate, and F1 score are used as performance indicators and F1 is the harmonic average.where  is true (i.e., the number of directly predicted false samples),  is false positive,  is false negative (i.e., fault samples incorrectly labeled as trouble-free),  is true negative,  stands for precision,  stands for recall, and 1 changes from 0 to 1, with 1 representing the model's best output and 0 representing the model's worst output.

Experimental results
We calculated the corresponding values of the model among the above four indicators, and the results are shown in Table 1.As can be seen from Table 1, our model has achieved a high score in the four indicators, indicating that our model shows very good performance in terms of accuracy.To verify the reliability of the model, we conducted experiments on training sets of different sizes on three data sets: Cora, Citeseer, and Pubmed.We use 20%, 40%, 60%, and 80% as training sets to observe the accuracy of the data set.The experimental results are shown in Figure 5.It can be observed that when the rate of trained setting is 60% and 80%, the difference between the two is not significant although the 80% training set performs better.Therefore, for some large-scale data sets, especially when the performance requirements are not very strict, aiming to increase the training speed, we can take the proportion of the running set from 80% to 60% into consideration.However, when this running set accounted for 20% and 40%, we investigated a significant decline in model performance.The main reason was that insufficient training data prevented effective feature learning.
Further, we studied the graph between the predicted metric predicted by the model and the true value, as seen in Figure 6.As can be seen from Figure 6, the predicted value of our model has a small error with the real value and can change with the fluctuation of the real value, which shows that our model has strong prediction ability and can be used well for power output forecasting.

Summary
This study aims to increase the prediction function of regional photovoltaic power grid power generation output to cope with the challenge that photovoltaic energy running systems are influenced by external constants such as meteorological conditions.By combining two advanced machine learning techniques, support vector machines (SVMs) and autoencoders, we achieved remarkable results.First, the basic principles of autoencoders and SVM networks are introduced, and the SVM scheme is applied to calculate and simulate the samples.SVM has strong generalization ability, especially in processing nonlinear data.Using SVM, we successfully established a highly accurate photovoltaic power generation output prediction model that can predict on different time scales, from hourly level to daily level.We highlight the superior performance of support vector machines in processing complex data and high-dimensional features applying advantages of support machines in detail.The emulation of supporting machines is optimized through the grid method and evaluated by using evaluation indicators.In it, through electric distance and the capability of v-regulation, a new cluster metric is put forward to guide the segmentation of the distribution network.Then, a voltage-using scheme, incorporating an optimal cluster, is proposed to achieve the combined active and reactive power optimal control of PV units.The results show that the model can provide a reliable reference for power system operation.

StartFigure 3 .
Figure 3. Flowchart of network partition based on tabu search.As a basis for CAO control, this paper realizes network separation by using a decomposed coordination method.The nodes of the upstream clusters are similar to the downstream slack buses, and the inter-cluster currents are considered as the virtual load power of the upstream nodes.Considering the network separation and equation of LDF approximation, the CAO of clustered  can be rewritten below.

Figure 5 .
Figure 5. Results of training data with different proportions on different data sets.It can be observed that when the rate of trained setting is 60% and 80%, the difference between the two is not significant although the 80% training set performs better.Therefore, for some large-scale data sets, especially when the performance requirements are not very strict, aiming to increase the training speed, we can take the proportion of the running set from 80% to 60% into consideration.However, when this running set accounted for 20% and 40%, we investigated a significant decline in model performance.The main reason was that insufficient training data prevented effective feature learning.Further, we studied the graph between the predicted metric predicted by the model and the true value, as seen in Figure6.As can be seen from Figure6, the predicted value of our model has a small error with the real value and can change with the fluctuation of the real value, which shows that our model has strong prediction ability and can be used well for power output forecasting.

Figure 6 .
Figure 6.The relationship between model predictions and true values.