Filling of missing values of voltage data based on ResIOFNN

With the increasing scale of power Systems, the amount of data that needs to be collected increases exponentially, and data collection equipment will inevitably experience different degrees of data loss. Traditional missing data filling algorithms, such as Expectation Maximization Algorithm (EM) and K Nearest Neighbors (KNN), have low accuracy when dealing with missing data. Given the limitations of the current time series data-filling model, this paper combines the idea of deep learning technology. It proposes an improved voltage missing value-filling algorithm based on the Fourier neural network model. The model can combine the future and past information of the missing data to complete the filling work on the missing data set, which improves the precision of voltage data filling. The calculation example adopts the data of the real power grid for simulation analysis, and the calculation outcomes prove the data-filling method’s high level of fill accuracy.


INTRODUCTION:
In the modern power system, power data is critical for guaranteeing the safe and reliable functioning of the power system.However, due to data acquisition equipment failure, data transmission error, human error, and other reasons, some power data may be missing [1], which brings certain safety hazards to the operation and control of the power grid.At present, there are mainly the following ways for filling in missing data: one is the mathematical statistics method, usually using interpolation method and mean value filling method [2], etc.This strategy focuses on the statistical characteristics of missing data without considering the correlation between the data and the timing characteristic.The second is the machine learning methods, such as the K-nearest algorithm [3], random forest [4], multi-dimensional correlation analysis [5], etc.
In recent years, deep learning has made significant advancements in missing data filling [7].In [8], a missing data completion method was proposed based on an improved deep convolutional neural network.In [9], a missing data-filling model was proposed based on self-attention generative adversarial networks.The application of deep learning technology in data filling can alleviate the problems of highdimensional and complex data mining and feature extraction of power grids on the one hand, and can make up for the problems of insufficient training data and poor generalization ability of traditional machine learning methods in practical applications [10].This paper proposes a measure for filling in missing voltage data based on the Residual Oscillatory Fourier Neural Network (ResIOFNN) model.
First, the historical data of the power grid is obtained and preprocessed.Then K-means clustering is used to divide the preprocessed data into scenes.Residual connections are added to improve the network structure into the Fourier neural network [6] to improve the ability of network generalization.Finally, data imputation on the missing dataset is completed.

Missing data preprocessing
Before data preprocessing, it is first necessary to detect missing data and abnormal data.The processing methods of unknown data are generally divided into direct deletion method and filling method.The way of direct deletion will discard loads of useful information, which is not beneficial to the mining and analysis of subsequent time series.Here, the Newton interpolation method is used to initially fill in the unknown data.Newton interpolation can replace missing terms in the original data with appropriate values.The Newton interpolation method Eq. ( 1) is as follows. ) When using the Newton interpolation method, five known data before and after the unknown data are selected for polynomial fitting, and the preliminary filling result of the unknown data can be obtained.

Scene partitioning based on K-means clustering
The analysis of a quantity of historical data shows that the load changes of the power grid on different dates show certain regularity.Therefore, during data preprocessing, the neural network in a specific scenario can learn more intrinsic features by clustering and analyzing the power grid data.K-means clustering is a commonly used clustering algorithm [11].The algorithm assigns data points to the nearest cluster center through iterative calculation.It continuously updates the center point of each cluster to make the clustering result optimal, and the loss function is defined as follows: where i x represents the ith sample, i c belongs to the cluster i x , and i c  represents the cluster's center point.
The load scene analysis steps based on K-means clustering are as follows.k centers are randomly selected.The active power of the load is taken as a reference value.The active power curve is drawn, and clustering is performed.And through iterative calculation, the loss function corresponding to the clustering result converges.After calculation, three classifications (can be classified as working days, rest days, and statutory holidays) are obtained, and the scenes are numbered.Then we classify them into the appropriate scenario by computing the Euclidean distance between the unknown data and the three cluster centers.

RESIOFNN MODEL
The ResIOFNN model is a fusion of the Oscillatory Fourier Neural Network [11] and Residual Network (ResNet).The input layer of the IOFNN network uses a fully connected layer, the hidden layer uses a specialized neuron activated by a cosine function, and the output layer uses a fully connected layer.The input layer can also use a 1D convolutional layer.

IOFNN model
The neuron is equivalent to projecting the input data into sine and cosine oscillation functions.Each neuron includes a specific frequency of oscillation, where i denotes frequency channels, and the frequency i propagation of IOFNN can be viewed as a discrete Fourier transform (DFT), and its structure is shown in Figure 1.
The input vector t   is calculated by Eq. ( 3) in every step.The input to the DC neuron is equal to t   minus the constant value, as shown in Eq. ( 4).The input to the AC neuron is equal to t   subtracting the time-driven phase modulation term i t  , as shown in Eq. ( 5) and Eq. ( 6), which is a hyperparameter used to define the clock speed of each AC neuron, can be used to calculate the parameter of various AC neurons.Eqs.(3)~( 6) are as follows.
( 1, , ) ( 1, , ) 4 ( 1, , ) ( 1, 2,3) The forward propagation of IOFNN is similar to the Recurrent Neural Network structure.First, the state of the intermediate hidden layer neurons will be initialized.Next, the input results of each time step can be sent to the parallel architecture, and the activation of each step will be calculated according to Eqs. (3)~ (6).Then, the final state of the neuron in IOFNN is calculated by the output of the hidden layer over all time steps by Eqs. ( 7) and (8).Finally, the hidden states of the neurons are concatenated and connected to the output layer via Eqs.( 9) and (10).The Eqs. ( 7)~(10) are as follows.
; ; ; As described in Eq. ( 11), the transitions are computed for each time step.Eq. ( 11) are as follows.
As shown in Figure 2 is the forward propagation process of IOFNN.

Residual network
ResNet is constructed by a range of residual blocks, which can be expressed by the Eqs.( 12) and ( 13); ( , ) l l F x w is the residual function, ( ) h  is the direct mapping, ( ) f  is the activation function.Then for a deeper layer L, the expression of the lth layer can be expressed according to Eq. ( 14), where Eqs. ( 12) ~( 14) are as follows.
( ) ( , ) ) Since ResNet is in forward propagation and backpropagation, data can be propagated between layers.If the output of the features by the previous layer does not contribute to the expressive ability of the entire network, then this residual block will become an identity map, which will not affect the expressive ability of the network [12].This can effectively solve the overfitting phenomenon of the deep neural network.For the IOFNN network, due to the stacking of multiple layers of IOFNN, network degradation will also occur.Therefore, the IOFNN can be stacked in the form of a residual network to construct a residual Fourier neural network.The structure of the ResIOFNN model can be expressed in Figure 3.For each unit in the ResIFONN network, it can be expressed as the following equations: , ( , ) ) where F represents a residual function, and , represents an identity map.The scenario where the voltage data in the verification set is missing is brought into the ResIOFNN network for iterative calculation, and it is judged whether the filling data of ResIOFNN meets the accuracy requirements.Otherwise, the training parameters are adjusted and retrained.The ResIOFNN model will judge whether the current data is true.If there is a gap, the data will be introduced to fill the model, and the missing voltage data will be processed.Then the filling value at this time is sent into the model as an input vector, and the next missing value is continued to be filled until all missing data are filled.

EXAMPLE ANALYSIS
In this paper, the missing voltage values of a region's power grid are filled in using historical data from that region's power grid as a sample set.The sampling period of the voltage is 5 minutes, and the datafilling object is the missing voltage value of the 10 kV bus.The KNN algorithm method, the Random Forest (RF) algorithm, and the algorithm mentioned in this study ARE selected to examine the imputation results.The root mean square error (RMSE) strategy is used to judge the deviation between the real value and the filling value.RMSE  can be calculated as follows:

Filling result analysis
The accuracy of the algorithm under different missing rates is researched.A random missing data set with a voltage data loss between 1% and 30% is constructed.Experiments are conducted using ResIFONN, RF, and KNN to compare the RMSE and Accuracy.Using the missing voltage value of a specific node in the actual regional power system as the filling target, the filling error of voltage missing data under different missing conditions is analyzed, as shown in Figure 4.The chart shows that the inaccuracies of RF and KNN considerably increase when the missing rate rises.The ResIFONN algorithm is relatively stable, particularly when the loss rate is more than 15% and the change of the error is not obvious.To further reflect the advantages of ResIFONN, this paper takes the case where the missing rate is 15% as a case sample for analysis.Figure 6 demonstrates the comparative analysis of 30 consecutive sets of data under a certain missing condition.The curve produced by applying the ResIFONN method recommended in this study matches well with the real value curve.The filling value is similar to the real value, as evidenced by the results in the figure.

CONCLUSION
This paper proposes an improved deep learning filling method to deal with the problem of missing grid voltage data.By adopting the improved Fourier neural network, an efficient calculation method for extracting frequency domain information from sequence data is provided, which can learn complex spatiotemporal relationships in power grid data, such as the correlation of voltage data and the periodicity of load changes.By introducing a residual structure into the network, the training speed of the network is made faster while avoiding the over-fitting problem of the network.The processing ability of complex models is enhanced, and a good filling effect is obtained.

4 Figure 2 .
Figure 2. The forward propagation process of IOFNN

Figure 3 .
Figure 3. ResIOFNN model structure diagram the real and filling values; n is the total number of missing values.Accuracy : Padding precision.r n is the number of correct estimated values,

Figure 4 .
Figure 4. Comparison of the RMSE of the filling results of the three algorithms

Figure 5
Figure 5 Comparison of filling accuracy of three algorithms Figure 5 shows the filling accuracy of the three algorithms under different backgrounds of missing degrees.The graphic demonstrates that ResIFONN has a superior filling effect than RF and KNN.

Figure 6 .
Figure 6.ResIOFNN filling results compared to true values