Error state prediction of capacitor voltage transformer based on TimesNet and gate control unit

The stability and reliability of the error state of capacitor voltage transformer (CVT) are poor, which affects the fairness of electricity trade settlement and the safe operation level of power grid. However, the CVT error data is superimposed by different periodic information such as daily period, monthly period and quarterly period, which makes the prediction of transformer error state challenging. In this paper, a CVT error state prediction method based on TimesNet and gate control unit (GRU) is proposed. The TimesNet network is employed to capture the intraperiod-variation and interperiod-variation characteristics of the ratio difference data of a-phase, b-phase and c-phase, and the feature data and GRU model are employed to predict the output. The simulation results demonstrate that the mean square error (MSE) and mean absolute error (MAE) of the proposed model are 0.0002 and 0.0101, respectively, indicating that this model has lower prediction error and higher prediction accuracy.


Introduction
The measurement error state of capacitive voltage transformer (CVT) is facilely affected by environmental factors, and it's stability is poor [1].The CVT error state evaluation method of periodic blackout inspection has the disadvantages of difficult operation and high cost, which can not meet the operation requirements of intelligent substation [2].Therefore, it is urgent to carry out the research on the prediction method of CVT measurement error state without power outage, for the sake of mastering the measurement error state of CVT in time and guiding the operation and maintenance of CVT in a targeted manner [3].
In 2018, Zhenhua Li initially explored the prediction method of CVT measurement error state, and proposed a Q-ARMA (autoregressive moving average) model [4].In the single-step prediction of this model, the error between the predicted value and the actual value of CVT measurement error is about 5%.In 2023, Guanghua Yang designed a CVT error state prediction method based on WHO-RNN structure.This method employed wild horse optimizer (WHO) to optimize the parameters of recurrent neural network (RNN) to realize the deep mining of historical information [5].In may this year, Wanli Zhang proposed an improved multi-task learning (MTL) deep framework based on gate control unit (GRU) [6].This framework was used to mine the coupling between six ratio error and phase error data sets, and only one model can be trained to obtained six prediction outputs, which greatly improved the training efficiency.
The above article lists some domestic scholars' research on CVT error state prediction methods in recent years.At present, there is no relevant literature abroad.According to these articles, scholars usually utilize the time series analysis method to model the error prediction of CVT.Based on this, this paper proposes a CVT error state prediction method based on TimesNet-GRU.This method first extends one-dimensional time change to two-dimensional space, extracts feature of time series data in two-dimensional space, and then constructs GRU model for prediction.

Overall architecture
The overall network framework of the proposed model named TimesNet-GRU is shown in Figure 1.In the proposed method, for the historical three-phase difference data of CVT, a combination method of standardized processing and one-hot coding is proposed to obtain the input matrix.Then, TimesNet is adopted to convert 1D time series into 2D space and capture the dependence of data in different time scales and between different time scales [7].Finally, GRU can further extract the time characteristics and long-term dependencies to obtain more global time information, and then predict the future threephase difference data [5].

Data pre-processing
The data standardization process can unify the dimension, smooth the gradient between different batches and different layers of data, eliminate the negative impact of singular data on model training, and accelerate the convergence speed.Therefore, this paper uses the Z-score normalization method, which can normalize the dataset to a distribution with a mean of 0 and a variance of 1.Let the threephase error data of CVT be ) , , ( , the time scale be T , then the standardized processing formula is: where μ is the mean value and σ is the standard deviation.Before X is output to the TimesNet network, we use the one-hot coding algorithm to map X to the depth feature , which extends the value of discrete features to the Euclidean space and makes the distance calculation between features more reasonable.At the same time, one-hot coding also plays a role in expanding features to a certain extent.

Feature extraction based on TimesNet
The change of each time point in the time series is not only affected by the adjacent time points, but also highly correlated with the time points of the same point in different periods.Actually, these two types of time changes can be named as intraperiod-variation and interperiod-variation, respectively.The intraperiod-variation represents the short-term trend in a period, and the interperiod-variation can reflect the long-term trend in different periods.However, it is difficult for one-dimensional time series to display these two different types of variation explicitly at the same time [8].Therefore, onedimensional time series can be reshaped into two-dimensional tensors, where each column contains a time point in a time period, and each row contains time points of the same phase in different time periods.Based on the above motives, we selects a relatively new TimesNet network to capture intraperiod-variation and interperiod-variation between time series.
In order to investigate the periodicities of time series, the Fast Fourier Transform(FFT) is first employed in TimesNet to process one-dimensional data in the frequency domain: where ) ( FFT , ) ( Amp , ) ( Avg and ) ( arg  Topk represent the Fast Fourier Transform, the amplitude calculation, the mean calculation, and the top k maximum values, respectively.
A represents the mean value of the amplitude of each frequency corresponding to different variables.Considering the sparsity of the frequency domain and avoiding the noise caused by meaningless high frequency, TimesNet only selects the first k significant frequencies } ,..., with the largest amplitude.The non-normalized amplitude and period length corresponding to these periods are } ,..., { , respectively, where k is a hyperparameter.Based on the above frequency and period length, the one-dimensional variable is reconstructed with multiple two-dimensional tensors: where ) ( padding means that the time series is zero-expanded along the time dimension so that it can be reshaped into two-dimensional data; represents the th i − group of reconstructed data after  Then, in TimesNet, the inception module is utilized to extract features of two-dimensional data, which can capture the details of different scales and levels in two-dimensional data, thus improving the model 's ability to understand two-dimensional data.We adopt the following formula to represent the above process: Where . ) ( Trunc in equation ( 7) is a truncation function, which truncates the sequence of length together using the amplitude mean A of different periods, because the amplitude A can reflect the relative importance of the selected frequency and period.Finally, TimesNet organizes its output in a residual way.

Prediction model based on GRU
GRU is a variant of recurrent neural network, which contains two gating units : update gate and reset gate [9].The update gate determines the weight between the new input and the previous hidden state, which is utilized to control the update of information.The reset gate determines how to combine the previous hidden state with the current input to better capture the long-term dependencies in the time series.Therefore, GRU is employed to further extract the long-term and short-term dependencies, enabling a more global pattern to be obtained for temporal information.The process can be formalized as , where ' Y is the predicted error value.

Experiment
To verify the effectiveness and superiority of the TimesNet-GRU model, we also experimented on three other models, including LSTM, GRU and TimesNet models.The hyperparameters of the comparison models were adjusted according to previous experience to optimize their performance.The learning rate, batch time step, input feature dimension, and the k value of the model TimesNet-GRU proposed in this paper were 0.0001, 16, 5, 3, 6, respectively.TimesNet-GRU was a multi-input, multi-output, single-step prediction model.The data set utilized in our experiments came from the CVT three-phase error data calculated every day during the period of 2020.12.20-2021.10.31 in Taizhou Xiashi Substation.When constructing the model, the division ratio of training set, verification set and test set was 6 : 2 : 2.

Evaluating indicator
In this paper, the classical evaluation indexes in the regression method are used to evaluate the performance of the model [1], which are mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE).The smaller the value of these indicators, the higher the prediction accuracy of the model [10].This paper uses the predicted value to calculate the evaluation index.

Experimental result
After 200 iterations, the TimesNet-GRU model has basically converged.Figure 2 shows the prediction results of the TimsNet-GRU model on the test set, where a_error_true, b_error_true, c_error_true are the three-phase error data of CVT, a_error_pre, b_error_pre, c_error_pre are the predicted values relative to it.It can be seen from Figure 2 that TimesNet-GRU can learn the timing information inside the original data and can fit the real data well.The evaluation indexes calculated by the four methods of LSTM, GRU, TimesNet and TimesNet-GRU on the test set are show in Table 1, where the bold part is the best result.The results show that MSE and MAE of TimesNet-GRU are 0.001,0.0041,0.0022and 0.0081,0.0049,0.0239lower than those of LSTM, GRU and TimeNet, respectively.In general, the TimesNet-GRU proposed in this paper improves the prediction accuracy and has strong generalization ability in CVT error data.Therefore, this model is beneficial to the stable operation of power system, so as to improve the economic and social benefits.

Conclusion
This paper proposed a novel CVT error state prediction method based on TimsNet and GRU.The intraperiod-variation and interperiod-variation features of CVT error data were extracted by the TimesNet network, and then the GRU was employed to prediction future error.TimesNet-GRU can extract different periodic information and long-term dependence information in the data, so it's prediction accuracy was higher than that of a separate TimesNet network or a separate GRU network.
The simulation results demonstrated that the MSE, RMSE, MAE and MAPE of the proposed method was significantly better than other comparison models in the single-step prediction application scenario.Thus, the proposed model can provide reference for the operation, maintenance and performance verification of CVT and electric energy measurement system.
number of rows and columns of a twodimensional tensor, respectively.

Figure 2 .
Figure 2. The prediction results of the TimsNet-GRU model on the test set.

Table 1 .
The evaluation index values of different models.