Deep-Adversarial-Transfer Learning Based Fault Classification of Power Lines in Smart Grid

Due to the insufficiency of actual fault samples, the machine learning based fault classification models of the power lines in smart grid are generally trained using the simulated fault samples acquired from software, such as Matlab/Simulink. Yet, the fault features of actual and simulated fault samples are different, and the existed methods might not be valid or accurate enough in classifying the fault types of actual power lines. Thus, a new fault classification model for actual power lines in smart grid based on deep-adversarial-transfer learning is proposed. Firstly, the conditional generative adversarial network (CGAN) is applied for the augmentation of actual fault samples, so as to increase the data amount to some extent. Then, the loss function of convolutional neural network (CNN) is resigned based on transfer learning, and a new fault classification framework based on improved CNN (I-CNN) is proposed. The I-CNN based model is trained using both adversarial and simulated samples, and can make the distrubution of both catergories of samples features closer, thereby achieving to classify the fault types of actual power lines. To verify the method validity, the real-world power line is used for case study. The results show the effectiveness of the proposed method.


Introduction
The faults occur frequently in the power lines [1]. Therefore, the fault classification of power lines in smart grid is extremely important. Many methods have been studied, and they are summarized into two categories, i.e., 1) methods based on signal transformation; and 2) methods based on machine learning.
For the second category, machine learning and specifically deep learning are the most frequently applied methods because of their powerful ability in feature extraction and nonlinear mapping. Among the deep networks, convolutional neural network (CNN) might be the most effective approach [15][16]. The CNN-based fault classification for transmission lines is studied in [17]. In [18][19], the paper increases the accuracy and robustness of the CNN-based algorithm.
Although favorable results are achieved using the above methods, some limitations are still existed. Specifically, the above methods require many data samples to train their corresponding models, and it would be fantastic if we can collect enough real-world data samples from power systems to train them. Yet, the acquisition cost of the real-world data samples is quite high, and their data amount is rather rare, and we cannot obtain a well-trained model only using the rare real-world data samples. Thus, to acquire well-trained models, the training-used data samples of the above methods are acquired from the software, such as Matlab/Simulink, but not the real-world data samples from power systems. The question is that the software is hard to simulate comprehensive operation details of power systems, and the fault features of simulated samples and real-world samples exist apparent differences. So if we use the simulated samples to train the model, it might only be suitable for the fault classification of simulated power lines, but not be appropriate and accurate enough for the applications of real-world power lines. In other words, the practical value of the existed methods is somewhat limited.
To address the above limitations and increase practical value of the fault classification model, a deep-adversarial-transfer learning based model is proposed. This model combines the simulated and actual data samples to achieve accurate fault classification of real-world power lines. New contributions are: 1) Aiming at the rare data amount feature of real-world data samples, this paper expands the data amount of these samples through conditional generative adversarial network (CGAN) [20], and the augmented samples set is generated. This data augmentation process makes the data amount more and is beneficial for the training of the fault classification model.
2) This paper design an improved convolutional neural network (I-CNN), which can achieve accurate fault classification of real-world power lines using the augmented and simulated fault samples, and has high practical values. The loss function of I-CNN is designed based on transfer learning, which can make the distrubution of both catergories of samples features closer, thereby achieving to classify the fault types of actual power lines, accurately.

Comparison of simulated and actual fault samples of power lines
The system in Figure 1 is studied in this paper. The voltage level of this power line is 220 kV. The length of this power line is 100km. The frequency of this power line is 50Hz. The load of this power line is 100MW. The system is modeled in MATLAB/Simulink. It should be noted that the parameters of the simulated power line in Simulink are totally the same with that of one actual power line we need to diagnose. In the actual system, a fault occurred in this power line, and the same fault condition is set in Simulink. Figure 2 shows the comparison. From Figure 2, the simulated and actual results are not totally same under same fault condition. Therefore, the training of fault classification model using only the simulated samples might not be useful in fault diagnosis of actual power lines, and the model with engineering application value need to be studied.

Data Augmentation Based on CGAN
CGAN adds condition to GAN, and the unsupervised model is expanded to supervised model. If the generator and discriminator are applicable to some additional condition c, i.e., fault type, then fault type c could be attached to the generator and discriminator for guiding the data augmentation process. In the generator, the fault type c and noise z, which obeys normal distribution N(0,1), are used as the input, and the fault type c and fault sample x are input to discriminator. The optimization goals of the discriminator and generator are: where θD is the parameter of the discriminator, and θG is the parameter of the generator. When training the discriminator, b noises sampled from normal distribution N(0,1) and b items of fault samples are used as training samples of a batch, then the discriminator loss and gradient updating direction are calculated according to (1). RMSProp optimization algorithm is used for the updating of gradient. The gradient is clipped to a smaller interval [-p, p] for satisfying the Lipschitz continuity condition of discriminator. When training the generator, fix the parameters of discriminator. Then, b noises sampled from normal distribution N(0,1) are used as training samples of a batch, and calculate the generator loss according to (2). RMSProp is also used as the algorithm for the updating of the parameters. In general, the better of the discriminator, the more accurate gradient information generator could get. Therefore, the discriminator is updated k times before each updating of the generator, so that the discriminator converges faster. The pseudocode of the CGAN training process is shown in Figure 3.

Framework of the I-CNN based fault classification model
The fault classification framework is demonstrated in Figure 4. The input of the framework is the voltage and current time series in both ends in the power line, and these time series are arranged in the matrix form, which is a 12×d matrix, where d is the length of the time series. For the 12×d matrix, each convolution kernel (lc×lc) moves forward one column a time, and the size of each convolution layer matrix is (12−lc+1)×(d−lc+1). We can get t convolution layer matrix in this convolution process, namely c1 to ct. Then, the pooling layer compresses the convolution layer matrix, and we get t pooled feature matrices, namely p1 to pt. Further, the layer matrix is stacked into a feature vector. Softmax is generally used as the classifier in CNN. The input of softmax is a vector of real numbers, which are normalized in the exponential domain to ensure their sum is 1. The number of output ports in softmax is equal to class number. There are 10 types of power line faults. Thus, the number of output ports in softmax is 10.  Figure 4. The framework of CNN-based fault classification model.

Loss function designing based on transfer learning
Transfer learning can make problems to be solved in both domains more similar [15] [21]. To acquire the model suitable for the fault classification applications of actual power lines, the loss function of the I-CNN in this paper is re-designed and it consists of the following two parts, which is 1) The Loss Term.
where M is samples number. hij is defined as, hij is a coefficient.

Samples Preparation
Totally, 27000 simulated and 375 actual samples are collected. In the data augmentation process, the actual data set has a total of 375 samples. When training CGAN, the condition variable is converted into one-hot coded form with the dimension dc=10 (10 types of fault). The input parameters of CGAN include: b=64, η=0.01, p=0.01 and k=5. Relu is used as activation function in the generator and With the structure parameters of the neural network, the dimensional of the time series d=60; the size of the convolution kernel lc=7, so the size of each convolution layer matrix is 6×54; the number of the convolution kernels t=6; the size of the pooling kernel lp=2, so the size of each pooling layer matrix is 3×27. ReLU is selected as the activation function for all the convolution layers and the fully connected layer, the batch size (training pairs) N=50, the learning rate α is 1, and the epochs is 15. In addition, the weight decay parameter λWD is 1, and trade-off parameter λMMD is 1.

Comparison with the existing diagnosis models
In this part, the test samples include two parts, i.e., 1) the samples in ds, and 2) the samples in dt.
Besides the proposed fault diagnosis model using merged simulated and augmented fault samples for training by introducing transfer learning (namely I-CNN), the traditional CNN based fault diagnosis model using merged simulated and augmented fault samples for training directly (namely T-CNN), and the methods in [1], [22] and [23]. In [1], the summation-wavelet extreme learning machine and summation-Gaussian extreme learning machine are proposed and applied to fault diagnosis (namely SW-ELM and SG-ELM, respectively). In [22], wavelet packet method is used to extract information from non-stationary signals, and then the entropies are fed to neural networks for fault classification and location (namely WE-NN). In [23], SVM classifier is used to classify the fault type based on discrete WT, and the wavelet coefficients are used to locate the fault (namely WT-SVM). It should be noted that the merged simulated and augmented fault samples are used for the models training in [1], [22] and [23].
The accuracy is considered: 1) the percentage of source samples in all fault type that are correctly classified (namely ps,typ), and 2) the percentage of target domain testing samples in all fault type that are correctly classified (namely pt,typ). The values of ps,typ and pt,typ are listed in Table 1. All of the six models are well-performed when classifying the source domain samples. This is because that the number of source domain samples is far more than the target domain samples, and the training of the models except the proposed one are mainly influenced by the source domain samples.

Validation of the proposed model with noise
Performance of the proposed method with several models is studied and the comparison is given in Table 2. The accuracies of the proposed model are above 98%, while those of other compared models drop much faster as SNR increases. For example, the accuracies of the proposed model decrease about 1% when SNR is from 40dB to 20dB. Yet, with kNN, the falling range of accuracies is over 10%. This results indicates that strong generalization ability of the neutral networks greatly reduces the impact of noises.

Conclusion
A deep-adversarial-transfer learning based power line fault classification considering insufficient fault samples is proposed. Several conclusions can be drawn. Firstly, after the introducing of the data augmentation and TL, the proposed model is suitable for the fault classification of actual power lines. Secondly, through the re-designing of the loss function on the basis of TL, the accuracies of the proposed model in the fault classification are much better.