SVM-based Partial Discharge Pattern Classification for GIS

Partial discharges (PD) occur when there are localized dielectric breakdowns in small regions of gas insulated substations (GIS). It is of high importance to recognize the PD patterns, through which we can diagnose the defects caused by different sources so that predictive maintenance can be conducted to prevent from unplanned power outage. In this paper, we propose an approach to perform partial discharge pattern classification. It first recovers the PRPD matrices from the PRPD2D images; then statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. Experiments conducted on a dataset containing thousands of images demonstrates the high effectiveness of the method.


Introduction
Gas insulated substation (GIS) is extensively used in electrical industry [1] due to its compactness and insensitivity to environment. To ensure the safety of electric power systems, predictive maintenance of GIS is routinely performed by experienced personnel. They measure partial discharges (PD), which are the localized dielectric breakdowns in small regions of insulating systems [2], to diagnose the condition of GIS. The partial discharge patterns can reveal the defects caused by different sources.
Traditional PD pattern classification was conducted manually by experts. With the increase of sensing data, it becomes more and more desirable to make classification and diagnosis automatic. For this purpose, a bunch of methods have been developed [3][4][5][6][7]. For instance, James and Phung calculated the IEC-270 integrated quantities, statistical moments and other fingerprints to recognize the PD patterns within the power frequency cycle [4]. Gao et al. investigated frequency characteristics of PD for classification [6]. As summarized in [7], the common framework shared in most of the methods is composed of two stages, which are feature extraction and pattern classification. Besides these two stages, there is another problem we need to face. That is, the partial discharges collected by PD sensors are stored in the form of graph. The corresponding raw data is unavailable due to the lack of standard protocols from manufacturers. This paper proposes an approach to classify the PD patterns captured by a PD-Detector. Our method first parse the raw information from PD graphs via image processing techniques. Then, statistical features are calculated and fed into a support vector machine (SVM) [8] for classification. Our approach is validated in a dataset containing thousands of PD graphs and demonstrates promising results.

The proposed method
There are two ways to perform PD pattern classification. One is extracting features from PD graphs directly, and the other recovers raw information first and then extract features. Our approach takes the latter way. In this paper, we first design an approach to parse raw data from PD graphs collected by a PD-Detector. Then, we extract statistical features from the recovered information and use SVM for classification. Each step is introduced in the following subsections.

Graph parsing
A PD-Detector gathers the phase-resolved partial discharge (PRPD) spectrum. The PRPD spectrum consists of three-dimensional discharge patterns ( , q , n )  , in which  stands for the phase angle, q is the discharge magnitude and n is the discharge rate [7]. The patterns are stored in either 2D or 3D forms as shown in Figure 1 (a). Graph parsing in this work aims to recover the raw PRPD matrix when a PRPD2D image is given. To this end, we design a parsing procedure based on some image processing techniques. The procedure consists of a couple of steps as presented in Figure 2.

Extract coordinate axes and axes' corners.
When a PRPD2D image is input, we first filter out the blue phase line and convert the remaining color image into gray level. Then, the Canny operator is applied to detect edges, by which we obtain the edge detection result as shown in Figure 1(b). Based on this, Hough transform is performed to extract the horizontal axis ( ), the vertical axis ( q ) and grid lines, as presented in Figure 1   for each point. Based on the axes' corners detected in the previous step, we can estimate the size of each bin on both the phase and the magnitude axis according to the following equations: where the discharge magnitude axis Y is divided into 100 bins and the phase axis X is divided into 72 bins, each of which stands for 5o.
Once the scale of axes is determined, we recover the PRPD matrix. We first initialize a 2D matrix of size xy is estimated by The RGB value of each grid on the PRPD2D image represents the discharge rate. According to the color table which is prior constructed, we can roughly get the discharge rate and therefore reconstruct the PRPD matrix. Figure 1(d) visualizes the recovered PRPD matrix, which validated the effectiveness of our graph parsing method.

Feature extraction
In this work, the features used for PRPD pattern classification is extracted from the reconstructed PRPD matrix. 18 statistical features are extracted in total. They are skewness, kurtosis, cross correlation factor (cc) and asymmetry calculated from three distributions 2. Separate each distribution into two parts, one is of positive phase and the other has negative phase; 3. Calculate the mean  and the standard deviation  for both positive and negative parts of all distributions; 4. Calculate the skewness and kurtosis for each part of three distributions, by which we obtain 12 features in total; 5. Calculate the cross correlation factor and asymmetry for each distribution, by which we get 6 features in total.
The above mentioned features are widely used in PRPD classification. For the purpose of selfcontainedness, we briefly introduce them as follows.

Kurtosis. Kurtosis measures the peakness of a distribution. It is defined as
When CC equals 1, it indicates that the two halves are highly similar; otherwise, they are not. , it indicates equal discharge levels.

PD pattern classification
When electrical insulations work normally, there should be no discharge detected. Otherwise, PD detectors will capture partial discharges. According to the sources that lead to defects, we can categorize the PD patterns into four major classes: 1) corona discharge, which is an electrical discharge brought on by air surrounding an electrically charged conductor; 2) surface discharge, which occurs along the surface of solid insulations when the surface tangential electric field is high enough to cause a breakdown; 3) floating electrode partial discharge, which happens when there is an ungrounded conductor within the electric field between conductor and ground; and 4) particle In this work, we employ the support vector machine (SVM) [8] for classification. SVM constructs a set of hyperplanes in the high-dimensional feature space that have the largest distances to the nearest training-data point of each class. It is extensively used in supervised machine learning for classification when the dimensionality of features are low. Therefore, SVM fits our task well.
When a set of training data ( , ) ii xy is given, SVM determines the hyperplane ( , b ) w by minimizing the following objective function: Here, i x is the i-th feature vector, i y is the labelled class, and  is a scaling factor. This function can be optimized via quadratic programming algorithms.

Experiments
To validate the proposed method, we build a dataset that contains 4600 PRPD2D images collected by PD-Detectors when performing routinely maintaince. Each category is of 1150 images. In our experiment, we randomly take 60% images for training and the remaining for test. Table 1 lists the classification accuracy for each class and the mean accuracy reaches to 0.9158. The experiment validates the effectiveness of our proposed method.

Conclusions
In this paper, we have presented an approach to perform partial discharge pattern classification. Our approach first reconstructs the PRPD matrix from the PRPD2D images collected by PD-Detectors; then 18 statistical features are extracted from the recovered PRPD matrix and fed into SVM for classification. We have validated our approach in a dataset containing 4600 images. Experiments have demonstrated the high effectiveness of the method.