Modeling and Algorithm Research on Identification of Wrong Wiring Power Supply Region based on Classification Analysis

The wrong wiring of the power supply region directly affects the accuracy of the metering data. This paper based on the existing data of the power supply regions, defined four indicators which include minimum negative line loss rate, maximum positive line loss rate, minimum power factor and maximum number of negative power values. Then by using the four indicators before and after the wrong wiring rectification of the historical wrong wiring regions and the decision tree algorithm, the wrong wiring regions identification model was built. The identification model was used to identify the wrong wiring regions in all regions of the power company, and the regions identified as wrong wiring by the model were inspected on-site. The inspection result shows that the precision of the model is satisfactory, which can effectively improve the inspection and rectification efficiency of the wrong wiring regions.


Introduction
Electric power data of the power supply region such as current, voltage, power, electrical energy supply and electricity consumption collected by the electric energy metering device lay the foundation for the power company to carry out the power supply quality monitoring and line loss assessment and other management work of the power supply region, which has strongly supported the service level improvement of the power company. With the rapid development of economy and technology, in order to meet the increasing demand of power supply service and power company management, more and more data indicators collected by power grid region, and the collection accuracy is constantly improved. As a result, there are more and more components and lines are more and more complex in the power supply region, and the lines of different components are interlaced together, which is prone to the wrong wiring of the power supply region. The wrong wiring of the power supply region will directly affect the accuracy of metering data of the power supply region, it causes property losses to the power company, and affect the power supply service level and management level of the power company [1][2]. As China has a vast territory and a large population, there are massive power supply regions and the regions are wide distribution. Therefore, it needs a lot of manpower and material resources to inspect the wiring of the electric energy metering device of the power supply region. At the same time, with the continuous development of the power supply region, the connection of the region may change, and the normal region may also become the wrong wiring region, so it is necessary to carry out the region wrong wiring inspect regularly.

Related Work
Electric power companies and related scientific research institutions have studied the remote and automatic identification of wrong wiring of the electric energy metering devices. Chen Xiao et al. judge the wiring relationship of the metering device by rotating the phasor diagram between each metering element of the electricity meter, so as to realize the rapid and accurate identification of the wrong wiring of the electricity meter [3]. Zhang Qiang et al. identify the wrong wiring by calculating the probability distribution of the phase difference angle of voltage and current, and expound the construction of the wrong wiring identification software system [4]. By summarizing the data needed to generate the vector graph of metering device in the power supply region, Zhang Sheguo et al. put forward the method of remote diagnosis of the wrong wiring of the power supply region by drawing the vector graph according to the data of the power supply region [5]. In article 3-5, wrong wiring is identified mainly based on the index characteristics of metering devices when wrong wiring occurs in theoretical situations, and there are many indicators need to be collected and the calculation is complicated, but it is difficult to find out the wrong wiring situation outside the theory. Yang Pei et al. put forward a correlation analysis method of metrological abnormal events, which can identify the wrong wiring by analyzing the abnormal events related to the wrong wiring, but this method can only analyze the abnormal events that have occurred simultaneously with the wrong wiring in history [6]. Based on ADSP-BF537 logic control and pattern recognition, Xu Jinliang et al. designed an automatic judgment system for power grid wrong wiring on site, this automatic judgment system needs to be used on site in the power supply region and cannot be used for remote identification of wrong wiring [7]. Wang Hongxi et al. proposed a method to identify the wrong wiring of electricity meters based on HPLC technology, but HPLC electricity meters are still in the stage of promotion and the method requires additional data collection, so it is difficult to be applied in practice at present [8].
In view of the problems existing in the research of wrong wiring identification of the electric energy metering device, this paper proposes an identification method of wrong wiring region based on classification analysis. Firstly, a modeling data set was constructed based on the daily line loss rate, daily power factor and daily power curve data collected and calculated by the current power information acquisition system. Based on the data set, a decision tree classification algorithm was used to construct the identification model of the wrong wiring region. Then, the model is used to identify the wrong wiring region in a certain power company, and the on-site inspection is carried out according to the identification result. The on-site inspection result shows that the wrong wiring region identification method proposed in this paper has high precision and can effectively support the power company to carry out inspection and rectification work of the wrong wiring regions. Moreover, the method proposed in this paper is based on the existing data of the power information acquisition system, and the identification model constructed is easy to understand.

Algorithm Introduction
The identification of the wrong wiring regions is a typical dichotomy problem. At present, the commonly used classification algorithms include decision tree classification, Bayesian classification, support vector machine classification, logistic regression and so on. The decision tree classification is the most widely used classification algorithm, which is easy to understand, interpretable, easy to visualize, and requires small training samples. Due to the small number of the wrong wiring regions detected in the history, and the identification model should be easy to understand and use for basic business personnel, this paper uses the decision tree algorithm to build the identification model of the wrong wiring regions. The concept of the decision tree first appeared in CLS (Concept learning system) [9]. In all decision tree classification algorithms, the most influential algorithm is ID3 presented by Quinlan in 1986 [10], and based on the ID3 algorithm, Quinlan presented C4.5 algorithm in 1993 [11], which becomes the most commonly used decision tree algorithm. The C4.5 algorithm users the information gain ratio to choose the node splitting attributes of decision tree, and overcomes the disadvantage of the ID3 tend to choose the attribute which has more values. The detailed structure method of the C4.5 algorithm is as follows: 1. The given samples set is regarded as root node of the decision tree; 2. The information gain ratio of every attribute in the samples set is calculated and choose the biggest as the partition attribute of current node; 3. The branches for every value of the partition attribute are established, partition the samples set into some subsets and establish new node for every subset. Step 2 and 3 above are repeated recursively for each new node until all nodes meet one of the following conditions: 1) All samples in the training sample set corresponding to the node belong to the same category; 2) The attribute values of all samples in the training sample set corresponding to the node are identical.

Data Sets Building Processes
The influence of wrong wiring on the power supply region indexes is mainly reflected in abnormal line loss, low power factor and negative power value in power curve. The influence of wrong wiring on line loss rate, power factor and power curve may not appear simultaneously on the same day, and the region may not be continuously in the state of wrong wiring due to the complex site environment and human factors, that is, the region is not in the state of wrong wiring every day. In order to avoid the effect of indexes influence of wrong wiring is not synchronized and the wrong wiring state fluctuation, in this paper, the line loss rate, power factor and the number of negative power values in the power curve of the region 10 days before the wrong wiring rectification (i.e. in the wrong wiring state) and 10 days after the wrong wiring rectification (i.e. in the non-wrong wiring state) are used to construct the modeling data set of the classified identification model. Define the historical wrong wiring power supply regions set is } ,..., , , , where k is the number of historical wrong wiring regions. Define the line loss rate, power factor and the number of negative power values in the power curve of the wrong wiring region ai 10 days before rectification is as in equation (1), (2) Define the line loss rate, power factor and the number of negative power values in the power curve of the wrong wiring region ai 10 days after rectification is as in equation (4), (5) According to the data situation and business meaning of the line loss rate, power factor and the number of negative power value in the power curve of the power supply region, line loss rate includes positive line loss rate and negative line loss rate, and greater the absolute value of line loss rate indicates that the region is more abnormal, smaller power factor indicates that the region is more abnormal, more negative power values indicates that the region is more abnormal. In this paper, the modeling data set of the classified identification model was constructed by using four indicators which include minimum negative line loss rate, maximum positive line loss rate, minimum power factor and maximum number of negative power values, and the four indicators were calculated by the data of the historical wrong wiring regions in 10 days before and 10 days after the rectification. For the wiring region ai before rectification, the four indicators calculation equations are as follows: 1. the minimum negative line loss rate  ai minLL as in equation (7).   3. the minimum power factor  ai minPF as in equation (9).  According to four indicators of the k historical wrong wiring regions before and after rectification, the modeling data set containing 2k records as shown in the Table 1 is obtained. When the state of the region is before the rectification, the label of the region is "Yes", i.e. the region is in the wrong wiring state. When the state of the region is after the rectification, the label of the region is "No", i.e. the region is in the non-wrong wiring state.

Model Building Processes
In this paper, using the line loss rates, the power factors and the power curves of the historical wrong wiring regions in 10 days before the rectification and 10 days after the rectification, the four indicators before and after the rectification are calculated, which include minimum negative line loss rate, maximum positive line loss rate, minimum power factor and maximum number of negative power values. The modeling data set is constructed according to the four indicators of the regions before and after rectification, and the modeling data set is input into the C4.5 algorithm to obtain the decision tree

Model Building
In this paper, by using the data of 10 days before and 10 days after the rectification of the 1200 historical wrong wiring regions in a certain power company, the modeling data set with 2400 records for the classified identification model was constructed. And by using the modeling data set and the C4.5 algorithm, the decision tree classification model for the wrong wiring region identification was constructed. As the decision tree obtained is large and inconvenient to display, the identification rules of wrong wiring region are obtained after sorting out the decision tree, which are as shown in the Table 2. From the Table 2, the line loss rate of region plays the greatest role in identifying whether a region is a wrong wiring region or not. When the minimum negative line loss rate is less than -100%, all regions are wrong wiring region, when the maximum positive line loss rate is more than 100%, the probability of the region being a wrong wiring region is 85.71%, and when the minimum negative line loss rate >=-7.16% and the maximum positive line loss rate <=9.13%, the region is basically not a wrong wiring region. In addition, when the maximum number of negative power values of the power curve of the region is more than 2, the region is more likely to be a wrong wiring region, and when the minimum power factor of the region is less than 0.8575, the region is more likely to be a wrong wiring region. When the rate of regions before rectification of a rule exceeds 50%, the model classified the regions which meeting the rule as wrong wiring region, otherwise as non-wrong wiring region. In the model as Table 2, 1200 regions which before rectification were classified into two categories: 1065 wrong wiring regions (i.e. classified correctly) and 135 non-wrong wiring regions (i.e. classified incorrectly), 1200 regions which after rectification were classified into two categories: 124 wrong wiring regions (i.e. classified incorrectly) and 1076 non-wrong wiring regions (i.e. classified correctly).Therefore, the accuracy of the model is 89.21%, the precision is 89.57%, the recall is 88.75%, and the f1-Score is 89.16%, indicating that the model has good classification performance.

Model Application
Using the line loss rate, the active power curve and the power factor data of 55814 power supply regions in a power company from May 1 to 10, and according to the four indicators calculation equations of the minimum negative line loss rate, the maximum positive line loss rate, the minimum power factor and the maximum number of negative power values in Chapter 3, the input data set of the classified identification model containing 55,814 records were obtained. Then the wrong wiring regions identification was carried out by using the decision tree classification model which is obtained in 4.1, and the identification result of the 55,814 regions was obtained as shown in the Table 3. The wrong wiring probability in the Table 3 is the rate of regions before rectification in the Table 2.
According to the identification result in the Table 3, the power company conducted on-site inspection on 2709 regions whose wrong wiring probability are more than 75%, and the on-site inspection result is shown in the Table 4. From the Table 4, there are 314 regions are verified wrong wiring region in the 327 regions with the wrong wiring probability equal to 100%(i.e. precision is 96.02%), there are 1520 regions are verified wrong wiring region in the 1661 regions with the wrong wiring probability exceeding 90% but less than 100%(i.e. precision is 91.51%), there are 522 regions are verified wrong wiring region in the 721 regions with the wrong wiring probability greater than or equal to 75% but less than 90% (i.e. precision is 72.40%). According to the on-site inspection result, the identification method of wrong wring region proposed in this paper has high precision, which can improve the accuracy of on-site inspection of wrong wring region, so to be able to effectively improve the inspection and rectification efficiency of the wrong wiring regions and significantly reduce the manpower and material resources. In addition, the higher the wrong wiring probability output by the identification model, the higher the accuracy of on-site inspection is. Therefore, according to the wrong wiring probability given by the model and the manpower and material resources of the power company, the inspection and rectification plan for wrong wiring regions made in order of high probability followed by low probability would be scientifically and reasonable.

Conclusion
In this paper, based on the power supply region data of line loss rate, power factor and power curve collected and calculated by the power information acquisition system, four indicators of the region include minimum negative line loss rate, maximum positive line loss rate, minimum power factor and maximum number of negative power values are defined by analyzing the business meaning the region data, and the classified identification model for the wrong wiring region is built by using four indicators and decision tree algorithm. All the regions of a power company are identified by the model, and the regions with high wrong wiring probability are inspected on-site. The on-site inspection result shows that, the identification method of wrong wring region proposed in this paper has high precision, and can effectively improve the inspection and rectification efficiency of the wrong wiring regions, thus reducing the manpower and material resources of the power company. In addition, the region data used in this method is simple and new data collection is not needed, the identification model is easy to understand and easy to be deployed and popularized.