Power User Sensitivity Analysis and Power Outage Complaint Prediction

The sensitivity of power users is an important basis for improving the quality of power customer service and refining customer service content. In order to improve the accuracy of power user sensitivity classification, this paper optimizes and improves the decision tree algorithm based on the ant algorithm, builds a power user sensitivity analysis model, and verifies its effectiveness with simulation experiments, which provides a more powerful data reference for improving the quality of power user services and other tasks. Based on the analysis of the sensitivity of power users, combined with the analysis of related characteristic data of power outages, this paper predicts the probability of power user outage complaints, hoping to provide data reference for improving power user satisfaction and reducing power outage complaints.


Introduction
In recent years, with the continuous development and advancement of my country's power system reform, the requirements for the service level of power customers have also increased. Power user sensitivity is an important data support for improving the service level of power customers. By mastering user sensitivity, it is possible to formulate distribution network architecture optimization and other related plans more rationally and targeted, and to refine the service content of power users.
The current research on the sensitivity of power users is mainly based on building a calculation model for analysis. The calculation and analysis methods used include logistic regression, entropy and so on. In order to improve the accuracy of power user sensitivity classification, this paper optimizes and improves the decision tree algorithm based on the ant algorithm, thereby constructing a power user sensitivity analysis model [1][2][3].

Data screening
The feedback information of power-related services in my country currently mainly comes from the customer service call system, but the number of power users in my country is very large, and the power customer service call data information of each user is quite different, and the distribution of most user-related data is uneven. Therefore, considering the specific situation of the characteristics of power users in my country, this article selects the data of a power supply bureau's marketing management system from 2018 to 2019 for analysis in specific research. In addition, the collected high-dimensional original business data contains many attributes that have no or little relevance to the sensitivity of power users, and there may also be some duplicate data, incomplete data, invalid data, etc. Therefore, it is necessary to filter the original data to ensure the efficiency of analysis and prediction [4][5][6].

Data standardization
The filtered data still has differences in dimensions and magnitudes, which need to be standardized to avoid the occurrence of excessively large variables due to irregular data. The standardization methods mainly include maximum-minimum standardization, Z-score standardization, and decimal standardization. Considering the advantages of Z-score standardization in multi-dimensional data processing, this paper chooses to use the Z-score standardization method based on the mean and standard deviation of the original data. The calculation formula is as follows: (1) After data standardization processing, a new combined data set including several variables is generated, including 1 label variable. The label variables are 0 and 1, respectively indicating insensitive and sensitive.

Principle of decision tree
The tree model is a widely used machine learning model, and the decision tree is its core classifier. It has two basic structures: a binary tree and a polytree, as shown in Figure 1. The decision tree represents the mapping relationship between object attributes and object values. In data mining, the decision tree is an analysis and prediction model.
Decision tree classification starts from the root node. Each node represents an object. During classification, it means that a variable in the data is compared with a feature node; each branched branch represents a possible attribute value. During classification, it represents a variable and the result of feature node comparison; each leaf node represents the value of the object corresponding to 3 the path from the root node to this node, and the final result is the leaf node that represents a certain type of distribution during classification [7][8][9].
The whole decision tree classification process consists of two parts: building regression tree and pruning. The establishment of classification regression tree is as follows: For a data set D containing m attributes, the attribution category of the data is C, and the sample belongs to category C (1) is P, then: After attribute A is decomposed, the formula for Gini is: The model needs to have low bias and variance, usually a compromise between bias and variance, so in order to improve the classification effect, it needs to be pruned: In the formula, is the error cost of pruning t , and

Ant colony algorithm
The ant colony algorithm is an algorithm that simulates the process of ants looking for food to find an optimal path. Its mathematical model is as follows: (1) Before starting, parameters such as task array and node array need to be initialized. Suppose the number of ants ) (0 Y included in the target ant colony is M, the maximum number of iterations is maxgen, and the number of nodes is n. fit is the fitness of the i-th ant. Find the fitness function: O is the predicted output value of the j-th sample, and j T is the predicted output value of the j-th sample.
(2) ij τ is the pheromone between i and j,   (10) α is the weight of the pheromone, β is the weight of the heuristic factor, and ) (i J k is the node selectable by ant k.
Record the location and path of the ants, and select the optimal path. When the maximum number of iterations is reached or the error threshold is reached, the iteration is stopped and the result is output, otherwise the loop continues.

Improved decision tree based on ant colony algorithm
In order to improve the accuracy of power user sensitivity classification, this paper uses ant algorithm to optimize and improve the decision tree algorithm, which specifically includes the discretization of attributes, heuristic information, and pheromone.
(1) Discretization of attributes: Take the minimum value MIN of the attribute and the maximum value MAX of the attribute, and then divide them into N equal parts: . ,..., . Repeat segmentation until the optimal global threshold is obtained.
(2) Combining pheromone and information gain as heuristic information, the information gain is expressed as:

Algorithm verification
In order to verify the effectiveness of the improved decision tree model proposed in this paper, some data in the marketing system of a power supply bureau were selected for statistical and simulation comparison experiments. The experimental results are shown in Figure 2.

Figure 2. Simulation and comparison experiment results
It can be seen from the results of simulation and comparison experiments that compared with support vector machines (svm) and traditional decision tree algorithms, the improved decision tree algorithm based on ant colony algorithm proposed in this paper has good classification accuracy, which can prove the method's effectiveness.

Sensitivity analysis of power users
In this paper, the resident users in a certain area are taken as the object, the corresponding power supply data is selected as the sample data, and the improved decision tree method based on the ant colony algorithm is used to conduct a comparative experiment on the sensitivity analysis of power users. The experimental evaluation criteria include judgment accuracy P, judgment coverage rate R, harmonic mean F, false positive rate FPR and false negative rate FNR. The experimental results are shown in Table 1. According to the data shown in Table 1, it can be seen that compared with support vector machines (svm) and traditional decision tree algorithms, the improved decision tree method based on ant colony algorithm for power user sensitivity analysis not only has a high recall rate, but also both the false positive rate and the false negative rate are relatively low.

Forecast of power outage complaints by power users
After statistically analyzing the data of power user complaints in a certain area, this article found that complaints related to power outages, such as frequent planned outages, early and delayed power transmission, accounted for about 45% of the total complaints. In addition to the complaints generated during the power outage, there are also some complaints generated during the period of time after the power is restored, including complaints about customer service attitudes, complaints about dissatisfaction with outage explanations, etc., the specific statistical results are shown in Figure 3.  Figure 3, about 80% of the related complaints that occurred after the power was restored in this area were concentrated in the first 10 hours after the power was restored. Based on this situation, this paper screened out the non-standard data set, selected the power outage data in the area in the past two years as the training set, and determined whether there were related complaints in the first 10 hours after the power was restored as a label, and determined other relevant feature values. Forecast the power outage complaints of residents in this area. The prediction results are shown in Table 2. According to the results shown in Table 2, it can be seen that the improved decision tree algorithm proposed in this paper shows good performance in the prediction of power outage complaints and

Conclusions
This paper studies the classification model of power customer sensitivity. Starting from the need to improve the performance of the algorithm, a new classification model and calculation method are proposed, and the validity is confirmed through algorithm verification experiments [10].
In addition, based on the analysis of the data, this article finds that a period of time after power is restored is a period of high incidence of power outage-related complaints, and based on this situation, the characteristics of the power outage complaint forecast data are refined. The comparison of experimental results proves that the improved decision tree proposed in this paper can meet the requirements of power outage complaint prediction and has better performance.