Anomaly attribute mining method for topological nodes of power system based on graph theory

Due to the large scale, high dimension and time series characteristics of power system data, and the normal samples far exceed the abnormal samples, the sample imbalance phenomenon occurs, and it is difficult to mine the abnormal attributes of nodes. Therefore, a method of mining abnormal attributes of topological nodes of power systems based on graph theory is developed. The logical relation between nodes is analyzed by graph theory, and directed graph and undirected graph are obtained. The topology structure of the power system is constructed, the noise in data is removed by the adaptive clustering algorithm, the weight of topological nodes is set, and the abnormal malicious attacks of topological nodes are discovered by the double threshold method. The experimental results show that the proposed method can accurately detect the abnormal topological nodes in the network, perform better in the energy consumption of the neighbor topological nodes, and greatly reduce the energy consumption of the neighbor topological nodes during mining.


Introduction
A power system consists of a large number of nodes, each of which represents a power device or process.However, in the power system, the abnormal properties of nodes may cause failure, destroy balance, and threaten the security and stability of the system.Therefore, it is of great significance to study the abnormal attribute mining method of topological nodes in power systems.Traditional power system anomaly detection methods mainly focus on monitoring the physical parameters of the equipment, such as current, voltage, and so on.However, these methods cannot fully cover the complex topological relationships of the system and the interaction between nodes [1][2].
In order to solve this problem, a number of methods based on big data and regression equations have emerged in recent years.For example, Zhang et al. [3] use big data technology to design a power communication signal anomaly data mining method.The latest data association detection rules are introduced to complete the cyclic processing of power node big data coding and detect nodes with abnormal characteristics.However, there may be noise, missing values or wrong data in big data, which will affect the accuracy of the mining results.Big data often involves a large amount of sensitive information, such as personal identity, financial data and so on.Protecting data privacy and ensuring data security is a challenge.Yang et al. [4] propose a new method based on power big data distribution to dynamically detect abnormal loads in power grids.By using a nonlinear regression equation to estimate the central load weight and segment the detection area, abnormal data mining and detection are completed.However, regression equations are usually based on certain assumptions, such as linear correlation.However, data in power systems often have complex nonlinear relationships, and regression models may not accurately capture this nonlinear feature.
Therefore, a new method for mining abnormal attributes of topological nodes of power systems based on graph theory is proposed.The innovation mainly includes the following aspects: (1) Graph theory is used to analyze the logical relationship between nodes of the power system, and directed graph and undirected graph are used to build the network topology of the power system.Through the graph analysis of the connection relationship between nodes, the abnormal properties and behaviors between nodes can be captured more comprehensively.
(2) The adaptive clustering algorithm is used to remove noise from power system data, thereby improving the accuracy of anomaly attribute mining.The adaptive clustering algorithm can automatically adjust the clustering parameters according to the distribution characteristics of the data, effectively deal with the problem of sample imbalance, and reduce misjudgment.
(3) The weight setting of nodes is considered when constructing the network topology of the power system.By assigning different weights to nodes, the importance and anomaly possibility of nodes can be reflected more accurately.This helps to improve the detection efficiency of abnormal topological nodes.
(4) A double threshold method is adopted to realize the mining of abnormal malicious attack topological nodes.By setting two thresholds, normal nodes and abnormal nodes can be clearly distinguished to achieve accurate abnormal node detection.This method can better balance the accuracy and efficiency of the detection.

Analysis of logical relations between nodes based on graph theory
Graph theory can provide a comprehensive and systematic way to analyze and describe the relationship between nodes, and help detect the abnormal properties of topological nodes.Graph theory is a branch of mathematics that deals with graphs and their properties, characteristics, and applications.In the power system, graph theory can be used to represent the connection between nodes and transmission paths, and the power system can be abstracted into a graph form.The nodes of the graph represent the various topological nodes in the power system, while the edges represent the connection relationships or transmission channels between the nodes.
Graph theory-based methods can help to understand and analyze the topology of power systems from a global perspective, rather than just focusing on the properties of individual nodes.By analyzing the logical relationship between nodes, hidden patterns, rules, and anomalies can be revealed.This helps detect possible abnormal attributes such as node tampering, connection errors, disconnections, and potential problems and risks between nodes.
The application significance of graph theory is that it can provide a quantitative and visual way to explain and analyze the complexity of power system topology.In addition, graph theory algorithms (such as shortest path, connectivity, cut set, etc.) can be applied to analyze the relationship between nodes, the reliability and robustness of paths, etc.In the aspect of mining the abnormal attributes of topological nodes of a power system, the method based on graph theory can help us to model and analyze the topological structure, so as to identify the possible abnormal conditions.For example, by detecting changes in node degrees, connectivity changes, path length anomalies, and network clustering changes, you can find abnormal attributes of nodes that may cause faults or security risks due to node tampering, connection errors, or disconnections.
This paper uses graph theory to express the logical relationship between nodes of the power system [5][6].Before this, the graph needs to be defined as follows: The graph can be used as a mathematical representation of the relationship between electrical nodes.
We make R and G represent the non-empty node set and the finite edge set respectively, and form a graph by the two.The expression formula of the non-empty node set and the finite edge set is as follows: where n and m represent the number of nodes and the number of edges, respectively.
The two-element subsets of Formula ( 1) are all composed of finite edges, represented by   Through the association of nodes in the non-empty node set, a set of edges with interconnection relations is formed, thus forming a graph.
According to the attribute of edges with associated relations, the graph is divided into directed and undirected.When the precursor of the node j u is i u and the two-element subintegration of the edge conforms to a certain order, then the edge between the two nodes is directed.When all nodes in the graph conform to the above relation, the graph is called a directed graph [7].If there are 3 nodes in the directed graph, then the expression of the inner edge of the edge set is as follows: The above studies the definition of the graph, the difference between a directed graph and an undirected graph, and the expression form of edge respectively, and the subsequent power system equivalent method provides the basis for the graph [8].

Topological structure
The power system is regarded as the vertex of line convergence, and the graph is constructed with the edges of several different vertices and some vertices in the connection items.In the component elements of the graph, the different points can be set as vertices, and the edges are ordered pairs of vertices.When two random points in a power system that can be connected by an edge are adjacent [9].In general, the power system network topology G can be expressed in the form of Formula (1): where V represents the set of all vertices; E is the set of all edges.The adjacency matrix is used to describe the adjacency relation dij between different vertices.The corresponding calculation formula is as follows: where vi and vj represent the I-th and JTH vertices in the vertex set.
Because there are many topologies in the power system, no matter what kind of network structure, all of them are composed of switches, overhead lines and other different devices.Therefore, it is necessary to simplify the topology of the power system and obtain the directed topology of the power system [10].
Combined with the above analysis, all the intersection points in the power system can be set as vertices, the edge is the feeder line between two adjacent vertices, the whole power system is a graph, the introduction of the adjacency matrix can accurately describe the topology of the power system.
In the actual application process, the current direction is obtained by combining the current transformer and ammeter.The initial values of each element in the topology matrix D of the power system need to be set under normal working conditions.Then, add current transformers at both ends of the cross-domain grounding line of the power system to ensure that the elements in the topology matrix of the power system are symmetrical.

Power system topology node adaptive clustering
In this study, an adaptive clustering algorithm based on set-conversion is adopted to preprocess the data and remove the noise existing in the original data from the root, which is also helpful to the subsequent mining efficiency [11].Suppose the filtering mechanism in the power system is 2 X , where a set of data consists of Hash hash functions containing n attribute features.The noise-filtering process of this topological node is shown in Figure 1.

Power node
The value is unchanged at 1 The value 0 is replaced by 1

Abnormal topology node mining
In the power system, there are all kinds of possible anomalies and malicious attacks.By using the double threshold algorithm, the network topology nodes of the power system can be detected and potential abnormal and malicious behaviors can be identified.The algorithm determines whether a node is identified as an abnormal node by setting two thresholds, which can effectively screen out the nodes that may have problems, and then carry out further analysis and processing.In order to ensure that normal topological nodes are not misjudged as abnormal nodes or attacked without cause, it is important to introduce the concept of confidence.Reliability evaluation can be calculated based on many factors such as the historical behavior of normal nodes, the relationship between nodes, and the consistency of data.When a node has high confidence, it proves that the behavior of the node conforms to the normal pattern and is unlikely to be an abnormal node or the object of an attack.Therefore, considering the reliability of nodes can improve the protection of normal nodes and reduce the misjudgment rate when using the double threshold algorithm for detection.0 ij w and 1 ij w are represented as the reliability weights of two relevant topological nodes, which represent the reliability of i v to node j v in the transmission topology when there is no anomaly or when there is an anomaly.This weight can prevent switching between the above two states of the transmitted signal, so as to eliminate the interference of the signal and obtain new reliability updates based on the detection results [12].If no exception occurs, the topology node that is abnormal or reported as 1 loses its weight.The result of obtaining the weight value is as follows: where  is the weight redundancy parameter and  is the weight compensation parameter.
When an exception occurs, the transmission topology node in the range where it occurs should report it as 1.In the range, the abnormal topology node and the malicious attack topology node will deliberately represent the detection result as 0, resulting in error detection.On this basis, by updating the trust degree of the topology structure, the normal and abnormal nodes in the region are not easily confused, and the loss of normal transmission topology nodes is avoided.The weights of the nodes in the topology are as follows: In the detection of malicious attacks on topology nodes, the average period when no exception occurs refers to the average running period of topology nodes in a normal working state.This cycle is when the node normally runs a complete loop, completes a specific task or operation in the loop, and then starts the next cycle.In the detection of topological nodes, by monitoring and analyzing the running period of the nodes, you can determine whether the nodes have abnormal behaviors or are under malicious attacks.If the operating period of a node deviates significantly from the normal average period, it indicates that the behavior of the node may be interfered with or attacked.We set the average period to T , and T can be derived from the given cases of  and  : By comparing the actual running period with the average running period without exceptions, you can detect whether the node is subjected to abnormal interference or malicious attacks.If the running period of a node is significantly different from the average period, abnormal behavior may exist, and further investigation and troubleshooting are required.Node weight reduction coefficient  , weight increase coefficient  , inv p is the node attack probability.Compute node attack ' T is:

Abnormal topology nodes determine the validity of the result
The experimental network consists of 3 clusters with 30 topological nodes in each cluster, and three types of abnormal conditions are simulated.Type 1 anomaly belongs to the load anomaly.The load condition of the power system will change constantly, and when the load increases or decreases abruptly, the node attributes may be abnormal, such as current overload, frequency fluctuations and other phenomena.Type 2 exceptions are device exceptions.For example, the device temperature is too high, current overload, voltage fluctuation, and so on.Category 3 anomalies are line anomalies where the transmission lines of the power system may be affected by damage or other problems.For example, a broken line, damaged insulation, or abnormal grounding may cause node attributes to be abnormal.During the experiment, different types of exceptions are added to a specific topological node in the power system for a certain period of time to verify the feasibility of the method.The proposed method was adopted to expand the analysis of these abnormal data sets, and the three abnormal conditions were compared with those without abnormal conditions.The abnormal topological nodes were added to the 500 s-800 s period to conduct a series of simulations.The simulation results are shown in Figure 2.

Topological node eigenvector value analysis
In order to further verify the mining and positioning of abnormal topological nodes by the proposed method, two types of abnormal data types are injected into the 15th topological node of the second cluster.Based on the abnormal vector features and topological node eigenvalues extracted by the proposed method, the eigenvalue vectors of the three clusters are analyzed respectively, and the eigenquantities of abnormal topological nodes are detected to determine abnormal topological nodes.The feature vector results of topological nodes are shown in Figures 3,4     According to the experimental results in Figures 3, 4 and 5, it can be seen that compared with Figure 3 and Figure 5, the peak of cluster 2 in Figure 4 has a higher correlation and a significant difference.The peak value of the eigenvector value of the 15th sequential topology node in Figure 4 is significantly higher, so it can be seen that the 15th topology node is an abnormal node.

Energy consumption of neighbor topology nodes under the application of mining method
The energy consumption indicator of a neighbor node refers to the energy consumption of neighboring nodes in a topology node.In a power system, the energy consumption required by each topological node is closely related to the efficiency of its normal operation and power transmission.The energy consumption of the neighboring nodes of a topology node can be used as one of the important indicators to evaluate the abnormal properties of the node.When verifying the performance of the abnormal attribute mining method for power topology nodes, the energy consumption of neighbor topology nodes is taken as an indicator to reflect the power load distribution of the nodes.The energy consumption of neighbor topology nodes can reflect the power load of the nodes themselves and the load state of the surrounding nodes.Abnormal nodes tend to affect the energy consumption of neighboring nodes.Therefore, by observing the energy consumption of neighboring nodes, we can indirectly infer whether the node has abnormal attributes.
The comparison method is the power anomaly node mining method based on big data proposed in Zhang et al.'s work [3] and the power anomaly node mining method based on the nonlinear regression equation proposed in Yang et al.'s work [4].The test results are shown in Figure 6.As can be seen from Figure 6, compared with the other two methods, the proposed method has a better advantage in terms of energy consumption.This is because when the research method is applied, the neighbor topological nodes in the power system are relatively balanced in power load distribution under normal operation.When one or more nodes are abnormal, it may cause unbalanced load distribution, resulting in some nodes taking on too much load, resulting in increased energy consumption.Therefore, when detecting the abnormal attributes of power topology nodes, observing the energy consumption of the neighbor topology nodes can help us determine whether the node is abnormal.

Conclusion
Nodes in the power system are abnormal due to various reasons, such as malicious attacks, equipment failures, overload and so on.These abnormal nodes may adversely affect the normal operation of power systems, and even lead to system failure and power outage.Therefore, an anomaly attribute mining method for topological nodes of power systems based on graph theory is proposed.By analyzing the logical relation between nodes, the definition of graphs and the difference between the directed graph and the undirected graph are studied according to the attribute of the edge of correlation relation.Taking the power system as the vertex of line convergence, the network topology of the power system is obtained.An adaptive clustering algorithm based on set conversion is adopted to filter the noise of topological nodes, and the abnormal malicious attacks on topological nodes in the power system are detected based on the double threshold method.Through experimental verification, the abnormal nodes in the power system can be found and analyzed in time to diagnose system faults, provide guiding maintenance measures to reduce the risk of a power failure and equipment damage, and improve the reliability and stability of the power system.

Figure 2 .
Figure 2. Plot of mean spectral radius with or without anomalies.

Figure 3 .
Figure 3. Topological node eigenvector corresponding to the first cluster.

Figure 4 .
Figure 4. Topological node eigenvector corresponding to the second cluster.

Figure 5 .
Figure 5. Topological node eigenvector corresponding to the third cluster.

Figure 6 .
Figure 6.Energy consumption of neighbor topology nodes under different mining methods.