Fault analysis and evaluation of power transformers based on family defects

Family defects in transformers are equipment defects caused by common factors such as design, material, and manufacturing process. This article conducts data discretization and other processing on the collected data, and improves the original Apriori algorithm to establish a family defect model. By analyzing the correlation between various indicators of family defects and fault feature quantities through the block array Apriori algorithm, it is found that each defect indicator reflects the state of the power transformer. Evaluate association rules for high confidence problems, and then use family defect scoring criteria to evaluate family defects, and judge the impact of power transformer status based on the scores.


Introduction
Power transformers not only have diverse structures, but their actual operating environment is complex and diverse.They are subject to various factors such as electricity, heat, machinery, and environment, and their performance will gradually deteriorate.Once a fault occurs, it will bring adverse effects to society and huge economic losses.Among these faults, a considerable portion are defects in the transformer itself, including quality issues such as manufacturing process, design, and materials.If timely detection and corresponding anti accident measures can be taken, it will greatly improve the safe and stable operation level of the power system.
Xian Daily proposed that "equipment defects confirmed to be caused by common elements such as design structure, material or raw materials, and process control are called familial defects.If such defects are present, other equipment with the same design, material, and process, regardless of whether similar defects can be detected at present, will be called familial defect equipment until this type of defect is eliminated or determined."In power transformers, If a defect occurs in the equipment and it is determined through analysis that it is caused by a design structure, material, or raw material or process control issue, then equipment with the same design structure, material, or raw material or process control as the equipment may have the same defect or there is a risk of such defects occurring.The family quality defects of power transformers can be divided into three categories: design structure, material components, and process control.
• Design structural defects, including abnormal sound, vibration, excessive heating speed during operation, abnormal temperature, large or small volume, abnormal heat dissipation speed, and the impact of transformer design on its own losses.• Material component defects, design materials and component defects such as electromagnetic wires, casing materials, iron core materials, insulation media, etc. • Process control defects are prone to problems in the same batch of equipment.If there are abnormal problems in fuel tank manufacturing, iron core manufacturing, coil manufacturing, insulation assembly, etc. Li aidong first proposed and introduced in detail the concept of familial quality defects of electrical equipment in 2009, and pointed out that familial defects are related to factors such as material, process and design of equipment, but did not propose technical ideas for discovering and solving familial defects.Li xinye proposed an improved hierarchical clustering algorithm to cluster the point spacing and slope spacing of transformer attributes, and analyze the health status of power transformers under familial defects [1] .Rao wei proposed another algorithm based on improved hierarchical clustering, which further quantifies the attribute factors and effectively evaluates the health status of electrical equipment through clustering, so as to provide a basis for power operation inspection [2] .Zhang yanxu used Apriori algorithm to mine secondary equipment defect data, and to a certain extent, it can be used to analyze familial quality defects [3] .Pang-Ning Tan used the matrix-based Apriori algorithm to mine transformer defect data, which can locate faults to the component level, and to a certain extent, can be used to analyze the specific factors of family quality defects of electrical equipment.
However, there is currently limited research on the objectivity analysis of familial defect data, and there is also a lack of analysis on the relationship between familial defects and failure rates.Based on existing data, the level of familial defects can be evaluated, which will provide greater assistance for the healthy operation of power transformers [4][5][6] .Therefore, it is very important to analyze the data of familial defects in power transformers.This article analyzes and mines the collected data of familial defects in power transformers, seeking efficient and applicable methods for familial defect data, and providing effective support for equipment operation and maintenance.

Familial defect model
This article first introduces the relevant algorithms for association rule mining; Secondly, the existing data is processed using algorithmic calculation methods, and the processed data is applied to establish an improved Apriori algorithm model to find a more suitable application method for identifying the causes and weak links of family defects, and to analyze the key indicators that affect family defects.This article uses 1429 text data and dissolved gas data in oil collected by a certain power company of State Grid.

Overview of association rule mining
Association rules are used to analyze the correlation between attributes and indicator values, that is, to find effective ways to find conditions with correlation between the two in the data.The main forms of information are generally rules, concepts, patterns, and patterns, while data mining is the process of extracting effective information hidden in data.Association rules are an important mining method in data mining, used to discover meaningful connections hidden in the dataset, which can be represented in the form of rules or frequent itemsets [7][8][9] .
There are various algorithms for mining association rules, and different algorithms are selected based on the different properties and ultimate purposes of the data.The most classic algorithm is the Apriori algorithm, and many algorithms are improved and obtained based on this algorithm [10] .

Data processing formatting
Due to the fact that data in production and engineering (including multivalued data, continuous monitoring data, and textual data, etc.) may not necessarily meet the data definition of algorithms, this section focuses on standardizing diverse data.

Text data standardization
There are many textual data in family defect data, such as manufacturer, equipment type, components, raw materials, defect equipment, and voltage level.These will bring many inconveniences when making association rules, such as reading issues, time efficiency issues, and accuracy issues.Therefore, for this type of problem, corresponding categories can be used instead.The specific indicators involved are shown in Table 1.

Standardize according to scope
In the defect data of transformers, there are many types of data that are classified according to the level of data defined by the scope.The normal range for mapping data within this range is 0, while the abnormal range is 1.In this way, these data are standardized according to some reference standards and within a normal or non normal range.

Standardize according to scope
The data is discretized using the ChiMerge algorithm.The ChiMerge discretization method originated from the chi square test and is a bottom-up algorithm.Using a recursive method, find the best neighboring intervals and merge them together until a stopping condition is found to merge into the maximum interval.Numerical indicators that require discretization are shown in Table 2.

Fault characteristic quantity corresponding to defects
In the obtained transformer data, finding defect manifestations through objective methods is to prevent transformer failures.The corresponding transformer defect may cause corresponding faults, thus quantifying the corresponding fault type.The specific types of faults are shown in Table 3. Boolean types, with a selection of 0 and 1.If a winding fault occurs, it is 1, and if it does not occur, it is 0. After standardizing the data, a family defect model is established using the algorithm of association rules.

Improving the Apriori algorithm family defect model
Due to the fact that data in production and engineering (including multivalued data, continuous monitoring data, and textual data, etc.) may not necessarily meet the data definition of algorithms, this section focuses on standardizing diverse data.

Improved Array Based Apriori Algorithm
Given the limitations of the classic Apriori algorithm and the array based Apriori algorithm, the family quality defect association rule proposed in this article adopts an improved array based Apriori algorithm, as follows: • Divide into several zones according to the manufacturer.The transactions in the database are divided into several disjoint blocks based on the defects in different regions; Only consider one block at a time and generate all frequent projects for it.Then, non destructively combine the generated frequent itemsets to create all possible frequent itemsets.• Based on the array frequent itemset mining algorithm, all attribute values are identified uniformly, and then database scans are performed.All data related to association analysis is saved in a defined two-dimensional array.During the process of frequent itemset generation, only two-dimensional arrays are scanned, and there is no need to scan the database, thus avoiding the time required to repeatedly scan the database.In summary, divide the data of familial quality defects into multiple blocks according to the manufacturer, and store each block in a defined array; Before scanning, the data is uniformly labeled, and during the scanning process, the data is judged.That is, the itemsets already included in the transaction database are represented by "1", and the ones not included are represented by "0".The scanned data is counted, its support is calculated, and the minimum support is min_ Sum comparison, will be less than min_ Delete the itemset of sup, only retaining values greater than support min_ The itemset of sup, then the stored one is the frequent itemset.

Experimental analysis
The data used in this article is a total of 1429, belonging to 10 transformers, covering defective 110kV, 220kV, and 500kV power transformers.Firstly, rough process the data to remove some obvious outliers, and then use an improved algorithm to calculate and analyze the rough processed data.Use an improved algorithm to solve association rules for the processed data.

Determination of association relationship
Set the minimum values of support and confidence for this experiment to 1.5% and 60%, respectively.After comprehensive selection, it was found that there are a total of 41 items in the frequent 1-item set, 162 items in the frequent 2-item set, 41 items in the frequent 3-item set, 8 items in the frequent 4-item set, and 2 items in the frequent 5-item set.There is no more frequent complex item set.Select 12 specific frequent itemsets as shown in Table 4.

Analysis of Strong Association Rules
The obvious performance in the experimental results is oil leakage fault, sleeve design, raw material issues, partial discharge, and winding faults.The prominent family defects are raw material issues and casing process design, as shown in Figure 1.Combining Figure 1 and Association Rule 7, it can be concluded that manufacturer tb exhibits obvious familial defect characteristics in the selection of insulation medium raw materials, and the yellow area in Figure 1 has a high confidence level, which can also indicate significant familial defect characteristics.Similarly, it can be seen from the figure that manufacturer bd has obvious oil leakage faults, while manufacturer xa shows that the design process of the component casing is more prominent.By improving the array based Apriori algorithm and conducting experimental analysis on the processed discretized data, the following correlation analysis can be obtained: • According to Rule 1 analysis, the 110kV transformer produced by manufacturer bd experienced an oil leakage fault, which is shown in Rule 4 to be related to the raw material of the oil seal glass, and the confidence level of this correlation can reach 92%.In production, transformer failures may occur due to material issues.For this type of problem, it may not only occur on the transformer, but also on transformers of the same batch.The manufacturer should inspect and replace the material of the component to timely remedy the material problem.• According to the analysis of Association Rule 9, there is a strong correlation between the acetylene content and the partial discharge displayed by Sensor 2 and the insulation medium fault, and according to Rule 5, the reliability of the insulation medium causing partial discharge is as high as 98%.This rule reflects strong correlation through confidence, and to some extent, it can detect familial defects in power transformers.• According to the analysis of association rules 2 and 3, the possibility of casing failure is high due to process design defects in the casing produced by manufacturer xa.This issue is a process control defect in a small component of the casing itself.Through association rules, potential problems in its production process can be identified, and timely remedial measures can be taken to counter accidents.
• According to Association Rule 10 analysis, prolonged service may have a significant impact on winding deformation.When the components exceed their service life, it is necessary to replace the winding components in a timely manner.In response to equipment family defects, strict defense should be exercised in structural design, raw material selection, and process control during the production process.Control from the source, strengthen protection measures in various stages of design, production, storage and transportation, and put into use, and extend the service life of each component.• According to association rules 11 and 12, a certain component of the power transformer installed on the main body produced by manufacturer SD has pipeline corrosion, which is mainly reflected on the transformer body.In response to this issue, manufacturer SD needs to start with the small components on the body and investigate the reason for the discrepancy between the pipeline material and other pipeline materials.It is necessary to conduct a detailed inspection of power transformers produced in the same batch as the transformer and take anti accident measures.On the basis of the original association rule algorithm, this section divides the data into blocks and stores them in an array based on the fault characteristics and defect quantity of the data to establish a family defect model.The improved family defect model has advantages in both time and space efficiency, and since it is an array storage, it will try to avoid the appearance of strong association rules on the surface.Further exploration is needed to determine whether the association rules that appear have significance and the impact of familial defects on the status of power transformers.Whether these association rules have a significant impact on the operation status of power transformers still requires objective quantitative analysis.

Simulation Analysis and Evaluation
In order to further evaluate the reliability of the information obtained from the experiment, this section adopts the evaluation criteria of family defects to evaluate the experimental information.Evaluate the health status of power transformer equipment through key indicators of familial defects, in order to reasonably and objectively reflect the impact of familial defects on the health status of power transformer equipment, and further provide strong basis for maintenance.

Evaluation and Calculation of Familial Defects in Power Equipment
The degree of impact on power transformers varies depending on the nature of the defective equipment and the familial relationships.Based on the data indicators mentioned in section 2.3, evaluate the impact of processing these familial defect data on the health status of power transformers.The impact of familial defects in power transformers is evaluated using r, and its specific expression is shown in (1). ) /( ) ( Among them, represents the score of familial defects, represents the scoring weight of familial defects, and m represents the total classification number of familial defect devices and devices with familial defect risk.The equipment rating is set at a level between 0 and 100.If there are defects that cannot be solved at the moment and may cause serious consequences, the rating is 0. If the equipment is intact and has no defects, the rating is 100.The specific scoring details are shown in Table 5.In addition to recording ratings, it is also necessary to have rating weights to distinguish the impact of different indicator attributes on the health status of power transformer equipment.The specific weight scores are shown in Table 6.Among them, n represents the number of occurrences of the same or similar defects on different devices.Familial quality defects generally use the above recorded scores and scoring weights to calculate the impact on the health status of power transformers.The closer the familial defect is to defects with the same manufacturer, model, and drawing, the higher the likelihood of familial defects.

Construction of familial defect evaluation index system
Evaluate similar familial defect indicators in the collected data to determine the health status of power transformers.Taking the data in 2.3 as an example, extract the fault type and defect indicator set, as follows: Fault type={winding fault, bushing fault, oil leakage fault, heat release fault, tap changer fault, insulation medium fault, lead wire fault, iron core fault}; Defect indicator set={manufacturer, equipment type, voltage level, defective equipment, defective components, component types, component materials, hydrogen gas content, acetylene gas content, methane gas content, total hydrocarbon gas content, ethane gas content, carbon monoxide gas content, ethylene gas content, carbon dioxide gas content}.Due to the assessment of the impact of familial defects on the health status of equipment, it is necessary to include equipment attributes such as the manufacturer.

Evaluation of Association Rule Models
Using a data-driven approach to evaluate the quality model of association rules is an objective method, and the best example in practical application is using objective data to express authenticity.So what needs to be done in evaluating association rules is an objective measure of interest.By proposing an "interest factor" to solve the support of the itemset that appears in the rule sequel, the interest factor is called the measure of lift, as shown in equation (2).
) ( (2) Among them, is the support level of the itemset in the rule consequent, representing the confidence level of the rule.This formula can solve the problem of high confidence.

Instance evaluation
Based on the algorithm in 2.3, calculate the confidence level between the fault type and variables as the score basis for evaluation.Bring the data into equation (2) to calculate the lift value, and the specific data is shown in Table 7.Based on the lift value to determine the gas content, familial defects were found and further correlation with the type of fault was found.Based on the data provided in 2.3, determine the impact of familial defects on the health status of power transformers.Take the calculated improvement degree into the evaluation criteria for familial defects for calculation, and use it as an example to calculate the number of equipment failures.Table 8 shows the family quality defect record form.The score of the influencing factors affecting family defects is 55.39 based on calculation and evaluation.
Based on the above evaluation, a method for objectively evaluating familial quality defects can be obtained.The evaluation of familial quality defects guidelines and the association rule model jointly evaluate the impact factors on familial defects.The higher the evaluation score, the greater the impact on the status of power transformers, which is of great significance for the maintenance and operation of power transformers.

Conclusion
The familial quality defects of power transformer equipment play an important role in industrial applications.If problems with the design, manufacturing process, materials, and other aspects of the equipment during production can be identified in a timely manner, it will provide greater security for the operation of the power system.This article studies the characteristics of data by establishing a family defect model that improves the Apriori algorithm, and applies this method to the collected family defect data mining and analysis through case analysis.Finally, the results are evaluated using association rule evaluation method.By evaluating the correlation and reliability of defect indicators through familial defect assessment, the attribute and dependency relationships between indicators can be determined, providing a methodological basis for familial defects in power transformers from an objective perspective.Evaluate the impact of familial defects on transformers based on the data obtained through association relationships.

Table 1 .
Family Defect Indicators of Transformers.

Table 2 .
Numerical indicators that require discretization.

Table 3 .
Table of Fault Types.

Table 4 .
Association Rules Based on Frequent itemsets.

Table 5 .
Familial Quality Defect Record Score.

Table 6 .
Weights of Familial Quality Defect Scores.

Table 7 .
Fault Components and Gas Content Lift Values in Oil.

Table 8 .
Familial Quality Defect Record Form.