Transmission line fault detection based on anchor frame a priori and attention mechanism

YOLO target detection algorithm is the mainstream method for image-based defect detection of transmission line insulators. However, the existing model complexity is large. A reasonable and effective parameter compression method is urgently needed as a prerequisite to lay the foundation for solving the dilemma of UAV edge equipment deployment. At the same time, the insulator defect images taken by UAVs have complex backgrounds and small defect sizes, which are prone to misdetection and omission. It is needed to improve the channel domain of the CBAM (convolution block attention module) attention mechanism and solve its problem of missing channel information due to dimensionality reduction. The improved CBAM is added to the backbone network of YOLOv5s. The goal is to enable the model to identify and pinpoint the critical objectives with greater precision. The model is combined with Cross Entropy Loss and Lovasz-Softmax Loss weighting, which makes the network converge more stably during the training process, and the accuracy rate is also somewhat improved.


Introduction
As China's electric power industry continues to grow at an accelerated pace, the transmission line erection area is expanding to guarantee the safe and reliable operation of the power transmission network, which has become more important.Insulators in the transmission and distribution circuits play the role of supporting wires and electrical insulation.However, due to long-term work in the harsh external environment, insulators are prone to self-explosion, breakage, flashover, and other defects, affecting the insulating properties of insulators, which in turn threatens the safety and reliability of transmission lines to a large extent.Therefore, regular inspection of insulators, timely detection of insulator defects, and taking appropriate measures are crucial.The traditional manual field survey method is inefficient and difficult to meet the needs of insulator defect detection; the helicopter manned inspection method is limited by the shortcomings of high inspection cost and low flexibility and cannot be applied on a large scale.The introduction of drones and artificial intelligence technology effectively reduces the difficulty of insulator inspection, protects personnel safety, and greatly improves inspection efficiency through the significant accomplishments of deep learning algorithms [1][2].In image recognition and detection, target detection algorithms based on deep learning have the strengths of high detection accuracy, robustness, high performance, high precision, robustness, and other advantages.It has been more and more widely used in insulator defect detection in insulators.
In this paper, we will improve on the YOLOv5s algorithm for sign language image processing.Firstly, the machine learning algorithm is used to improve the a priori anchor box size matching using the Kmeans algorithm, and then the improved CBAM [3] (convolution block attention module) attention mechanism is added to the backbone network to enhance the feature space information as well as the The improved CBAM (convolution block attention module) is added to the backbone network to enhance the relationship between the spatial information of the features and the channel information to enhance recognition accuracy.The accuracy of transmission line fault recognition is improved.Finally, the Cross-Entropy Loss [4] and Lovasz-Softmax Loss [5] weighting are combined to make the network more stable during model training.Finally, the compound of Cross Entropy Loss and Lovasz-Softmax Loss weighting is used to make the network converge more stably through the model development phase for the purpose of addressing the challenges of practical situations.

Object detection algorithms
In recent years, improvements in computing power and AI methodologies have led to substantial enhancements in the field of target detection.The development of target detection algorithms has gone through several important stages.The earliest method is a sliding window-based algorithm, which determines whether there is a target object inside the window by sliding windows of different sizes on the image and using a classifier.However, this method is computationally intensive and inefficient.Then, an image pyramid-based approach emerged to achieve multi-scale target detection by processing the image at different scales.However, this method is also computationally expensive.Later, the candidate region-based method was proposed.This method works by generating candidate regions and using a discriminator to identify whether each candidate region has a target present.This method provided a significant improvement in accuracy, but the corresponding computational effort also increased.It was not until the rise of deep learning that target detection algorithms saw a major breakthrough.Deep learning-based target detection algorithms achieve higher accuracy and efficiency by using convolutional neural networks (CNNs) for descriptor extraction and object identification.Many successful target localization techniques, such as Faster R-CNN [6], YOLO, and SSD, are based on deep learning.
As shown in Figure 1, the model structure of YOLOv5 is categorized into two main parts: the backbone network and the detection head.The backbone network uses CSPDarknet53 [7] as the main descriptor-learning network.CSPDarknet53 is a lightweight convolutional neural network consisting of multiple convolutional layers and residual blocks.It is used to gather information from images and construct effective feature maps.CSPDarknet53 has better parameter efficiency and receptive field size, which can correct the accuracy of target detection.The detection head is the core component of the YOLOv5 model and is used to predict the location, class, and confidence of the target object.YOLOv5 uses three detection heads, each for target detection at different scales.Each detection head comprises multiple layers of convolutions and fully connected layers that perform target detection by predicting the placement and categorization of the bounding box.The detection head generates a series of anchor frames and then determines the final detection result based on confidence scores and non-great value suppression.The model structure of YOLOv5 is simple and efficient, enabling fast and accurate target detection.All of the above methods have realized, to a certain extent, the insulators and their defect detection performance improvement.However, the type of insulator defects detected insulator defects.The detection of insulator defects is relatively single, generally only focusing on insulator self-detonation defects and paying less attention to other defects.In addition, the inspection images collected by the UAV have a variety of insulator defects and defect targets, with a small defect target scale, complex background environment, and other characteristics.When the target is in a complex background, the above algorithms make it difficult to perform effective feature extraction for multiple defect targets, and the environmental adaptability is poor, which may easily cause insulator defect leakage and false detection problems.The existing model complexity is large.There is an urgent need for a reasonable and effective parameter compression method as a prerequisite to solving the dilemma of UAV edge equipment deployment to lay the foundation.To this end, this paper proposes an improved algorithm based on the YOLOv5 network for insulator self-destruction, breakage, and flashover defects in transmission lines.

Priori frame matching based on K-means
The K-means algorithm can automatically learn to generate adaptive prior frames according to the different shapes and sizes of gestures in the sign language dataset.This adaptivity can effectively solve the problem that traditional fixed-size a priori frames cannot adapt to different gesture variations, increasing the robustness and accuracy of the algorithm.Improved detection accuracy: The k-means algorithm is used to set a priori frames that can better adapt to the features of the sign language and locate the position of the gesture and the bounding box in a more precise way.Reduced computational complexity obtained by learning against object data: the K-means algorithm is able to automatically determine the appropriate number of a priori frames by clustering the sign language data and using them as input to the target detection algorithm.The nine a priori anchor frames obtained in Table 1 have a high degree of variability among them, and they are applied to the different scale detection layers of the network with better clustering results.which is applied to the different scale detection layers of the network, the clustering effect is better.

Spatial and channel attention mechanisms
For input sign language images, in addition to hand information, they are often accompanied by complex background information.When convolving, continuous accumulation of background information results in a buildup of redundant content, and excessive accumulation of background information leads to occlusion of the target, resulting in reduced detection performance.In this article, we improve the CBAM attention mechanism and select effective locations to add them to the YOLOv5s network model for feature fusion so that the model can more accurately localize and identify the target of interest.The attentional mechanism is employed to acquire crucial data by concentrating on key areas of the subject matter.CBAM is a simple but effective attentional mechanism that combines channel attention.In CBAM, channel attention learns the weights of different channels and uses the weights to classify different channels in order to improve the network's attention to the crucial frequency bands.Figure 2 illustrates a schematic of the attention mechanism.

Data sets and experimental data
The dataset is a real collected transmission line dataset containing 720 real aerial images of circuits.The ratio of the training set to the dataset is 9:1.In the model training, the parameters are tuned by the Adam Optimizer [8], and the class confidence threshold of the target is set to 0.5.The initial learning rate is set to 0.001, and the weight decay coefficient is set to 0.0005.The GPU is RTX4090 with 64 GB of video memory.The labels of the dataset are categorized into three labels: insulator self-detonation, breakage, and flashover.

Evaluation indicators
The evaluation of the detection performance of models in the field of target detection has specific metrics.
In order to measure the model complexity, this paper uses the parameter amount and floating-point operation amount as the evaluation indexes of space complexity and time complexity.To evaluate the effectiveness of the algorithm, this paper adopts the average precision and the average class precision as the evaluation criteria of the algorithm in this paper.The average precision is determined by the recall rate and the detection rate, which is an intuitive measure of how well the model algorithm detects each category.

Experimental analysis
To prove the efficacy of the method introduced here, we pit it against widely used detection techniques in current practice, including Faster R-CNN, SSD [9], YOLOv5, YOLOv7 [10], and YOLOv8 [11].The outcome of the experiments is presented in Table 2. Table 2 shows that compared with the current stage small target detection algorithms YOLOv7, YOLOv8, and other mainstream algorithms, the mAP50 and mAP50:95 values of this algorithm are higher than those of other algorithms for insulator selfdestruction, breakage, and flashover defects.3, it can be seen that the background environments of the self-destructing scene 1 and self-destructing scene 2 are similar to the insulator color.The defect scale is small, and other algorithms have the phenomenon of leakage detection.The algorithm proposed in this paper has not occurred leakage detection.This is because the introduction of the attention mechanism can help the model accurately focus on the key information in the features and reduce the impact of irrelevant background noise.Therefore, the model would have a high adaptability in the insulator defect site identification task and can effectively reduce the phenomenon of leakage detection.From Scene 1 and Scene 2, it can be seen that in other algorithms, there are insulator-broken defects mistakenly detected as insulator defects.This is because when the region of the damage defect is large, the damage defect and the self-explosion defect are similar, and it is easy to misdetect the phenomenon.As can be seen from the visualization diagram, the algorithm in this paper does not cause misdetection by flashback scene 1 and scene 2. It can be seen in other algorithms on the insulator flashback defects there is a leakage of detection or misdetection by the background into the target phenomenon.This paper's algorithm, to a certain extent, suppresses the other background noise.

Conclusion
Aiming to realize the intelligent detection of insulator defects in transmission lines, this paper proposes a method for multi-defect detection of insulators in transmission lines in view of the advanced organization of the existing detection model, the advanced organization of the insulator defect image background, the many types of defects, and the small size of the defects.After experimental verification, the attention-oriented approach can strengthen the recognition precision of the object network on the region of interest, and the addition of the a priori frame can enable the model to obtain better detection results.

Figure 1 .
Figure 1.Yolov5 network structure.All of the above methods have realized, to a certain extent, the insulators and their defect detection performance improvement.However, the type of insulator defects detected insulator defects.The detection of insulator defects is relatively single, generally only focusing on insulator self-detonation defects and paying less attention to other defects.In addition, the inspection images collected by the UAV have a variety of insulator defects and defect targets, with a small defect target scale, complex background environment, and other characteristics.When the target is in a complex background, the above algorithms make it difficult to perform effective feature extraction for multiple defect targets, and the environmental adaptability is poor, which may easily cause insulator defect leakage and false detection problems.The existing model complexity is large.There is an urgent need for a reasonable and effective parameter compression method as a prerequisite to solving the dilemma of UAV edge equipment deployment to lay the foundation.To this end, this paper proposes an improved algorithm based on the YOLOv5 network for insulator self-destruction, breakage, and flashover defects in transmission lines.

Figure 3 .
Figure 3. Comparative results of visualization of different models.

Table 1 .
Prior anchor box scales.

Table 2 .
Comparison of performance without modelling.