Insulator Defect Detection Method upon Fused Attention Mechanism and Bidirectional Feature Fusion

Insulators are important components for achieving electrical insulation and mechanical support, but they are prone to various defects in harsh operating environments, which can damage their mechanical strength and insulation performance. This article proposes the Shuffle YOLOv7 model based on the YOLOv7 algorithm for insulator defect detection, aiming to solve the weakness of low precision in traditional object detection algorithms when facing complex backgrounds and small-sized defects. To address the issue of low attention to flashover faults in traditional algorithms, the ShuffleAttention fusion attention mechanism is supplied to concentrate on both intra-channel and inter-channel deep features, and the original PANet structure is replaced with a pyramid which has a bidirectional feature fusion structure to enhance the network’s feature extraction ability. The Focal-EIOU LOSS optimization method focuses on high-quality prior boxes to improve model accuracy, and the effectiveness of the optimization method is verified through ablation experiments. These results of the experiment show that the proposed algorithm achieves varying degrees of performance improvement in terms of precision, recall, average precision, and overall loss compared to mainstream object detection algorithms in detecting insulator damage and flashover.


Introduction
As important components in power transmission lines, insulators [1] has exposed to complex environments for a long period ago, making them susceptible to physical damage and chemical corrosion.This can affect their performance and lead to unstable operation of the lines.Traditional manual inspections are costly and often lack precision and speed.Therefore, research on more intelligent methods for detecting defects in line insulators is of great significance.
At present, with the popularization of drones, inspection methods using intelligent algorithms to diagnose and analyze collected images have been adopted.Yan et al. [2] achieved higher accuracy in classifying and locating insulator self-burst defects than conventional Faster R-CNN, supporting vector machine and VGG algorithms by serially connecting two-stage Faster R-CNN networks and combining feature pyramids for feature extraction.Li et al. [3] replaced the original VGG structure in the SSD network with ResNet, enhancing overall feature extraction capability.Tian et al. [4] combined the YOLOv5S network with the SE attention mechanism, utilized the K-means clustering algorithm to construct prior boxes, and achieved higher detection accuracy, recall rate, and detection speed.However, these algorithms still face challenges in achieving efficient detection in scenarios where defect size is small, background is complex, and boundaries are indistinct, due to issues such as neglecting deep features, inefficient feature fusion, and inappropriate loss calculation.
The paper proposes an insulation defect detection method called Shuffle YOLOv7, which is based on the YOLOv7 [5] network and incorporates fusion attention mechanisms and bidirectional feature fusion.The following are the contributions of this paper: (1) A dataset containing insulation damage and flashover faults in insulators is created, and data augmentation techniques are supplied to expand the dataset for general insulation defect detection tasks; (2) ShuffleAttention mechanisms are added to the three dimensions of the YOLOv7 backbone network to capture both intra-channel and inter-channel features.The PA-Net structure in the head is replaced with the BiFPN structure to enhance the fusion capability of features in different dimensions; (3) The original C-IOU Loss function used in YOLOv7 is displaced by the Efficient IOU Loss, which helps optimize the model by reflecting the differences between predicted and ground truth bounding boxes in terms of width, height, and confidence.The focal mechanism is introduced to focus on high-quality anchor boxes; (4) Comparative experiments and ablation experiments are conducted on the insulation defect dataset to validate the superiority of the model we proposed and the feasibility and necessity of various optimization methods.Experimental results demonstrate that the Shuffle YOLOv7 algorithm has certain advantages in terms of detection accuracy. [6]) As an important component in network design, the attention mechanism allows for focused attention on certain feature information.In this paper, the ShuffleAttention module is integrated with the YOLOv7 backbone network.By capturing the pairwise pixel-level relationships in the spatial domain and the mutual dependencies between channels, the model complexity is kept low.The illustration of the SA structure is shown in Figure 1.We apply GroupNorm [7] to concentrate on spatial attention, and SE [8] is supplied to focus on channel attention.

Figure 1. Illustration of ShuffleAttention structure
The equation for the output features Y of each group, given that the feature mapping for each group's input features is , is as shown in Equation (1).
where F1 and F2 represent the channel attention mechanism and spatial attention mechanism.They denote tensor products, representing the concatenation operation.
By dividing the feature tensor into n groups, the computational complexity can be effectively reduced.Within each group, the SA module is used to process and focus on both intra-channel and inter-channel features.After fusing the two types of features through concatenation, the Channel Shuffle operation is performed to rearrange the inter-group features, ensuring the communication of feature information between different groups.

Weighted bidirectional feature pyramid
YOLOv7 applies a classic path fusion network [9] (PANet) as a feature fusion network.Compared to FPN, PANet can better integrate low-level positional information features and high-level semantic features.However, it still suffers from issues such as inconsistent multi-scale features, insufficient fusion of useful information, and increased computational complexity.In this paper, we adopt BiFPN [10] to replace PANet, with the core idea being efficient bidirectional cross-scale connections and weighted feature fusion to achieve an improved balance between precision and efficiency.By introducing residual connections, the representation capability of features at the same depth is enhanced, allowing for the extraction of richer semantic information.To reduce computational complexity, BiFPN removes edge nodes that only have one input, as they contribute less to feature fusion.This helps simplify the network structure and improve computational efficiency.Additionally, BiFPN selects different weights for features at different scales, aiming to adapt to different levels of features and improve detection speed.Figure 2 shows the illustration of BiFPN.(3) EIOU retains the beneficial features of the CIOU loss calculation while minimizing the difference in width and height between the box of prediction and truth, leading to faster convergence speed and improved localization accuracy.EIOU divides the loss into three parts: intersection over union loss, distance loss, and width-height loss, as defined in Equation ( 4 Due to the sparsity of the target objects in the detection task, there is a severe imbalance issue with far more low-quality regression samples than high-quality regression samples.To focus the EIOU loss on samples of high quality for regression, we integrate Focal loss and EIOU loss, with the specific expression as shown in Equation (5).
where J is a parameter that controls the degree of suppression for outliers.
Based on Focal-EIOU, the modifications directly reduce the discrepancy between the width and height of the box for prediction and the truth box, while focusing on high-quality predicted boxes for regression.This approach achieves better localization accuracy.

Data preprocessing
This paper focuses on the annotation of the transmission line insulator defect dataset released by the Electric Power Research Institute (EPRI) of the United States, using labeling.In the training set, the insulators were categorized into normal, damaged, and flashover fault types based on their shell defects.We discarded some low-quality insulator images and applied operations such as rotation, flipping, cropping, and scaling.Simultaneously adding salt and pepper noise and random noise to simulate adverse detection conditions.As a result, we obtained a total of 1, 596 images of insulator defects.The quantities of each defect category are shown in Table 1.The training and validation were separated in a 9:1 ratio.The test set consists of 88 complex background insulator defect images with different materials.

Experimental environment and evaluation metrics
The environment configuration is depicted in Table 2 When discussing object detection, average precision (AP) are commonly used evaluation metric.The definitions of these metrics, as shown in Equations ( 6)-( 9), are as follows: where TP depicts the quantity of true positive samples; FN depicts the quantity of false negative samples where positive instances are incorrectly predicted as negative; FP represents the quantity of false positive samples where negative instances are incorrectly predicted as positive; R (Recall) refers to recall; P (Precision) refers to precision; N represents the total number of categories for object detection.

Model training
To validate the improved Shuffle YOLOv7 model's performance on the insulator defect dataset, this model needs to be trained with the following specific parameter settings: 100 epochs of training iterations, batch size set to 32, and all other parameters set to default.

Model performance comparison
To validate the effectiveness of the aforementioned enhancement method in object detection tasks, comparative experiments on the insulator dataset were prepared by employing two alternative object detection algorithms.We selected P, R, map, and loss as the main evaluation metrics.The model performance parameters were recorded in Table 3 at the end of convergence.The training process was visualized as a line graph shown in Figure 4, and the x-axis represents the number of iterations.Observing the entire model training process, it is apparent that the improved YOLOv7 model reaches more efficient and stable convergence.However, due to the limitations of the size and complexity of the insulator dataset, a high number of iterations will result in similar average precision convergence for both YOLOv7 models.Nevertheless, the improved YOLOv7 model still maintains a certain advantage.From the performance comparison of the model, it can be observed that, under the same number of iterations and training conditions, the improved YOLOv7 model achieves an average precision improvement of 3.6% and 5.8%, compared to YOLOv7 and YOLOv5, respectively.The precision and recall rates have also been improved to varying degrees.

Ablation experiment
To determine whether these aforementioned enhanced methods are effective in insulator defect detection and investigate any potential inhibitory effects that may arise from the combination of these methods, ablation experiments were conducted to validate their performances.The ablation experiment data is presented in Table 4.It is evident that without the improvement of ShuffleAttention mechanisms, the average precision of the network decreased by 1.2%.The loss also increased when Focal optimization and EIOU loss were not employed."-" indicates the absence of the improvement strategy, and " " represents its inclusion.

Conclusion
The insulation defect detection method we discuss in this paper, based on Shuffle YOLOv7, is capable of successfully detecting insulation damage and flashover discharge defects under complex background conditions.Through comparative experiments, it has been verified that the aforementioned method achieves better detection performance and lower detection loss, compared to the original YOLOv7 and YOLOv5 networks.

Figure 2 .
Figure 2. Illustration of the bidirectional feature fusion network2.3.Focal E-IOU loss functionThe C-IOU loss function used in YOLOv7 selects overlap area, center point distance, and aspect ratio as random variables for optimization.It is defined specifically as Equation (2).DQ U2

Figure 3
Figure 3 shows examples of insulator broken and flashover defects in the dataset.

Figure 3 .
Figure 3. Example of an insulator broken and flashover defect

Figure 4 .
Figure 4. Example of an insulator broken and flashover defectObserving the entire model training process, it is apparent that the improved YOLOv7 model reaches more efficient and stable convergence.However, due to the limitations of the size and complexity of the insulator dataset, a high number of iterations will result in similar average precision convergence for both YOLOv7 models.Nevertheless, the improved YOLOv7 model still maintains a certain advantage.From the performance comparison of the model, it can be observed that, under the same number of iterations and training conditions, the improved YOLOv7 model achieves an average precision improvement of 3.6% and 5.8%, compared to YOLOv7 and YOLOv5, respectively.The precision and recall rates have also been improved to varying degrees. ):

Table 2 .
. Configuration of the experimental environment

Table 3 .
Model performance comparison

Table 4 .
Result of the ablation experiment