Adaptive defogging method for transmission line inspection images based on multilayer perceptual fusion

Existing image defogging methods generally have problems such as incomplete defogging and color distortion. To address this problem, this paper proposes an adaptive defogging method for transmission line images based on multilayer perceptual fusion, which uses dynamic convolution, dense residuals, and attention mechanism to design an adaptive feature enhancement network containing six Dy-namic Residual Components (DRC) and two Dy-namic Skip-Connected Feature Fusion Component (DSCFF) composed of adaptive feature enhancement network, which prevents the problem of features being forgotten in the early stage of the network, and enhances the expressive ability of the model. For the decoding network, the de-fogging effect of the model is further strengthened by introducing a decoder module based on the SOS enhancement model, and finally, by comparing the experiments with the current de-fogging methods with more advanced performance, the results show that the method has good de-fogging effect and can retain the image details better with high color retention.


Introduction
With the vigorous development of industry and the occurrence of bad weather, in the fog, haze, smog and other conditions of transmission line inspection image acquisition is affected to a greater extent, resulting in inspection images there are image details fuzzy, image distortion, low contrast and other issues, the acquisition of low-quality images on the subsequent line defects identification has a serious impact.Therefore, the study of transmission line inspection image defogging technology has a wide range of practical value and practical significance.Convolutional neural networks, as one of the most common deep learning models, are widely used in various fields of computer vision.A large number of scholars have developed image defogging methods based on convolutional neural networks.Shu-yun Liu, Ziyi Sun, et al. proposed an end-to-end defogging network, DehazeNet, to estimate the transmittance of foggy images, and then restore the fog-free images through an atmospheric scattering model.However, this method has a long computational period and is not applicable to regions with complex backgrounds [1][2].Fan Guo, et al. improved the atmospheric scattering model by combining two unknown parameters, transmittance and atmospheric illumination, into one parameter, and then constructed this model into an end-to-end convolutional neural network to realize image defogging, but this method is prone to mesh artifacts [3].
In this regard, Javed Iqbal, et al. combined attention to local residual learning and solved the problem of grid artifacts by adjusting the weights of the learned features, which improved the quality of image recovery [4].The above methods are prone to problems such as uneven haze distribution and severe color bias in real foggy scenes, which greatly reduces the performance of the de-fogging algorithm.To address this problem, this paper proposes an adaptive de-fogging method for transmission line inspection images based on multi-layer perception fusion to recover fog-free images end-to-end.An adaptive feature enhancement network is designed, which can adaptively generate corresponding convolutional kernel parameters according to different input samples in the inference stage.Meanwhile, a decoder module based on the SOS enhancement model is introduced into the decoding network to ensure that the recovered image is visually close to the original clear image.

Algorithm design
In this paper, we propose an image adaptive defogging method based on multilayer perceptual fusion to realize the end-to-end adaptive learning image defogging process, the network contains three parts: coding network, adaptive enhancement network, and decoding network, and its structure is shown in

Dy-namic Residual Components
In this paper, Dy-namic Residual Components are designed mainly composed of Dynamic Dense Residual Modules, Convolutional Layers, Channel Attention Modules, and Pixel Attention Modules to solve the problem of uneven haze distribution in different scenes.

Weights dynamic aggregation module
In this paper, a weight dynamic aggregation module is designed to dynamically generate convolutional kernel weights to improve the expressive power of convolutional neural networks.The process can be represented as: The module inputs the image into the average pooling layer, inputs the output values from the fully connected layer into the softmax layer, obtains a set of attention weights 1 , 2 , ⋯, } and

Dynamic dense residual module
In this paper, we design a dynamic dense residual module, which reduces the loss of the shallow layer of the network to some extent, retains the multi-level information, enhances the performance of the model, and ensures the flow of information.Its structure is shown in Fig. 3.The input fog feature image passes through three densely connected dynamic convolutional layers, and each dynamic convolutional layer is densely connected to each other, which can greatly retain the information in each layer of the network, and substantially improve the characterization ability of the defogging model.

Dual Attention Module
Since most image defogging networks do not make a clear distinction between image features, resulting in poor defogging results.This time, the combination of channel attention force [5] and pixel attention [6] extends the expressive ability of convolutional neural network and provides additional flexibility to deal with different concentrations of fog, and its structure is shown in Fig. 4.

Dy-namic Skip-Connected Feature Fusion Component
In order to reduce the information loss of the model in the coding stage, this paper designs the Dynamic Skip-Connected Feature Fusion Component as shown in Fig. 5.The loss of texture and details of the output image is reduced by adaptively fusing the foggy image features and fusing the output features with the output of the sampling layer.

Decoder Module Based on SOS Enhanced Modeling
The SOS enhancement model is a variant of the enhancement algorithm that has a better signal-tonoise ratio in image processing.Therefore, in this paper, it is introduced into the decoding network to refine the output fog-containing feature images, so that the fog concentration in the final output defogging map is smaller and the image details are more obvious, and the model can be expressed as: Where， denotes the estimated image at the first n estimation image at the first iteration, andg denotes the defogging method, and + denotes the input image I of the enhanced image.
From equation ( 2), it can be seen that the +1 the fog concentration is less than J n , so in this paper, a decoder module based on SOS enhancement model is designed in the last convolutional layer of the decoding network, which can have improved image defogging effect.The structure of this module is shown in Fig. 6.

Loss function
In this paper, the network model training uses L2 loss (MSE), which is used to calculate the sum of squares of the difference between the clear fog-free image and the defogged image processed by the network model.L2 loss is commonly used in the image reconstruction loss, relative to the L1 loss L2 loss is more sensitive to the outliers, convergence is faster, and it will find a closer and more stable solution.The L2 loss formula is defined as follows: where n is the total number of samples in the training set, and f(x) is the clear and fog-free image, and y i is the processed defogged image, and the L2 loss function is monitored by the f(x) and y i the sum of difference scores to optimize the model.In this paper, we sample the strategy of dynamically adjusting the learning rate to ensure that the model can reach the optimal solution, the initial learning rate is set to 0.0001, and the learning rate is dynamically adjusted by monitoring the change of the verification loss, the dynamic learning rate can effectively prevent the network from oscillating near the optimal value, so that the network accelerates the convergence in the right direction.

Data sets and evaluation indicators
In this paper, we use the public dataset RESIDE to train the network, 15000 outdoor blurred/clear images are selected as the training set in the dataset RESIDE, and 2 real transmission line fogcontaining images are selected as the test.
In order to evaluate the defogging performance of the method in this paper, peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are selected for quantitative and objective evaluation.The experimental environment parameters are shown in Table 1.

Experimental results
In this paper, comparison experiments are conducted with the current representative or current better performance methods, and the experimental results are shown in Table 2, and the output image is shown in Fig. 7.  .Different methods of fog removal effect Through the comparison test, the image generated by DCP algorithm is too different from the target, and the color variation is also large.CAP algorithm to remove the fog will make the image lose part of the original color, the effect of the rendering of the general, the contrast is relatively low.Retinex algorithm to deal with the color of the image is darker image into gray.DehazeNet algorithm to deal with the foggy image will make the image into darker tones, and lose part of the original image information.The method proposed in this paper improves the contrast, has stronger robustness, better recovery, and better recovery.

conclusion
In this paper, an adaptive defogging method for transmission line images based on multilayer perceptual fusion is proposed, which can output fog-free and clear images end-to-end.By improving the structure of convolutional neural network, Dy-namic Residual Components (DRC) and Dy-namic Skip-Connected Feature Fusion Component (DSCFF), which can adaptively adjust the model parameters according to different foggy conditions and improve the model robustness.And the foggy image features are dynamically fused, which retains the effective components in each layer and enhances the expression ability of the model.The decoder module based on the SOS enhancement model is introduced into the decoding network to further improve the model defogging effect.Finally, it is verified through experiments that the method proposed in this paper is superior to the current defogging methods with more advanced performance, and has stronger robustness, recovery, and better recovery.

Fig. 1 .
Will contain the coding network contains four convolutional layers, dimensionality dimension 128 dimensions, the fog image for initial extraction with 4 times downsampling.The adaptive enhancement network consists of six Dy-namic Residual Components (DRC) and two Dy-namic Skip-Connected Feature Fusion Component (DSCFF) to adaptively enhance the image features.The encoder consists of three convolutional layers, with the first two performing the deconvolution operation, and the last convolutional layer is a decoder based on the SOS enhancement model, which outputs the final de-fogged transmission line image.This structure can greatly retain the effective information in the image, accelerate the convergence speed of the model, make the fog concentration in the final de-fogged image smaller, and effectively solve the de-fogging problem.

Figure 1 .
Figure 1.Image adaptive fog removal model based on multi-layer perception fusion.

Figure 6 .
Figure 6.Decoder module structure diagram based on SOS enhancement model.
Figure 7. Different methods of fog removal effect Through the comparison test, the image generated by DCP algorithm is too different from the target, and the color variation is also large.CAP algorithm to remove the fog will make the image lose part of the original color, the effect of the rendering of the general, the contrast is relatively low.Retinex algorithm to deal with the color of the image is darker image into gray.DehazeNet algorithm to deal with the foggy image will make the image into darker tones, and lose part of the original image information.The method proposed in this paper improves the contrast, has stronger robustness, better recovery, and better recovery.