Logo Based on Improved Generative Countermeasure Network Image Super Resolution Reconstruction Method

In order to solve the problem that the existing learning based super-resolution reconstruction methods rely too much on the degenerate model while the degradation model is unknown, which leads to noise and edge jagged in the reconstructed image, an image super-resolution reconstruction method based on improved generative countermeasure network is proposed. The generator of the network consists of up sampling module, denoising module and anti-aliasing module Firstly, the input image is sampled by the up sampling module to generate the initial super-resolution image; then, the denoising module and anti-aliasing module are used to reconstruct the clear super-resolution image; in order to reconstruct a better logo image, a joint training loss is introduced, including content loss and resistance loss, and the content loss includes perception loss and edge loss Lost. The experimental results show that, compared with the current super-resolution reconstruction method based on generative countermeasure network (SRGAN), the peak signal-to-noise ratio (PSNR) of the reconstructed image is improved to 0.2675dB and the structural similarity is improved to 0.035, which can effectively improve the quality of logo image reconstruction.


Introduction
In recent years, the field of image super-resolution has developed rapidly. Image super-resolution reconstruction method based on deep learning has become the mainstream, because it has greatly improved the effect and speed compared with traditional methods [1]. Deep learning provides an endto-end mapping relationship learning mode to deal with super-resolution problems. Dong et al. [2] proposed for the first time that the image super resolution using deep convolutional neural networks (srcnn) was applied to the super-resolution reconstruction of color images, which made a qualitative leap forward in the super-resolution reconstruction effect of color images. Kim et al. [9] introduced the residual network into the super-resolution, and used the residual information of the image for superresolution reconstruction (VDSR). Ledig et al. [10] applied the generative adversarial network (GAN) to image super-resolution, which greatly improved the image reconstruction effect. Muhammad et al. [11] implemented the traditional super-resolution method IBP (iterative back projection) through deep back projection networks for super resolution (dbpn). Yong et al. [26] applied binary closed-loop neural network to image super-resolution (DRN), and improved and solved the problem of unknown degradation model. However, these methods are aimed at the super-resolution reconstruction of largescale color images, and have no strong pertinence for logo images.
The difference between logo image and common color image is that the color uniformity and uniformity are required after the logo image reconstruction, and there is no color block noise. The edge must be smooth and sharp enough to meet the re-use in the later stage. The reconstruction of ordinary color image is more subjective feeling of human eyes, so it has higher tolerance for some noise problems and edge problems. At present, most of the super-resolution reconstruction networks rely on the degradation model, and the unknown degradation model of logo image leads to the reconstruction effect can not be guaranteed [3], and the problems of noise and edge are difficult to solve The generator of the network consists of up sampling module, de-noising module and anti aliasing module. Firstly, the input low-resolution image is upsampled 4 times by the up sampling module to generate the initial super-resolution image; then, the clear super-resolution image is reconstructed through the denoising module and anti aliasing module, although the low-resolution image is not The degradation model is unknown, but the noise and edge sawtooth caused by this problem can be solved directly by these two modules. In the discriminator, a joint training loss is introduced, including perception loss, edge loss and resistance loss [5]. They restrict the final image at pixel level and high frequency detail level respectively, so that the reconstructed image is closer to the original image Real high resolution images. The innovations of this paper are as follows: 1) in view of the noise and edge problems caused by the unknown degradation model of logo image, the denoising and anti aliasing module is introduced into the anti network generator. 2) The joint training loss is used in the discriminator to make the reconstructed image have better quality.

Related Knowledge
In recent years, image super-resolution based on deep learning has received great attention. Dong et al. [12] first proposed image super-resolution reconstruction using three-layer convolution. Kim et al. [9] deepened the number of network layers on the basis of the three-layer convolution network and learned the residual information of the image, which improved the performance of super-resolution reconstruction. Ledig et al. [10] proposed to use generative confrontation networks for confrontation learning to reconstruct images. These networks are aimed at common color images. For logo images, the reconstruction effect is not well performed because the degradation model is unknown.
For image denoising, the traditional methods such as Gaussian filter, median filter, bilateral filter and so on are modified for different types of noise, while CNN based denoising can cover a variety of types of noise for denoising, which is of great help for reconstructing high-quality logo images.
Anti aliasing mainly involves edge smoothing [4]. Super resolution reconstruction is easy to produce jagged image due to its inherent characteristics of pixel reconstruction. Traditional methods are mostly based on filtering for smoothing, but this leads to many important details of the picture filtered out. Anti aliasing operation based on CNN can avoid this phenomenon and make the edge of the image reach To a more smooth and clear level.
Little attention has been paid to the combination of super-resolution, denoising and anti aliasing. Xiao et al. [16] proposed a deep full convolutional coding decoding framework to solve the problem of denoising and super-resolution image restoration. However, the network has many layers, and the running time cost is high for logo images, and the pertinence is not strong. Yang [19] proposed a super-resolution method based on sparse transformation the traditional methods, such as Gaussian blur, bilinear amplification, etc., are useful for image anti aliasing operation [20]. However, such multi-stage method increases the error of super-resolution image and is not accurate enough for the reconstruction of logo image.

Generator network
The generator network is composed of up sampling module, de-noising module and anti aliasing module. The up sampling module is shown in Fig. 2, and the denoising and anti aliasing module are shown in Fig. 3 and Fig. 4. In the first layer convolution of the up sampling module of the generator, a convolution kernel with the size of 5 * 5 is used to extract a large number of rich features, because the size of the convolution kernel determines the size of the receptive field, and the large convolution kernel can ensure that In order to extract enough effective features and reduce the network operation cost [8], the residual network structure is used. Each residual block contains two 3 * 3 convolution layers and relu is used as the activation function After that, two 6 * 6 deconvolution layers are used to enlarge the feature size. Finally, a 3 × 3 convolution layer is used to obtain the final 3-channel sampled image.  The denoising module uses the network structure of convolution and deconvolution, which is similar to the network structure in [21]. The difference is that the number of network layers in this network is 10, that is, 5-layer convolution and 5-layer deconvolution. For logo images, the details of the image are not very rich. Compared with the 20-layer convolution in [22], the 10-layer network structure can achieve better effect, and the volume of the module is medium The 3 * 3 convolution kernel is used in both cumulus and deconvolution layers, the channel size is 64, and the skip connection is used to connect the convolution layer and the deconvolution layer symmetrically, which makes the parameters and signals propagate back to the bottom layer, thus avoiding the disappearance of gradient and increasing the training speed of the model At the same time, the network can ensure that important information is not lost as much as possible [23].
The anti aliasing module adopts two-layer convolution structure, the convolution kernel is 3 * 3, and the channel number is 3. The first layer convolution uses Gaussian filter as convolution kernel, and the second layer uses Butterworth high pass filter as convolution kernel. The final output anti aliasing image is the image reconstructed by the generator.

Discriminator network
The discriminator network is composed of two layers of network. The high-resolution image reconstructed by the generator and the real high-resolution image are used as input. The upper network is composed of five layers of convolution. The convolution layer uses 4 * 4 convolution kernel, and the number of channels is 64, 128, 256 and 512. Leakyrelu is used as the activation function. The lower network is composed of three layers of convolution. Sobel operator is used as convolution kernel The size is 3 * 3, and the channel is 3 [24]. At the end of the network, dense (full connection layer) and sigmoid activation function are used to output the similarity between the reconstructed high-resolution image and the real high-resolution image.

Loss Function
The loss function of this method is joint training loss, including content loss and confrontation loss. Content loss includes perception loss and edge loss. The perceptual loss corresponds to the loss function of the generator. Compared with most of the learning based super-resolution reconstruction networks which are constrained at the output layer of the network, the three modules of the generator are constrained by means of mean squared error As the loss function [25], the up sampling module can provide better precursor generation effect for the denoising module, and the de-noising module can also provide better effect for anti aliasing. Each module can restrain each other to achieve a better reconstruction effect and sense the loss

Edge loss
The edge loss corresponds to the lower layer network Sobel operator convolution layer in the discriminator [13]. Compared with the traditional generator network, a network is added in the generator to extract the edge images of high-resolution reconstructed images and real high-resolution images. The accuracy of training is further improved by calculating the edge loss, The calculation method of edge loss is shown in formula (2)

Counter loss
represents the probability that the discriminator determines the image generated by the  D generator as a real high-resolution image, is the network parameter of the discriminator, is the  w network parameter of the generator, represents the th low resolution image, represents the total LR t I t N number of input images [14].

Objective function
To sum up, the objective function of this method is the weighted combination of perceived loss, edge loss and countermeasure loss. The calculation method is shown in formula (4):

Data Set
In this paper, we use the flickrlogo-32 data set, which has 8240 images in this paper, 500 HR images with white background color are selected for training, including 400 as training set and 100 as test set. HR images are downsampling to LR images by three different downsampling methods, including bicubic, bilinear and nearest neighbor the downsampling factor was 4, HR image was 400 * 400, LR image was 100 * 100 [15]. At the same time, in order to make the training results more authoritative, in addition to the test set extracted from flickrlogo-32, this paper also uses the crawler to grab a large number of logo images with unknown degradation process from Baidu and Google platforms respectively, forming three test sets Set1, set2 and set3 with 100 pieces each.

Experimental Process
The experimental configuration environment of this paper is: the operating system is windows10, the processor is core (TM) i7-7700k, the graphics card is nivid GTX 1080ti, the experimental platform is cuda9.2, python 1.0.1, Python 3.6.5, using flickrlogo-32 training set training generator network. The input is the LR image after HR image downsampling [16], and the output is the reconstructed SR image obtained from LR image after passing through the generator network. Then all parameters of the generator network are reserved, and the discriminator network parameters are initialized to train the discriminator network [17]. The parameters shown in formula (4) are set to 0.001, 0.01 and 0.01. Adagrad is used as the optimizer for gradient updating. The mini batch is set to 128. The network parameters of generator and discriminator are updated alternately during training.

Analysis of Experimental Results
In order to show the effect of this method more intuitively, this paper selects four related superresolution methods based on deep learning network for comparison, which are srcnn, srresnet, srgan and esrgan. The source code used in the peak signal-to-noise ratio comparison method is obtained from the project published by GitHub, where the address of srcnn project is https://github.com/fuyongXu/SRCNN_ Pytorch_ 1.0; the project address of the other three methods is https://github.com/xinntao/BasicSR [18].
As shown in Fig. 5 is a comparison chart of different methods using the same data set. It can be seen that the effect of this method is significantly improved compared with the previous four methods, mainly reflected in the obvious reduction of noise and the smoothing of edges. The former four networks only consider the point-to-point mapping between low-resolution images and high-resolution images, and do not consider the unknown degradation one effect [19], resulting in the lack of pertinence in the reconstruction of logo image, and can not meet the need of reuse after the reconstruction of logo image. In this paper, the denoising and anti aliasing module added to the logo image solves the visual defects of the reconstructed logo image, so as to achieve the purpose of reusing the logo image after reconstruction.  Table 1, the average values of PSNR and SSIM of 200 test images are estimated, and the PSNR and SSIM values of 4 images are randomly selected for display. The results show that compared with srcnn, srresnet, srgan and esrgan, the PSNR and SSIM values of reconstructed images in this paper perform better. Compared with srgan, the proposed method has an average improvement of 0.2675db on PSNR and 0.035 on SSIM. For logo image reconstruction quality has been significantly improved.
At the same time, for the efficiency of the algorithm, this method can also achieve the same or even better level as srgan and other methods. Under different iteration times, the training trend of srcnn, srresnet, srgan, esrgan [21], PSNR and SSIR of this method are analyzed and counted, and the results shown in Fig. 6 are obtained. The results show that, in the case of low iteration times, the proposed method is better than other methods the slope of the four methods is large and the numerical value is updated quickly. However, with the increase of iteration times [22], the PSNR and SSIR values of this method tend to converge, but we can see that compared with the other four methods, the convergence speed of the proposed method is still superior to the other four methods. The test results show that the proposed method has lower time complexity, faster convergence speed and better reconstruction ability for logo image Yes.

Ablation Experiment
In order to verify the effectiveness of the improved method in the countermeasure network, ablation experiments are carried out for the denoising module, anti aliasing module, and the fusion of sensing loss, edge loss and anti loss in the generator network [24]. As shown in Table 2, after removing the denoising module, the numerical performance of the reconstructed image on PSNR and SSIM is greatly reduced, because the up sampling module is for LR Image noise is also used as image details to reconstruct, resulting in poor reconstruction effect [25], and the added denoising module is a good solution to this problem, making the reconstructed image quality better, and the anti aliasing module also has such a phenomenon. It can be seen that for the generator, the addition of these two modules can make it have better reconstruction ability for logo image and improve the network For the loss fusion, the PSNR and SSIM of the reconstructed image are significantly reduced after removing the perceptual loss, edge loss and resistance loss respectively [26]. It can be seen that the fusion loss in this method plays a great role in the quality of the reconstructed image.

Conclusion
On the basis of deep learning neural network, a super-resolution reconstruction method of logo image based on improved generative countermeasure network is proposed in this paper. The main