Fisheye Rectified Image Super-resolution Using Deep Symmetrical Neural Network

Influenced by the fast developments of imaging technologies, fisheye cameras have been widely applied. But the resolution of images captured by fisheye cameras is limited due to lens distortions. In our work, we use a deep symmetrical network model to deal with single fisheye rectified image super-resolution task. It is shown by the experiments that the proposed method can achieve satisfactory results for fisheye rectified image super-resolution.


Introduction
Image super-resolution is one of the hot research areas with applications on safety surveillance [1], remote sensing [2] and image in-painting [3], etc. Fisheye cameras are often used to obtain panoramic images with large field-of-view. However, the resolution of fisheye images is relatively low due to lens distortions. Few research has been conducted on improving the resolution of fisheye images or similar types of images [4][5][6]. In a fisheye image, the resolution varies in different regions, which increase the difficulty of super resolution. Recently, deep learning based approaches has proved to be effective for image processing tasks, which provides new ideas for fisheye image super-resolution. Convolutional neural networks can be trained on large numbers of samples. Then the trained model can be applied to learn the mapping from input fisheye rectified image to the corresponding highresolution result.
In this paper, we propose a deep symmetrical network model to improve the resolution of fish rectified image. Our approach can generate satisfactory results according to the experiments. The proposed method is described as follows.

Method
Our model is built as a symmetrical deep network with an encoding process and a decoding process. The numbers of feature maps as well as the kernel sizes for the network layers are shown in Table 1. In the proposed model, the original fisheye rectified image is the input and the corresponding superresolution result is the output.
The encoding process includes convolution and pooling layers which processes the input fisheye rectified image and generates feature maps with various resolutions. Larger receptive field is obtained and the higher layers contains more global information, and detailed local information is contained in lower layers. On the other hand, the decoding process deals with the feature maps generated from the encoding process. The resolutions of feature maps are raised and the super-resolution result is generated. As the input of the network is similar to the output, we add a residual link from the input to the output to accelerate the convergence of the network. The loss function in the training process is the L2 loss between the output image and the corresponding ground truth image.

Experiments
Firstly, fisheye rectified images are obtained based on the general fisheye model [7] with various parameter settings. Then simulated training and testing sets can be built. The training and testing dataset contain 20000 and 1000 samples, respectively. The resolution of the input fisheye rectified image and the output are 128*128 and 256*256 respectively. A training sample contains a fisheye rectified image and the corresponding super-resolution ground truth.
Our experiments are complemented based on Caffe [8]. The initial learning rate is 1e-4 and the number of iterations in the training process is 300K. After the training process is finished, we test the proposed model on the testing set. Figure 1 shows the satisfactory performance of the proposed method (From left to right: low-resolution fisheye rectified image, corresponding high-resolution ground truth and super-resolution results of the proposed method). To further test the performance of our method, we compare the proposed model with the widely used super-resolution algorithm DRCN [9]. The DRCN method is not able to deal with low-resolution problems caused by lens distortions. It is shown that details can be better recovered by the proposed model and the quality of fisheye rectified image can be effectively improved.

Conclusions
In this paper, we design a deep symmetrical network to improve the resolution of fisheye rectified image. It is shown by the experiments that our model can achieve satisfactory super-resolution results. In the following, we will improve our work in terms of network architecture and training strategy.