Color Difference Optimization Method for Multi-source Remote Sensing Image Processing

This paper proposes an approach to solve the problem of color inconsistency after the fusion and splicing of multi-spectral images of multi-source domestic satellites, by using regularly division the high-resolution remote sensing images, block by block computing the multiple linear regression and percent clip stretch. First of all, the remote sensing image is divided into blocks and multiple linear regression is carried out separately in each block. Secondly, on the basis of the image to be corrected after regression, percentage truncation and stretching are carried out on the over-concentrated gray band, so that the gray histogram distribution of the gray band is similar to that of the reference image. This scheme ensures that all pixels participate in the calculation, which not only improves the accuracy of the model, but also avoids the problems of long processing time and memory overflow caused by the large amount of high-resolution remote sensing image data. At the same time, each scene image selects an appropriate truncation percentage according to the evaluation index to solve the problem of excessive color difference caused by too concentrated image band gray. The experimental results show that the scheme in this paper can effectively solve the problems of too large color difference and too concentrated gray scale in image band. Compared with traditional methods, it has better performance in model accuracy, color retention and time efficiency.


Introduction
With the rapid development of satellite remote sensing technology, the problem of insufficient information from a single signal source can be overcome by using multi-source and multi-temporal satellite remote sensing images for large-scale comprehensive application and extracting comprehensive information of various remote sensing images. Radiation normalization and color correction are required before comprehensive application, so as to facilitate subsequent image splicing, fusion and comprehensive application. However, the imaging effect of remote sensing images is affected by many factors such as satellite camera quality, weather environment, lighting conditions, surface undulation, underlying surface reflection and so on, resulting in certain differences in spectral reflectance and spectral characteristics of the same features on different images [1].
The research background of this paper is that the multi-spectral images of different satellites are directly fused and spliced with great color differences, so the color of the images needs to be further corrected. The conventional method firstly carries out radiation normalization processing on the image, and the radiation normalization is divided into absolute normalization and relative normalization. Absolute normalization requires historical data such as satellite observation data and atmospheric properties, so it is difficult to be widely applied. Relative normalization is to determine a reference image, and then normalize the gray value of the processed image to the reference image band by band, so that the same ground object in the image to be processed has similar gray value [2]. Compared with absolute normalization, relatively normalized data are easy to obtain, simple and efficient, so they are widely used [3].Common relative radiation normalization methods are divided into two methods based on pixel pair and distribution. Based on the distribution method, regression is performed according to the distribution characteristics of the whole image data. Its advantages are that it can start from the whole image and has high objectivity. Its disadvantages are that when normalization is performed from the overall perspective, the details of local areas cannot be well expressed, and it is easy to cause changes in the original spectral characteristics, which is not conducive to subsequent processing. Based on the pixel pair method, the specific pixel pairs in two images are mostly determined through various algorithms, and the screened pixel pairs are used as experimental data to establish the gray mapping relationship between the images. The advantage of this method is that different pixel pairs can be selected according to different algorithms, thus having better enhancement effect on specific areas. The disadvantage is that this method is based on pixel pairs to calculate gray values, so it requires high geometric registration of images, and the selection of pixel pairs is subjective. In addition, the method of selecting pixel pairs and the set of choices directly affect the quality of the final mapping relationship [4].With the development of domestic satellite remote sensing and the expansion of application requirements, more and more multi-source and multi-temporal remote sensing data from domestic remote sensing satellites are used in coordination. However, most of the current researches and applications of radiation normalization are aimed at the same sensor or specific areas of multi-source sensors, and few researches have been done on the large-scale application of radiation normalization of multi-source sensors.
This paper studies the color correction and radiation normalization of multispectral images for the color problems in image fusion and mosaic, and solves the problems of lack of levels of image bands, too centralized gray distribution, and large-scale radiation normalization of different images. The method of relative radiation normalization for multi-source images based on block regression and the workflow of color correction for different images are proposed. All pixel pairs participate in the calculation of the model, thus improving the accuracy of the model. At the same time, a highly accurate and automated relative radiation normalization process for multi-source remote sensing images is proposed, which provides a method reference for the collaborative application of multisource remote sensing images.

Radiation Normalization Method for Satellite Remote Sensing Images
Before image fusion processing, it is generally necessary to carry out certain image gray processing on the fused two images to better fuse and mosaic. However, when the images are directly mosaicked and fused, it is found that there are great differences between the two images. Therefore, it is necessary to find out the cause of the difference change and carry out color conversion for this reason. There are many factors that cause the gray value transformation of multi-temporal images. The common factors are: the difference of relative radiation response between sensors; Satellite sensor calibration changes with time; the difference between sunlight and observation angle; Reflective anisotropy; Topography; Real changes in ground object reflectivity [5].
The technical route of image color correction in this paper is mainly divided into two parts: one is to establish a normalized multiple linear regression model of relative radiation between images; the other is to solve the problems of over-concentration of gray scale and over-small contrast in gray band of remote sensing images based on model regression. Linear regression is a statistical analysis method that uses regression analysis to determine the interdependent quantitative relationship between two or more variables. The specific flow chart is as follows:  For the first problem, the relative radiation normalization method of the conventional method is to directly use the least square method to linearly regress the image as a whole, which will lead to problems such as over-fitting of the model and large noise, and the large amount of image data in one scene is difficult to process at the same time. Therefore, in this paper, the image is divided into equal width blocks to effectively reduce the amount of data, and then a multiple linear regression model is established to normalize the relative radiation. The advantage of this block processing method is that it can effectively remove noise and blocks with poor data effects such as no data blocks to participate in fitting; meanwhile, the fitting model of adjacent blocks is adopted to carry out fitting, which also avoids the problem that samples are too small to fit due to data removal; secondly, the correlation coefficient of the model can be improved through block processing; the fitting effect of each block by using the corresponding regression model alone is better; and the over-fitting caused by too large data amount is eliminated, so that the precision is reduced.
For the second problem, the gray scale of the image band is too concentrated, resulting in the gray scale of the image fitted by the multiple regression model is still too concentrated without hierarchy and difficult to distinguish. Therefore, an appropriate stretching method is needed to stretch the image to a moderate degree of contrast, which is convenient for subsequent operations such as uniform light and color, image mosaic, image fusion and the like with the satellite image to be corrected.

Block Processing of Remote Sensing Images
High-resolution multispectral images have the characteristics of large amount of data and difficult processing. If the images are directly processed, the image size of each scene is over 1GB. In the following multiple linear regression, it is generally necessary to carry out multiple regression modelling for multiple images in the region [6].The amount of data is too large, and the memory required for processing at the same time needs to exceed the configuration of ordinary computers, resulting in memory overflow. In addition, due to the panorama simple linear regression method in the relative radiation normalization method based on pixel pairs, the direct least square method for the whole scene image is easy to cause over-fitting and noise due to too large data amount, thus affecting the regression coefficient.
Therefore, this paper proposes to divide the image to be analysed into the same block processing, using the principle of equal distance block and dividing the image according to 128*128 pixel size.

Establishment of Multiple Linear Regression Model
Multiple linear regression is mainly the correlation between a dependent variable and multiple independent variables. For the general multiple linear regression model, it is usually expressed as [7][8]: Where n is the number of independent variables, β is the regression coefficient and ε is the random error, which is divided into interpretable error and unexplained error. For the linear regression models of a certain wave band of the reference image to be fitted and the three wave bands R, G and B of the image to be corrected, different wave band regression models will be changed.

Linear Stretching of Image Gray Band
For the image after regression, the gray scale of the fitting result image is still concentrated to a certain extent because the gray scale of a certain band of the reference image is too concentrated, so stretching is needed. When actually observing the same area of the two images, it can be found that a certain wave band of the image to be corrected is obviously concentrated compared with the reference image. When comparing the two bands separately, the remote sensing image of the image to be corrected is weaker in color contrast than the reference image, and the gray value of the image of the reference image is more concentrated from the histogram.
Different stretching methods are required for different situations. Maximum stretching: linear stretching according to the maximum and minimum gray values of the current histogram, with the minimum output and maximum output pixel values as the stretching range. The method has the advantage of being suitable for stretching raster images with dense pixel value distribution.
Standard deviation stretching: standard deviation stretching is to linearly stretch the gray values of the remaining pixels according to removing the extreme value of the image. Similar to percent cutoff stretch. The difference is that the larger the standard deviation is, the greater the degree of dispersion in the histogram, and the smaller the standard deviation is, indicating that most gray values are similar to the mean value and the degree of dispersion is low. The advantage is that stretching raster datasets with a darker overall hue has a better effect. Percent Cutoff Stretch: Applies a linear stretch between the maximum and minimum clipping percent pixel values. The method has the advantages that the method is suitable for enhancing the overall dark remote sensing image, and is also suitable for the remote sensing image Histogram equalization: the original function maps the gray value to the gray range of the new image through a transformation function, which is essentially to reduce the gray level of the image and enhance the contrast of the image. The advantage is that the overall contrast enhancement effect of the image is better.
Each of the above methods has its own advantages and disadvantages, so it is necessary to select the most suitable stretching method through experiments. In this paper, four commonly used grayscale stretching methods such as 2% linear stretching, percentage truncation stretching, and histogram matching and histogram equalization are selected for experiments.

Experiment and analysis
In this paper, the GF-2 image on October 29, 2017 and the BJ-2 image on October 22, 2017 along the Poyang Lake in Jiangxi Province are selected as the reference images and the images to be corrected. The image includes water, forest, arable land and construction land, most of which are covered by vegetation.

Multi-band Block Multiple Linear Regression
Remote sensing images are all generated by multispectral cameras carried by satellites. The quality and spectral range of the cameras will affect the final imaging [10].Since the images selected in this paper are remote sensing images of different satellites in the same area at adjacent times, the difference of sensors in satellites is the main reason for the difference of image radiation. Through consulting the data, it is found that the band ranges of the two satellite sensors are different, and the results are shown in the following The main difference between the two satellite bands shown in Table 1 lies in the red and nearinfrared bands, of which the shorter and narrower near-infrared band of GF-2 satellite image is the main reason for the difference in radiation.  In addition, by looking at the histogram gray mean, mode, standard deviation and other parameters of the two images in Table 2, it can be seen that there is a large gap between the two images. It is necessary to adjust the color of the images first, then perform image fusion and image mosaic.
Multi-band regression fitting was carried out on the experimental images by using the block processing multiple linear regression model. Among them, blue and green bands need only be fitted with blue and green bands corresponding to GF-2 because the wavelength ranges of GF-2 and BJ-2 are basically the same in this band. However, the red band is quite different from BJ-2 because of its wavelength range, which is doped with the adjacent green band and near infrared band. Therefore, it is necessary to determine which bands to choose according to the correlation coefficient R. The following table is a list of normalized equations for band combinations for the overall sub-situation of the study area.  Table 3, multiple linear regression fitting using GF-2 green, red and near infrared bands has the best effect, so the red band images are then subjected to multiple linear regression according to green, red and near infrared bands.
In order to solve the over-fitting problem caused by too much data due to direct linear regression of the image, the standard for blocking is to divide the entire image into uniform blocks, which are divided into different numbers of blocks according to the size of the image. The minimum block range is 64x64 pixels, and the maximum range that can be adjusted according to the situation is 512x512 pixels. Too much data will lead to over-fitting and reduce the correlation of regression models [11].For noisy areas, the noise data can be eliminated for fitting when the noise is obvious, and for areas with indistinct noise characteristics that are difficult to distinguish, the regression fitting can be carried out for the inner divided areas, and the regression equation of the inner divided areas is used as the regression equation of the divided areas for fitting. Generally, the noise-free region is divided into four blocks evenly, and linear regression and fitting are carried out separately in each block, and finally each block is merged into a complete image.

Band Stretch
In order to solve the problem of insufficient layering in the green band of BJ-2, it is necessary to properly stretch the images fitted by multiple regression to reflect the layering of vegetation forest land. Observing the gray histogram of the experimental area, it was found that the gray levels were all concentrated in the low gray value area. The image is stretched by conventional maximum value, standard deviation, histogram equalization and percentage truncation respectively.
The gray scale of the images in the experimental area is too concentrated, and different methods are needed for stretching and comparison. Among them, the percentage truncation stretching method needs to determine what percentage stretching is the most appropriate, usually using exhaustive method combined with correlation coefficient to judge. The truncation coefficient decreases from 0% and 100% and moves to the middle with a threshold of 0.1%, calculates the correlation coefficient between the newly generated image and the original image, and selects the percentage range with the largest correlation coefficient as the truncation value. For selecting the appropriate stretching ratio, we can observe the histogram features and compare the mean, standard deviation, information entropy, correlation coefficient and Babbitt distance after image stretching as the evaluation criteria. The coefficients that can not only maintain the spectral characteristics of the image but also have certain stretching effect are selected as application parameters. Referring to the parameters shown in Table 4 and combining with the image histogram, select an appropriate percentage ratio. For the experimental area, as the percentage truncation ratio increases, the mean value and standard deviation also increase, indicating that the contrast layering of the image is enhanced. The smaller the information entropy and correlation coefficient, the smaller the correlation between the stretched image and the original image. The high and low deviation coefficients indicate that different stretching coefficients have different emphases on the preservation of spectra. Therefore, 1% stretching ratio with small deviation coefficient and appropriate standard deviation of mean value is selected as the best ratio. Finally, the percentage truncation method with the truncation ratio of 1% is selected to stretch the green band of BJ-2 image and save the result as the band to be synthesized in the next step. For the remaining bands, the same method is used to select the ratio for stretching.

Band Synthesis and Result Comparison
After the processed images are combined, it can be seen that the similarity between the generated image and the reference image GF-2 is very near. Compared with the traditional overall regression method and the upstretched comparison method, the regression effect of this method is better. Among them, the new method is more accurate in color expression on cultivated land and buildings, more obvious and more hierarchical than the input image in green bands such as forests and grasslands, and more consistent with the visual effect of the reference image.  As a quantitative comparison, the method in this paper is compared with the traditional global regression method to generate non-stretched images. The residual images and complex correlation coefficients of the two methods are calculated, and the results are shown in the following table:  Table 5 shows that the residual error obtained by this method is smaller and the average complex correlation coefficient is higher. It can be seen that the block multiple regression modeling algorithm proposed in this paper has higher calculation accuracy and regression correlation than the traditional modeling algorithm.
In addition, the residual error between this method and the reference image is made. The results show that the residual error image of the improved model is more black than the residual error image of the traditional method, which is close to full black. All show that this method is superior to the traditional method.

Conclusion
Based on the analysis of the radiation characteristics of multi-source remote sensing images and the analysis of satellite sensors, this paper proposes a color difference correction method that combines block multiple regression and histogram interception. The results show that this method has a good correction effect on the chromatic aberration of BJ-2 to GF-2 images. It can also provide useful reference for color difference correction of other multi-source images. The method in this paper has the following advantages: (1) All pixels are calculated by the block method, which avoids the problems of high subjectivity, low automation and insufficient regression precision caused by artificial selection of points such as pseudo-invariant feature method; At the same time, the problem of overfitting caused by too much data can be avoided by blocking.(2) By comparing and selecting the most appropriate percentage truncation method and the appropriate stretching ratio to stretch the returned image, the display effect of a certain gray scale band can be effectively enhanced, the contrast and layering feeling can be enhanced, and the problem of color inconsistency caused by a certain gray scale band on the original image can be reduced or eliminated.