Study on Hyperspectral Characteristics and Estimation Model of Soil Mercury Content

In this study, the mercury content of 44 soil samples in Guan Zhong area of Shaanxi Province was used as the data source, and the reflectance spectrum of soil was obtained by ASD Field Spec HR (350-2500 nm) Comparing the reflection characteristics of different contents and the effect of different pre-treatment methods on the establishment of soil heavy metal spectral inversion model. The first order differential, second order differential and reflectance logarithmic transformations were carried out after the pre-treatment of NOR, MSC and SNV, and the sensitive bands of reflectance and mercury content in different mathematical transformations were selected. A hyperspectral estimation model is established by regression method. The results of chemical analysis show that there is a serious Hg pollution in the study area. The results show that: (1) the reflectivity decreases with the increase of mercury content, and the sensitive regions of mercury are located at 392 ~ 455nm, 923nm ~ 1040nm and 1806nm ~ 1969nm. (2) The combination of NOR, MSC and SNV transformations combined with differential transformations can improve the information of heavy metal elements in the soil, and the combination of high correlation band can improve the stability and prediction ability of the model. (3) The partial least squares regression model based on the logarithm of the original reflectance is better and the precision is higher, Rc2 = 0.9912, RMSEC = 0.665; Rv2 = 0.9506, RMSEP = 1.93, which can achieve the mercury content in this region Quick forecast.

Abstract. In this study, the mercury content of 44 soil samples in Guan Zhong area of Shaanxi Province was used as the data source, and the reflectance spectrum of soil was obtained by ASD Field Spec HR (350-2500 nm) Comparing the reflection characteristics of different contents and the effect of different pre-treatment methods on the establishment of soil heavy metal spectral inversion model. The first order differential, second order differential and reflectance logarithmic transformations were carried out after the pre-treatment of NOR, MSC and SNV, and the sensitive bands of reflectance and mercury content in different mathematical transformations were selected. A hyperspectral estimation model is established by regression method. The results of chemical analysis show that there is a serious Hg pollution in the study area. The results show that: (1) the reflectivity decreases with the increase of mercury content, and the sensitive regions of mercury are located at 392 ~ 455nm, 923nm ~ 1040nm and 1806nm ~ 1969nm. (2) The combination of NOR, MSC and SNV transformations combined with differential transformations can improve the information of heavy metal elements in the soil, and the combination of high correlation band can improve the stability and prediction ability of the model. (3) The partial least squares regression model based on the logarithm of the original reflectance is better and the precision is higher, Rc2 = 0.9912, RMSEC = 0.665; Rv2 = 0.9506, RMSEP = 1.93, which can achieve the mercury content in this region Quick forecast.

Introduction
In recent years, with the continuous development of industrialization and urbanization, soil heavy metal enrichment increased, seriously affecting crop yield (C.Y. HUANG, 2011; X.Z. LI, 2002). At present, the determination of heavy metal content in soil is mainly based on field sampling, based on indoor chemical experiments on the basis of different heavy metal content. Because of the high cost and low efficiency of traditional heavy metal element measurement methods, different elements need different chemical treatment, can't meet the large area of soil heavy metal pollution monitoring. Hyperspectral technology has been widely used in the prediction of heavy metal content because of its abundant information, time-saving and non-destructive sample structure, but less research on mercury (Hg) content prediction. Used the partial least squares regression model to estimate the Cr and Zn contents in the Rhine basin (KOOISTRA L, 2001). Gr, Cu, Pb, Ni, Zn in the soil of Tarnowskie Gory, Poland, in the near infrared and mid-infrared diffuse reflectance regions (GRZEGORZ S, 2004). Used the partial least squares method to establish the hyperspectral estimation model of soil mercury content in Zhundong coalfield, and the results show that the model based on the first order differential of reflectivity is the best (Y. The soils in different regions are different due to different pollution sources, and the soil spectrum characteristics are different. The pre-processing method and the modeling method are different when the prediction model is established. The established inversion model is limited to the selected research area. Therefore, when predicting the heavy metal content of soil in the small area, based on the characteristics of the soil reflectance spectrum in the study area, a variety of pre-treatment and comparative analysis of the established inversion model can be used to establish the accurate heavy metal element content estimation model. In this study, the mercury elements in Fufeng County, Yangling County and Wugong County of Shaanxi Province were used as data sources. Normalization (NOR), Multiplicative Scatter Correction (MSC), Standard Normal (FD), second order differential (SD) and reciprocal logarithm (LOG) transformations were carried out by using the Savitzky -Golay convolution smoothing method to smooth and decrease the spectral curves. And the hyperspectral estimation model of Hg was established by using partial least squares regression (PLSR).

sample collection and determination of elemental content
In this study, the main soil types were soil type in Fufeng County, Yangling County and Wugong County, Shaanxi Province. The soil samples were collected according to the "S" -shaped sampling method. The sampling depth was the thickness of the tillage layer, usually 0-30cm. A total of 44 soil samples were sampled. The samples were air-dried and passed through the 2-mm hole. A 200-g soil sample was mixed and passed through a 100-mesh sieve for indoor heavy metal content and another soil sample was used for soil reflectance spectroscopy. The content of mercury (Hg) was determined by inductively coupled plasma mass spectrometry (ICP-MS, Agilent 7700). The statistical characteristics are shown in Table 1.

Hyperspectral data determination
The soil reflectance spectroscopy was performed in the field using a high density reflective probe equipped with an ASD Field Spec HR Terrestrial Spectrometer. The wavelength range of the spectrometer is 350 ~ 2500 nm, the sampling bandwidth is 1.3 nm (350 ~ 1000 nm) and 2 nm (1000 ~ 2500 nm), and the sampling interval is 1 nm. High-density reflective probes can effectively avoid the effects of soil stray light and eliminate the effects of weather. 2cm front view area can avoid the soil in the stone tablets, crop roots, etc., see the parameters shown in Table 2.

sample collection and determination of elemental content
Use the ViewSpePro software to remove the jump spectrum curve and calculate the average as the actual reflection spectrum of the soil sample. Soil samples in the collection, processing and analysis process will introduce different degrees of error, will affect the late data analysis and modeling accuracy. The Mahalanobis distance is based on the multiple normal distribution, taking into account the covariance, mean and variance of the three factors, can be a comprehensive response to soil samples of the comprehensive indicators (SHI. Z, 2014). (Normalization, NOR), Multiplication Scatter Correlation (MSC), Standard Normal Variate (SNV) were used to eliminate the scattering between soil samples using TQ Analyst. Caused by the impact.

sample collection and determination of elemental content
The first order differential, second order differential and reflectance reciprocal logarithmic transformation of the original reflection spectrum were used to smooth and remove the reflection spectrum by using Savitzky -Golay convolution smoothing (Savitzky, A., Golay, J. E. 1964). In addition to direct analysis of soil spectral reflectance, three transformations were made for finding the response regions of different heavy metal elements. First order differential and second order differential transformations can increase the correlation between reflectivity and heavy metal elements while eliminating or limiting the influence of partial linearity or near linear background.

Data modeling and verification
Partial least squares regression (PLSR) is a new method of multivariate statistical data analysis, which solves the problem that the number of samples is less than the number of variables, effectively reducing the number of variables High degree of linear correlation problems, with partial least squares regression method more effective. The model is model and verification set, and the prediction model is established by PLSR method. The model results are verified by the decision factor R2 and the root mean square error RMSE.
Where is the mean value of the sample observations, m is the number of samples for the calibration set, and n is the number of samples to be verified.

Overview of the study area
In this study area, soil types were soil and 44 soil samples were collected. Among them, the soil pH value between 7.47 ~ 8.38, belonging to alkaline soil. The content of soil Hg is between 6.5 and 23.9 mg.kg-1, which is strongly enriched, and the content of all samples exceeds the secondary soil quality standard.
The spectral curves of five different mercury contents in 44 soil samples were randomly selected, and the trend of each spectral curve was similar. Compared with different mercury content of the spectral curve, the higher the mercury content, the lower the spectral reflectance. In the range of 350 ~ 543nm, the reflectivity increases rapidly with the increase of wavelength, and the curve is steep. In the range of 535 ~ 729nm, the curve growth trend is slightly slowed down. 729 ~ 1350nm and 1495 ~ 1797nm range, the reflectivity gradually increased, but the trend is relatively slow. In the 1400nm, 1900nm, 2200nm position appears three more significant absorption peak, is caused by the soil moisture.

Correlation analysis
In order to extract the correlation between mercury content and spectral reflectance under different mathematical transformations, the first order differential, second order differential and reflectance reciprocal logarithmic transformation are analyzed for the correlation with mercury content. The results are shown in Fig. 2 The Compared with the original reflection spectrum, the reflection spectrum of the differential transformation, the correlation significantly improved. From the differential transformation method, the first order differential effect of the reflection spectrum is the best, followed by the second order differential.
The correlation between the original reflectance and the mercury content is the smallest, and the correlation coefficient is between 1350 ~ 1440 and 1813 ~ 1969nm, but this interval is influenced by the water. In the visible region 450nm, 935nm, 1039,1141nm correlation coefficient peak, indicating that this interval is affected by mercury content, in the modeling of this band is conducive to the model construction. The correlation between the reflectance spectrum and the mercury content after differential transformation is obviously enhanced, and there is a strong correlation between the visible and near infrared regions. The correlation coefficients were significantly correlated in the range of 392nm ~ 407nm, 693 ~ 785nm, 923 ~ 929nm, 1806 ~ 1897nm, and the positive correlation coefficient reached the maximum at 2331nm, which was 0.81 The In the visible region, the positive correlation coefficient is larger than the negative correlation coefficient, and the correlation coefficient is positively correlated, indicating that the mercury content is positively correlated with the first order differential of the reflectance. After the second order differential transformation of the reflectivity, the correlation coefficient in the range of 760nm, 1363nm, 1846 ~ 1898nm, 2409 ~ 2468nm is above 0.4, which is significant correlation. Through the correlation coefficient curve, it can be found that the first order differential of the reflectivity, the second order differential transformation and the mercury content of the correlation coefficient between the existence of more 0 value, which shows that although the differential transform can enhance the signal, but also inhibit the spectrum itself Useful information. The reflectivity is negatively correlated with the mercury content after reciprocal logarithmic transformation. At the 433nm, 935nm, 1145nm, 1868nm there are four inflection points, and reached the maximum at 1868nm, -0.5. Based on the above results, it can be concluded that the sensitive regions of mercury are located at 392 ~ 455nm, 923nm ~ 1040nm and 1806nm ~ 1969nm.

Model establishment and comparison
Since the number of samples in this study is only 44, in order to reduce the multiple correlation between the data, the partial least squares regression method is used to establish the estimation model of mercury content. Firstly, three kinds of pre-processing of NOR, MSC and SNV were carried out. The partial least squares regression model was established by using the first order differential, second order differential and reflectance reciprocal logarithmic transformation. The Use the determination coefficient (R 2 ) and the root mean square error (RMSE) for testing. The larger the modeling decision coefficient, the smaller the root mean square error, indicating that the better the stability of the model. The larger the prediction coefficient, the smaller the root mean square error, indicating that the model predicts the stronger. At the same time, in order to avoid over-fitting, the model, the dimension of the model's independent variable is as small as possible.
The partial least squares regression results for different spectral indices R, FD, SD and log (1 / R) are shown in Table 3. By comparing the established regression model, it can be concluded that the model correlation coefficient based on the 16 transformations is 0.8 or higher in the modeling group, and the modeling effect is good. In the prediction group, except for S + C and SNV + LOG, the prediction correlation coefficient reached 0.8 or higher, and the inversion effect was better. In addition to MSC, the PLSR model based on the original spectral, NOR and SNV pre-treatment is better than the response based on the original data. Compared with different transformation methods modeling and prediction results can be drawn, modeling the effect of good model, the prediction effect is not necessarily the best. In the modeling group, the model established by combining the second order differential transform with SNV treatment is the best, Rc 2 = 0.9932 and RMSEC = 0.585, but the inversion result is Rv 2 = 0.8876 and RMSEP = 3.17. Compared with Rc2, Rv 2 , RMSEC and RMSEP, the best way to deal with the modeling effect is to convert the original reflection spectrum to Savitzky-Golay nine-point smoothing in all the optimal PLSR models: Rc 2 = 0.9912, Rv 2 = 0.9506, RMSEC = 0.665, RMSEP = 1.93. Based on this modeling method, the inversion model, the measured value and the predicted value are evenly distributed on both sides of the modeling set and the verification regression tract, and the regression lines of the two sets of data tend to be parallel (Fig. 3). The results show that the inversion model established by the logarithmic logarithmic transformation combined with the partial least squares regression of the reflectivity is smooth and the rapid detection of field mercury content can be achieved.  Figure 3. Comparison between measured and predicted of soil Hg content

Conclusion
In this study, the original hyperspectral estimation model of Hg was established by performing the first order differential, second order differential and reflectance reciprocal logarithm transformation, and the partial least squares regression method. By comparing the effects of different pre-treatment methods on the establishment of soil heavy metal spectral inversion model, the following conclusions are obtained: (1) The Hg content in this study area was seriously exceeded, which posed a serious threat to the normal growth of animals and plants in the area.
(2) The reflection spectra are processed by NOR, MSC and SNV respectively. The first order differential, second order differential and reflectance reciprocal logarithmic transformation are used to reduce the influence of external factors such as surface particle size and other factors. At the same time, the differential transformation can help to improve the correlation between the heavy metal elements and the reflection spectrum in the soil, and the use of the higher correlation band can significantly improve the stability and prediction ability of the model. After differential transformation, it can be concluded that the sensitive regions of mercury are located at 392 ~ 455nm, 923nm ~ 1040nm and 1806nm ~ 1969nm.
(3) According to the modeling and prediction results can be drawn, modeling good effect of the model, the forecast is not necessarily the best. The partial least squares regression model established by the reciprocal logarithmic transformation of Savitzky-Golay nine-point smoothing is the highest, and the prediction effect is better than 0.95, Rc 2 = 0.9912, Rv 2 = 0.9506, RMSEC = 0.665, RMSEP = 0.193, can achieve rapid prediction of mercury content in the region.