Global Sea Surface Temperature Analysis Based on Domestic Ocean Satellite Data. Part I: Methods and Results

Sea surface temperature is widely used in research and applications such as upper ocean processes, air-sea heat exchange, numerical simulation and prediction of the ocean and atmosphere. In this article, the global gap-free fusion data of sea surface temperature has been developed using optimal interpolation (OI) method which is commonly used by international operational institution, merging satellite remote sensed H1C, H2B, AVHRR, AMSR data and GTS in-situ data. According to three data fusion experiments, it is found that the fusion results of domestic satellite remote sensed data during the test period are qualitatively better than those of foreign satellites in the Northwest Pacific region. Further quantitative analysis is compared with Argo surface SST data, a total of 41842 data pairs are matched in 2022, with a deviation of -0.0756 and a root mean square error of 0.4283.


Introduction
Sea surface temperature as one of the most important parameters for describing the thermal state of the ocean surface [1], is widely used in research and applications such as upper ocean processes, air-sea heat exchange, numerical simulation and prediction of the ocean and atmosphere [2].There are two main methods to obtain sea surface temperature: in-situ measurement and satellite remote sensing.The in-situ measured data has serious limitations in terms of both spatial continuity and temporal synchronization.Satellite remote sensing of SST mainly relies on infrared and microwave sensors [3].Infrared sensors have the advantage of high spatiotemporal resolution, but it is susceptible to the influence of clouds and aerosols, resulting in lower precision.Microwave sensors have the advantages of all-weather and strong penetrability, but its spatial resolution is relatively low and is affected by sea-land boundaries and precipitation [4].The fusion data of sea surface temperature can merge multiple sources such as in-situ and satellite remote sensed SST data sets to obtain gap-free and highprecision data to meet scientific research and operation needs.With the increasing variety of ocean data and the development of data fusion technology, research institutions and operation departments around the world have continuously developed global gap-free sea surface temperature products that merge in-situ and multi-satellite remote sensed SST data.
The fusion of sea surface temperature adopts the following assumption: given the sea surface temperature data of a certain research area from the previous day, the actual sea surface temperature data of that day can be well represented, that is, the changes in sea surface temperature within two adjacent days are relatively small [5].The fusion of sea surface temperature is essentially the ocean data assimilation.Currently, the fusion products of the sea surface temperature in international operation institutions are mainly based on the optimal interpolation (OI) or variational methods [1,2,6].
Since the 1970s, China has built three major series of ocean satellites in the field of ocean satellite remote sensing technology and applications: ocean color, ocean dynamic environment and ocean monitoring.It has gradually formed a marine satellite remote sensing operation system led by China's independent satellites [7,8,9].In September and October 2018, China launched the first batch of operational ocean satellites HY-1C and HY-2B, respectively, and developed a ground application system for operation.In succession, China has also launched HY-1D and HY-2C satellites to form constellations with HY-1C and HY-2B for operational ocean observation.These autonomous satellites provide important and reliable data on marine environmental factors [10], with data accuracy reaching the level of similar satellite abroad [11,12,13].
In this article, the optimal interpolation (OI) method which is commonly adopted by the international operation institutions is used to obtain the global gap-free fusion data of sea surface temperature, merging multi-satellite remote sensing data and GTS in-situ data.Session 2 of the paper is data and methods, and analysis and results are in session 3. The session 4 is the conclusion.

Data
2.1.1.Satellite data.The infrared satellite data of sea surface temperature in this article includes foreign AVHRR and domestic HY-1C, while the microwave satellite data includes foreign AMSR and domestic HY-2B (Table 1).The data is selected to be analysed, usually in the form of outliers, to obtain a data matrix X m×n .Then the cross product of X and its transposed matrix X  is calculated to obtain the square matrix Based on 20 years daily sea surface temperature data from OSTIA SST (2002-2021), the top 25 global (SST) modes are obtained.According to equation ( 2), the satellite remote sensing data of SST are adjusted by the large-scale deviation based on in-situ data [1].
For (), if there is observation data, it is 1, otherwise it is 0; () is cos weight for latitude; the judgment condition for   is greater than 15%.

Optimal Interpolation.
The optimal interpolation (OI) algorithm [5] is an analytical method that minimizes the variance of analysis under the assumption that background values, observation values, and analysis values are unbiased estimates.The optimal interpolation algorithm is determined relative to the background field, representing both the observed and analysed values in increments relative to the background field.The analysis increment r represents the difference between the analyzed and background values, and the observation increment q represents the difference between the observed and background values.Analysis increment r k at grid point k can be expressed as: where  represents all valid points,   represents the minimum square weight, which is obtained by minimizing the sum of variances in analyzing the sea surface temperature field; The subscripts  and  (used below) represent any observation point.The subscript  represents any point required for analysis,   represents the observation increment at grid point .
In the optimal interpolation algorithm, the analysis values on spatial grid points are determined by adding the background value of the grid points and the increment value.The increment value is obtained by weighting the deviation between the observation values of surrounding observation points and the background field.The weight coefficient (i.e. the optimal interpolation coefficient) is not arbitrarily selected, and the error of the grid point analysis should be minimized.For a single variable v   represents the analysis value of the variable at spatial grid points, v   represents the background value of the variable at spatial grid points, K is the weight coefficient, and v   represents the observed value of the variable at the observation point, v   represents the background field of the variable at the observation point.When the observation points and grid points do not coincide, v   is determined by the following formula: H is a bilinear interpolation operator that interpolates the background field of grid points onto the observation points.Combining equations ( 4) and ( 5) yields: The expression form of the weight coefficient matrix K is B is the background error covariance matrix, and  is the observation error covariance matrix.To calculate the weight coefficient K, it is necessary to firstly estimate the background error covariance matrix B and the observation error covariance matrix .
The determination of B is based on two assumptions: (1) B is stationary; (2) the horizontal correlation of background error follows an exponential decreasing trend as the horizontal distance increases.There are:  =  0.5 ρ 0.5 (8) ρ Is the horizontal correlation matrix of the background field, and  is the diagonal matrix composed of the background field.
For correlation matrices ρ， based on past experience and the sparsity of the horizontal distribution of observation stations compared to the vertical distribution, corresponding improvements have been made to the correlation coefficient to make it more in line with the physical laws of the relevant scales.
The correlation range of this formula is elliptical in shape.,  represents the relevant scales in the longitude and latitude directions, where  is taken as 200 km and  is taken as 150 km;   ,   is the relevant distance in the longitude and latitude directions, respectively.Furthermore, the differences in correlation scales near coastal regions compared to the open ocean will be discussed in other articles.

Analysis of fusion data from different satellites
The large-scale deviation method for correcting remote sensed data based on in-situ SST is EOF.The optimal interpolation method has been used to merge sea surface temperature data from foreign satellites AVHRR and AMSR, domestic satellites H1C and H2B, and GTS (including drift buoy data, equatorial moored buoy data, coastal moored buoy data, and volunteer ship data), with a horizontal resolution of 1/4 degree.
We will set up three data fusion experiments on foreign satellites AVHRR, AMSR and GTS, domestic satellites H1C, H2B and GTS, as well as domestic and foreign satellites AVHRR, AMSR, H1C, H2B and GTS, with December 2021 as the test time span.Taking the results on December 31, 2021 as an example.Figure 1(a) shows the fusion results of foreign satellite AVHRR and AMSR with GTS measured data.Figure 1(b) shows the fusion results of domestic satellite H1C and H2B with GTS measured data.Through comparison, the fusion results of foreign satellites have a smaller deviation compared to OSTIA SST, because OSTIA SST is also mainly composed of AVHRR and AMSR satellites remote sensing data, which are fused with GTS.The main deviation is distributed in the strong current regions of the Northern Hemisphere and the Southern Ocean, which is due to the differences in resolution and fusion methods.

Verification results of fusion data
The observation data is compared using Argo sea surface temperature data, sourced from iQuam (https://www.star.nesdis.noaa.gov/socd/sst/iquam/).The fusion data adopts the merged results of foreign satellites AVHRR and AMSR, domestic satellites H1C and H2B, and GTS data (referred to as NMEFC), spanning from January to August 2022, with a horizontal resolution of 1/4 degree and a daily temporal resolution.Through the analysis of mean distribution (Figure 3), the spatial distribution of NMEFC fusion data is basically consistent with other literature results, and the mean is 20.11 [14].Compared with Argo surface SST data (Figure 4), the fusion results of domestic and foreign satellite and GTS data match a total of 41842 data pairs in 2022, with a deviation of -0.0756 and a root mean square error of 0.4283.

Conclusion
The sea surface temperature is one of the most important parameters for researching on the global ocean-atmosphere system.The global gap-free fusion data of sea surface temperature has been developed based on optimal interpolation (OI) method commonly used by international operational institution using foreign satellites AVHRR and AMSR, domestic satellites H1C and H2B, and GTS insitu data in this article.According to three data fusion experiments on foreign satellites AVHRR, AMSR and GTS, domestic satellites H1C, H2B and GTS, as well as domestic and foreign satellites AVHRR, AMSR, H1C, H2B and GTS, with December 2021 as the test time span, it is found that the fusion results of domestic satellites during the test period are qualitatively better than those of foreign satellites in the Northwest Pacific region.Further quantitative analysis is compared with Argo in-situ SST data.By comparative analysis, a total of 41842 data pairs are matched in 2022, with a deviation of -0.0756 and a root mean square error of 0.4283.The next step is to quantitatively evaluate the differences and reasons between domestic and foreign satellite data fusion results based on multi-year of fusion results.

Figure 3 .
Figure 3. Mean value distribution map of NMEFC fusion data (fusion results of domestic and foreign satellites and in-situ data: AVHRR, AMSR, H1C, H2B and GTS).

Figure 4 .
Figure 4. Comparative analysis of NMEFC fusion data and Argo data (fusion results of domestic and foreign satellites and in-situ data: AVHRR, AMSR, H1C, H2B and GTS)..
Figure 4. Comparative analysis of NMEFC fusion data and Argo data (fusion results of domestic and foreign satellites and in-situ data: AVHRR, AMSR, H1C, H2B and GTS)..

Table 1 .
List of satellite remote sensing data of sea surface temperature.Satellite Application Facility on Ocean and Sea Ice REMSS: Remote Sensing Systems 2.1.2.In-situ data.The Global Telecommunications System (GTS) is a data communication system established by the World Meteorological Organization (WMO) to quickly and accurately transmit meteorological observation data worldwide.After approval, any hydrological or meteorological observation point can use a unified encoding format to enter GTS for data transmission or sharing with other users.In-Situ data includes drift buoy data (Drifter), equatorial moored buoy data (T-Mooring), coastal moored buoy data (C-Mooring), and volunteer ship data (Ship).The statistics of various in-situ observation data in 2022 are shown in Table2.

Table 2 .
Statistical list of various in-situ observation data in 2022