Prediction of the sea surface temperature in Perhentian Islands by EMD-LSTM model

The increase in the sea surface temperature (SST) is currently an important factor in the decline of coral reef ecosystems worldwide, and SST prediction has always been an important research direction in operational oceanography. This paper collects and analyses the buoy data deployed at Malaysian Perhentian islands and combines CRW data to develop coral bleaching warning products for Malaysian Perhentian islands. The Long Short-Term Memory Network(LSTM) and Empirical Mode Decomposition(EMD)-LSTM methods are used to research the SST prediction, and the differences between the two prediction methods were compared. The research results show that both the LSTM prediction model and the EMD-LSTM prediction model can accurately predict SST, with almost all prediction errors at 0.11°C and 0.01°C, respectively.


Introduction
The temperature rise, ultraviolet radiation and other elements such as low temperature, pollution, bacterial infections, and viruses are usually considered to be the main factors contributing to coral bleaching.The increase in sea surface temperature(SST)is an essential cause of the decline of coral reef ecosystems worldwide.Corals are very sensitive to changes in the SST.When the SST is 1-2℃ higher than the coral's tolerance temperature, corals will be stressed and the symbiotic insect yellow algae in their tissues will gradually detach, leading to bleaching.According to historical records (Ove,  1999), large-scale bleaching events have occurred on coral reefs worldwide from time to time due to abnormal seawater temperatures, and the intervals between these events have been steadily decreasing from an average of 25 years to less than 6 years [1] .In 1998, the rise of the SST caused about 16% of coral bleaching worldwide, affecting almost all coral reef areas, and is considered to be one of the most significant coral bleaching events in the world.According to statistics from the International Coral Reef Society, coral bleaching has occurred in a large number of coral reefs in at least 50 countries around the world(Leggat et al.,2021) [2] .The extent of coral bleaching is very large and includes major coral reef areas in the Pacific, Indian Ocean, and Atlantic Oceans.Heat-induced largescale coral bleaching has become a serious threat to coral reefs worldwide, with the most common cause being an abnormal SST rise.In recent decades, coral reef ecosystems have been declining at an alarming rate globally, and thermal coral bleaching is one of the most essential factors exacerbating this decline.Therefore, the forecasting of the SST has always been an important research direction in operational oceanography.In recent years, the development of artificial intelligence oceanography has opened up new ideas for improving the efficiency of SST forecasting (Spillman et al.,2009; Bethel, B.  et al.,2021; X. Zhang,et al.,2021; Teleki,2022)  [3][4][5] [6] .Over the past few years, the deep learning has also been developed in the study of SST prediction.Zhang Q. et al.(2017) used the Long-Short Term Memory Network (LSTM) to forecast the sea surface temperature [7] .Xiao C et al. (2019) used the AdaBoost iterative algorithm and LSTM to improve the efficiency of the SST prediction [8] .Sarkar et  al. (2020) used the LSTM to improve the accuracy of the model output, which is better than linear regression and traditional neural network models [ 9] .Wei L et al. (2020) used LSTM to forecast the monthly average SST and the root mean square error is 0.5℃ [10] .This paper collects and analyses the buoy data deployed on Malaysian Perhentian island and combines satellite data products from the coral reef monitoring system of the National Oceanic and Atmospheric Administration (NOAA) Coral Reef Monitoring System to develop a coral bleaching warning product for Malaysian Perhentian island.At the same time, the LSTM and Empirical Mode Decomposition(EMD)-LSTM methods are used to study the SST prediction, and the differences between the two prediction methods were compared.The relevant results can provide technical support for coral bleaching warnings.

Data introduction
The sea surface temperature data used in this study, is located on the Perhentian Island of the Malaysia at a depth of about 20m (Figure 1).Observations have been conducted since April 2022, including sea surface temperature, pH value, chlorophyll, Etc. Figure 2 shows the temporal distribution of SST.From mid-April to early May 2022, the SST exceeded 31.5 ℃ and the coral bleaching occurred at Perhentian Island.The NOAA Coral Reef Watch (CRW) program uses near-real-time satellite measurements of SST to monitor the thermal stress of the coral reefs, providing up-to-date measurements to pinpoint areas that are currently at risk of thermally induced coral bleaching.Based on our in-suit SST observations and the CRW data, we combined them with empirical analysis to conduct a coral bleaching warning analysis on the Perhentian Island.Using the Thermal Stress Level data from the CRW to research the level of the coral reefs bleaching.The thermal stress level is divided into 5 levels, among which the level of no stress means no thermal, the bleaching watch means slight thermal stress, the bleaching warning means thermal stress accumulation, the bleaching alert level 1 means strong bleaching and the bleaching alert level 2 means severe bleaching(Table 1). Figure 3 shows that SST has been in the range of 28.9~31.2°C in the last four weeks (17 July-17 August, 2023).The SST this week(11 August-17August, 2023) increases to higher than the maximum monthly mean SST and the coral reefs bleaching thermal stress level of the coral reefs is categorized as Bleaching Watch.The predicted SST around the reef region would decrease next week (18 August-24 August) and the thermal stress level is No Stress(Figure 4).

Methods introduction
Use observational SST and CRW data for coral bleaching warning research, but due to the duration of CRW data, we can only develop one-week of effective coral bleaching warning product.Therefore, this study intends to use machine learning and other algorithms to carry out the prediction research of SST, which could provide data support for long-term coral bleaching warnings in the future.Jurgen Schmidhuber in 1996 [11] .It was originally designed to solve the problems of gradient vanishing, gradient explosion, and long-term dependence in the training process of recurrent neural networks for long sequences.It is well suited for predicting time series data.All recurrent neural networks have the form of a chain of repeating modules of neural network.In standard RNNs(as Figure 5), this repeating module will have a very simple structure, such as a single tanh layer LSTM also have this chain-like structure, but the repeating module has a different structure.Instead of having a single neural network layer, four interact in a very special way.Therefore, it has advantages in analyzing complex and non-stationary signals and has achieved good results in signal analysis fields such as ocean, climate, earthquake, and acoustics, which have been widely used.

Single LSTM training model prediction
Using the Adam algorithm as the optimization algorithm, an LSTM prediction model is built by the first 90% observation data of the temperature series.The network layer structure of LSTM is: a sequence input layer, a LSTM layer with 200 hidden units, a fully connected layer and a regression layer.One hidden layer is used in the LSTM prediction model, and set the number of hidden layer neurons is set to 200.During the prediction, the network model is updated with the observed values at time t-1 and the temperature at time t is predicted.Then concatenate the first 90% of the observation data and the last 10% of the prediction data to draw Figure 6.In order to quantitatively analyze the prediction results, the root mean square error (RMSE) between the last 10% observation data and the last 10% of prediction data is calculated by comparing them and using the RMSE as an evaluation indicator.The Figure 6 shows that the LSTM prediction model has a good prediction effect on the site.The root mean square error (RMSE) of the prediction results is 0.11℃.When the temperature changes drastically, the prediction error will be significant, which is consistent with previous research.From the residuals representing the overall trend, it can be seen that the IMF components and residuals obtained from the SST after EMD decomposition are more stable, with the IMF signal amplitudes are mainly in the range of -0.5℃ to 0.5℃.At some moments, the IMF component still shows significant fluctuations, with the maximum amplitude of the IMF reaching 1.2℃.This strong fluctuation is probably caused by a large amount of missing data and excessive interpolation data.The research results show that the EMD-LSTM prediction model also has good predictive effects on SST(Figure 7).The root mean square error (RMSE) of the predicted results is 0.01℃.Both the LSTM prediction model and the EMD-LSTM prediction model can accurately predict the SST.Although the prediction performance of the EMD-LSTM prediction model is slightly better than that of the LSTM prediction model, it is not sufficient to prove that the SST prediction results obtained by using the EMD-LSTM model are better than those obtained by using a single LSTM training model.In the future, comparisons between two methods at multiple observation stations can be conducted to study the differences between the LSTM and EMD-LSTM methods.

Conclusion
The increase of the SST is an important factor in the decline of coral reef ecosystems worldwide, and the SST prediction has always been an important research direction in operational oceanography.In recent years, the development of artificial intelligence oceanography has opened up new ideas to improve the efficiency of SST prediction.This paper collects and analyzes the buoy data deployed on Malaysian Perhentian islands, and combines CRW data to develop coral bleaching warning products for Malaysian Perhentian islands.The LSTM and EMD-LSTM methods are used to research the SST prediction, and the differences between the two prediction methods were compared.The research results shows that both the LSTM prediction model and the EMD-LSTM prediction model can accurately predict SST, with almost all prediction errors at 0.11℃ and 0.01℃, respectively.This confirms the feasibility and accuracy of the LSTM prediction model and the EMD-LSTM prediction model in the field of SST prediction.EMD relies more on the integrity of data.The LSTM algorithm relies more on the authenticity of the input data after the model is established, while the effect of abnormal data on the training the model is not significant.Based on the characteristics of actual data and the advantages and disadvantages of these two algorithms, prediction models can be flexibly applied to applications.

Figure 1 Figure 2
Figure 1 the location of the observation(the map is from the ocean date view)

3. 1
Long Short-Term Memory Network(LSTM) Deep learning is a new research direction in machine learning, which has the advantage of learning the inherent laws and representation levels of sample data.Deep learning mainly includes a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Deep Belief Network (DBN).Long Short-Term Memory (LSTM) is a special recurrent neural network proposed by Sepp Hochreiter and

Figure 5 .
Figure 5.Standard structure of the RNN and LSTM (a)RNN,(b)LSTM http://colah.github.io/posts/2015-08-Understanding-LSTMs/3.2 Empirical Mode Decomposition(EMD) Empirical Mode Decomposition (EMD) is a signal analysis method developed by scientists such as N. E. Huang at the National Aeronautics and Space Administration (NASA) in the late 20th century.EMD is the decomposition of a signal into several sub-signals and residuals, called Intrinsic Mode Functions (IMF).The function of EMD is to linearise and stabilize data from nonlinear and nonstationary processes, while preserving the characteristics of the data itself during decomposition.Therefore, it has advantages in analyzing complex and non-stationary signals and has achieved good results in signal analysis fields such as ocean, climate, earthquake, and acoustics, which have been widely used.

Figure 6 .
Figure 6.Comparison effect of single LSTM training model prediction4.2.EMD-LSTM training model predictionThe SST data is decomposed by the EMD model into 8 intrinsic mode functions(IMF) and one residual component(the high frequency of the IMF1,IMF2 and IMF3 components have been omitted ).From the residuals representing the overall trend, it can be seen that the IMF components and residuals obtained from the SST after EMD decomposition are more stable, with the IMF signal amplitudes are mainly in the range of -0.5℃ to 0.5℃.At some moments, the IMF component still shows significant fluctuations, with the maximum amplitude of the IMF reaching 1.2℃.This strong fluctuation is probably caused by a large amount of missing data and excessive interpolation data.The research results show that the EMD-LSTM prediction model also has good predictive effects on SST(Figure7).The root mean square error (RMSE) of the predicted results is 0.01℃.Both the LSTM prediction model and the EMD-LSTM prediction model can accurately predict the SST.Although the prediction performance of the EMD-LSTM prediction model is slightly better than that of the LSTM prediction model, it is not sufficient to prove that the SST prediction results obtained by using the EMD-LSTM model are better than those obtained by using a single LSTM training model.In the future, comparisons between two methods at multiple observation stations can be conducted to study the differences between the LSTM and EMD-LSTM methods.

Table 1 .
Warning levels of the Coral Reef Monitoring System of the National Oceanic and Atmospheric Administration of the United States