Research on Geomagnetic Reconstruction Based on LSTM Neural Network

Geomagnetic observation data has unique academic research significance, and it is also an important national basic strategic resource. Based on the correlation and temporal characteristics of surrounding stations in geomagnetic observation data, this paper proposes the use of LSTM neural network method to predict the missing data of the target station based on the data of the reference station, and it is compared with the traditional BP neural and CNN neural network method. The results show that the LSTM neural network is superior to the BP neural network and CNN neural network in geomagnetic data reconstruction, and the predicted geomagnetic data has higher accuracy.


Introduction
The geomagnetic field is composed of four parts: the geomagnetic field, the crust magnetic field, the changing magnetic field, and the induced magnetic field. It is one of the important geophysical fields [1] . The mission of a geomagnetic station is to continuously record the change of the geomagnetic field at a fixed position for a long time [2] . Geomagnetic observation data is mainly used to monitor and study changes in the Earth's magnetic field, to conduct seismomagnetic relationship research and earthquake prediction [3] ; to provide a basis for other geomagnetic measurements (such as navigation, geophysical exploration.) [4] . Therefore, geomagnetic observation data has unique academic research significance and is an important national basic strategic resource [5] .
At present, due to instrument failures, power supply failures, lightning strikes, and ferromagnetic interference, instrument data is lost or unavailable. Therefore, how to accurately reconstruct these missing data and ensure data continuity and integrity is of great significance. In the research on the reconstruction method of geomagnetic data, Zhu Zhaocai used the data of adjacent stations and missing stations, and used the correlation of the station data to fill the missing data through regression equations [6] . Lockwood used the least squares fitting method to calculate missing data through adjacent stations and achieved good results [7] .Yao Xiuyi used spatial weighting and BP neural network to reconstruct the geomagnetic data of missing stations by using adjacent reference stations [8] . Based on deep learning theory, this paper proposes an advanced recurrent neural network, namely long short-term memory neural network (LSTM), to predict geomagnetic missing data, and compares and analyzes it with BP neural network and CNN neural network.

BP neural network
BP (Back-propagation) neural network is a typical multi-layer forward neural network proposed by Rumelhard and McClelland (1986). It is one of the most widely used neural networks [9] .The working mode of BP network is mainly the forward propagation of signals and the backward propagation of errors. The most important training of BP network is the back propagation process of errors. During the simulation, the error between the predicted value and the actual value generated by the system is repeatedly collected. If the set accuracy requirements are not met, the error is back-propagated to adjust the weight of the neuron. The training process is continuously looped until the output error meets the accuracy requirements or reaches a preset number of learning times.
The BP neural network prediction idea is to use the data of the four reference stations and the target station for training, and predict the missing data of the target station through the data of the reference station during the missing data period. When the BP neural network performs prediction, it uses four reference stations to train with the same second data as the target station. When predicting missing data, it uses the data from the four reference stations as input and the data from the target station as output. The correlation between the previous day's data and the next day's data at the same time is predicted.

.LSTM neural network
LSTM (Long Short-Term Memory Network) is improved on the basis of RNN recurrent neural network by Sepp Hochreiter and Jurgen Schmidhuber. The LSTM neural network incorporates a memory module on the basis of the RNN hidden layer. It is composed of a unit, input gate, output gate, and forget gate. The gate structure provides continuous write, read and reset simulation operations. The LSTM network inherits most of the characteristics of the RNN model, and at the same time solves the Vanishing Gradient problem caused by the gradual reduction of the gradient back propagation process. It is suitable for dealing with problems that are highly related to time series, and the geomagnetic field data is closely related to time series, so Using LSTM neural network to predict geomagnetic loss data is very suitable.
The LSTM neural network prediction idea uses four reference stations from t-255 seconds to t seconds of data as input to predict the missing data of t seconds, and the prediction is based on the correlation between the first 256 seconds of data and the second second of data when the geomagnetic field changes.

CNN neural network
CNN (Convolutional neural network) is a feed-forward neural network. It consists of three structures: convolutional, activation, and pooling. Its artificial neurons can respond to surrounding cells within a portion of its coverage.
When CNN neural network makes predictions, assuming that t seconds of data is missing, using t-255 seconds to t seconds data of the four reference stations as input, predicting the second data of the target station at time t.

Experiment
When the distance of the local magnetic station is less than or equal to 1000 km, the geomagnetic record has a high spatial correlation. As the distance increases, the correlation decreases [8]. The target station selected in this paper is Hongshan Station. With the target station as the center and a radius of 750km, the four stations are selected as reference stations: Tai'an, Mengcheng, Yulin and Qianling. Assuming the 86400 seconds of data is missing from Hongshan on April 19, 2013, 00:00: 00-23: 59: 59, use the data from the previous day of the reference station for training, and use LSTM, CNN, and BP neural networks to train Hongshan. The missing data of the station on April 19 was reconstructed. The reconstruction results and errors of the D/H /Z components are shown in the following figure.  Figure 1 (b) shows the error analysis between the actual data and the predicted data in Figure 1 (a), with MAE of 0.180nT and MSE of 0.068nT. Figure 1 (c) is the reconstruction result of D component using CNN neural network, blue is the actual data curve, orange is the predicted data curve. Figure 1 (d) shows the error analysis between the actual data and the predicted data in Figure 1 (c), with MAE of 0.270nT and MSE of 0.135nT. Figure 1 (e) is the reconstruction result of D component using BP neural network, blue is the actual data curve, orange is the predicted data curve. Figure 1 (f) shows the error analysis between the actual data and the predicted data in Figure 1 (e). The MAE is 0.260nT and the MSE is 0.117nT. Figure 2 (a) is the reconstruction result of H component using LSTM neural network, blue is the actual data curve, orange is the predicted data curve. Figure 2 (b) shows the error analysis between the actual data and the predicted data in Figure 2 (a), with MAE of 0.147nT and MSE of 0.050nT. Figure 2 (c) shows the reconstruction results of the H component using a CNN neural network. Blue is the actual data curve and orange is the predicted data curve. Figure 2 (d) is the error analysis of the actual data and the predicted data in Figure 2 (c), with MAE of 0.373nT and MSE of 0.216nT. Figure 2 (e) is the reconstruction result of H component using BP neural network, blue is the actual data curve, orange is the predicted data curve. Figure 2 (f) shows the error analysis between the actual data and the predicted data in Figure 2  Blue is the actual data curve and orange is the predicted data curve. Figure 3 (b) shows the error analysis between the actual data and the predicted data in Figure 3 (a), with MAE of 0.209nT and MSE of 0.085nT. Figure 3 (c) shows the reconstruction results of the Z component using the CNN neural network. Blue is the actual data curve and orange is the predicted data curve. Figure 3 (d) shows the error analysis between the actual data and the predicted data in Figure 3 (c), with MAE of 0.395nT and MSE of 0.228nT. Figure 3 (e) is the reconstruction result of Z component using BP neural network, blue is the actual data curve, orange is the predicted data curve. Figure 3 (f) is the error analysis of the actual and predicted data in Figure 3 (e), with MAE of 0.351nT and MSE of 0.203nT.

Figure2. Geomagnetic H component data reconstruction results and errors
Based on the three-component prediction results, Table 1 shows the effects of different neural networks on the prediction results. As can be seen from the

CONCLUSIONS
This paper uses LSTM neural network, CNN neural network, and BP neural network to reconstruct geomagnetic data. It is obtained through comparison and analysis with actual data. Using LSTM neural network to reconstruct geomagnetic data, the errors of MAE and MSE are the smallest, and the refactoring accuracy. Among the three geomagnetic components, the Z component of the reconstruction error is greater than the two components of D and H, because the Z component is more closely related to the underground medium structure in the area.