Federal Algorithm Design and Numerical Experiments of Ensemble Kalman Filter Data Assimilation Analysis Process

Based on the federated Kalman filter proposed by Carlson with linear systems, we propose a federated computing scheme for the ensemble Kalman filter (EnKF) assimilation analysis process for nonlinear systems, and give an optimal information fusion estimation algorithm weighted by diagonal matrix under the linear minimum variance criterion, that is, the assimilation analysis values of each variable in the global estimation are linear combinations of the assimilation analysis values of the corresponding variables in the local estimation of sub-filters, and the calculation of the combination coefficients is given. The federated algorithm of the EnKF assimilation analysis process for nonlinear systems is verified by the Lorenz (1963) system.


Introduction
With the development of science and technology, there are usually a variety of observation instruments to provide their information in the process of the atmosphere and ocean.If the standard EnKF [1][2][3][4] data assimilation method is adopted, the amount of computation is very large due to the need to centrally fuse the measurement data of all observation instruments.In addition, this centralized EnKF method does not have pluggability for the assimilation algorithm design of different types of observation data.Once an observation instrument fails or an observation data is wrong, the entire data assimilation process will not work properly.If the assimilation method of separate processing and federal design can be adopted for different types of observation data, it can not only reduce the computational burden of centralized assimilation data, but also improve the scalability and fault tolerance of the assimilation system.According to the principle of the federated Kalman filter [5] [6], a federal computational algorithm based on matrix weighted optimal fusion for the EnKF assimilation analysis process suitable for nonlinear systems is proposed.However, this federal method for the EnKF has two disadvantages.One disadvantage is need to calculate the inverse of the covariance matrix of the state variable analysis value after the assimilation analysis of different types of observation data [7].Because the covariance matrix in the EnKF is obtained by statistical methods, the inverse of the covariance matrix does not necessarily exist, especially when the number of sets is less than the number of variables.Another disadvantage is that matrix inversion has a large computational burden.In view of the above shortcomings, we propose an optimal information fusion estimation algorithm weighted by diagonal matrix under the linear minimum variance criterion.The weighted optimal fusion formula of diagonal matrix is derived by Lagrange multiplier method, and the optimal weighting matrix is replaced by calculating the optimal weighting coefficient, which solves the difficulty of federation of EnKF.Finally, a numerical experiment is carried out to test our proposed algorithm.

P
represent the variable value of the main filter and its variance covariance matrix, respectively; N is the number of types of observation data and the number of sub-filters.The core of the federated filtering algorithm is to fuse the results of each filter according to the following formula: It can be obtain the global optimal estimation, Carlson [8] gave proofs of Eq.( 1) and Eq.( 2) when the errors between are not correlated.
Then each sub-filter uses the EnKF calculation method proposed by Evenson [2].For the main filter, its function is to update the time first, and the second is to fuse the estimation results of each sub-filter globally.The main filter has no measurement update.

Global Optimal Estimation Federal Filtering Algorithm Weighted by Diagonal Matrix under
Linear Minimum Variance Criterion For the EnKF assimilation federal analysis scheme suitable for nonlinear systems, the current problem is how to use the local estimation given by each sub-filter, including the main filter, to obtain the global optimal estimation.If the global optimal estimation of the state variable analysis values estimated by each sub-filter and the main filter is performed according to Eq.( 1) and Eq.( 2), two difficulties will be encountered: firstly, in the EnKF method, the analysis covariance matrix Pi,j (i=1, 2, … , N, N+1) of each sub-filter is obtained by the statistics of the estimated state variable set of each sub-filter, so it is not guaranteed that the inverse of the analysis covariance matrix of each sub-filter must exist, especially when the number of analysis state variables is greater than the number of set members, which will cause trouble in the calculation of the global optimal estimation in Eq.( 2); the second difficulty is that the statistical calculation of the covariance matrix of the analysis variable and the calculation of the matrix inversion is very large [2][7].In order to solve these two difficulties, we propose a global optimal estimation federal filtering algorithm weighted by diagonal matrix under the linear minimum variance criterion.In this algorithm, it is assumed that the kth element () k g x of the global estimate g x is only a linear combination of the kth element , , which w is the weight coefficient and n is the number of control variables, then: , 1 The global estimation should satisfy the following conditions: x W P W of the global estimation takes the minimum value, which x is the true value.Because the local estimates are not correlated, that is According to condition ①, that g x is unbiased, it can be obtained: where nn  R I is a unit diagonal matrix, and the variance matrix of the global estimate is : where, , 1 where The objective function is defined as follows: 11 TT , 11 where is the Lagrange multiplier vector.It is necessary to determine the parameter vectors and Λ, so that the objective function J is minimized, for this let , 2 , 1, 2, , , 1 The above two equations can be written as the following Eq.( 11) form: Since the matrices Di,i and I are diagonal matrices, the Eq.( 11) can be decomposed into the following n independent n + 2 order linear equations: , It can be seen that this system has a unique solution.The solution is:

Numerical Experiments
To test the feasibility and effectiveness of the proposed EnKF federal design scheme that the global optimal estimation algorithm weighted by diagonal matrix under the linear minimum variance criterion, we use Lorenz (1963) [23]  ; and the mode error is not considered.The fourth-order Runge-Kutta method is used for the numerical solution of the equations.The time integration step is 0.001, and the total integration is 800 steps.The probability density distribution of the three initial perturbations is a Gaussian distribution with a mean value of 0 and a variance of 0.4, and the number of perturbations is 100.The first kind of observation data is: which O(k) denotes the observation data at time k, x(k), y(k) and z(k) are the values of the corresponding time calculated from the real initial value, respectively.ε(k) is the observed perturbation, it is a Gaussian white noise with a mean value of 0 and a variance of 0.6.The second kind of observation data is: , which the value of o in the superscript represents the observation data, x(k), y(k) and z(k) are the values of the corresponding time calculated from the real initial value, respectively; wx(k), wy(k) and wz(k) are observation disturbances, which are Gaussian white noises with mean value of 0 and variance of 0.9, respectively.These two kinds of observation data are assimilated every 10 steps, that is, 80 times.In order to test the assimilation effect of various EnKF test schemes, the error function is defined as J=|φa-φt |, which φ represents x, y or z respectively, the quantity with superscript a represents the estimated value obtained by various EnKF assimilation schemes; the quantity with superscript t represents the true value, that is, the quantity calculated from the true initial value.Firstly, the assimilation effect of the EnKF assimilation analysis process using the federated scheme is tested.We carried out the first group of numerical experiments, which includes three test schemes.In the first experiment, the two kinds of observation data were assimilated by the method of joint processing, and then the optimal solution was obtained by global fusion, the scheme of average distribution of information factors is adopted, that is β1=β2=β3=1/3.The second experiment is to assimilate only the first kind of observation data; the third experiment is to assimilate only the second kind of observation data.Fig. 2 shows the variation of the error function with time in the first group of three experiments.It can be found from the figure that the assimilation analysis values of the three experiments gradually approach the true value with the increase of the number of assimilation observation data.For experiments 2 and 3 that only one observation data is assimilated, the speed of the assimilation analysis value approaching the true value is obviously slower than that of the experiment 1 that uses the federated processing to assimilate the two observation data at the same time.The results of these numerical experiments also show that the proposed federal scheme and the global optimal estimation method weighted by diagonal matrix under the linear minimum variance criterion are feasible, and the accuracy of the assimilation results using the federal scheme to assimilate the two observation data is obviously better than that of only assimilating a single data.To further analyze the assimilation accuracy of the EnKF federal processing, we carried out the second group of numerical experiments.The first experiment was the same as the first experiment in the first group; the second experiment is to assimilate the two kinds of observation data by traditional centralized processing method.The third experiment is to use the federal processing method for the two observation data.The distribution schemes of information factors are β1=0.4,β2=0.1 and β3=0.5.In addition, we also carried out multiple sets of assimilation experiments with different information distribution factors.
Fig. 3 shows that the joint EnKF using the global optimal estimation algorithm weighted by diagonal matrix under the linear minimum variance criterion is effective.The accuracy of its results is the same as that of the centralized data assimilation and the difference in information distribution factors does not affect the accuracy of global filtering.

Conclusions
A federated calculation scheme is proposed for the EnKF with nonlinear systems, its optimal information fusion is weighted by diagonal matrix under the linear minimum variance criterion that the optimal weighting matrix is replaced by calculating the optimal weighting coefficient.
Based on the Lorenz (1963) system with two kinds of observation data, the federal EnKF calculation scheme and the fusion estimation algorithm are tested.The numerical test results show that the federal EnKF calculation scheme and the optimal information fusion estimation algorithm weighted by the diagonal matrix are feasible and effective.
In addition, the system used in the numerical experiment is relatively simple, and the observation data is simulated observation data.If the actual observation data are used, the application effect of this method in complex numerical models needs to be further tested.

Acknowledgment
Supported by the National Natural Science Foundation of China Grant No.42075080.

Figure 1 .
Figure 1.The federal design diagram of the EnKF assimilation analysis process.

Figure 2 .Figure 3 .
Figure 2. The variation of error function with time in the first group of three experiments.(Solidline : Experiment 1 ; dot line : Experiment 2 ; disconnection : Experiment 3) system to carry out numerical experiments.The system control equations are: