Trainable Kalman Filter Based on Recurrent Neural Network and its Application in Aviation Surveillance

The most commonly used algorithm for aviation surveillance is Kalman filter. The accuracy of Kalman filter is affected by the accuracy of its parameters. When the parameters may change with environment change, the accuracy of traditional linear Kalman filter will be affected, in severe cases, filtering divergence will occur. This paper proposes an aviation surveillance filtering model that treats Kalman filter as the kernel of Recurrent Neural Network, uses Back Propagation Neural Network to predict parameters of Kalman filter. Which let Kalman filter be trainable and have ability to estimate parameters dynamically. Moreover, actual radar measurement data is used for radar filtering experiment, and the experiment results show the feasibility of this model, and show that this model has better accuracy and adaptability than traditional Kalman filter.


Introduction
Since the 21st century, civil aviation has been developing rapidly. The increase in the number of civilian aircraft has brought great challenges to aviation safety. Existing aviation surveillance systems usually use secondary surveillance radar to monitor aircraft. However, while radar monitoring, the accuracy will be reduced by refraction of electromagnetic waves as waves propagate in the atmosphere [1]. Therefore, filtering algorithm is usually used to filter radar measurement data to reduce the errors caused by environment or radar itself. An excellent filtering algorithm can get more accurate position estimation about aircraft, which has a great effect on safety of aviation surveillance system.
At present, aviation surveillance systems mainly use Kalman filter algorithm to filter data. Many researches have been done for Kalman filter. Wang et al. addressed the problem of poor robustness and adaptability of Kalman filtering in radar target tracking in complex environments, and studied a new radar target adaptive robust tracking algorithm [2]. Gao et al. analysed the problems in the application of traditional Kalman filter in complex data environment, presented a synthesis estimation of target state component based on weighted linear track parameter estimation model of single radar [3], Yang et al proposed a cascaded Kalman filter algorithm [4].
Kalman filter is a recursive linear minimum variance estimation algorithm. The advantage of applying this filtering algorithm is that each time a new measurement value is added, only the estimation result of previous moment is needed to calculate the estimation value of current moment. There is no need to store measurement values at various times in the past, and as long as dynamic noise and measurement noise parameters used by this algorithm are accurate, Kalman filter is an optimal estimation, that is, mean value of estimation error is 0, and error variance is the smallest. However, when parameters are inaccurate, the filtering accuracy of Kalman filter will be affected by Based on the structural similarity between Kalman filter and Recurrent Neural Network (RNN), a filtering algorithm that treats Kalman filter as core of RNN is proposed in this paper, so that Kalman filter can be used as RNN layer, using back propagation algorithm of RNN to train Kalman filter directly and learns the parameters of the Kalman filter. Enable Kalman filter algorithm to obtain realtime estimation of parameters, so as to improve the adaptability of the traditional Kalman filter algorithm in a dynamic environment, and reduce the workload of manual parameter adjustment.

The introduction of BPNN
Back propagation neural network (BPNN) is the most widely used neural network, its structure is shown in Figure 1.   Back propagation neural network is composed of multi-layer perceptron neurons. Each neuron accepts the output vector from previous layer as its input, and associates a weight for each input component. Each neuron calculates the weighted sum of all input vector components, and applies a nonlinear function to the sum to get output. The neural network fully connects the output of each layer to all neurons of next layer, and the output of the last layer is the output of entire neural network. Input and output of each layer of neurons satisfy is as follow: (1) BPNN is widely used because of its excellent generalization ability and simple principle, it's also been applied in aviation surveillance. Xia and Peng aimed at the low recognition rate of template matching in aircraft target recognition, proposed an aircraft target recognition algorithm based on multi-layer BP neural network [5]. Qian et al. aimed at the problem of target track prediction in hotspot area, proposed an aircraft target track prediction model based on BP neural network [6].

The introduction of RNN
RNN is a neural network used to analyse time series data. Unlike BPNN, data from RNN does not just flow from input layer to output layer. Each layer of the network has a connection pointing to itself. The simplest RNN neuron is shown in Figure 2. Neurons of RNN receive not only external input at each moment, but also their own output at previous moment. The output at current moment is generated by these two inputs through calculation. If expand a neuron in timestep, its expanding structure looks like Figure 2 Figure 2. Structure of RNN. At each time step t, this cyclic neuron receives an external input , and its own output of previous moment. The output of a single neuron is as follow.
( 2) In 1989, Ronald Williams and David Zipser proposed Real-Time Recurrent Learning (RTRL) [7] for Recurrent Neural Networks, and then Paul Werbos proposed the back propagation of Recurrent Neural Networks over time (BP Through Time, BPTT) [8] in 1990. These training algorithms make RNN become practical.
Long-term short-term memory unit (LSTM) is an improvement of basic unit of RNN. It was first proposed by S.H. and J.S. in 1997 [9]. It is used to solve the gradient disappearance and gradient explosion phenomenon that occurs in traditional RNN learning sequence. The structure of the longterm and short-term memory unit is shown in Figure 3. Figure 3. Structure of LSTM. If regard LSTM unit as a black box, it is similar to the basic form of RNN neurons from outside, but LSTM unit will perform better than basic unit, converge faster, and be able to perceive long-term dependence of data. Hu et al. proposed a method for vessel trajectory prediction based on RNN in order to improve the accuracy and efficiency of the prediction [10], and Quan et al. proposed a prediction method combining ship trajectory automatic identification system (AIS) data and LSTM [11].

Experiment data and preprocessing
The source of experiment data for this paper is message that complies with the ASTERIX standard [12]. The ASTERIX standard is a presentation layer protocol developed for transmission and exchanging of aerial surveillance data. The categories of messages used in this paper are 21 and 48. Among them, the message of category 21 represents the ADSB measurement data of flight target, and the type 48 is the radar measurement data of the flight target. These two messages will be referred to as ADSB message and radar message for short in the follow-up content of this paper. Among all data items parsed from these two messages, the fields mainly used in this article are shown in Table 1 and Table 2. The aircraft position information parsed in radar message is radar local polar coordinates, however the position information in ADSB message is WGS.84 latitude and longitude coordinates, the coordinate system of two messages is different. On the other hand, the unit of each data item is not uniform. So before experiment, these two messages should be pre-processed. Units of correspond data items will be converted uniformly, and the coordinate system should be unified, polar coordinates of radar message will be projected to WGS.84 latitude and longitude coordinates, and the latitude and longitude coordinates of both messages are projected to terrestrial coordinate system by using Gauss-Kruger projection.
The radar record and ADSB record of aircraft state after data processing can be expressed as below. where x and y represent the horizontal coordinate of aircraft in terrestrial coordinate system, t is the Unix timestamp, and are the components of aircraft ground instantaneous speed on x and y axis. In follow-up experiments, ADSB record will be taken as the real state of aircraft, and radar record will be taken as the measurement state of aircraft, since ADSB record is more accurate.

Kalman filter structure
In this paper, we choose to use linear Kalman filter with white noise, to filter the radar record. The general system state equation of the general linear system with white noise is as follows: is measurement matrix converts state vector into the measurement vector, is mean value of the measurement error of each time, it is non-random sequence also, like . And is a zero-mean white noise vector sequence. The linear filter recursion equation at each moment is as follows. Where equation (7) and (12) is state prediction equation and others are variance prediction equation. (10) In the experimental scenario of this paper, we use the ADSB data of aircraft as actual position of aircraft, and the radar data as the measurement position of the aircraft. According to the pre-processed results in Section 2, the state vector of aircraft at time k is , and measurement vector at time k is . Dimensions of measurement vector and estimated state vector are the same, and the meanings of each corresponding components are the same, so the measurement matrix is identity matrix at any time. Because two radar measurement data at adjacent times are separated by about 4 seconds, we can approximate that the aircraft does uniform linear motion with speed of previous moment between two adjacent times, so the state transition matrix is follow, where is time difference between two moments.
[ ] To make up for error caused by the linear model itself, let the components of vector be average error of components of at time k, components of are random noise of components of , and components of are mutually independent. Therefore, matrices and are both identity matrices. Systematic error variance matrix is the covariance matrix of , and its diagonal element is variance of each component of . Vector is the fixed error of radar observation at moment k, is random noise at time k, and components of are independent of each other, like . Measurement error variance matrix is the covariance matrix of . In order to reduce the complexity of model, we also make following assumptions: and are not related, that is, the covariance matrix of motion noise and measurement noise is 0. Equations (6), (7), (8) above will be simplified after these assumptions to equations (14), (15).
The algorithm structure of Kalman filter is shown in Figure 4. Comparing the structure of Kalman filter and LSTM (Section 2.2), it can be seen that there are many similarities between their structures. Firstly, they receive input vector at every time step. In Kalman filter, it is measurement vector , There is an output at every time step, namely the predicted state vector . Both of them have two states that propagate between time steps. In Kalman filtering, these two states are and . Therefore, Kalman filter also has structure like neuron of RNN, treating Kalman filter as the core of RNN neuron and train it with by BPTT algorithm is possible.
Variance Prediction

Error prediction base on BP neural network
It is shown from Section 3.1 that in the filtering process, parameters that need to be dynamically estimated are , , and . In order to let them trainable, for each parameter vector or matrix, a group of BPNN is used for prediction, call it BPG, for example . The structure of network is shown in Figure 5.  Figure 6. Trainable Kalman filter model. Each BPNN predict one component of output, for example represent the first component. Since , related to and , related to , input for and is , for and is . If output should be a vector, BPG simply joins output values of all BPNNs together to build a vector, otherwise BPG builds a matrix, treats outputs as diagonal elements. After using BPG to predict parameters, final model structure is shown in Figure 6.

Experiment prepare
In order to verify the performance of trainable Kalman filter in solving radar tracking problem, this paper design experiment to compare traditional Kalman filter and trainable Kalman filter. For training, select radar data and ADSB data of the same aircraft for 3 days, use first two days data as training set, and the third day data as testing set. To build training set, the first one of any two adjacent data is used as a training sample and the second one is used as a label to simulate the process of state transition. The sample format of the obtained training set is , and the format of label is . Thus, there are 70000 samples in training set.

Result comparison and analysis
Compare coordinates of radar measurement data, coordinates of ADSB measurement data, filtering result of traditional Kalman filter, and result of trainable Kalman filter. Part of tracks comparison is shown in Figure 7. Intuitively, the filtering accuracy of trainable Kalman filter is higher than traditional Kalman filter, and its filtering result is closer to coordinates of ADSB data. Compared with the filtering error in Figure 8, traditional Kalman filter has better filtering effect in early stage of filtering, and its result is smoother. However, while system parameters changing with the changing of aircraft position, traditional Kalman filter results in accumulation of errors due to parameter errors, which ultimately results in filter divergence. The error of trainable Kalman filter is less stable at initial stage. However, since it can dynamically estimate the system parameters during the filtering process, it has better adaptability to dynamic environment and can achieve a lower average error value after filtering Close to 0, and reduce the traditional random noise.