Dynamic Object Tracking Based on Triplet Relationship Guided Sampling Consensus Algorithm

Compared with the traditional methods based on the prediction of moving object trajectory, the method based on image registration has the advantage of not relying on the object motion equation to track the object. However, in the case of a large number of outliers, the image matching method has the defects of inadequate filtering of outliers or erroneous filtering of incomers due to the limitations of spatial constraints of feature matching pairs, which will make the tracking of flying objects inaccurate or impossible. During the flight, the object may be in a similar background and the image shooting angle is different, which makes the object imaging angle change, resulting in a large number of outlier interference. Because of the above situation, this paper adopted an improved TRESAC algorithm based on RANSAC, which used a feature matching method based on a triplet relationship to effectively filter outliers and then adopted an initial data subset selection strategy to increase its robustness. Experimental results show that the TRESAC algorithm can filter outliers quickly and accurately, and the object is still tracked effectively under the condition of similar background and pose change.


Introduction
Nowadays, object tracking can be based on the trajectory prediction of moving objects.The movement model of objects can be obtained through mathematical analysis of the trajectory of tracking objects, to predict the position information of objects at the next moment, to achieve the purpose of tracking [1] .However, for the use of UAV to track dynamic objects, the inertia and external disturbance generated during the flight of UAV make the object have large position changes on the adjacent frame images, and it is unable to generate motion information with certain rules to provide prediction, which will cause inaccurate object prediction position, thus losing the object.Given such problems, researchers combine image registration technology with the traditional moving object trajectory prediction method, to locate the moving object more accurately [2][3] .However, under the airborne aerial photography task with disturbance, the continuous switching method may lead to an unnecessary computational burden.But image registration technology can be used to solve this problem, such as SIFT (Scale-invariant feature transform) algorithm [4] .However, the SIFT algorithm leads to many matching errors, resulting in inaccurate object localization [5] .In this regard, researchers adopt an improved SIFT algorithm, which can effectively improve the positioning accuracy of static objects, but does not involve the impact of object pose changes [6][7] .In addition to improving the SIFT algorithm, the results of the SIFT algorithm can also be used as the initial data and the data can be processed to improve the correct matching rate.Researchers used the RANSAC algorithm to detect the damage to highway billboards, and it had a good matching rate [8] .
The above literature can have good experimental results in scenes with few outliers, but there may be a large number of outliers in UAV remote sensing tasks.Aiming at the problem of low tracking accuracy caused by the above two factors in UAV tracking dynamic object tasks, this paper adopts the feature pairs matched by SIFT algorithm as the initial data set and adopts an improved triplet relation feature matching method based on the RANSAC algorithm [9] .The main contributions are as follows: (1) A triple relation matching point strategy is adopted to make the paired feature points have better robustness and to improve the accuracy of feature point matching and the accuracy of object tracking.
(2) The proposed method is verified in datasets with similar backgrounds and object pose changes, which proves that the proposed method is effective.

The TRESAC Algorithm
TRESAC (Triplet Relationship Guided Sampling Consensus) is an improvement on RANSAC.The disadvantages of the RANSAC algorithm [10] are that it needs to update iteratively to obtain the best model and dues to the lack of space constraints in feature matching pairs, it takes a long time and has a low accuracy rate when dealing with a large number of outlier data [11] , so it cannot meet the application scenario of this paper.However, the basic principle of TRESAC is based on RANSAC, which imposes constraints on the spatial relationship of feature-matching pairs, and samples the data subsets under the guidance of triplet relations involving high correlation and geometric consistency.Then, according to an initial data subset selection strategy proposed in this paper, the initial data subset is effectively sampled.Thus, it can effectively identify and filter outliers and improve the correct rate of matching pairs.

The Principle of Triple Relations
The key idea of TRESAC is to analyze the triplet relations between input feature matches, that is, the input data, where each triplet consists of three data, and the relations between them satisfy the spatial consistency constraint, thus obtaining a series of triplet sets.S and 2 S are spatially close to each other, the spatial consistency constraint is satisfied.Therefore, they are all judged to be interior points.In (b), multiple constraints are added by the triple relation, so the outliers in the three sets of matches can be effectively detected.
The k-nearest neighbours xi 1 and yi 1 feature points i x i y in a pair of images are searched according to their spatial relationships, respectively.For a feature point i x , the triplet relation between it and other feature points in the set S is defined as follows.
Definition 1: A triple contains three feature points, that is, if and only if the relation, satisfies: where 1 x N denotes an indicator function that takes the value 1 if x N and 0 otherwise.For feature points, there may be more than three feature points satisfying Equation (1).Thus, the set of triples i x will be denoted as , the same thing as i y its triple is denoted  Subsequently, through a proposed initial data subset selection strategy, the triplet relationship is used to effectively capture the sampling weight of the data, and the sensitivity to outliers can be significantly reduced by sampling the promising subsets as initial data.

Initial data subset selection strategy
Given two feature matches i s j s , we can calculate their compatibility scores through the pairwise relationship, as shown in Equation (2): Through Equation ( 2), if a pair of feature matching points satisfies the spatial consistency constraint, it will obtain a high compatibility score, that is, it has a high probability of being an interior point.For triples ( , , ) , according to Equation ( 2), the compatibility score of feature matching within the triple is defined as follows.
( , , ) ( , ) ( , ) ( , ) According to Equation (3), if the three feature matches in the triple are interior points, the compatibility score is high.Otherwise, the compatibility score is low.Based on the compatibility score obtained from the triplet set A, the sampling weight based on ( )  max{ ( , , )}, ( ) 0, In Equation ( 4), the maximum value max{ ( , , )} i j k c s s s of the compatibility score of the triple associated with i s and 0 otherwise.So selecting an initial subset of data according to the sampling weights prevents most outliers from being selected during the sampling process.

Object localization
The object tracking in this paper is based on the features of image registration.By taking the image of the object to be tracked as the template image, and then taking the image of each frame as the image to be matched, the feature matching pairs of the image to be matched and the template image is screened out, to realize the object tracking in the scene.The basic process is shown in Figure 3.

Experimental results and analysis
To verify the effectiveness of the algorithm used in this paper, the experiment will be carried out under different working conditions.At the same time, the running time, correct matching rate, and effective tracking rate of the algorithm are used as the evaluation criteria.The formula for calculating the correct matching rate is shown in Equation ( 5).
Correct matching rate TP TN TP TN FP FN (5)   In Equation ( 5), TP (True Positive) means correctly classified as an inlier, FN (False Negative) means incorrectly classified as an outlier, FP (False Positive) means incorrectly classified as an inlier, and TN (True Negative) means correctly classified as an outlier.
The effective tracking rate formula [12] is shown in Equation ( 6): Number of successfully tracked target frames Effective tracking rate 100% The number of target frames detected in a single frame u (6)

Trial 1-Object tracking with different outliers
The object tracked in this experiment is the aircraft in flight.Taking the aircraft image as the input image and the flight scene as the matching image, the SIFT algorithm is used to preliminarily pair the two images A and B, and the outlier ratios are 22.2% and 53.3%, respectively.RANSAC and TRESAC were used to process the data, and their outlier filtering effect and tracking effect were compared as shown in Figure 4.In the case of a small number of outliers (22.2%), both the RANSAC algorithm and TRESAC algorithm can filter out outliers and track them accurately.However, in the case of a large number of outliers (53.3%), the tracking effect of the RANSAC algorithm is not good because it cannot match correctly, while the TRESAC algorithm can still track accurately.To further verify the performance between algorithms, this paper compares the running time and correct matching rate of the algorithms, as shown in Table 1.Table 1 shows that for image A, the correct matching rates of the RANSAC algorithm and TRESAC algorithm reach 91.72% and 99.7%, respectively.However, in terms of running time, compared with the RANSAC algorithm, the TRESAC algorithm used in this paper greatly reduces the running time, only 0.2070 seconds, while ensuring a high matching rate.For image B, which contains a large number of outliers, the correct matching rate of the RANSAC algorithm drops to 80.1%, while the TRESAC algorithm can maintain above 90%, and the operation efficiency of the TRESAC algorithm is better than that of RANSAC algorithm.

Trial 2-TRESAC tracking effect
To better verify the applicability of the algorithm adopted in this paper, this test uses the video data pre-processed by SIFT algorithm, and the data set information is shown in Table 2.There are similar backgrounds in the images of this data set, and the attitude of the object changes with the flight.At the same time, there are disturbances, which will cause a large movement range of the object between adjacent frames.The experimental results show that TRESAC based on image registration is competent for this task.Figure 5 shows the tracking effect of TRESAC on the aircraft under frame 7, frame 40, and frame 83 of the dataset.It can be seen that the algorithm used in this paper has a high object-tracking effect under the shooting condition of disturbance interference, which meets the actual user requirements.

Experimental analysis
In terms of the robustness of the algorithm, this paper did the test tracking test for the data set with a small number of outliers and a large number of outliers in the image registration, and the test results are shown in Table 3.The robustness and feasibility of the TRESAC algorithm are effectively verified, and the correct registration rate of image registration is improved.The average running time of each frame in processing the dataset is 0.058 seconds, and the average correct matching rate is 91.36%.The results show that the proposed method can track the object effectively, and the effective tracking rate is 96%, which makes up for the problem that the traditional method based on motion prediction of object position cannot track effectively in the case of disturbance.However, this method will have the phenomenon of inaccurate matching when the object poses changes greatly.In the subsequent work, this will be regarded as a problem and further improved.

Summary
In the task of photographing and tracking a moving object with a camera mounted on a UAV, aiming at the problem that the RANSAC algorithm is inaccurate in tracking the object due to a large number of outliers in the similar background of the image and the change of the object pose, TRESAC algorithm is used to identify and track the object, which can robustly adapt to the imaging background with different outliers.The results show that when using the camera mounted on the UAV for tracking tasks, the algorithm can support the effective implementation of such tasks.Compared with the RANSAC algorithm, the algorithm can provide a more accurate tracking effect and efficient running time.

Figure 1 .
Figure 1.Comparison of robust model estimates for (a) pairwise and (b) triplet relationships

.
If the feature points i x and i y the feature matches have the corresponding triple, the three feature matches si, sj, and sk are regarded as a set of matching groups with the associated triple feature, as shown in Figure 2.

Figure 2 . 3 (
Figure 2. Illustration of identifying triples for feature matching (a) three feature matching 1 2 3 ( , , ) s s s and N data, as shown in Equation (4): (a)original image (b) feature match Figure 3.The basic flow of object localization; (a) The input original image to be matched, the left image is the image to be matched, and the right image is the input tracking object; (b) The SIFT algorithm is used to obtain the feature matching pairs between images.
(a) Results of Image A (b) Results of Image B Figure 4. Feature point pairing and tracking effect of different algorithms

Figure 5 .
Effect of object tracking

Table 1 .
Comparison of the algorithm running time and correct matching rate

Table 2 .
Information on the data set in this paper