Sparse optical flow outliers elimination method based on Borda stochastic neighborhood graph

During the tracking of moving targets in dynamic scenes, efficiently handling outliers in the optical flow and maintaining robustness across various motion amplitudes represents a critical challenge. So far, studies have used thresholding and local consistency based approaches to deal with optical outliers. However, there is subjectivity through expert-defined thresholds or delineated regions, and therefore these methods do not perform consistently enough under different target motion amplitudes. Other studies have focused on complex statistical-mathematical modeling which, although theoretically valid, requires significant computational resources. Aiming at the above problems this paper proposes a new method to calculate the optical outliers by using stochastic neighborhood graph combined with the Borda counting method, which reduces the computation amount on the basis of objectively eliminating the outliers. Sparse optical flow (SOF) values are used as the overall population and the outlier and inlier SOF values are used as samples. Analyze the dissimilarity between SOF data points, obtaining the dissimilarity matrix, introducing the Gaussian function to smooth and reduce the dimensionality of the dissimilarity matrix, and then normalizing the smoothing matrix to generate the binding matrix, where the probability sum of each node to other nodes in the matrix is equal to 1. Stochastic neighborhood graphs are then generated based on a binding matrix to obtain the outlier probabilities of data points in different neighborhood graphs, and outlier samples are obtained based on the probability. To avoid the subjectivity of the expert thresholds, the outlier probabilities are weighted and ranked to calculate the data point Borda scores to obtain accurate optical outliers. The experimental results show that the method in this paper is robust to different amplitude motions and real scenarios, and the accuracy, precision and recall of outliers elimination are better than the current mainstream algorithms.


Introduction
Camera movement tracking technology (CMTT) [1] has a wide range of applications, such as video surveillance [2], robot vision [3], and unmanned vehicles [4], and medical image analysis [5].Video surveillance systems typically require object tracking to ensure security [2].Robot vision systems rely on precise location information to perform various tasks [3], while autonomous driving technology depends on CMTT for environmental perception and navigation, ensuring safety and efficiency during vehicle operation [4].In the field of medical image analysis, CMTT is used to stabilize and calibrate medical images, such as x-rays, CT scans, MRI, and ultrasound images.By tracking camera motion, doctors can more accurately locate and diagnose a patient's condition, reducing errors and improving the success rate of surgeries.However, there are still some challenges in the application of this technique, owing to complex environmental changes and the diversity of target movements.Precision optical flow value is the key to achieving CMTT.To improve the accuracy of optical flow data, many researchers have obtained a precise optical flow value by improving the robustness of the optical flow method.
Sparse optical flow value (SOF) [6] is the calculation of a SOF field using the main features of the moving target (corner points, contours, texture features, etc.) and estimating the position of the main features in the next image frame by comparing the pixels of two adjacent frames.SOF outliers generated in special scenes usually refer to errors in optical flow values caused by factors such as occlusion and illumination changes due to large movements of the target, and loss of feature pixels due to small amplitude movements.These light displacement group values seriously affect the analysis results and need to be eliminated.Currently, researchers have widely used methods based on threshold [7][8][9], local area consistency [10,11], and statistics [12] to eliminate outliers in the optical flow values.
The threshold-based method is based on the size of the optical flow value of the pixel points in the optical flow field.A threshold is set, and the pixel points with an optical flow value exceeding the threshold is defined as outliers.Zheng et al [7] combined median filtering and total variational regularization techniques to remove outliers and noise from optical flow values, improving the robustness and precision of optical flow estimation.Silva et al [8] utilized a wavelet transform to decompose the signal into different frequency components and performed threshold processing based on the statistical characteristics of each component to eliminate optical flow errors caused by noise.Doshi and Kiran [9] combines a total variational energy function with an optical flow estimation model.This method is able to suppress noise while preserving image edge information, but the threshold-based method is better able to handle outliers significantly larger than the threshold, and it fails for some outliers close to the threshold.
Based on the local area consistency method, the average optical flow value of the pixel points within a local area was calculated, assuming that the optical flow was approximately uniform and the pixel points that were larger than the average optical flow value were considered outliers.The optical flow field was estimated using an optical flow estimator based on a local direction filter [10].The local direction filter filters the optical flow field and protects edge information to eliminate outliers, thereby improving the precision of the optical flow method.Rao et al [11] adopted an algorithm based on a minimum value filter to reduce the optical flow error values introduced by the image noise and distortion.
The method based on a statistical approach is to model the optical flow field using Gaussian distribution, Laplace distribution, etc [12], and to use this model to detect outliers and eliminate them.
In summary, threshold-based methods are generally not suitable for handling non-uniformly distributed outlier values since they cannot adapt to varying outlier density across different regions.Local area consistency-based methods miss some true outliers when the defined local regions are too small, and they may include non-outliers if the regions are too large.Statistical-based methods require complex mathematical computations, which can lead to higher computational complexity, especially when dealing with large-scale image data.
To address the limitations of existing methods, this paper proposed a SOF outlier elimination method based on Borda stochastic neighborhood graph (B-SNG).The proposed approach employs a stochastic neighborhood graph-based method to calculate the outlier probability for each optical flow value, avoiding the issue of excessive computational complexity associated with statistical methods and eliminating restrictions related to data distribution.Additionally, it suggests the use of the Borda counting method to prevent the problem of non-uniform density distribution and increased false elimination rates caused by manually set thresholds and local area consistency methods.
The overall logic of the B-SNG algorithm proposed in this study is similar to that of the SOS algorithm [13].The difference is that SOS obtains the affinity matrix A by calculating the variance between eigenvalues for dimensionality reduction, whereas this study uses the Gaussian function to smooth dimensionality reduction to reduce the process of calculating variance and improve the speed of the algorithm.The SNG constructed by SOS and the directed edges of the bound nodes are all random processes.The resultant values of each calculation are not fixed, and the expert threshold is used to determine whether they are outliers, which is subjective.In this study, we propose the use of the Borda counting method for multi-weighted outlier voting and eliminate outliers by ranking the order of Borda scores.B-SNG is an unsupervised learning method that is insensitive to noise and can automatically eliminate outliers and eliminate some erroneous vectors caused by motion magnitude, etc.
In this study, the outlier elimination mechanism is used for robust SOF to ensure high recall while improving the accuracy and precision of the SOF under different motion modes.As shown in figure 1, depth image information was obtained using charge-coupled device (CCD) and velodyne sensor (VS) techniques.Using perspective transformation technology, the motion relationship of an object in a 3D video stream captured by a mobile camera in space is transformed into the changing relationship between pixels in a 2D image sequence at any time.The Lucas-Kanade SOF method was used to obtain the optical flow values of feature points in the image sequence as the SOF dataset.SOF belongs to clustering distribution, and this study introduces stochastic neighborhood graphs and Borda counting method for the elimination of optical flow outlier values.First, we used the Euclidean formula to calculate the dissimilarity between each data point.The Gaussian function was used to smooth the dimension reduction of the dissimilarity matrix to obtain a smooth matrix.A smoothing matrix was then normalized to obtain a binding matrix with a probability sum of 1 for each element in each row.A stochastic neighborhood graph is generated based on the binding matrix, and the outlier probability of data point x a is obtained by calculating the probability of selecting data point x a as its neighbor from the stochastic data point x b .For different stochastic neighborhood graphs, the outlier probabilities of the data points are calculated differently.The Borda counting method is used to assign adaptive weights to the data points, and the Borda scores of the data points are calculated as outlier probability scores after multiple iterations to reduce the subjective influence of expert thresholds.
The main contributions of this study are as follows: (1) This study constructs an accurate SOF feature dataset based on feature point detection, Odataset.It provides strong data support for research on optical flow outlier elimination.(2) The first uses the concepts of stochastic neighborhood graphs and Borda counting for SOF outlier elimination.The use of stochastic neighborhood graphs to calculate the outlier probability of optical flow values avoids a large amount of data volume computation compared to statistical based methods.The Broda counting method is also used instead of the expert thresholds method, which solves the problem of uneven density distribution and false elimination caused by the thresholds and local region consistency based methods.(3) An unsupervised outlier elimination method called B-SNG is proposed.This method calculates the outlier probability of optical flow values based on stochastic neighborhood graphs and adaptively determines the weight of the outlier probability using the Borda counting method.The optical flow outliers can be effectively eliminated, and the accuracy and precision of the SOF can be improved, while maintaining a high recall rate, which is applicable to the case of different motion modes.
The study is structured as follows.In section 1, problems with existing outlier elimination mechanisms for robust optical flow estimation are cited, and the structural arrangement and innovations of the study are presented.In section 2, an overview of existing outlier elimination algorithms is provided, and the method for obtaining SOF in this study is presented.Section 3 describes the main techniques applied to the methodology used in this study.In section 4, the model training and evaluation metrics are identified.Experimental validation and data analyses are completed.Section 5 summarizes the entire text.

Related work
The introduction section reviews and analyses the strengths and weaknesses of the methods that have been used to eliminate outlier optical flow values.This chapter is further an overview of popular outlier elimination algorithms that have not been used for optical flow estimation.It also describes the methods used to convert the dense optical flow in a given dataset into SOF values.

Outlier elimination algorithms
Popular outlier elimination algorithms include methods based on distribution, distance, dimension reduction, density, clustering, and trees.Choosing a suitable optical flow outlier elimination algorithm must be analyzed on a case-by-case basis.
The distribution-based 3σ [14], Z-score [15], and Boxplot [16] methods are simple and easy to use, but they cannot accurately output the normal range.The judgment mechanism eliminates them one by one, which is not suitable for models with large amounts of data.
Distance-based K-nearest neighbor (KNN) [17] is not required to assume data distribution and can find global outliers but cannot find local outliers.Principal component analysis (PCA) [18] and AutoEncoder [19] based on dimensionality reduction can simultaneously process multiple features and transform them into a new set of unrelated variables.Unifying local outlier detection methods via graph neural networks (LUNAR) [20] trained a graph neural network (GNN) on a graph to construct a graphical representation of the data, where each data point is represented as a node in the graph and the edges between the nodes represent the distance between the data points.Different outlier detection techniques were then applied to each node in the graph to identify outliers.However, the distance-based approach is ineffective for nonlinear data.
The density-based approach can effectively detect multidimensional outliers without limitations in data distribution.The gaussian mixture model (GMM) [21] algorithm assumes that the dataset is a mixture of multiple Gaussian distributions, and estimates the model parameters by maximizing the likelihood function.Calculate the probability density of each data point in the model, which is used as a GMM score, and a threshold is used to determine whether the point is an outlier.For high-dimensional datasets and datasets with many outliers, the selection of appropriate model parameters is not sufficiently robust.The SOS algorithm [13] constructs multiple SNGs, with each neighborhood graph randomly selecting a data point as the center from the dataset.Then calculate the variance between each data point and its data points in the neighborhood graph for dimension reduction processing is normalized to obtain the outlier probability of each data point in multiple neighborhood graphs, and to judge whether the outlier probability is greater than an artificial threshold to eliminate outliers.Copula-based outlier detection (COPOD) [22] used kernel density estimation or histogram-based methods to estimate the marginal distribution.The copula function is then used to model the dependency structure between variables, and the threshold value is set according to the dependency degree to eliminate outliers.The density-based method requires a preset density threshold, is more sensitive to data distribution, and is susceptible to interference from data noise.
Clustering-based local subspace clustering and pruning (LSCP) [23] clusters feature in subspaces, with each subspace representing a cluster.Pruning removes unnecessary or redundant features from the dataset, reduces the dimensionality of the feature space, and improves the efficiency of the clustering algorithm.Local correlation integral (LOCI) [24] constructed a distance graph by calculating the distance between each data point and other points in its neighborhood.Then, the local correlation integral (LCI) between each data point and other points in its neighborhood was calculated, and the average of the LCI was used as the LOCI score of that point.Finally, the threshold value was used to determine whether a point was an outlier.The clustering-based algorithm has a high computational complexity, requires calculating the distance between each data point and other points in its neighborhood, and requires adjusting several parameters for optimization.
Tree-based isolation nearest-neighbor ensembles (INNE) [25] randomly selects a subset of data to construct a decision tree [26].The decision tree recursively divides data into subsets based on randomly selected features and thresholds.The decision tree assigns each data point an outlier score that indicates the depth of the leaf node in the tree where the data point is located.Data points assigned high anomaly scores were regarded as outliers.The tree-based approach is relatively ineffective for detecting anomalies in low-dimensional data.

Obtaining SOF values
As shown by the green dashed box in the figure, this section uses CCD and VS to acquire 3D scene information and then converts the 3D scene information into 2D image sequences using perspective transformation techniques.The Lucas-Kanade SOF method based on Shi-Tomasi corner point detection is used to obtain the SOF values in 2D image sequences.

Obtaining 2D images
In this section, CCD [27] and VS [28] were used for image acquisition.As shown in figure 2, to obtain the 3D coordinate displacement offset values (u, v), a perspective transformation global coordinate system was used to transform two adjacent frames of 3D feature points from P 0 to P, into a camera image 2D datasets.
All positions and orientations of the CCD with respect to the global coordinate system are computed by means of a perspective transformation [27].The coordinate system of CCD takes the direction of vehicle travel as Xc axis, and Xc, Yc and Zc are perpendicular to each other to form a three-dimensional coordinate system.The global coordinate system was XYZ.[R | t] is the transformation between the camera's current coordinate system and the camera's initial position coordinate system.Assuming that the first ground truth pose is (0, 0, 0) and the input is a 3D global coordinate system.The output determines the displacement of the target between frames by means of the camera coordinate system and the transformation matrix [R | t].

Shi-Tomasi-based SOF estimation
The (x, y) obtained in section 2.2.1 is the 2D coordinate point.The method of Kaur et al [29] corner point detection was used to extract the corner points with significant motion in the image as SOF estimates with a matrix N × 2. The corner point response value R introduced in the Harris algorithm [30] is used to determine whether the feature points are corner points and boundaries, as in equation ( 1), The Shi-Tomasi corner point detection algorithm is optimized for the R-value equation.As α of the Harris corner detection algorithm is an empirical value, it is difficult to set an optimal value.Shi et al found that the stability corner points are related to smaller eigenvalues in matrix M. The R-value formula can be rewritten as: where λ is set to a minimum value of 0.01 to ensure that all corner points satisfying the decision conditions are detected.In this study, all pixels were sorted according to the magnitude of the eigenvalues, and the top N pixels with the smallest eigenvalues were selected as corner points.
The Lucas-Kanade SOF method was used to obtain the correspondence of feature points of adjacent frames.Let I(x, y, t) be the grayscale value of the feature point (x, y) at moment t. u and v are the x and y velocity components of the optical flow at that point, assuming that the pixel point moves to the point (x + ∆x, y + ∆y) after time ∆t, where ∆x = u∆t and ∆y = v∆t.By comparing the pixel relationship between two adjacent frames of feature points using equation (3), we determine whether there is a correspondence between two frames of feature points, In equation ( 3), I x and I y represent the grayscale gradients in the x and y directions, and I t represents the rate of change of grayscale over time.The feature points that satisfy the correspondence notation of equation ( 3) are denoted as P 0 → P, and the (u, v) variable values are saved.
In CMTT, according to the properties of the Lucas-Kanade SOF method with constant luminance, small motion, and spatial consistency, it is known that in the same region when time and speed are consistent when the time flow estimates (u, v) has the characteristic of appearing in clusters.In view of the advantages

The proposed approach
The SOF outliers elimination method based on B-SNG is the core of this study.As shown in the brown dashed box in figure 1, the optical flow estimates corresponding to the pixel positions of the corner points are used as the input data points (u, v).The dissimilarity matrix D is obtained by calculating the dissimilarity between data points using the Euclidean equation.Then, the phase difference matrix is smoothed using a Gaussian function to obtain the smoothing matrix S. The elements in matrix S do not match the probability distribution, and the smoothed matrix is normalized to obtain binding matrix B. The binding matrix is then combined with the SNG to calculate the outlier probabilities.Finally, the Borda score is obtained from the outlier probability ranking and score correspondence table, and in the case of generating multiple SNGs, the Borda counting method is used to assign a ranking to each data point, the higher the ranking the higher the outlier probability.

Dissimilarity matrix
The SOF between two frames obtained using the Lucas-Kanade SOF method is saved in the flow variable with dimension (n_samples, 2), where n_samples is the number of samples, and each sample has two features representing the offset (u, v) in the x and y directions, respectively.The accuracy of the optical flow values is improved by treating each sample as a data point and using B-SNG to detect and filter out data points that may be outliers.
In the B-SNG algorithm the data set X of SOF is taken as input, each data point x = [u, v], the number is n and the size of X is an X = n × 2 matrix.The input parameter is a two-dimensional array X, where each row represents a data point, and each column represents a feature of the data point.Calculate the point-topoint dissimilarity matrix D using Euclidean calculations, where d ab denotes the difference between points x a and x b .x ak denotes the kth eigenvalue of the ath data point, and in this study k = 1, 2. Where the values in row a and column b represent the Euclidean distance between the ath data point and bth data point.Equation (4) satisfies the non-negativity, symmetry, and trigonometric inequality, so all elements of the matrix are nonnegative and all elements on the diagonal are 0.
Figure 3(a) shows an example of randomly selected data points, where u and v represent the offsets on the x-axis and y-axis, respectively, i.e.SOF.In figure 3(b), the length of the red line represents the difference of the other 9 data points from point X2; the longer the red line, the greater the difference.Figure 4(a) shows a heatmap of the dissimilarity matrix D for 10 sets of optical flow estimates, with color ranging from dark to light indicating dissimilarity from high to low.To more clearly represent the deviation of data from each other, as shown in figure 3(c), each data point is set as a node, and the distance from the other data points to this node is calculated.The variance value of each data point, calculated from these distances, indicates the degree of deviation of the data points.In the figure, circles are drawn with a radius equal to half the variance of each data point, showing that point X 4 has the greatest degree of deviation.

Smoothing matrix
While retaining the main features of the data points and on the premise of reflecting the similarity relationship between the data, the dimensionality of the data is reduced and the D matrix is smoothed and reduced by a Gaussian function to obtain the smoothing matrix S, where the dissimilarity from data point a to data point a is 0. The smoothing matrix from data point a to data point b is where sigam is the smoothing factor, which can control the Gaussian function to make its value smoother after dimensionality reduction; the smaller the sigam the smaller the edge weight, which can easily lead to distortion of the data value; and sigam = 9 (sigam = 9 is the best value obtained through empirical adjustment), which can ensure the accuracy of the data after dimensionality reduction on the premise of reducing distortion as much as possible.
In equation ( 5), s a denotes the similarity distribution of the ath row of the smoothing matrix as data point x a , where s aa is 0, Figure 4(b) shows the smoothed matrix S heat map for example data.The smoothing matrix S highlights the affinity between data points more than the phase-difference matrix D. The darker color in figure 4(a) indicates greater dissimilarity and distance between data points.The darker color in figure 4(b) indicates less dissimilarity between data points.

Binding matrix
The probability sum of any node to other nodes in the smoothing matrix S is not equal to one and is a nonprobability distribution.Normalize the smoothing matrix S. The dissimilarity between x a and x b is proportional to the probability density of x b .Compute the conditionally bound matrix B such that In equation ( 8), b a denotes the binding probability distribution of the ath row of the binding matrix as the data point x a , where b aa is 0, Figure 4(c) heat map of the binding matrix B of the example data, the shades of color represent the distribution of the binding probability, the higher the binding probability the darker the color.

Binding probability
The binding probability defines a SNG with n nodes, where each node V a has a random set of neighboring nodes {x a }, x a ∈X, and a set of nodes {V a }, V a ∈V.And there is a directed edge E G between node V a and all the In this study, it is defined when data point x a selects data point x b as a neighbor in graph G.If node V b is not bound to node V a , i.e., its data of zero degrees in degree G belongs to outlier Co.The only values calculated at this point are the outliers Co = 0 and non-outlier Co = 1, as shown in equation ( 9), According to equation ( 9), the probability that its degree G is zero is equal to the probability that node is not bound to any node.It is assumed that the nodes V are constant and B is not uniformly distributed; therefore, the probability of generating a particular G depends only on the binding probability p(G), As shown in figure 5(a), there are 10 nodes in the example with directed edges from node V 10 to each point, and the sum of the binding probabilities from V 10 to the remaining 9 nodes is 1. Figure 5(b) shows that the total probability of binding each vertex to V 10 is not equal to 1.Each vertex V a in the graph is associated with the bound probability distribution b a .The binding probabilities of V 10 →V 4 is relatively low, and the binding probability of V 4 →V 10 is relatively low.
Figure 6 shows a schematic diagram of graph G in which the nodes are fixed and a node can be bound to multiple nodes.It can be observed that the binding probability calculation is not a closed-loop calculation.After obtaining the binding probabilities, the neighbor nodes of each node V a in graph G are chosen randomly and the graph G = (V, E G ).In figure 6, SNG is a stochastically generated neighborhood graph, as in the example when n = 10 nodes; that is, there are (n-1) n SNGs.The two forms in the example in figure 6, when {X 1 , X 4 , X 6 , X 10 } are outliers, have a binding probability of 2.85 × 10 −5 for graph G a .When {X 2 , X 3 , X 7 } is an outlier, the binding probability of the graph G b is 2.4345 × 10 −10 .The higher the binding probability, the smaller the dissimilarity between the nodes.

Outlier probability
To reduce the computational effort while guaranteeing the accuracy rate, in this study, given an SNG, each node is allowed to bind to only one node to form a closed-loop outlier probability.The bound probability data given in equation ( 11) is calculated based on a given graph G.The probability that a randomly selected In equation (11), Co denotes an outlier category.In figure 9, node V 1 is randomly selected; one node can only be bound to one node, and then a closed-loop operation is performed along the nearest neighbor to that point.When {X 4 ,X 5 ,X 8 ,X 9 } are outliers, the outlier probability X 1 is 0.48921.When {X 4 } is an outlier, the outlier probability, X 1 is 0.30502.The greater the outlier probability, the greater the dissimilarity between nodes.
The x a is an outlier when p(x a ∈Co) > θ.As shown in figure 7, with the same threshold θ, the output X 1 outlier probability will not be exactly the same, even if the nodes are the same and the randomly bound nodes are different each time.As shown in figure 8(a), a G is returned using binding matrix B as input.Where 0 or 1, indicates whether there is an edge from node a to node b.As shown in figure 8(b), the matrix B is combined with matrix S. The zero points in S are maintained, and the remaining points are replaced with values in matrix B.
In this section, the Borda count [31] was used to obtain data outliers.The Borda count algorithm considers the ranking of each data point instead of simply focusing on the vote share, considers all candidates for each threshold, effectively avoids vote scattering for data values, and assigns a certain number of points to each candidate.
The value of θ is not defined in this study.The outlier probabilities are calculated by generating gn G-graphs, and the output outliers are sorted and labeled L = {L1, L2,…, Ln} each time.Figure 8(c) shows the outlier probability values for the data points when gn = 10.
Assuming that there are N data points in the dataset X and k% of outliers are excluded, the ranking and scores of the Borda count are shown in table 1.
The total outlier score for each data point was obtained by equation ( 12):  In equation (12), gn is the number of graphs G.The ranking of the statistical outlier total score by equation ( 11), the vector with the top k% of occurrence frequency or Score(s) greater than a threshold, is thresholded, and the feature points with probability values higher than a certain threshold are marked as outliers; thus, these outliers can be eliminated, and a valid optical flow estimate can be obtained.The B-SNG pseudocode proposed in this study is presented in table 2.
In this section, we set the top 10% frequency of the occurrence of points as outliers.As shown in figure 8(c), the data points {X 1 ,X 2 ,X 3 ,X 5 , X 6 ,X 7 ,X 8 ,X 9 ,X 10 } are classified as inliers, whereas the data points {X 4 } are classified as outliers.

Experiment and performance analysis
To evaluate the performance of the B-SNG algorithm, this section demonstrates that the B-SNG algorithm proposed in this study is effective for outlier elimination on the SOF estimation domain by comparing it with a variety of popular outlier elimination algorithms.All the experiments are implemented on NVIDIA Quadro P5000 GPUs on the PyTorch platform.

Evaluation methods
The experimental results of the proposed B-SNG are compared with those of several state-of-the-art algorithms to provide quantitative metrics to demonstrate the superiority of the method in processing optical flow outliers.[34], n_datas [34] and s_datas [34][35][36], respectively.Reproduced from [32], with permission from Springer Nature.Reproduced from [33].CC BY 3.0.Reproduced with permission from [34].methods were implemented in the Visual Studio Code and tested in the same Python 3.97 environment, with default values chosen for all parameters of each method.

Evaluation indicators
In the comparison experiment of outlier elimination algorithm, accuracy, precision, recall, FPR, F1 score [37] and running time are used as evaluation indexes to evaluate the performance of outlier elimination algorithm in multiple dimensions.
The accuracy, precision and recall rate emphasize the performance of the model in different aspects, including the overall accuracy, the prediction accuracy of the positive samples and the capture degree of the positive samples.The F1 score combines the precision rate and recall rate, provides a balance index, and considers the trade-off between these two performance aspects.The FPR reflects the model 's ability to prevent misclassification of negative samples into positive ones.Finally, the running time is used as an additional indicator to evaluate the efficiency of the algorithm, so as to understand its feasibility in practical applications.

Performance comparison of outlier elimination algorithms
The results of the B-SNG are analyzed for comparison with popular elimination algorithms in recent years.Figures 10-16 show the results of the comparison of seven of the algorithms (INNE (2018), GMM (2019), COPOD (2022), LSCP (2019), LOCI (2022), LUNAR (2021), SOS (2012)) and B-SNG for optical flow estimation outliers on the normal motion scene, small amplitude motion scene and large amplitude motion scene datasets, respectively, where accuracy, precision, recall, FPR, F1 score and runtime (s) are reported.

Normal amplitude motion scene
Figures 10 and 11 show the results of the comparison of the seven algorithms for eliminating optical flow estimation outliers on the common motion scene dataset n_datas.The first column in figure 10 shows the number of inlier points as a proportion of the entire dataset.The inlier ratios in the bamboo_1, bamboo_2, temple_2.1,and temple_2.2datasets ranged from 13%-85%, 18%-92%, 40%-95%, and 56%-96% respectively, with average inlier ratios of 42.5%, 64.2%, 72.4%, and 76.6% respectively.
Accuracy.The second column of figure 10 reports the accuracy of different datasets at different proportions of inlier ratios.In normal amplitude motion scenarios, accuracy is not affected by the number of inlier ratios when the number of inlier ratios is less than 30%.The LSCP is the most accurate when the proportion of the inlier ratio is less than 30%.Compared to the third-best LOCI algorithm, it improved by 9.832%.Compared to the worst COPOD, it increased by 18.578%.
Precision.The third column of figure 10 shows the precision rates for different datasets at different inlier ratios.When the proportion of the inlier ratio is less than 30%, there are too many negative samples, and similar negative samples may exist near the positive samples, resulting in lower precision than accuracy.The B-SNG algorithm had the best accuracy rate when the proportion of inlier ratios is less than 30%.
Recall.The fourth column of figure 10 shows the recall rates for different datasets with different inlier ratios.The overall trend of recall for all algorithms increased as the proportion of the inlier ratios increased.The B-SNG algorithm outperformed the other 5 algorithms in terms of average recall, and is inferior to the LSCP  FPR.The fifth column of figure 10 shows the FPR for different datasets at different inlier ratios.The B-SNG algorithm outperformed the other six algorithms in terms of average FPR.It is 0.1544% lower than that of the second-best LOCI algorithm and 19.12% lower than that of the worst COPOD.F1 scores.Figure 11 shows the F1 scores for the different datasets with different inlier ratios.The B-SNG algorithm outperformed the other five algorithms in terms of average F1 scores and is inferior to the LSCP algorithm, with a 2.69% decrease compared to the LSCP algorithm.It improved by 2.604% compared with the third-best LSCP algorithm.It improved by 20.508% compared to the worst LOCI.
The precision rate and FPR of this algorithm are better than those of other algorithms in normal motion scenarios.The accuracy, recall, and F1 are not as good as those of the LSCP algorithm but outperformed the LSCP algorithm in terms of running time.
Accuracy.The second column of figure 12 shows the accuracy for different datasets at different inlier ratios.In small-amplitude motion scenes, the accuracy is not affected by the number of inlier ratios when the number of inlier ratios is less than approximately 25%.When the number of inlier ratios is greater than about 25%, the accuracy tends to increase with the number of inlier ratios.The SOS has the best accuracy when the inlier ratio is less than 25%.The B-SNG algorithm outperformed the other six algorithms in terms of average accuracy.It improved by 0.013% compared with the second-best LSCP algorithm.It improved by 13.242% compared to the worst LOCI.
Precision.The third column of figure 12 shows the precision rates for different datasets with different inlier ratios.The B-SNG algorithm outperformed the remaining five algorithms in terms of the average precision rate, and is inferior to the LSCP algorithm, with a 3.6836% reduction.It improved by 0.053% compared to the third-best LOCI algorithm and by 7.821% compared to the worst GMM.
Recall.The fourth column of figure 12 shows the recall rates for different datasets with different inlier ratios.There is an overall increasing trend in recall for all algorithms as the inlier ratio increases.The B-SNG algorithm outperformed the remaining six algorithms in terms of average recall.It improved by 0.076% compared with the second-best LSCP algorithm and by 9.16% compared with the worst LOCI.

FPR.
The fifth column of figure 12 shows the FPR for different datasets with different inlier ratios.The B-SNG algorithm outperforms the remaining four algorithms in terms of average FPR, and is inferior to the LSCP and LOCI algorithms, with an increase of 44.879% and 20.885%, respectively.It decreased by 36.549% compared to the worst SOS.F1 scores.Figure 13 shows the F1 scores for the different datasets with different inlier ratios.The B-SNG algorithm outperformed the other five algorithms in terms of average F1 scores and is inferior to the LSCP algorithm, with a 3.701% decrease compared to the LSCP algorithm.It improved by 4 The accuracy and recall of this study's algorithm outperformed those of other algorithms in small-amplitude motion scenarios.The precision and F1 scores are inferior to those of the LSCP algorithm, and the FPR is inferior to the LSCP and LOCI algorithms, but better than the LSCP and LOCI algorithms in terms of running time.

Large amplitude motion scene
Figures 14 and 15 show the comparison results of the seven algorithms for eliminating optical flow estimation outliers on l_datas, a dataset of large amplitude motion scenes.The first column in figure 14 shows the number of inlier ratios for the entire dataset.The inlier ratios in the ambush_2.1 [34] and ambush_2.2[34] datasets ranged from 10%-66% and 5%-89%, respectively, with average inlier ratios of 0.384 and 0.483, respectively.The inlier ratios in the market_5.1 [34] and market_5.2[34] datasets ranged from 17%-87% and 35%-93%, and the average inlier ratios are 0.575 and 0.703, respectively.
Accuracy.The second column of figure 14 shows the accuracy for different datasets with different inlier ratios.The algorithm accuracy is not affected by the number of inlier ratios when the number of inlier ratios is less than approximately 25% for large motion scenarios.When the number of inlier ratios is greater than 25%, the accuracy of the algorithm improved as the number of inlier ratios increased.The B-SNG algorithm remained consistently on the upper side for different inlier ratios for different datasets.The LSCP has the best accuracy when the proportion of inlier ratios is less than 10%.The B-SNG algorithm outperformed the other six algorithms in terms of average accuracy.It improved by 0.265% compared with the second-best SOS algorithm and by 13.411% compared with the worst LOCI.
Precision.The third column of figure 14 shows the precision rates for different datasets with different inlier ratios.When the inlier ratio is less than approximately 25%, there are too many negative samples and similar negative samples may exist near the positive samples, resulting in a lower precision than accuracy.The SOS algorithm had the best precision when the inlier ratio is less than 25%.The B-SNG algorithm outperformed the other six algorithms in terms of average precision.It improved by 0.120% compared to the second-best SOS algorithm and by 16.272% compared to the worst LOCI.
Recall.The fourth column of figure 14 shows the recall rates for different datasets with different inlier ratios.The recall of all algorithms tended to increase as the inlier ratios increased.The recall is generally consistent with the precision rate, and the SOS algorithm has the best recall when the inlier ratio is less than 25%.The  FPR.The fifth column of figure 14 shows the FPR for different datasets with different inlier ratios.The B-SNG algorithm outperformed the other five algorithms in terms of the average FPR and is inferior to the LOCI algorithm.There is a 21.489% increase compared to the LOCI algorithm.It decreased by 0.22% compared with the third-best GMM algorithm and by 15.91% compared with the worst COPOD.F1 scores.Figure 15 shows the F1 scores for the different datasets with different inlier ratios.The B-SNG algorithm is inferior to the SOS algorithm in terms of F1 scores, with a 2.69% decrease compared to SOS.The improvement is 2.604% compared with the third-best LSCP algorithm.The improvement is 20.508% compared to the worst LOCI.
Running times.The average running times for SOS, INNE, GMM, LSCP, LOCI, COPOD and B-SNG on the large amplitude motion scene dataset are 0.9224, 0.13117, 0.10428, 2.38619, 3.58897, 0.14784, and 1.001s, respectively.Compared to the best GMM algorithm, the algorithm in this study increased by 85.99% and  8.522%, respectively, compared to the SOS algorithm.Compared to the worst LOCI algorithm, this algorithm decreased by 72.109%.
The accuracy, precision, and recall of this algorithm are better than those of the other algorithms in large-amplitude motion scenarios.The FPR is not as good as the LSCP algorithm and the F1 score is not as good as the SOS algorithm, but it is better than the LSCP algorithm in terms of running time.

Comparison of the overall performance of different sports modes
Figure 16 shows the average resultant values of qualitative outlier elimination on the Odataset (s_datas, l_datas, and n_datas) dataset for the seven algorithms.The B-SNG in this outperformed the other six algorithms in terms of average accuracy, precision, and recall.The average FPR is inferior to that of the LSCP and LOCI algorithms, increasing by 8.903% and 14.268%, respectively.The average F1 score is lower than that of the LSCP algorithm, with a decrease of 1.1928%.Although the method in this study is inferior to the LSCP and LOCI algorithms in terms of the average FPR and F1 score, it outperforms them in terms of running time, with reductions of 59.3909% and 73.1489%, respectively.

Real scenario application
This section discusses the application of B-SNG to eliminate outlier optical flow estimation in real scenes and evaluates the performance of SOS, INNE, GMM, LSCP, LOCI, COPOD, and B-SNG on real video-stream datasets.In figure 17, a sequence of 10 consecutive frames from a provincial road moving camera with an image size of 1050 × 480 pixels is shown, and the red dashed boxes are the starting frames of the two datasets t_data1 and t_data2.
After acquiring the image sequences, the Lucas-Kanade SOF method is used to obtain optical flow values (the number of optical flow values in each set is 100).The outliers in the optical flow dataset are calibrated by experts, as shown in figures 18 and 19.To demonstrates show the performance of the comparison algorithm more clearly, this section changes the ordering of the proportion of true outliers obtained from real image sequences, and the proportion of true outliers is reordered according to the lowest to the highest.
Figure 18 shows the results of comparing the seven algorithms on the real scene dataset by eliminating the optical flow estimation outliers.The first column in figure 18 shows the number of outlier ratios for the entire dataset.The outlier ratios ranged from 2%-22% and 8.6%-35% for the t_data1 and t_data2 datasets, respectively, and the average outlier ratios are 0.1011 and 0.1940, respectively.The bar chart in figure   shows the F1 scores and average metrics of the algorithms for the t_data1 and t_data2 datasets.Figure 20 shows the outlier elimination effect plots for the four optical flow datasets.
Accuracy.The second column of figure 18 shows the accuracy rates for different data sets with different outlier ratios.The histogram in figure 21 shows that the B-SNG algorithm outperformed the remaining six algorithms in terms of average accuracy.The improvement is 1.013% compared to the second-best LSCP algorithm and 14.2119% compared to the worst SOS.Precision.The third column of figure 18 shows the precision rates for different data sets with different outlier ratios.The B-SNG algorithm outperformed the other six algorithms in terms of the average precision rate.The improvement is 0.4953% compared to the second-best LSCP algorithm.It improved by 7.953% compared to the worst SOS.
Recall.The fourth column of figure 18 shows the recall rates for different data sets with different outlier ratios.The histogram in figure 20 shows that the B-SNG algorithm outperformed the remaining six algorithms in terms of the average precision rate.The improvement is 1.675% compared to the second-best LSCP algorithm and 8.644% compared to the worst SOS.
FPR.The fifth column of figure 18 shows the outlier ratios for different datasets with different outlier ratios.The B-SNG algorithm outperforms the remaining three algorithms in terms of average outlier ratios and is inferior to the INNE, GMM, and LSCP algorithms.This is an increase of 32.185% compared to the algorithm.This is an increase of 3.261% compared to the third-best GMM algorithm.This is an increase of 66.18% compared to that of the LSCP.A 32.6565% decrease compared to the worst SOS.F1 scores.Figure 19 shows the F1 scores for the different datasets with different outlier ratios.The B-SNG algorithm outperformed the remaining six algorithms in terms of average F1 scores.The improvement is 0.8% compared with the second-best LSCP algorithm.The improvement is 8.024% compared to the worst SOS.Running times.The average running times of SOS, INNE, GMM, LSCP, LOCI, COPOD, and B-SNG on real data sets t_data1 and t_data2 are 1.15454, 0.1371, 0.1267, 2.5656, 3.9473, 0.1551, and 1.0967s.Compared with the best GMM algorithm this study's algorithm increased by 76.5589% and 5.00985% compared to the SOS algorithm.Compared to the worst LOCI algorithm, this algorithm decreased by 72.2165%.
The accuracy, precision, recall, and F1 score of this study's algorithm outperformed those of other algorithms in real motion scenarios.The FPR is inferior to the INNE, GMM, and LSCP algorithms, but outperforms them in terms of running time.

Conclusion
In this study, we propose a method for outlier elimination of SOF values based on the B-SNG, which can effectively improve the accuracy and precision of SOF values.First, 3D video stream acquisition is performed using CCD and VS.The 3D video stream information is converted to 2D image sequence information using perspective transformation technique.Then, the Lucas-Kanade SOF method is used to extract the optical flow values of feature points between adjacent image sequences.Next, the SOF values are used as the input dataset, the phase dissimilarity matrix between the data values is calculated, and the smoothing matrix is obtained by smoothing the reduced dimensional phase dissimilarity matrix using a Gaussian filter.The smoothing matrix is processed using normalization to obtain the binding matrix, and the outlier probability of the data values is calculated using the binding matrix and stochastic neighborhood graph.Finally, outliers with high scores are obtained by ranking the outlier probabilities, assigning weights to each data value, and calculating the Borda scores of the data values.Experiments are carried out on Odataset (n_datas, s_datas and l_datas) and real scenario (t_data1 and t_data2) datasets, and the experimental results show that the method in this study achieves the highest accuracy, precision and recall in terms of average performance when compared with the state-of-the-art methods.There is an increase in running time compared to tree and density-based algorithms, mainly because the method in this study requires multiple clustering and iterative computations.However, compared to the clustering-based outlier elimination method, the operational efficiency of the algorithm in this study was improved.Considering all the indicators, the method in this study has strong robustness and reliability in the application of light displacement outlier elimination in real scenes.In future research, more attention should be paid to the efficiency and practicality of the algorithm, and the running time should be reduced as much as possible to meet the needs of a wider range of applications.

Figure 1 .
Figure 1.Flow chart of the method in this article.

Figure 2 .
Figure 2. Schematic diagram of perspective transformation of 3D motion in a 2D plane.

Figure 3 .
Figure 3. Schematic diagram of the phase dissimilarity matrix (a) example data points (b) dissimilarity matrix D (c) deviation into degrees.

Figure 4 .
Figure 4. Heat map of three matrix values (a) phase anisotropy matrix D (b) smoothing matrix S (c) binding matrix B.

Figure 5 .
Figure 5. Illustration of binding probability (a) binding probability of V10 to the rest of nodes (b) binding probability of the rest of nodes to V10.

Figure 6 .
Figure 6.Stochastic neighborhood graph (1 node can be bound by multiple nodes).
x b chooses data point x a as its neighbor and the probability that data point x a belongs to an outlier is p(x a ∈Co),p(x a ∈ C O ) = b̸ =a (1 − b ba ).

Figure 7 .
Figure 7. Stochastic neighborhood graph (a node can only be bound to one node).

Figure 8 .
Figure 8. Schematic diagram of outlier probabilities (a) A graph G is stochastically generated (b) A graph G corresponds to a binding matrix (c) 10 graphs G correspond to data point outlier probabilities, and each row represents a data point outlier probability of a G graph.

Figure 16 .
Figure 16.Average performance comparison of 7 algorithms in different motion modes.

Figure 17 .
Figure 17.Real video flow datasets for evaluation.

Figure 18 .
Figure 18.Performance comparison of the seven algorithms in real scenarios.From top to bottom, the t_data1 and t_data2 datasets are shown respectively.

Figure 19 .
Figure 19.F1 score and the average value of each index.

Figure 20 .
Figure 20.Plot of outlier elimination effect of B-SNG in different motion modes.

Table 1 .
Correspondence table between Borda count sequence and score of data points.
.997% compared with the third-best INNE algorithm.It improved by 9.248% compared to the worst LOCI.Running times.The average running times for SOS, INNE, GMM, LSCP, LOCI, COPOD, and B-SNG on the large amplitude motion scene dataset are 0.927 651 906, 0.128 558 654, 0.101 935 512, 2.470 868 725, 3.800 803 578, 0.221 414 196 and 0.981 24 s.86.2609% increase compared to the best GMM algorithm and 5.777% increase compared to the SOS algorithm in this study.Compared to the worst LOCI algorithm, this algorithm decreases by 74.1834%.