Intelligent Selection Method for Gravity-suitable Area Based on Improved Support Vector Machine by the Genetic Algorithm

The gravity-aided inertial navigation system (GAINS) can achieve precise positioning of underwater vehicles in a gravity-suitable area. However, there are generally shortcomings in the existing intelligent suitable area selection methods in terms of selecting gravity feature parameters and learning parameters. In this paper, an intelligent suitable area selection method is proposed based on an improved support vector machine by the genetic algorithm (GA-SVM) to address the aforementioned problems. Firstly, the genetic algorithm (GA) is utilized to independently pick out the optimal feature subset of 15 existing gravity feature parameters and obtain the optimal support vector machine (SVM) learning parameters while eliminating irrelevant redundant features, thus improving the classifier’s performance and generalization ability. Then, the SVM classifier is trained according to the optimal information output by GA, and the accuracy of the test set is 95.5%. Finally, the classifier is utilized to distinguish suitable and unsuitable areas in the application region to evaluate the proposed method’s performance. The TERCOM experiment in the suitable area resulted in an average positioning error of less than 170 m.


Introduction
Underwater vehicles, as an important tool for exploring the ocean, have the problem of accumulating positioning errors over time [1] .Currently, Combining INS with gravity anomaly information of the Earth is the prevailing method for addressing the limitations of underwater positioning [2] .However, the matching accuracy of GAINS is significantly impacted by the suitability of the matching area.The research shows that under the same experimental conditions, areas with abundant gravity feature information (suitable area) can obtain higher-precision matching results [3] .Therefore, selecting an area with excellent suitability is crucial for improving the matching positioning accuracy of GAINS.
To date, a vast body of research has been devoted to the selection method for suitable areas.The current mainstream is largely to select the suitable area from the perspective of multi-feature fusion.Cai [4] , Ma [5] and Li [6] et al. utilized varied multi-feature fusion methods to fuse multiple features for the selection of suitable areas.There is still room for improvement in the intelligence of the suitable area selection in the method above.Therefore, in recent years, machine learning with excellent classification ability has been introduced into the suitability analysis to realize the intelligent selection of the suitable  [7] used the BP neural network algorithm to divide terrain between suitable and unsuitable areas.In 2021, Wang et al. [8] proposed an underwater gravity-aided inertial navigation matching area optimization method based on SVM to optimize the matching area.Although the machine learning gravity suitable area selection method has better suitable area selection results, the optimal classification model cannot be obtained, owing to the unavailability of the optimal subset of the gravity anomaly feature set and optimal learning parameters.
An intelligent gravity-suitable area selection method based on GA-SVM was proposed to upgrade the problems above in this paper.The optimal subset of the gravity anomaly feature set that is highly correlated with the suitable area from the 15 existing gravity feature parameters set and the optimal learning parameters of SVM were derived using GA.Then, the SVM classification model is trained and its effectiveness is verified.

Method
The method is shown in Figure 1.The method first needs to prepare the gravity anomaly feature set.Then, it is used as the input of GA-SVM to obtain the best suitable area classification model.Finally, the optimal classification model after training is applied to divide the suitable and unsuitable areas in the application area.The gravity anomaly feature set and GA-SVM that will be elaborated on in the subsequent two sections are the main focus.
Figure 1.The selection method based on GA-SVM.

The gravity anomaly feature set
Various gravity anomaly features provide diverse perspectives on the suitability of the area.To investigate the relationship between gravity anomaly features and suitable areas, 15 existing gravity features are used to construct a gravity anomaly feature set.The feature set includes standard deviation (F1), roughness (F2), slope (F3), slope standard deviation (F4), correlation coefficient (F5), kurtosis (F6), skewness (F7), range (F8), and mean (F9), and gray histogram complexity (F10), the sum of gravity anomaly gradient values (F11), energy (F12), Contrast (F13), inverse differential moment (F14), and Correlation (F15) in the field of image processing [9,10] .In addition, the sample data is standardized by z-score standardization to eliminate the difference caused by the value range or different dimensions of different feature parameters.

GA-SVM
GA-SVM is composed of GA and SVM.The GA part outputs the optimal subset of the gravity anomaly feature set and the optimal learning parameters of SVM.The SVM part uses the output from GA to attain the best SVM classifier.Due to the chromosome encoding and decoding and the fitness function in GA that need to be specifically designed, these two aspects are studied for the purpose of solving the problem of gravity-suitable area selection.

Chromosome encoding and decoding
First of all, the binary encoding is chosen from various encoding methods.Secondly, in order to obtain the optimal subset of the gravity anomaly feature set and SVM learning parameters, the chromosome needs to contain three kinds of information: the subset of the gravity anomaly feature set, penalty parameter C, and kernel function parameter γ.The specific chromosome encoding and decoding are shown in Figure 2.
During coding, the coding of the chromosomes follows the method depicted in Figure 2. Likewise, the values of k, j, and l are 15, 7, and 8, respectively, due to 2.1 and the necessity for SVM learning parameters.
Figure 2. Chromosome encoding and decoding.During decoding, if b i F = 0, F i is cleared to achieve the purpose of removing this feature as an input parameter of SVM.If b i F = 1, F i is reserved as one of the input parameters of the SVM, as shown in Formula (1).
The decoding formula of the penalty parameter C is shown in Formula (2).max min min 2 1 where C d is the decimal value converted from b 1 C □ b j C ; C max and C min are the maximum and minimum values of C, respectively.
The decoding formula of the kernel function parameter γ is shown in Formula (3).max min min 2 1 where γ d is the decimal value converted from b 1 γ □b 1 γ ; γ max and γ min are the maximum and minimum values of γ, respectively.

Fitness function
In the iterative process, it is based on the fitness of each individual to determine the next genetic operation.In order to obtain the optimal SVM classification model, the individual fitness solution in GA-SVM is divided into the following steps.
Firstly, each individual in the population is decoded to obtain a subset of the gravity anomaly feature set, as well as the values of C and γ.Then, it is combined with the training set to train SVM to obtain the SVM classifier.
Then, the test set samples are input into the SVM classifier to analyze the difference between the classification results y(x i ) and the pre-calibration results g(x i ), and the number of successfully classified test set samples is recorded.

( ( ), ( ))
where N c is the number of samples correctly classified of the SVM classification model obtained by the c-th individual in the population; N is the number of test set samples; x i represents the sample vector to be tested; P(y(x i ), g(x i )) is shown in the Formula (5).If y(x i ) = g(x i ), 1 is output; if y(x i ) ≠ g(x i ), 0 is output.Finally, fitness for each individual is calculated by Formula (6).

ICAITA-2023
Journal of Physics: Conference Series 2637 (2023) 012005 IOP Publishing doi:10.1088/1742-6596/2637/1/012005 where fitness c represents the fitness of the c-th individual in the population; m represents the population size.

The schema of GA-SVM
The overall flowchart of GA-SVM is shown in Figure .3.
Step 1.The sample data is standardized and then split into training and test set.Next, GA's initial population is generated.
Step 2. According to the fitness, the genetic operation continuously iteratively updates the population until the end of the iteration and the optimal individual is output.
Step 3. The optimal individual is decoded to gain the optimal feature subset, penalty parameter C, and kernel function parameter γ information to train SVM, so that the optimal classification model of the SVM is acquired.

Verifications
The range of the experimental area is from 113°E to 115°E in longitude and from 10°N to 12°N in latitude with a resolution of 1'×1'.Through Matlab interpolation processing, the resolution of the experimental area is converted into 100 m × 100 m.The Distribution of gravity anomaly is shown in Figure 4.In pursuit of sufficient data, the Monte Carlo method is used to randomly select 5000 sample areas in the experimental area, each sample size is 30 km × 30 km, and TERCOM is utilized to pre-calibrate each sample.The use of the Monte Carlo method for random sampling in the experimental area is shown in Figure 5.If the average positioning error of the trajectory is less than or equal to 200 m, it is regarded as a matching successful trajectory, otherwise a matching failed trajectory.The ratio of the number of successful trajectories to the total number of trajectories is the matching rate, and the area with a matching rate greater than or equal to 0.8 is regarded as a suitable area, otherwise an unsuitable area.Moreover, the sample data is balanced by under-sampling to avoid over-fitting problems.Comparison is made between the selection method based on GA-SVM and the selection method based on SVM in the test set, as shown in Table 1.The classification accuracy rate of the method based on SVM is 93.5%, which proves that the gravity-suitable area selection method based on machine learning can effectively distinguish the suitable area and the unsuitable area.The classification accuracy rate of the method based on GA-SVM is 95.5%, which is 2.13% higher than that of the previous method.Furthermore, the selection method based on GA-SVM only needs 5 gravity anomaly features (F3, F4, F9, F10, F11) as SVM's input parameters.And the selection method based on SVM is to use all 15 gravity anomaly features as the input parameters of SVM.In contrast, the calculation amount and complexity are greatly reduced, and the training efficiency is significantly improved.
The performance of the trained classification model is verified by applying it to classify suitable and unsuitable areas in the application area.The range of the application area is from 111°E to 113°E in longitude and from 9°N to 11°N in latitude with a resolution of 1'×1'.Through Matlab interpolation processing, the resolution of the experimental area is converted into 100 m × 100 m.The Distribution of gravity anomaly in the application is shown in Figure 6.The sliding window method is used with a step size of 15 km and a window of 30 km × 30 km to divide suitable and unsuitable areas, and the outcome is displayed in Figure 7.    7, it can be attained that the gravity anomaly changes notably in the lower and left sides of this area, and the suitable areas are chiefly situated in these places.100 TERCOM matching simulation positioning experiments were carried out in 4 randomly selected suitable areas to verify the capability of the method based on GA-SVM.The statistics of average positioning error are shown in Table 2.The average positioning error of the selected samples is less than 170 m, which confirms the capability of the selection method based on GA-SVM.

Conclusions
In this paper, an intelligent gravity-suitable area selection method based on GA-SVM is proposed.The classification accuracy rate of the test set is 95.5%, which supports that GA-SVM can competently obtain the optimal classification model.The simulation experiment results of TERCOM matching positioning in the suitable area selected by the optimal classification model prove the validity of the gravity-suitable area selection method based on GA-SVM.

Figure 4 .
Figure 4. Distribution of gravity anomaly in the experimental area.

Figure 5 .
Figure 5. Random sampling in the experimental area

Figure 6 .
Figure 6.Distribution of gravity anomaly in the area of application.

Figure 7 .
Figure 7.The suitable areas divided by the classification model in the application area

Table 1 .
Statistic of Classification results.

Table 2 .
Statistic of Average positioning error.