Research and application of bolt defects detection technology based on machine learning

With the development of computer technology, the technology based on computer machine learning plays an important role in various fields. Using drones for collecting image data and using machine learning to analyze the collected image data have become the current general method of intelligent detection technology. As the main machine learning method, deep learning is commonly used in image analysis, but it requires many high-quality training samples and high-performance embedded system. In the engineering quality and safety detection with few training samples, the detection effect of this method is not satisfactory. To solve this problem, computer vision and machine learning technology are introduced into image analysis of bolt, based on the analysis and mining of historical image samples, the recognition and judgment of new collected images can be realized by matching the newly collected image samples and historical samples. Taking the bolt on a steel structure bridge as an example, this method is used to recognize the bolt appearance image collected by UAV. The results show that the method can effectively identify the appearance state of bolts, with fast calculation speed and high recognition accuracy.


Introduction
Steel structure bolt connection has obvious advantages in processing and installation, and it is widely used in steel frame structures at home and abroad, especially one of the main connection methods of large steel structure facilities such as bridges [1][2] . However, with the operation of the bridge, the bolted connection structure has been subjected to vibration, impact and fatigue for a long time, causing the bolts to loosen or even fall off, which affects the safety of the entire structure [3][4] . Therefore, real-time detection of bolt status has important engineering significance. The current bolt appearance inspection method mainly relies on manual operation on hand-held digital cameras or hand-held crack observers. With the development of technology, the detection method of auxiliary mechanical extension arm equipped with camera to collect images for human-computer interaction analysis is gradually adopted [5] , but its depth, breadth and automation level are generally not high. With the development of my country's steel structure bridge construction technology, the scale of the building has increased year by year, and the bolts used on the structure have also increased. For example, the Hong Kong-Zhuhai-Macao Bridge has a total length of 55 kilometers and uses more than 800,000 sets of highstrength bolts [6] , and distributed in every corner of the pontic structure, traditional manual inspection 2 methods can no longer handle the appearance inspection of a large number of bolts, and the inspection process has high requirements for safety and timeliness. Therefore, the appearance inspection of bolts urgently needs to adopt intelligent inspection technology, which is deeply integrated with advanced technologies such as precise target positioning, image correction, and high-precision recognition, so as to improve the automation level of inspection technology. In recent years, with the development of drone technology, drones equipped with camera equipment are used in map surveying and mapping [7] , building safety inspection [8][9] , resource investigation [10][11], traffic planning [12] . This technology can also be applied to the appearance inspection of steel structure bridge bolts. Through the analysis, identification, and judgment of bolt image data collected by drones, the state of the bolts being inspected can be sensed, and countermeasures can be made, as shown in Fig 1. Therefore, the analysis, judgment and recognition of the collected image data is an important technology in the bolt detection based on UAV. Deep learning is the current mainstream technology for image recognition. This method extracts and learns image features through a large number of image samples, and recognizes them, thereby judging the category and status of the detected target image sample [13][14] . However, this processing technology first requires a large number of high-quality image samples as training samples, and secondly, the built deep learning model is huge, with millions of training parameters, which occupies a large amount of system resources and higher requirements on the performance of the embedded system.
In this context, this article introduces machine learning algorithms to the defects detection of bolts. This method has low requirements for the number of training samples, fast recognition speed, and less system resources. It can quickly and effectively identify and judge the image data transmitted by the drone, thereby judging the appearance of the bolts, and assisting the staff in making countermeasures. After calculation, this method has high recognition accuracy and can meet the daily safety inspection needs of bolts.

Method and Process
First, the bolt samples are collected by the drone, and the collected image samples are automatically cleaned, screened, corrected and cropped, classify and label image samples to obtain learning sample library. using the machine learning algorithm to perform dimensionality reduction and feature extraction on the sample library to obtain the low-dimensional feature subspace of the bolt learning sample library. When identifying, the drone collects new image samples and passes the above steps to project the new bolt sample to be identified into the low-dimensional feature subspace, and find the point in the low-dimensional feature space that is closest to the bolt sample to be identified, that is, find the most similar sample in the learning sample library to the sample to be identified, and identify the state of the new bolt sample based on the classification and status of the historically similar sample. The main steps are included as below: (1) Learning sample set construction. Including bolt sample data collection, cleaning, screening, background segmentation of the cleaned image sample before and after the background and image data cropping, and mark the offset, rust, fall off, and missing bolts to construct a learning sample set Q.
(2) Machine learning algorithm analysis. First, using the principal component analysis algorithm (PCA algorithm) to reduce the dimensionality of the learning sample data set, while preserving its features in the high-dimensional space, project it to the low-dimensional space and perform feature extraction to construct a low-dimensional feature subspace , in the low-dimensional space, the characteristics of the learning sample are extracted.
(3) Identification. When a new bolt sample to be identified appears, also project it to the lowdimensional feature subspace , and find the closest point which is the most similar learning sample to the new sample in the feature subspace, according to the state of the most similar points in the learning sample library, determining the type and corresponding state of the bolt sample to be identified, and finally complete the identification process.

Learning sample set construction
This article selects a steel structure bridge as a case, using drones to take pictures of bolts on various parts of the bridge, due to the impact of the drone's cruise route, shooting angle, and light during collection. the collected image samples cannot be directly analyzed, and the images need to be processed. This article mainly uses algorithms in computer vision to perform image pre-processing, including filtering and denoising of sample images, image corrosion, expansion, opening and closing operations, front and back background segmentation, and target edge extraction [15] , finally, determinimg the ROI area where the bolt is located and mark it with a red frame. The image filtering denoising and edge extraction use Gaussian filtering algorithm [16] and Sobel algorithm [17] , respectively. The detailed introduction of these algorithms will not be repeated in this article.
After the image sample is processed by the above algorithm, the bolt shape is automatically located and recognized, and marked in the image. Then take the identified bolt as the center, automatically crop the image sample into a single bolt sample with a resolution of 100×100, and mark its state one by one as a positive training sample. Due to the limitation of the collection objects, the data of the bolt samples that fall off, missing, and rust are seriously missing. In order to ensure the diversity of the sample library, this paper collects bolt samples from multiple channels, and obtains the pictures of the bolt samples that fall off, missing, and rust, after the same data processing, Cut into a single bolt sample with a resolution of 100×100, and label its status one by one, as a negative training sample, to expand and improve the professional sample library.

Dimensionality reduction and feature extraction
The learning sample library is stored in the high-dimensional image space. If the recognition is performed directly in the high-dimensional image space, it will not only have a large amount of calculation and consume a lot of resources, but also the recognition accuracy will be low [18] , Therefore, before recognition, it is necessary to reduce the dimensionality of the learning sample library, that is, transform the original image vector in the high-dimensional image space into the feature vector of the low-dimensional subspace, and then classify and recognize the low-dimensional feature vector [19] . Through dimensionality reduction, the original high-dimensional image data is represented by effective feature data with fewer dimensions. On the basis of not reducing the amount of inherent information contained in the original data, the main features of the original data are extracted [20] . This paper uses the 2DPCA principal component analysis algorithm [21] in the machine learning algorithm to reduce the dimensionality of the sample data, compared with the currently more commonly used 1DPCA method, this method shows better performance in feature extraction, not only can save computing time, but also the recognition rate is significantly improved [22] .
When calculating, assume that all bolt learning samples are in a low-dimensional linear space, and different bolt samples are separable in this space. Perform row and column two-way dimensionality reduction analysis on the bolt sample space, and project the samples into the space The direction where the row and column change the most, that is, the direction with the largest variance, realizes spatial feature extraction and feature compression. The calculation steps are as follows: (1) Suppose X represents the column vector matrix, and the bolt sample matrix Q of size m×n is directly projected onto Y through the following linear change: (1) Y is called the eigenvector of the matrix Q, and the optimal projection axis X can be determined according to the dispersion distribution of the eigenphasor Y.
(2) The ideal projection matrix X should ensure that the results after projection are separated as much as possible, that is, the divergence is maximized, to ensure that the mapping results can retain the maximum degree of information. Therefore, the following criteria are used as the objective function to measure the performance of the projection matrix X. The criteria adopted are as follows: (2) Among them, represents the covariance matrix of the training sample projection feature vector Y, and represents the dispersion degree of . The definition of matrix is as follows: (3) (3) Therefore: (4) represents the covariance matrix of the learning sample matrix , so it is separately defined as . With image learning sample matrix, the definition of is as follows: (4) By calculating the eigenvector of , the eigenvector corresponding to the cumulative contribution rate α=0.9 is selected to form the projection matrix: (6) Then , is the projection of the sample in the U direction. That is to say, after feature extraction, only the number of bits of the column vector of the learning sample matrix is compressed, and the dimensionality reduction of the column vector is completed. At this time, the dimension of the row vector remains unchanged.
(5) Taking the new sample projected in the column direction as the object, continue to construct the covariance matrix: Among them, , find the eigenvalues and eigenvectors of , take the eigenvectors whose cumulative eigenvalue contribution rate is α=0.9, and obtain the projection matrix , then . So far, the optimal projection matrices U and V for the two projection directions are obtained, and the final dimensionality reduction matrix of the bolt sample set is , then each sample is projected into the feature subspace Φ through the optimal projection axis. On this basis, the minimum distance is defined to realize the identification of new samples.

Identification
The identification of a new sample is to project the bolt sample to be identified into the lowdimensional feature subspace Φ of the bolt sample library through the optimal projection matrices U and V, to obtain the point of the sample in the low-dimensional feature subspace, and then follow The principle of the smallest distance is to find the learning sample that is most similar to the new sample in the low-dimensional feature subspace, that is, the smallest distance. The state of the labeled known learning sample corresponding to the new sample is the result of recognition.
(8) In the formula, is the feature matrix of the bolt sample to be identified, and is the feature matrix of the bolt learning sample library. Finally, the appearance state of the new bolt sample to be identified is judged according to the appearance state of the bolt in the learning sample library that is most similar to the sample to be identified (normal, falling off, offset, rust, etc.).

Test Results and Discussions
This article takes the high-strength bolts on a steel structure bridge as an example. The bolt samples are collected by drones, and the data is standardized to form a bolt sample library. Then, the method introduced in this article is used to identify and judge.
The steel structure bridge is a river-crossing railway bridge in the form of a riveted steel truss girder bridge. The DJI M300 drone with positioning and re-shooting function is used for image data collection, and 130 clear bolt images are collected, as shown in Fig.2. The original image samples collected by the drone are processed for data standardization, and the foreground is prominent and the target is clear. Then, using the sample pre-processing method in this article, the shape and position of the bolt in the image can be automatically identified and marked with a red frame, as shown in the Fig.3 shown. On this basis, it is automatically processed into 100×100 resolution learning samples to form a learning sample library.

Fig.3 Bolt sample identification
After the learning samples are constructed, the new bolt image samples collected by the drone are identified using the identification method in this article. The identification rate is shown in Table 1, and the identification results are shown in Table 2.
As shown in Table 1, this method has a high recognition rate of 98.23% for bolts with normal appearance, and the recognition accuracy rates for dropped bolts and offset bolts are 91.12% and 87.38%, respectively, while the recognition accuracy rate for corroded bolts is higher. Low, only 72.14%.  Table 2 is the correct recognition result of each appearance bolt. It can be seen from Figure 4 that the method proposed in this paper is to find the learning sample with the known state that is most similar to the newly collected image sample to determine the state of the new sample. The method can effectively identify and judge unknown new samples, and the automatic identification of bolt appearance status is more efficient. And even if the number of learning samples is small, the appearance state of the bolt can be effectively recognized.

Conclusion
The identification method proposed in this paper can accomplish the task of identification well. It can be seen from the recognition results that even with a small number of learning samples (130 learning samples), the algorithm can well identify the shape, position and state of the bolt, which is very suitable for the situation where there are few high-risk bolt samples. Moreover, this method is fast in recognizing new samples. The processing, cutting and recognition of new samples can be completed within 0.01s. The performance requirements of the UAV embedded system are not high, and the recognition results can be transmitted in real time, which can be used as the daily appearance of bolts. Detection method.
It can be seen from the recognition results that this algorithm has a higher recognition accuracy rate for bolts with a relatively large number of samples, the number of normal bolt samples collected is the most, so its recognition accuracy is the highest. Due to the better daily maintenance of steel structure bridges, there are fewer negative samples of bolts that fall off, shift, and corrode, especially, the number of learning samples of rusty bolts is the smallest, so the error of its surface texture recognition is relatively large, which causes the lowest accuracy of bolt corrosion recognition. Therefore, the negative samples should be supplemented by multi-party collection methods, enriching and perfecting the learning sample library, and improving the accuracy of bolt recognition.