An Analysis Method of Ship Traffic Section Flow Based on Big Data

Traffic flow has important research value for traffic planning and traffic management. In waterway traffic, the cross-sectional flow of ships also has important application value in ship safety management, navigation rule making, bridge and wharf construction, and ship driver decision-making. In order to fully explore the characteristics of ship traffic flow in the big data environment, this paper studies the analysis method of ship cross-section flow based on big data. First of all, in order to ensure the accuracy of calculation and reduce the calculation time to improve calculation efficiency, based on AIS big data optimization data processing method, complete data storage strategy and construction of data management mechanism. Then, the ship section flow analysis algorithm is proposed and implemented. Finally, based on the ship section flow analysis algorithm, a ship section flow analysis and statistics system is constructed to complete the flow calculation and information statistics. The research results of this paper make the relevant analysis of ship traffic flow more convenient and faster, and have greater application value for waterway traffic planning and management.


Introduction
With the development of water transportation, ship traffic accidents occur frequently, causing unpredictable economic losses and damage to human life.Traffic accidents are often accompanied by management factors. For example, the density of ships in waters with heavy traffic flow is also higher, and the risk of accidents is usually greater. By understanding the traffic flow and carrying out corresponding traffic management or control, the occurrence of traffic accidents can be effectively reduced. Manual observation of ship cross-section flow is currently the most widely used flow observation method in our country, that is, the channel management department arranges for observers to select the most typical days of each month to observe the ship cross-section. However, the manual observation method has many shortcomings, such as large interference by human factors and limited sample size. Therefore, it is necessary to study a set of feasible monitoring methods for cross-section flow.
Qualitative and quantitative analysis methods are currently the most used methods to solve ship traffic flow forecasting. This article mainly uses quantitative analysis methods to achieve traffic flow forecasting. Tian Yanhua et al. [1] selected the traffic flow data of the Yangtze River Estuary for model training, and the results showed that the prediction model was effective in predicting traffic flow. Liu Jingxian et al. [2] used the nonlinear fitting ability of neural network to solve the small sample problem in the prediction process, and ensured the accuracy and stability of the prediction. Feng Hongxiang et al. [3] verified the feasibility of the prediction model in short-term prediction of ship flow. Wu Baoquan [4] used the statistical method of traffic flow on the road for reference, and proposed a method for monitoring the flow of ships in the channel section between the ship locks. By setting up two sets of monitoring equipment in a certain segment, when the ship passed the two sets of monitoring equipment in order, Then the cross-sectional flow of the channel can be measured. Hussein Dia [5] proposed that dynamic neural networks have superior prediction performance compared with static classifiers, and proved the feasibility of using object-oriented neural network methods to predict short-term traffic. Yisheng Lv et al . [6] proposed a deep learning method based on SAE model to predict traffic flow, and successfully discovered potential traffic characteristics.
Based on the research of the above scholars, this paper proposes an analysis method of ship crosssectional flow based on the AIS big data of the automatic ship identification system, which can quickly obtain the cross-sectional flow of the channel or route, so as to reduce the work intensity of the observer and grasp it in real time. The characteristics of traffic flow elements improve the accuracy of ship flow statistics, and further support waterway traffic risk analysis and traffic control to reduce ship traffic accidents and ensure water safety and environmental safety.

AIS message analysis
In terms of logical function, it can be divided into 5 parts.
(1) Reading part of AIS message: read the original message of AIS.
(2) The check part of the AIS message: The AIS message is checked by the method of comparing the cyclic redundancy CRC (Cyclic Redundancy Check) with the message.
(3) The effective extraction part of the AIS message: extract the AIS message data after verification processing.
(4) Decoding part of AIS message data (5) AIS analysis information storage part ① The data after preliminary analysis is stored in the form of a notepad file ② Then write Java code as needed, extract the data in notepad format, store it in excel, and finally import it into the database for storage and management.

Information error type ①Static information error
IMO number is unavailable (zero), MMSI input error, ship name is missing, ship name input error, call sign is missing, ship type is not available, ship draught data is not available, captain data is missing (0), the distance from the bow to the antenna is missing , Missing ship width distance (zero), call sign format output error, ship length is less than ship width, the distance from the bow to the antenna compared with the ship length does not match the actual situation, the distance from the right chord to the antenna compared with the ship width does not match the actual situation, etc. . ②Dynamic information error The AIS device is faulty or the AIS device has not been connected to other devices, such as the connection between the AIS device and GPS, etc., or the device connected to the AIS device is faulty, resulting in errors in the collected AIS information. On the other hand, the on-duty pilot or the responsible crew member has not modified the ship's navigation status, for example, the ship has left the berth and is sailing normally, while the equipment is still at anchor. This can also cause errors in dynamic information. Data cleaning refers to removing the abnormal information part of the data.For example, in static data, the IMO number is not available (zero) or the field is empty.The MMSI input is incorrect. Missing ship name, ship name input error, call sign missing, ship type unavailable, ship draught data unavailable, ship length data missing (0), ship head to antenna distance missing, ship width distance missing (zero), call sign format output error, The length of the ship is less than the width of the ship, the distance from the bow to the antenna is greater than or equal to the length of the ship, the distance from the right chord to the antenna is greater than or equal to the width of the ship, etc. In the dynamic data, the ship's longitude and latitude information is obviously inconsistent with the current GPS measured longitude and latitude data, that is, data where the longitude and latitude are not in the normal range, such as the longitude value exceeds 180°, and the latitude value exceeds 90° recording. The data record of the heading exceeding 360°, the data record of the course deviation between the ground course and the course calculated by continuous latitude and longitude. Data records with negative values for ground speed, data records with large deviations between ground speed and the speed calculated by continuous latitude and longitude.

Data error correction method
②Data repair In order to ensure the accuracy and completeness of the data, it is necessary to repair the missing parts of the AIS data during the process of collection and transmission. For example, if the trajectory data sent by the ship is interrupted and the position point jumps, it is necessary to clear the abnormal trajectory data point of the jump. After clearing the abnormal trajectory point, the original ship trajectory data point needs to be repaired.

Storage and management of trajectory data 2.2.1 Database design
Trajectory data includes MMSI code, ship name, call sign, ship length, ship width, ship type, ship latitude and longitude, course, speed, draft, time stamp, etc. In order to achieve data storage management and effective query, the data is imported into a table in the database.

Static information table design
This article takes the collected data of Fujian in January as an example. A static information table is created in the MySQL database to store the AIS data of the ship's original trajectory, including MMSI, Shipname_EN, Shipname_CN,Callsign,IMO, Shiptype_EN, Shiptype_CN, Navi_status_EN , Navi_status_CN, Length_d, Width_d, Destination and other attribute fields. Among them, MMSI, Length_d, Width_d are of Double type, and no data length is assigned; Shipname_EN, Shiname_CN, Callsign, IMO, Shiptype_EN,Shiptype_CN, Navi_status_EN, Navi_status_CN, and Destination are of Varchar type, and the data length is 255. The specific table creation information is shown in the static information in Table 1:

Realization of Section Analysis Algorithm
Java Topology Suite (JTS) is an API for processing data in Java.This article uses the GeometryFactory, Geometry and Coordinate classes in JTS, and uses the Intersects space operation and the LineString space data type to implement the flow algorithm.
First extract the coordinate data of the same MMSI ship in chronological order, as shown in Figure   6,coordinates of ship   ②The time selection of the flow calculation area. In this paper, there are four areas including Wuhan, Nanjing, Fujian and Shandong. Each area has 12 months of data, so it is necessary to select the time of the calculation area.
③The selection of the flow information database table, the information of ships passing through the section after the section flow calculation will be imported into the database table. Therefore, the flow information database table of a certain section in a certain month that needs to be calculated has been established in the database, so it is necessary to select the flow information database table to be imported in order to obtain and count the information of ships passing through the section.
④Selection of flow calculation area, select the area that needs to calculate flow. ⑤Output of cross-section flow rate value, the final flow rate value will be output to the output box.

MySQL database and Java connection
Based on MyEclipse software, using Java language to access MySQL database using JDBC (Java DataBase Connectivity, Java) connection. ①Start the MySQL server ②MyEclipse import jar package ③Use the MySQL server client SQLyog to establish a database ④Load the data access driver ⑤Configure the database link address ⑥Configure user name and password ⑦Establish connection ⑧Execute SQL language ⑨Processing result set ⑩Close the database

Java extracts data from MySQL
The trajectory data of a ship is associated with the MMSI code. Each time the trajectory information sent by the ship must have an MMSI code, which is used to distinguish the trajectory information of different ships in order to generate the trajectory of each ship. Therefore, this article uses MMSI code to access other track information of this ship. Part of the track information is shown in Table 3: First of all, the MMSI is processed in the MySQL database, and the processing results are shown in Table 4. 码表名 ；" ，At the same time, due to the large amount of actual data, it is impossible to determine the length of the data queried from the MySQL database after the SQL statement is executed by MyEclipse, so you need to define the list container first, and then store the data found from the MySQL database in the list container. Then, define a one-dimensional array, the length of the array is the length of the list container, and store the data of the list container in the one-dimensional array in order. Finally, use the MMSI code in MyEclipse to access other track information in the MySQL database, and store the track information in multiple arrays. The executed SQL statements are as follows: "SELECT track information FROM '

Realization of flow calculation
Take the ship with MMSI=200023823 in Table 5 as an example, the array ARR2[i] storing latitude data and the array ARR3[i] storing longitude data respectively store latitude and longitude data in chronological order. Now take the adjacent latitude and longitude data in turn, as shown in Figure3.

Implementation of Traffic Information Classification
The graphical interface for cross-section flow calculation can only obtain the number of times the ship has passed through the cross-section, and cannot obtain the detailed information of the ships that have passed through the cross-section. For this reason, the code is optimized. If the latitude and longitude data of a ship with MMSI=MMSI[i] is judged as intersecting by the algorithm, then MMSI=MMSI[i] ship's Unixtime，Lat_d，Lon_d，Shipname_EN，Shipname_CN，Shiptype_EN，Shiptype_CN， Length_d，Width_d are imported into the database. Take Fujian's January trajectory information and section AB as an example, as shown in Figures 4 and 5.  Table 6.

Conclusion
There are more and more researches on cross-section ship flow statistics based on AIS big data, and they are becoming more and more perfect, but the implementation process is more complicated. This article proposes a relatively simple implementation process and more convenient algorithm, and uses the Java language to use JTS tools The package and MySQL database realize the flow statistics; the graphical interface for flow calculation and statistics is designed, and the flow calculation area and calculation period can be selected in the interface; and the statistics of the ships through the section are classified. Compared with the previous traffic statistics method, it is easy to operate and more practical. The system can conveniently count ship traffic flow information. It is of great significance for maritime administration agencies to formulate ship navigation management rules, bridge and wharf construction to avoid waters with large flows, and ship drivers to understand the flow of navigable waterways.