Traffic accident prediction based on Markov chain cloud model

A dynamic prediction model based on Markov chain and cloud model is established to predict the volume of road traffic accidents which is under the guidance of stochastic process and cloud theory referring to the characters of the road traffic accidents and time series data. First of all, we establish the evaluation model based on cloud model, and get the relative error range of the observed value and the predicted value. Then we can use Markov chain to correct the relative error. The result shows that this method balances both randomicity and fuzziness and has higher accuracy. Therefore, it can be used for analysing the trend of road traffic accidents in different traffic conditions and provide evidence for safety early warning and corresponding accident prevention countermeasures formulation in relative road segments.


Introduction
Currently, there are lots of prediction models such as Time Series Model [1,2], Gray Model [3,4], Neural Network Model [5,6] , which have been widely used to predict road traffic accidents and achieved good prediction results. But all are imperfect. Time Series Model can describe the changing periodic law of data, however, it is unrealistic to record and predict all the data about relative road segments thus it is difficult to establish training and leaning model. The neural network method needs a large number of training data, and takes a lot of energy, therefore it is difficult to apply in practice, and its prediction accuracy is difficult to be guaranteed. Further, it is easy to have over fitting phenomenon when there are many learning samples. GM (1,1) needs a small number of samples to establish the model [7,8], but it requires the smoothness of original data, and the prediction period is short, so it can not reveal the law of periodic change of data. Whereas, the qualitative concept of cloud model can grasp the overall law, and transform the quantitative data into qualitative knowledge that divided as per historical periods using specific algorithms. Then, it activates the prediction rules of the corresponding qualitative knowledge to realize the uncertainty prediction of the observation value, according to the prediction time's membership degree of different periods, so it has a better prediction effect. Markov chain can predict the development trend of the error according to the known error state, so as to realize the correction of the error. Therefore, it has certain advantages for the prediction of road traffic accident data series affected by various factors. Based on above analysis, this paper proposes a road traffic accident Cloud is an uncertain transformation model between a qualitative concept and a numerical representation of the concept. It is assumed that X is a quantitative domain expressed by a precise numerical value and T is a qualitative concept related to this domain. The membership degree of element x in X to the qualitative concept expressed by T is not an accurate number, but a random number with stable tendency.
, the distribution of this random membership degree in the domain is called membership cloud, which is called cloud for short. Amongst, each x, is called a cloud droplet [3].
Definition 2: Let U be a quantitative domain expressed by precise numerical value, and C is a qualitative concept on U. It assumes that quantitative value x∈U, and x is a random realization of qualitative concept C, if x satisfies that x~N (Ex，En'2) amongst En'~N(En, He2), and the certainty of x to C satisfies μ x exp , then the distribution of x on the domain U is called normal cloud.

Predication method based on cloud model
In order to present the distributional law and uncertainty of the current sequence, proposes that to excavate the trend of data from existing data. Based on this, the observed value of the time series can be divided into two parts: historical data set H and current data set C. Assumed that in the time series data set H, ai is the value with time property A (such as the ai-th study period ), bi is the value of numerical attribute B that at a certain time point of ai, the study purpose is to predict the value bt of the numerical attribute B at a certain point at in the future.
It is therefore necessary to excavate prediction knowledge from the database D firstly. The implementation mechanism of predicted rule is as shown in figure 1, the algorithmic flow is as follows: Drop(x i ,μ i ) Figure 1. The implementation mechanism of predicted rule (1) According to the established cloud model parameters, constructs the x-conditional cloud generator of A and y-conditional cloud generator of B respectively (i=1,2, …, n).
(2) Input the prediction time-antecedent value ai into x-conditional cloud generator of A and produced cloud droplet (ai, ui) (i=1,2,…, n). (3) Input the membership degree ui of cloud droplet (ai, ui) into y-conditional cloud generator of B and produced cloud droplet (bi, ui).
(3) Input the membership degree ui of cloud droplet (ai, ui) into y-conditional cloud generator of B and produced cloud droplet (bi, ui).
(4) Output prediction value bi. What needs illustration is that the cloud droplet and output value in the algorithm are not unique or definite, thus this rule based on cloud model can realize uncertain reasoning. Taking cloud B with the maximum membership ui as the corresponding prediction knowledge, which is called historical cloud.
Carrying out inertia weighting of the historical cloud based on current cloud, and obtains the integrated cloud S(Et, En, He). The definition of the integrated cloud is as follows: The integrated cloud S (Et, En, He) can integrate two different prediction knowledge. By replacing historical Cloud B with the integrated cloud S, traffic accident prediction rules is therefore generated.

Markov chain modified model
Markov chain prediction method is based on the prediction of state transition, dividing the random sequence into several states to predict the future states. Therefore, it is assumed that the relative error between the measured value and the predicted value is divided into n state intervals, denoted as   The error interval which is most likely to contain the predicted value calculated by cloud model can be judged and modified according to formula (3) . Firstly, selected the states of n known relative errors which are nearest to the prediction time, and then obtained the number of transition steps of the predicted value, and then extracted the row vectors corresponding to the initial state from the state transition matrix corresponding to the transition step number to form a new probability matrix. The sum of the i-th column vector of the new probability matrix represents the probability of the predicted value in the state Si. The midpoint of the state Si with the maximum probability is taken as the most probable relative error, that is: (2) If two or more probabilities are the same in the row vector , 1,2,3, , kj p j n   and both are the maximum values, then the transition direction of the next step cannot be determined. In this case, it is necessary to calculate the second or above step state transition probability matrix. Generally, the mid-value can be used as the prediction value of the future, then the single value of the prediction of the accident volume of the future time corresponding to the time interval can be obtained according to this state. The predicted values of the combined model are as follows:

Accuracy test of combined forecasting model
(2)Mean square error ratio C

Case analysis
In order to research the trend of road traffic accidents, chooses the observed values form January 1, 2018 to January 1, 2019 in a subordinate districts and country under Jinan City and set them as the original sample, to as show in Figure 2 (each week serves as an observation period). Using the data from January 2, 2019 to September 1, 2019 (each week serves as an observation period) to predict based on the cloud model. Amongst all data, the first 31 periods are used as fitting samples to obtain relative error data, and the last 4 periods are used as prediction samples of the combined model.  in each period as historical data set H and constructed historical cloud B (Ex, En, He). Then distinguished the road traffic accident's membership interval in every historical moment and calculate its membership degree. Take the prediction of road traffic accidents volume in phase 32, 2019 as an example, based on current cloud, the historical cloud is processed by inertia weighting, to obtain the characteristic parameters therefore generated the integrated cloud S (Et, En, He), ss shown in Table 2. Then carrying out prediction based on the integrated cloud and repeat for several times to get enough cloud droplets. Determine the prediction set of accidents volume at the prediction time according to the region where the cloud droplets are located. Obtained all prediction values in the prediction period and furthermore, obtained the predicted value based on average value of cloud droplets. Afterwards, introduced prediction value into current cloud, and the next prediction is completed by continuously updating current cloud model.
Use Markov chain to correct the error of prediction data and divide the state of relative error into four state intervals, as shown in Table 3. Relative error interval % S1 [-2,-1) S2 [-1,0) S3 [0,1) S4 [1,2) Input the relative error of cloud model prediction value into Markov chain. Take the relative error states of four research periods closest to the prediction time and calculate the steps required for the four known states transferring to the prediction research period. Four state transition matrices are calculated as follows.  It can be seen from table 4 that the relative error of the first prediction period in 2019 is most likely in S2 state. From this we can calculate that the corrected value is 31, which belongs to interval 2. In this way, we can get the prediction values of other prediction periods, and then get the final prediction results of road traffic accidents based on the cloud model modified by Markov chain. As shown in Table 5. According to the prediction results, the predicted C value is less than 0.5, and the prediction fitting accuracy is good. It proves that the cloud model based on Markov chain is effective and can be used for subsequent prediction to reflect the development trend of road traffic accidents.

Conclusion
Based on the existing literature, through in-depth analysis of their research ideas, and using results of relevant research results in econometrics field for reference, according to the characteristics of road traffic accidents and time series data, under the guidance of cloud model and Markov chain theory, a dynamic prediction model based on cloud model and Markov chain is established to predict road traffic accidents volume. The prediction accuracy is improved after the correction by Markov chain. The method provides a new idea for road traffic accident prediction.