Indirect determination of the measurement accuracy of the FMK-004 track geometry measuring car used on Hungarian rail network

During the examination of the Hungarian national network for the period from the 2nd half of 1999 to the 2nd half of 2013, had to answer to the reason why the geometrical errors are able to decrease, despite the fact that no regulation intervention was done on the line section during the study period. This anomaly leads us to the procedure in which it became important to determine the accuracy of the FMK-004 measuring car (FMK in English: railway superstructure measuring car). With the VBA code written for the research task, the data set was optimized, then deterioration functions were fitted to the data, and where outliers were detected, the extent of the outlier was analysed on a statistical basis. This determines the measurement accuracies for each of the three metrics separately. Furthermore, it can be clearly demonstrated by the examination of the national network that in 90% of cases there is a series of geometrics that have a value that decreases compared to the previous half-year, while other geometrics value still is increasing.


Introduction
Our age is characterized by major changes that affect not only our everyday lives, our tools but also our machine analysis and data processing. In scientific life, a significant part of the results of publications is produced by some method assisted by artificial intelligence and then analysed [1]. An examination was needed whether higher-level mathematical models can be used more effectively in the case of models describing railway track degradation than the various simpler prediction procedures used so far [2], [3] a procedure is required for its design that predicts with good certainty the change of the state of the track geometry under a given traffic load [4]. This is necessary because intervention timing based on correct and reliable data would significantly improve cost-effectiveness at the national level. Several Hungarian researchers [5] and international [6] publications have long drawn attention to the fact that the reduction of the cost of construction works, be it new construction or maintenance, can be achieved with thorough preparation and proper organization. But in order for this prediction to be described with sufficient certainty, reliable data is needed, data whose accuracy and magnitude of error are known. This short article summarizes the procedure used to determine the measurement accuracy of a measuring train.

Theoretical background
The deterioration process of railway track geometry is influenced by several factors, which mainly consist of the traffic passing through it and the non-negligible environmental impacts. Deterioration of the non-intervention track can be properly described by the following function: where: C -general parameter characterizing the geometric condition of the railway track, C 0 -general parameter characterizing the initial geometric condition of the railway track, α -"track sizing" factor depending on the track structure, In the deterioration process shown in Figure 1, the values of C on the αmv 2 -axis in the range 0 to 1 are considered. According to the Hungarian regulations, αmv 2 =0 is the construction state, and the geometry cannot reach the state 1 value because it must be preceded by an intervention.

Figure 1:
The exponential deterioration of the rail geometry according to Dr. Pál Vaszary [7] From a technical point of view, the deterioration of the general geometric quality can be attributed primarily to the effect of railway trains on the track, and secondly to the effects of weather and on-site circumstances. The condition of the railway substructure and superstructure is continuously deteriorating geometrically and structurally from the date of construction as a result of rolling traffic loading. Rails are worn by wheels; fastenings are loosened due to lowering and ascending movements in the load cycles. The sleepers are increasingly pressed into the ballast; the crushed stone bedding grains are pressed into the additional layer or into the earthwork; and while the subsoil does not reach final consolidation, the earthwork also sinks. In the case of superimposition of all these effects, different dimensional deviations and then position errors develop in the track. The relative geometry of the railway track is characterized, by three quantities: The units of measurement of these numbers are the distortion-free measurement results of the calculation in the original wavelength range, which is done according to MSZ EN 13848-2.
The general track geometry classification is based on qualification lengths, the station spacing, line and line network are classified on the same basis. There are several methods using measured data, which are collected in each 0,25 m: area-based evaluation, error-maximum (peak-to-peak) evaluation, and projection-based evaluation. During the general qualification, the measuring car forms measuring numbers and qualification numbers for the qualification lengths, in the case of my work 500 m. In general, the sign of longitudinal level is Z (average of the values measured on the left and right rails), the alignment characteristic is Y (the average of the values measured on the left and right rails), and that of the track twist is X (see fig. 2). The measuring numbers are: (2) •Alignment =

Analytical program
Besides the evaluation graphs and tables, the railway engineers can use MS Excel as well, to analyse all the geometrical evaluation data of the railway track. It was clear that if a new program is needed it must be created in Visual Basic for Applications. Other programs had been used like MATLAB, which is a special program system developed to perform numerical calculations, as well as a programming language. The software system developed by The MathWorks is capable of performing matrix calculations, representing functions and data, implementing algorithms, and creating user interfaces.
Although the software is purely numeric, it can also display mathematical expressions graphically by adding the MuPAD package. Furthermore, in the artificial neural network procedure were assisted. MATLAB is able to work with and use data from a well-prepared Excel workbook. A procedure (see fig 3) had been developed that can be easily analysed and modelled from the existing data table but can also be clearly interpreted for others. The basic idea was to find characteristic sections suitable for the purpose of the study and then to analyse the geometric deterioration by comparing different measuring and qualification numbers. As the number of subtasks to be solved increased, a whole set of procedures was finally born as a result of our work. A computer procedure had been created that automatically processes, analyses and uses the measurement and qualification numbers provided by the FMK-004 measuring car: • collects the specified number of semi-annual data sets which have not been repaired, • fits linear, exponential, logarithmic and power-based regression descriptive functions to the collected data sets, • collects, organizes the function parameter and deterministic coefficient of each regression equation and writes the equation specific to the given line from all four types, which is plotted on a graph, • the parameters of the equations are plotted separately on a histogram, and new ranges can be selected according to the distribution, for which it recalculates the parameters of the given function type, • calculates the correlation between the selected measuring and rating numbers, i.e. the magnitude of the linear relationship between two values, • calculate the deviation of the missing or outliers from the regression-matched equation of the data set, • calculate the time difference function and spare time function of the selected lines, • summarizes the predictions made by the seven models [8] and analyses their accuracy based on the mean squared root deviation. This program can provide a real processing system for programming in the industry for knowledgeable engineers. Because it is open-source, there is scope for development and customization.
At the following links (QR-codes, see fig. 4), the running of the program can be observed at a slow speed, for demonstration purposes: The extent to which the alignment was needed, level and twist in these cases determine the magnitude of the SAD rating. With the help of another program written for this purpose, analysation has performed on this issue on the national network. After the successful examination 1402 of the 14434 sections with 500 m length remained, where both the measuring and the qualification number increased for 6 consecutive half-years, i.e. on these sections the track deterioration is monotonous in the given period.
This article attempts to answer the question of why the value of the three measuring numbers decreases in the additional 13023 pieces of 500 m sections.
Programming was again used to help us conduct the study. The first really interesting question was how much of the weight of the different measuring numbers in the rating? Then, analyses had to be made around the anomaly of time series showing geometrical improvement without regulation intervention, since the geometric condition of the track could not improve "by itself".
The calculations of the dominance studies did not provide an answer as to the reason why the measuring numbers are able to decrease, despite the fact that no work was done on the line section during the study period.
It is important to note that the measuring car is calibrated as required in all cases. Furthermore, the measurement inaccuracies published in this article determine the measurement accuracy based on data from the entire national network. It does not take into account the differences in the condition of the railway sections. On the other hand, it can be clearly demonstrated at the national network level that in 90% of cases, for some reason, there is a series of measuring numbers that have a value that decreases It is important to emphasize that the data that were examined belonged to the whole national network, and the 13032 data series is a series of six half-yearly measurement and qualification numbers for a 500 m section, where the SAD number always increases while one of the measuring numbers decreases compared to the previous half.
Furthermore, the determination of how much the improving value deviated from the value calculated by linear regression had much importance. This is the reason why another program was written that calculates the difference from the pre-and post-average of the decreasing metric number between a metric and a linear regression value. In the case where the decreasing measuring number is at the beginning or end of the timeline, enter the expected rate of deterioration backwards from the next two values or forward from the two values preceding it. This is how the difference could had been calculated. It should be noted that the outliers were not considered when constructing the descriptive equation by the regression procedure.
The analyses had to be continued with newer studies that clearly showed at the national network level that in 90% of cases there is, for some reason, a series of measuring numbers that have a declining value compared to the previous half-year without intervention. For these examined data sets, see Figure  5. and Table 1.   It is worth to dwell a little on the result that one of the measuring numbers (Y, X, and Z) decreases while there was no intervention, and the number of these types of 500m sections are altogether 90%. Thus, the probability of a combined decrease in measuring numbers from non-working data sets is less than 10%.
The multiplicity and the ratio have already been shown, now let us look at the magnitude of the difference. The difference of the improving value from the value calculated by linear regression had to be determined. In contrast to the theoretical exponential model, linear regression was used because it is for short periods (max. 6 semesters) and the linear approximation could be quite accurate here. That was the reason why the program that calculates the difference from the average before and after the decreasing measuring number was written. Give in advance the degree of expected deterioration, against which the difference can be calculated. The outlier was not considered when creating the descriptive equation generated by the regression procedure.
In figure 6-8 the difference can be seen between the given measure and the linear regression calculation. The diagrams show the values of the measuring number on the horizontal axis and the determined deviation from the regression approximation on the vertical axis. The number of error sizes on the histograms were plotted. This is how the results for all three measuring numbers were reported.  In the following, figures 9-11 the results according to a different idea have been plotted. The percentage of the deviation is assigned to the given measuring number, because this makes the data easier to analyse. Furthermore, the percentage of error rates can thus be compared to the condition of a given 500 m section of a given railway line.   The deviation of the Alignment (Y) and the regression fit shows a lognormal distribution, with an expected value of 10%. Figure 11. Track twist (X) deviation from the regression fit diagram and this description statistic The deviation of the Track twist (X) and the regression fit shows a lognormal distribution, with an expected value of 9%. Expected value: 1.093, median: 1.075, standard deviation: 0.067 In this part of the research, examining the entire nationwide network in the 27 half-years it turns out that in almost 90% of the cases without intervention assumed on the basis of a monotonous increase in the number of qualifications, at least one of the indicators shows improvement. By processing metric increments of this improving nature, are:

Conclusion and comparison of results with the literature
• Longitudinal level (Z) accuracy 11.7% • Alignment (Y) accuracy: 10.3% • Track twist (X) accuracy: 9.3% These numbers characterize the geometrical measurement accuracy of the FMK-004 measuring car.
In the light of the results, more accurate and reliable data provision will be available for the FMK-004 measuring car, which may also improve the accuracy of track deterioration forecasts. There is no research in the literature that tries to answer the question of the severity of the rating numbers in the SAD rating by processing such a large amount of data. There are articles in the international literature, including [9], [10], [11], which deal with similar topics and calculations, but because each country evaluates and analyses its own railway network differently, although with similar evaluation data (alignment, longitudinal level, twist) it was not possible to find any research comparable to our results. Track twist (X) deviation values from the regression fit, descripting statistic