Research on Physical Fitness Test Data Mining and Analysis Based on Apriori Algorithm

Based on the analysis of the characteristics of Apriori algorithm, this paper uses the iterative method of layer by layer search to find out the relationship of item sets in the students’ body measurement database to form the mining model of association rules. The model maximizes the generation of candidate sets, reduces the size of frequent sets, and achieves good performance. At the same time, the average GM (1,1) model is constructed by the accumulation of the original sequence and the least square estimation of the parameters, as well as the whitening differential equation form, to predict the trend of College Students’ physique change in the future, and the research shows that the error of the model is small and the precision is high.


Introduction
College students are the main force of social development. In order to solve the problem of College Students' physique, through the in-depth study and analysis of the data mining results, we can understand the correlation between College Students' physique and health test items more comprehensively, it is of great practical significance to improve the physical health of college students, to promote physical education in Colleges, to improve the quality of teaching and the construction of indoor and outdoor venues.

Index system construction
Based on the 2015-2018 physical test data of Chongqing University of Education, 12255 samples were taken to construct the physical test index system. Eliminate the invalid data and make descriptive statistical analysis on the eliminated data. As shown in  From table 1, it can be seen that the average change trend of each index of female students is: the 50m run and the sit-and-reach gradually become smaller; the sit-up gradually become larger; the height, weight, vital capacity, standing long jump and 800m run oscillate. From table 2, we can get the average change trend of various indicators of boys: standing long jump gradually becomes smaller, pull-up, vital capacity generally gradually becomes larger, height, weight, 50m run, sit-and-reach, 1000m run oscillation changes, the trend is uncertain.

Apriori algorithm of association rules data analysis
Apriori algorithm uses candidate item set to find frequent item set, which is the most influential algorithm to mine frequent item set of Boolean association rules. In the case of female students, the minimum rule confidence is 80%, and rules with gain less than 1 are eliminated. Finally, the results are visualized as a mesh for overall analysis, as shown in table 3 and figure 1.  The purpose of this paper is to study the promotion of each index in order to improve the physical quality of students. For example, rule mark 3, under the condition of good 800m run grade, the probability of passing 50m run grade is 91.95%, which shows that 800m run can promote 50m run. The thickness and depth of the middle line in the figure represent the strength of the connection. It can be seen intuitively that in addition to sitting forward bending, other indicators of girls have strong connection. Repeat the above operations to get the association between boys' indicators shown in table 4 and figure 2.  It can be seen intuitively that boys' standing long jump, 50m run and 1000m run have a strong connection.

Grey prediction model
Based on the distinct characteristics of small sample, poor data and various body measurement indexes of male and female students, it is more suitable to use the grey prediction EGM model. Take 50 meters for girls as an example: • • Restore the calculated analog value, • Calculate } 9.640 9.784 9.930 10.074

Prediction analysis
Repeat the above steps to get the predicted values of other indicators and the corresponding mean relative errors, as shown in table 5 and table 6. Height, weight, vital capacity and sit ups showed a slow growth trend; standing long jump and sitting forward bending gradually decreased; 50m and 800m time gradually shortened. From table 6, it can be seen that the mean relative error of the average prediction is larger than 1% except for the mean relative error of vital capacity. The mean relative error of the remaining index prediction is smaller than 1%. It shows that the EGM model is an ideal model for predicting the development trend of physical fitness. By analogy, the predicted values and mean relative errors of boys' indicators is shown in table 7 and table 8:   Table 7. 2019-2021 prediction of the average value of each physical measurement index (boys) Height, weight, vital capacity and sit-and-reach show a slow growth trend; the pull-up increase is large; the standing long jump is gradually reduced; the time of 50m is gradually shortened, but the