Optimization of Decision Tree Machine Learning Strategy in Data Analysis

Aiming at the problem of too coarse matching of machine learning decision tree models in the field of data mining and low prediction accuracy, the corresponding improved optimization strategies are proposed. First, the field matching degree of data is further improved by discretizing continuous attributes in multiple intervals. Then, the method makes the selection of business attributes more reasonable in the downward splitting process of the model by compensating the weight of feature attributes by business sensitivity indicators. Finally, the data classification rule transformation is used to further improve the data prediction accuracy of the model. The experimental results show that the introduction of the tree model generated by the business sensitivity index is more concise. In addition, the business pertinence and data classification capabilities are stronger. The results show that the transformed and upgraded data classification rules can effectively improve the accuracy of data prediction compared with the traditional optimization algorithm.


Introduction
Data mining and data analysis based on the decision tree model is an important application direction of current big data [1][2][3][4][5]. Most of the algorithms and improvement strategies in the existing decision tree research theories are limited to specific scenarios, and the scope of application is very narrow and many theories are still immature [6][7][8][9]. There are some shortcomings in practical applications. Moreover, in many research theories, the accuracy of data prediction is generally not high, and the ability of data classification is low, which makes data mining unable to truly grasp the essence of things [10][11][12][13][14]. How to find a mature, reliable and superior performance data prediction algorithm is an important research direction in the field of data application.
In discussing the process of discretization of feature attributes, this paper proposes a method to divide continuous attributes into discretization by probabilistic method. In addition, it also divides the continuous attributes into multiple reasonable sub-interval segments to make up for the shortcomings of the discretization algorithm. By analyzing the core degree of feature attributes, a calculation method is proposed to measure the weight of feature attributes through the correlation index.
In the reminder of this paper, we first introduce the data classification and feature attribute discretization and then analyze attribute sensitivity of experimental data. Finally, two typical tests are carried out to validate the performance of the proposed method.

Decision tree data classification and feature attribute discretization
The decision tree originated from the computer programming structure. It is an excellent classification model in the field of data analysis. The model can be used for machine learning of native data thus obtaining the laws and characteristics of the data. In this way, it can predict the outcome and trend of 2 unknown data. Decision process of tree creation represents the analysis and processing process of the problem. In the process of creating a decision tree, the concepts of entropy and information gain in information management are used as the basis for decision tree generation. The essence of the decision tree data classification process goes through the training data set induction, learning and reasoning, and the tree model growth process of classification prediction.
For the problem pfwhether the decision tree model can truly have data prediction ability, it is necessary to use unknown data to test. Generally speaking, the decision tree model generation process includes training samples and verification samples. The verification samples are mainly used to test the prediction ability of the model. If it is better, the prediction rules must be corrected. A good decision model must not only discover the data rules of the training examples, but also accurately classify the test examples.

Attribute sensitivity analysis of experimental data
Using information gain to select attributes and downward splitting tends to select attributes with more values. In many cases, those attributes with more values are often not the core attributes, which will cause the predicted results of the generated decision tree to be different from the actual. In response to this situation, literature [8] describes a method of using information gain rate to select attributes. However, it is not efficient when processing data with large sizes. In this paper, we consider introducing business sensitivity indicators to measure each characteristic attribute.
The business sensitivity index s is a variable in the interval [0,1]. The main function of this index is to serve as a correction value to strengthen the weight of important attributes and reduce the weight of non-important attributes. So, there will not be too many factors in the process of using decision tree generation. Value non-important attributes cover up the phenomenon of less important attributes. The value of this variable mainly depends on whether the split attribute is a core attribute. If it is a core attribute, its value is relatively small. And if it is a non-core attribute, its value is relatively large. Hence, the weight of this attribute is inversely proportional to this variable. A concept of association degree can be used to measure whether the split attribute is a core attribute. The greater the association degree, the greater the core degree of the split attribute. The measurement of association degree r is as follows: , the stronger the relationship between the attribute and the employment. And i v represents the attribute range. According to the above correlation calculation formula r , we can define the core correlation value of a similar split attribute (k), which is the business sensitivity index s as:

Experiments
The decision tree model developed by self-coding is used as a prototype tool to classify and test more sample data of job applicants. In the experiment, different model structures are used for classification testing. There are 3 kinds of experimental models, namely: 1 denotes the original model of optimization without any processing; 2 contains an improved model of business sensitivity indicators; 3 contains the ultimate model of business sensitivity indicators and modified data classification rules. The experimental classification output results of the test samples on the three models are as follows As shown in Fig. 1 and Fig. 2, it can be seen from the output results that when the sample size is still CISAI 2020 Journal of Physics: Conference Series 1693 (2020) 012219 IOP Publishing doi:10.1088/1742-6596/1693/1/012219 3 relatively small, the classification accuracy of the three models fluctuates greatly, which cannot reflect the actual classification ability of the model. As the number of experimental samples increases, The data classification capabilities of the three models are gradually reflected, especially when the sample data volume reaches a certain level, the accuracy of the three models is basically stable within a certain range, and the data classification capabilities of the three models are very clear: the Ultimate model >Improved model>Original model. Therefore, the experimental results proves the effectiveness of this paper on the optimization strategy of the decision tree model based on machine learning.

Conclusion
This paper improves the data prediction ability of the decision tree from several aspects such as data classification, sensitivity index analysis, and classification rule modification. Experimental analysis has verified the effectiveness of the optimization and improvement strategy. Compared with the traditional decision tree ID3 algorithm, the classification rules are improved and wrong classification rules are discarded. The classification ability is strengthened, so the data classification accuracy rate is higher. Compared with the traditional decision tree C4.5 algorithm, the continuous attributes can only be divided. Accordingly, this optimization scheme can discretize continuous attributes into multiple data intervals. So, its discrete process of continuous attributes is more reasonable and accurate, which can meet broader data processing needs. How to further improve the accuracy of data prediction is the direction of future research in this subject.