Machine Learning-based Investigation of the Influencing Factors on the Hardness of Steel Materials

The hardness of steel is often regarded as a crucial indicator of its production and application. The influences of chemical composition and process parameters on the hardness of steel exhibit strong non-linearity, making it challenging to accurately predict hardness through traditional multivariate regression or orthogonal experiments. Although machine learning has achieved distinguished success in diverse applications, its use in studying Metal materials has emerged only recently. Inspired by Sunčana et al.’s work, we select the Jominy distance as an input variable for the chemical composition, and instead of his subtly finding the best artificial neural networks, we simply use Generalized Regression Neural Network (GRNN) and due to the heterogeneity of the data, a second-order clustering method is employed as a data pre-processing step. The predicted hardness values were obtained using leave-one-out cross-validation, and the optimal smoothing factor (spread) was selected based on the Root Mean Square Error (RMSE) criterion. The optimal prediction results showed that in Configuration I, the RMSEs of the two types of data were 62.41 HV and 20.51 HV, respectively, while in Configuration II, they were 66.38 HV and 29.51 HV. Results showed Configuration I was more successful than Configuration II. This research presents a novel approach for predicting the hardness of metal materials using GRNN which is designed to address the need for quick adaptation to changes in design methods and the increasing demand for high-quality manufactured products.


Introduction
A precise understanding of the relationship between composition, processing, microstructure, and performance has become a fundamental challenge in the field of metal structural materials science and engineering.Composition and processing dictate the microstructure, which in turn determines the performance.This golden rule connects materials science with materials processing science, leading to the development of new processes and theories.In recent years, machine learning has emerged as a powerful tool, advancing from its theoretical foundations in the 1950s to the current level of deep learning.It is widely applied across various fields, including predicting steel hardness in this study [1][2] [3] .
Steel is one of the most commonly used materials worldwide, and China is currently the largest producer of steel globally [4] .Hardness is a fundamental mechanical property of materials that holds significant application value.From a microstructure perspective, hardness in steel primarily depends on its chemical composition and is determined by the heat treatment structure of the alloy.Quenching, a crucial step in heat treatment, plays a key role in the formation of the martensitic structure, which is closely related to steel hardness.The content of martensite is determined by various factors, including the cooling rate during heat treatment, the temperature at which austenite transforms to martensite, and the temperature at which austenite converts to martensite.Initial studies in predicting steel hardness focused on using the composition as the main input variable and primarily employed mathematical statistical methods like multiple regression analysis.Hollomon and Jaff [5] were the first to propose an equation for predicting austenite hardness, which involved tempering time, temperature, and a coefficient C (carbon) determined by chemical composition.The value of C, which is typically considered a constant, was obtained through numerous experiments and mathematical statistical methods based on experimental data.For example, Säglitz et al. [6] suggested a C value of 11.8 for low carbon steel, while Kamp et al. [7] proposed a value of 20.Among these approaches, Kang and Lee [8] introduced the Composition-Dependent Tempering Parameter, that the C value is no longer considered a constant but is jointly determined by the chemical composition of the alloy steel, leading to the improved prediction accuracy of the TP equation.This indicates that the effect of the chemical composition of the alloy steel is essential in accurately predicting steel hardness.
Machine learning in the form of artificial neural networks (ANNs) has become increasingly popular for predicting the composition-hardness relationship in steel [9][10] [11][12] .It is worth noting that the experimental data obtained from previous works may vary due to a lack of detailed processing parameters in modern steel material manufacturing.Sunˇcana et al. [13] , have sought to improve the accuracy of steel hardness predictions using ANNs by incorporating a new input variable -Jominy distance.Their study demonstrated that accurate predictions of steel hardness could be made using four variables: austenite temperature, austenite time, cooling time to 500 degrees, and Jominy distance.By considering these variables, ANNs can effectively model the complex relationship between steel composition, heat treatment, and resulting hardness.This approach can help optimize the heat treatment process to achieve the desired mechanical properties in the final product.
Vermeulen et al. [14] used feed-forward neural networks, and the prediction model obtained through training with 4,000 data sets had an average error of only 2 HRC in the entire sample prediction compared to the actual values.ANNs have been widely applied as high-performance classifiers in the field of recognition due to their adaptive capabilities.The Generalized Regression Neural Network (GRNN), an essential branch of Radial Basis Function (RBF) neural networks, is a typical feed-forward local approximation neural network based on nonlinear regression theory that approximates target functions by activating neurons.
Heretofore, numerous cases have demonstrated improving metal materials' performance by adjusting composition or processes.However, experimental data has revealed that there can be significant differences in the magnitude of cooling speeds, particularly when different processing techniques are used.When raw data heterogeneity is not taken into account, directly applying machine learning methods like neural networks for processing may result in time-consuming and inaccurate predictions.Based on this, this paper employs a second-order clustering method to divide the original data into two categories, using GRNN for the prediction of the grouped data.
The paper is organized as follows.In Section 2, we introduce the data source and present the data pre-processing algorithm due to raw data heterogeneity.Additionally, two configurations of input variables are adopted for each data set.Configuration I's input variables include element composition and processes, while Configuration II introduces the Jominy distance proposed by Sunˇcana et al. [13] , replacing the element composition in Configuration I.In Section 3, after briefly introducing ANN model construction and GRNN, we present an improved GRNN to study the two types of data and in Section 4, make a comparison between Configuration I and Configuration II.In Section 5, we demonstrate the convenience and efficiency of improved GRNN with a specific data pre-processing algorithm and its friendly use.Finally, the conclusion of this paper is shown in Section 6.

Data source and preprocessing method
This study utilizes a total of 423 experimental data samples from Sunˇcana et al.'s work [13] and, its data visualization reveals an abnormal relationship between the variable t500 and hardness.Figure 1 data for one type of steel (15CrNi6), with the variable t500 representing the time required for steel to cool down to 500°C, indicative of the cooling rate.As depicted in the figure, under the premise that other components remain unchanged, the order of magnitude of t500 varies significantly.When t500 reaches 85621.78s, the experimental hardness is 151.00HV; when t500 reaches 41493.26s, the experimental hardness is 152.00HV.Despite the considerable difference in t500 values, the change in experimental hardness remains insignificant.
Quenching is a common heat treatment process used to produce reliable steel components.The control of the steel cooling rate during quenching is crucial, especially for regulating its hardness.In experiments, the reason for little change in steel hardness lies in the transformation of steel to non-Martensitic structures such as pearlite or bainite when the cooling rate is below the critical cooling rate.These non-Martensitic structures have a lower hardness.Therefore, even with a large range of t500 values, slow cooling rates only have a slight effect on the increase of non-Martensitic structures and hence the steel hardness.When the cooling rate exceeds the critical cooling rate, steel transforms into the harder Martensitic structure, significantly increasing its hardness.However, if the cooling rate is too high, uneven thermal expansion and contraction inside the workpiece may cause deformation or cracking.Therefore, to achieve the desired performance of the steel, a suitable cooling rate should be selected.
In response to the aforementioned scenario, direct prediction of all data would result in significant prediction errors for the GRNN neural network.To address this, a second-order clustering approach was adopted for the clustering analysis of the t500 data.The second-order clustering involves two stages.Firstly, a clustering feature tree is established where all the data is placed in the first node of the tree root.A distance measurement is employed as a similarity criterion to group similar records in the same tree node, and new nodes are generated for records with low similarity.Secondly, a merging clustering algorithm is applied to combine the leaf nodes.This produces diverse clustering schemes, and the BIC criterion is utilized to compare the clustering schemes and automatically determine the optimal number of clusters, thus achieving the best clustering scheme.The clustering results of SPSS in Table 1 demonstrate that t500 data can be divided into two categories with an average silhouette of 0.7 and good clustering quality.The first category contains 272 samples, and the second category contains 151 samples.By predicting these two categories of data separately, the inter-group differences are smaller compared to the original data, which can avoid the influence of large t500 values on the training of GRNN.

ANN model construction
The ANN models currently used in research are mainly BP neural networks, which suffer from several drawbacks, such as slow convergence speed, susceptibility to local minima, and insufficient accuracy.To overcome these issues, it is necessary to adopt various techniques, such as increasing the number of hidden layers, adjusting the initial parameters, and exploring alternative methods.RBF neural network can improve these issues and it has strong abilities to describe nonlinearity [15] .RBF consists of three layers: the first layer is the input layer, which is composed of signal source nodes; the second layer is the non-linear hidden layer, which determines the center using unsupervised learning methods and generates data for the second layer using radial basis functions; when the number of neural nodes in the hidden layer is less than the number of training set data, the hidden layer becomes a regularized network; the third layer is the output layer, which responds to input patterns.GRNN network is an improvement based on the RBF network as depicted in Figure 2, by discarding the last forward neural network layer of the RBF network and adding a summation layer.This summation layer includes both arithmetic summation and weighted summation.The output layer is the ratio of the two types of summation results [16] .GRNN is more efficient and simpler to implement than RBF.
where  is the hardness term of the i-th sample, i X is the corresponding feature variable, F is the radial basis function (in Gaussian form), and σ is the smoothing factor spread.

Normalization processing
Due to the significant variations in chemical composition, processing methods, and specific Jominy distance ranges of steel, data normalization is necessary.
where X represents the original value, and max X and min X are the maximum and minimum values data respectively.

Improved GRNN
GRNN is a non-parametric model with only one smoothing factor say spread parameter σ, so onedimensional search optimization for its value is enough.In this article, we employed the leave-one-out cross-validation method for verification and the approach proposed by Lin et al. [17] is adopted to find the optimal value of σ.The optimization problem is to find the minimum value of root mean square error (RMSE), which is defined as: where i y ˆ and i y are the predicted and true values, respectively, and n is the number of data points.The leave-one-out method is one of the cross-validation methods.It involves constructing training and testing sets by leaving out one sample in each iteration, with samples being processed by their corresponding numbers.Firstly, sample 1 is removed from the dataset, and the model is trained using the remaining samples to predict the output of sample 1.The removed sample and the predicted result are then evaluated.Sample 1 is put back into the dataset, and sample 2 is removed for the next iteration, and so on until model training is completed.The leave-one-out method's evaluation is often accurate, but it is not suitable for large datasets as there are no random factors in the experiment.In this article, the dataset is divided into two categories, with 272 samples in the first category and 151 samples in the second category, making the leave-one-out method applicable for model validation.Leave-one-out (LOO) cross-validation is a method that optimizes the utilization of samples, and the resulting model is comparable to that trained using the entire dataset.Therefore, LOO evaluation often provides more accurate results.However, this method can be computationally expensive and may not be practical for large datasets.In the given dataset, which is divided into two classes with 272 samples in the first class and 151 samples in the second class, the sample size is moderate, making LOO evaluation a suitable choice.

Figure 3. RMSE graph of the two types of data
As shown in Figure 3, the smoothing coefficient for the first class of data reaches the minimum RMSE at 0.1, while the second class of data reaches the minimum RMSE at 0.16.Therefore, using these two smoothing coefficients can obtain the optimal value for the training sample.

Results and analysis
This study investigated two different group parameters (defined as Configurations I and II) among six chemical components, three process parameters, and specific Jominy distance, which is the same as parameter sets in Sunˇcana et al.'s work [13] .Table 2 displays the input variables used for the two configurations.The input variables for Configuration 1 comprise the primary alloy elements (C, Si, Mn, Cr, Mo, Ni), austenitizing temperature, austenitizing time, and time to cool to 500°C.In Configuration 2, the alloy elements are replaced by the specific Jominy distance, while the other inputs remain identical to those in Configuration 1.
Table 2. Variables for two configurations  while the smoothing factor σ is 0.16, the predicted results for the second type of sample are in Figure 7.

Variable
(c) (d) Figure 7. Predicted results for the second type of sample when the smoothing factor is 0.16.
As demonstrated in the scatter plots above, they are obtained from the generalized regression neural network prediction.Figure 6(a)(b) is derived from the first category of data for Configuration 1 and Configuration 2, while Figure 7(c)(d) is derived from the second category of data for Configuration 1 and Configuration 2. It can be observed from the figure that there is no significant deviation in the scatter points of Configuration 1 and Configuration 2 on the ideal regression line, indicating that both configurations have similar performance.Table 3 displays three metrics -correlation coefficient (r), root mean square error (RMSE), and mean absolute percentage error (MAPE) -calculated for the first and second types of data under two different configurations.MAPE is the primary evaluation criterion, as it uses absolute values to prevent positive and negative errors from cancelling each other out when evaluating prediction results.The formula for MAPE is as follows: where  is the number of samples,  is the true value, and  is the predicted value.By comparing the three indicators, it is evident that Configuration 1 of the first category of data (r=0.95,RMSE=62.41,MAPE=9.53%)outperforms Configuration 2 (r=0.94,RMSE=66.38,MAPE=16.53%).For the second category of data, Configuration 1 (r=0.93,RMSE=20.51,MAPE=6.94%) is superior to Configuration 2 (r=0.84,RMSE=20.51,MAPE=10.31%).
For further evaluation of the neural network prediction results of the two configurations, regression lines with a ±20% deviation are plotted in Figures 6 and 7.It can be observed that most of the data are within the deviation regression lines.Figure 8 is a bar chart of the statistics for these data.It can be found that for the first category of data, Configuration 1 and Configuration 2 have 13.24% and 14.34% of the data deviating from the experimental values, respectively.For the second category of data, Configuration 1 and Configuration 2 have 4.64% and 6.99% of the data deviating from the experimental values, respectively.

Discussion
This study primarily investigates the prediction of steel hardness using Generalized Regression Neural Network (GRNN) models.Analysis of previous literature on relevant studies indicates that neural network predictions of steel hardness mostly use chemical composition, heat treatment parameters, and cooling time as input variables.However, these require inputting a large number of variables and may not accurately specify the content of the chemical composition.Therefore, this paper chose the specific Jominy distance proposed by Sunčana et al. [13] as one of the input variables, establishing Configuration 2. The two-stage clustering method was used for data classification preprocessing, focusing on comparing the fitting effects of the two configurations.Configuration 1 uses the main alloying elements (C, Si, Mn, Cr, Mo, Ni), austenitizing temperature, austenitizing time, and cooling time to 500°C as input variables, while Configuration 2 replaces the main alloying elements with a specific Jominy distance.Sunčana et al. [13] used an iteration algorithm to adjust parameters subtly (number of neurons in each hidden layer, number of layers, etc.) to find the best structure of artificial neural networks [13] .This research utilized leave-one-out cross-validation and RMSE as the evaluation metric to conduct a one-dimensional search for the optimal smoothing coefficient, ultimately achieving the highest possible predictive capability of GRNN.
From an implementation point of view, the GRNN neural network employed in this paper to predict the hardness of steel is simpler and faster than Sunčana et al.'s work [13] .Moreover, we propose a preprocessing of data which contributes to greater accuracy than the models predicted by previous researchers, providing new insights for the prediction of hardness in metallic materials.

Conclusions
The following three conclusions can be drawn from the above analysis.
(1) The optimal smoothing coefficient "spread" in the GRNN neural network was obtained by leaveone-out cross-validation using RMSE as the evaluation criterion, which solved the arbitrariness in designing this parameter and reduced the impact of human factors on the prediction results.
(2) The prediction performance of the two configurations was similar, but the prediction accuracy of Configuration 2, which introduced the Jominy distance, was slightly lower than that of Configuration 1.
(3) For the first category of data with a bias within ±20%, Configuration 2 demonstrates a predictive ability similar to that of Configuration 1.However, for the second category of data characterized by extremely slow cooling rates, Configuration 2's predictive ability is noticeably lower than that of Configuration 1, with only 6.99% of the data deviating from the experimental values.
This research introduces a novel approach for predicting the hardness of metallic materials using GRNN.The findings suggest that using a specific Jominy distance as a substitute input variable is accurate enough when the detailed chemical composition is unknown.The aim of this method is to address the growing demand for high-quality manufactured products and the need for rapid adaptive design methods.

Figure 1 .
Figure 1.t500 for one type of steel.

Figure 4
Figure 4 illustrates the structure of the neural network used in this paper.Figure 5 presents nine input variables for Configuration 1, and four input variables for Configuration 2. The output layer predicts the hardness.

Table 1 .
SPSS Clustering Distribution Results

Table 3 .
Variables for two configurations