Unlocking the strength: the predictions of silicon carbide fracture toughness revealed through data-driven approach

Silicon carbide ceramics are widely used within various applications, including mechanical, chemical, aerospace and military; where the fracture toughness plays a crucial role. From the processing perspectives, the fracture toughness is controlled by the combination of starting phases and sintering conditions (including additives, atmosphere, temperature and pressure). However, the interplay of these factors makes the forward predictions of fracture toughness untreatable neither through experimentation nor physical modeling; not mention to the reverse estimations of optimal processing parameters. In this work, a data-driven strategy was proposed that firstly to predict the fracture toughness from processing parameters; and then to explore certain parameters that have large impacts on the fracture toughness. From running four different machine learning (ML) algorithms on a well-established dataset of SiC sintering recipe, it was found that the eXtreme Gradient Boosting (XGBoost) model possess the best performance with accuracy up to 88%. Further, the feature importance scores revealed that the sintering temperature and the types of sintering additives show their significant influence on fracture toughness. It was found that the sintering temperature is the most critical factor affecting the obtained fracture toughness of SiC, where the optimum temperature range is of 1800 °C–2000 °C; and also, the sintering additives of Al and Al2O3 have great influences on the obtained fracture toughness, where the optimum range of their mass fraction within the whole additives is 3–8 wt%. Finally, the developed model shows its capability to propose sintering strategy for the preparation of SiC ceramics with target fracture toughness.


Introduction
Silicon carbides (SiC), with merits of low density, high strength, high hardness, as well as excellent high temperature oxidation resistance [1,2] are widely used within machinery, chemical industry, aerospace and military and other fields, especially under harsh environments with high temperature and severe corrosions [3][4][5]; making it as a very attractive structural material.Fracture, with various modes, is one common failure manner of brittle materials (such as concrete composites, rocks) when suffering tensile loads, which has been assessed by different experimental approaches and modified through composite or/and reinforced form [6,7].The micro-defects, initiated near the macro-crack tip due to local high stress concentration, or generated during the material preparation and processing processes, play a significant role on crack growth behaviors that influence the fracture properties [8].As one type of brittle materials, the fracture toughness is one of the most fundamental mechanical properties for ensuring the reliability of SiC in their structural applications [9][10][11].
In recent years, many works have explored different strategies, such as tuning sintering additives and sintering conditions, for SiC synthesis for improving their fracture toughness [12][13][14].In terms of the sintering additives, after studying the effects of additive components on the mechanical properties of pressure-less sintered SiC ceramics, Jung-Hye Eom et al [15] found that the Al 2 O 3 -Y 2 O 3 -AlN combination as additives could accelerate the β→α phase transition of SiC, which refines the microstructure and improve the flexural strength and fracture toughness of SiC.While for exploring the effects from sintering atmosphere, Suzuki et al [16] studied the effects of Ar and N 2 sintering atmosphere on the grain morphology of SiC; and under the same conditions, it was found that the N 2 sintering atmosphere controls the SiC microstructure more effectively than the ones under Ar atmosphere, as well as to obtain finer grain morphology, possessing the potential of higher mechanical properties.Kim [17] studied the effect of temperature on the mechanical properties of sintered SiC.With the increasing of sintering temperature, the fracture toughness of SiC keeps increase; that is, the fracture toughness monotonically enhances from 4.8 to 6.0 MPa•m 1/2 with the increase of sintering temperature from 1750 to 1900 °C.Due to the low efficiency and high cost of the experimental strategies, it is challenge to analyze the influence of various factors on the fracture toughness of SiC quantitatively, and to further propose the optimized process parameters.
Within the framework of materials genome initiative, data-driven methods are increasingly used to make the explorations of potential materials with desired properties Xu et al [18] identified the material system with high interfacial thermal resistance within 692 data sets using machine learning methods.Yang et al [19] used machine learning method of limit gradient enhancement to predict and analyze the bending strength of Si 3 N 4 ceramics, which could evaluate the factors (i.e.sintering temperature and total content of sintering additive) affecting the bending strength quantitatively; and further suitable addition sequence of sintering additive was provided for obtaining superior bending strength of Si 3 N 4 ceramics.Furushim [9] selected 330 data sets from database for different additives and various process conditions.The Convolutional Neural Networks model was constructed and applied to the evaluations of fracture toughness and bending strength of Si 3 N 4 ceramics; and the results show that the trained model can correctly recognize the two mechanical properties.Conventionally, the machine learning strategies rely heavily on data-driven pattern recognition and statistical analysis to make predictions by learning models from large amounts of data.However, data-driven approaches alone can be challenging when data is scarce.In fields with abundant domain knowledge, the theory-guided machine learning tactic has been proposed as an approach that combines machine learning with domain knowledge and theory, which improves the performance, interpretability, as well as reliability of machine learning models by fusing the domain knowledge with the capabilities of machine learning algorithms.Within the materials science domain, the theory-guided machine learning could play an important role, via combining the advantages of data-driven methods and the expertise of domain knowledge, to achieve the desired performance.In this work, based on a well-established dataset of SiC sintering recipe, a machine learning method is employed to predict the fracture toughness of SiC.The established model can provide initial guidance for selecting appropriate sintering additives and sintering conditions for developing SiC with high fracture toughness.The focus of this study is to investigate the importance of sintering parameters on affecting the fracture toughness of SiC.

Data set construction and data processing
Data acquisition has always been a laborious task in materials science.Many experimental data are scattered in various literature and database, which brings great difficulties to the acquisition and utilization of data.In order to construct a database containing fracture toughness of SiC synthesized with different sintering additives and under different sintering conditions, relevant experimental data were collected from various literatures (see appendix).

Feature value selection and data processing
The selection of features would determine the efficiency and accuracy of machine learning prediction models.The type of features selected in this work are: a) sintering raw materials; b) the content ratio of α-SiC and β-SiC; c) the content ratio of sintering additives, including Al, Al 2 O 3 , B 4 C, C, Y 2 O 3 , AlN, B, CaO, Al 3 BC 3 ; d) atmosphere; e) temperature, T; f) pressure, P; and g) sintering time, t.In order to reduce the risk of overfitting due to the great differences within features and improve the generalization ability of the model, the Pearson correlation coefficient diagram was adopted to evaluate whether the selection of features is reasonable and avoid redundant features.
The Pearson correlation coefficients are shown in figure 1, from which it can be seen that most of the feature pairs do not have high correlations, so that these features can be employed to predict fracture toughness.

Machine learning models
Currently, there are various existing machine learning algorithms that can be used to handle different types of problems.In order to evaluate the performance of different machine learning model, this paper adopted four different machine learning models (i.e.Bagging, Random Forest, eXtreme Gradient Boosting (XGBoost) and Gradient Boosting) and compared the performance of them.

Bagging
Bagging is a technique that reduces the generalization error by combining several models [20], and improves the performance and generalization ability of the overall model by combining the prediction results of multiple basic models.The main idea is to train several different models separately, and then let all of the models vote on the outputs of the test samples.This is an example of a general strategy in machine learning, called model averaging.

Random Forest
Random Forest is a common ensemble learning algorithm based on the idea of Bagging [21].It performs prediction and classification tasks by constructing multiple decision trees and integrating them.Adding random attribute selection in training, 'data random' refers to randomly extracting data from all data as the training data of one of the decision tree models [22].Unlike traditional decision tree training, selecting some attributes first and then selecting the optimal sub-attributes from them will make the training process faster.

XGBoost
XGBoost is a machine learning algorithm based on gradient boosting decision trees, originally proposed by Chen and Guestrin [23].By optimizing the loss function, the core idea is to gradually build a series of decision tree models.It combines gradient boosting algorithm with decision tree method to optimize gradient decision tree, improve gradient boosting algorithm, and then iteratively train multiple decision tree models to enhance the prediction performance.

Gradient boosting
Gradient Boosting is a regression model that is based on a gradient boosting algorithm, which iteratively trains a number of weak learners to construct a predictive model.A series of regression models are constructed by optimizing the gradient of the loss function, each iteration focuses on the predictive errors of the preceding model and corrects these errors by training a new model, and the predictions of all the learners are weighted and summed to obtain the final predicting model.

Parameter selection
The machine learning parameters are chosen to optimize the performance and generalization of the model, ensuring that the model can perform well on unseen data.This work employed sequential grid search tactic to find the optimal hyper-parameter combination.By using GridSearchCV for sequential grid search, multiple hyper-parameter combinations can be automatically tried to find the optimal parameter configuration, thereby improving the performance and generalization of the machine model.In order to maximize the performance of the model, the following parameters must be adjusted, as shown in table 1:

Evaluation indicators
Mean absolute error (MAE), Mean square error (MSE) and coefficient of determination (R 2 ) were employed to evaluate the results.MAE is used to evaluate the average difference between the predicted value and the actual value.MSE is the mean squared value of the difference between predicted value and the actual value; where the samples with large differences are emphasized, making the model more sensitive to these differences.The coefficient of determination R 2 is a statistic that measures how well the regression model fits the observed data.It ranges from 0 to 1, with a value closer to 1 indicating a model performs better.where n denotes the number of samples, ̅ y denotes the mean of the actual values, and ŷi and y i denote the predicted and true values, respectively.
Based on the error evaluation results, the best model was selected as the machine learning prediction model for the relationship between SiC sintering preparation parameters and fracture toughness.The 5-fold crossvalidation was also used to prevent model overfitting and to determine its performance on out-of-sample data that not included in the training samples.The 5-fold cross-validation can fully utilize the limited data set to provide a comprehensive evaluation of the model performance; which reduces the variance of the evaluation results, and can detect the stability of the model on different data subsets.

The performance evaluation and model validation
There were 104 data collected, including information about fracture toughness of SiC synthesized with different sintering additives and under different sintering conditions.The dataset was divided into training set and testing set, with the ratio of 7:3, based on which the Bagging, Random Forest, XGBoost and Gradient Boosting were employed to train for model building.The optimal parameters combinations were chosen as the model parameters by grid search based on performance of different models on validation set, as shown in table 2. In figure 2, the comparing results of the four selected models are shown, based on the values of MAE, MSE and R 2 .
The predictions of all four models are with satisfying accuracy, and the XGBoost model is the best one; where the parameters of the XGBoost model are shown in table 3.In order to reduce the risk of overfitting and improve the generalization ability of the model, the gradient boosting algorithm and regularization techniques were used during the training of the XGBoost algorithm.In addition, XGBoost can output the importance score of the characteristics.

The feature importance ranking
By utilizing the XGBoost capability of feature importance ranking, which makes the model interpretable [24], the importance score of each feature were analyzed (shown in figure 3).It was found that the most critical features for predicting fracture toughness are sintering temperature and the content of Al/Al 2 O 3 sintering additives.It is acknowledged that both of the sintering time and the additives improve the fracture toughness of ceramic materials through their microstructure controlling (i.e.modulating grain size and distribution, as well  as controlling the porosities and pore size distributions) [25].The modulations of microstructure affect the crack path and facture behavior, thereby affecting the fracture toughness.The further details are addressed in the next section.

The effects of sintering temperature and sintering additives on fracture toughness
The compounding effects of the types and contents of sintering additives, as well as sintering temperature, on fracture toughness are shown in figure 4; where the x-axis in figure 4(a) represents the mass fraction of Al and Al 2 O 3 additives, and in figure 4(b) represents the mass fraction of all additives.It can be found that the sintering temperature between 1800 °C-2000 °C, and the mass fraction of 5-10 wt% of all sintering additives, is beneficial to improve the fracture toughness.While for the mass fraction of sintering aids Al and Al 2 O 3 , the optimal range is 3-8 wt%.
For the sintering temperature, it is acknowledged that it has an indispensable effect on the grain size growing and the compactness of the sample [26].The rise of the sintering temperature would gradually activate the mechanisms of densification.It is proposed that when the temperature is lower than specific temperature, the sample with low density, porosity and poor mechanical properties is derived; and thus, the fracture toughness cannot reach high values.While the sintering temperature keeps growing, the fracture toughness of derived SiC  is improved attributing to the introduction of liquid phase; where the weak oxidation phase remains on the grain boundary of SiC, which regulates the grain boundary of SiC and changes the fracture mechanism of the sample [27].Further, as the sintering temperature continues to rise, the fracture toughness of the sample would decrease due to the abnormal growth of the grains [17].In this work, the grains are proposed to grow up quickly at higher sintering temperature than 1800 °C-2000 °C, leading to decrease in the fracture toughness.On the contrary, the sintering additive cannot form enough liquid phase, and thus the driving force of densification is too small at lower sintering temperature than 1800 °C-2000 °C, which would damage the fracture toughness as well.The sintering temperature ranging from 1800 °C to 2000 °C has the capability to facilitate densification of the SiC and keep grains size within an appropriate range.
For the sintering additives, the great influence on promoting the sintering process of SiC particles has been fully studied [27][28][29][30][31]. Due to its highly covalent bonded characteristics, it is difficult to densify SiC without sintering additives.The densification is either achieved by solid-state sintering using B and C as additives, or through liquid-phase sintering using metal oxides such as Al 2 O 3 and Y 2 O 3 [32].The roles of the kinds and amounts of additives play for the sintering not only as the densification aids but also the elements for tuning the microstructure and the chemistry of grain-boundary phase.The excessive concentration of sintering additive increases the liquid phase content and therefore improve the driving force of densification and promote grains growth.Consequently, excessive concentration of sintering additive is detrimental for fracture toughness.A low concentration of sintering additive can lead to a low driving force of densification.In this work, the concentration of 5-10 wt% sintering additives is appropriate for high fracture toughness.Specially, for the mass fraction of sintering aids Al and Al 2 O 3 , the optimal range is 3-8 wt%.The effects of Al are attributed to the Al atoms that dissolved into 6H-SiC grains may have stabilized 4H-SiC, partially transformed 6H-SiC to 4H-SiC, and accelerated elongated grain growth [33].While for the effect of Al 2 O 3 additive, its presence provides the condition for the rod-like structure formation possessing greatly enhanced resistance to crack propagation [34].

Conclusions
In this study, based on the constructed database of sintering recipe for SiC, different machine learning algorithms are employed and compared to evaluate the fracture toughness of SiC; and finally, the XGBoost algorithm is adopted to make the predictions of fracture toughness.Further, the factors affecting the fracture toughness are analyzed; and it is found that: 1) the sintering temperature is the most critical factor affecting the obtained fracture toughness of SiC, where the optimum temperature range of 1800 °C-2000 °C; 2) the sintering additives of Al and Al 2 O 3 have great influences on the obtained fracture toughness, where the optimum range of their mass fraction within the whole additives is 3-8 wt%; 3) Machine learning methods can be used to predict the fracture toughness of SiC ceramic materials, and the desired properties can be obtained by designing the processing parameters and additive types and contents.In summary, this study provides theoretical guidance for the optimization of the fracture toughness of SiC ceramics, and further the proposed workflow could be employed to guide the design of other kinds of target properties for specific applications.

Figure 1 .
Figure 1.Heat map of Pearson's correlation between fracture toughness and descriptors.

Figure 2 .
Figure 2. The ML model evaluates the datasets (training and test sets) and ML predicts the fracture toughness based on the experimental dataset using the XGBoost, Bagging, Random Forest, and Gradient Boosting models, with y = x denoting the ideal prediction without any bias.

Figure 4 .
Figure 4. Scatterplot visualizing the data distribution.X-axis and Y-axis denote sintering additive content and sintering temperature, respectively, and color denotes fracture toughness value.a: represents the mass fraction of sintering additives Al and Al2O3, b: represents the mass fraction of all additives.

Table 1 .
The machine learning model parameters.

Table 2 .
The evaluation indicators for different models.