A zero-shot learning for property prediction of wear-resistant steel based on Multiple-source

In order to address the scarcity of C-Cr-V-Mo steel samples, a zero-shot transfer component analysis (TCA) based on multi-source is proposed. TCA maps the features of multiple sources composed of different kinds of wear-resistant steels and target domain to the reproducing kernel Hilbert spaces (RKHS). And the proposed fitness parameter α derived from the maximum mean discrepancy (MMD) allows multiple sources to affect the prediction to varying degrees. The support vector regression (SVR) model, established after TCA, can then predict the hardness without homologous samples. The matrix is rapidly predicted by the minimum distance from the sample to the cluster centers of matrix. The (Cr+V)/C, V/Cr and predicted hardness are added to feature space and the abrasion loss of samples quenched at high temperature are predicted using these quenched at low temperature. Experiments show that the multi-source based TCA+SVR model improves the prediction accuracy of hardness of C-Cr-V-Mo steel with R of 0.98 and MAE less than 1.4HRC under zero-shot condition. The primary matrix is quickly identified as martensite. The abrasion loss is mostly effected by hardness, (Cr+V)/C and V/Cr, which is predicted with R of 0.95, MAE of 5.23 mg.


Introduction
For the excellent properties, wear-resistant steel with V and Cr is often used in long-life production conditions of abrasive wear [1].Traditional design method is time-consuming and costly.The selection efficiency of target sample can be greatly improved by preparing and testing candidate samples based on property predictions [2].At present, model for property prediction is mostly established based on machine learning (ML) directly [3][4][5], which requires a similar marginal probability distribution of the source and target domains, otherwise the model performs poorly or even learns negatively [6].However, it is difficult to obtain a large number of high-quality homologous samples in a short period of time for high V and high Cr steel.Therefore, it is important to investigate how use the knowledge learned from other types of steels can be used to predict the properties of the target wear-resistant steel.
Transfer learning (TL), which can transfer knowledge learned from other source domains to solve tasks in the target domain [7], is often used to solve problems of insufficient homologous samples.Manipulating a transfer learning model with feature as transfer strategy has two steps.First, the features of the samples in the source and target domains are mapped to the reproducing kernel Hilbert spaces (RKHS), narrowing the marginal probability distribution between them.Then, based on the mapped source and target domains, the prediction model is built according to the conventional machine learning (ML) methods.In the absence of homologous samples, TL can obtain knowledge from non-homologous samples and perform target domain tasks that are not possible with direct ML modeling in previous studies.Hao et al [8] designed the hot zone of G8 ingot furnace by using the hot zone evaluation parameters and geometric parameters of that of G7 ingot furnace, and successfully transferred the growth law of ingot crystalline silicon from small-size crystals to large-size ones.Bouguettaya et al [9] classified six common defects of strip surface based on the deep learning model taking the NEU-CLS as the source domain, and the classification accuracy was higher than 98%.And the classification accuracy could be further improved by adding the data set with fewer unbalance defect samples to the NEU-CLS [10].Gong et al [11], aiming at the scarcity of defect samples of aeronautics composite materials (ACM), learned the characteristics of defect shape from the welding database and transferred to realize defect recognition and classification of ACM.Zhu et al [12] learned the creep fracture life rule of GH4169 alloy from NIMS nickel-based super alloy database, which was transferred to predict the creep fracture life of GH4169D alloy, and the high temperature creep fracture life was predicted based on that at low temperature.Wei et al [13] used the rotational bending S-N curves to predict the anti-twist S-N curves for low alloy steel.Researches shows that it is crucial to find an appropriate source domain for TL.The greater the similarity between the source and target domains after transfer, the better the learning effect [14].Therefore, a suitable source domain needs to be chosen for different tasks that the generalization ability of the model is poor.In addition, knowledge from the abandoned source domains is also worth learning, which can supplement the limitation of knowledge from a single source domain [15].Incorporating knowledge learned from multiple sources leads to more comprehensive knowledge and facilitates improving the accuracy and generalization capabilities of the model.
In the design of wear-resistant steel, CALPHAD and phase-field method are commonly used to model the microstructure evolution and judge the matrix and carbide that can be formed.Wang et al [16,17] simulated the phase transition based on CALPHAD in the variation process of isomorphic phase by using single phase configuration entropy.Roberto et al [18] established model based on phase field method to predict the evolution of microstructure by inputting electronic scale parameters from first principles and single phase scale parameters from CALPHAD.However, this approach is complicated and highly dependent on first-principles calculations.At present, the TL model is mainly focused on the recognition and classification of steel microstructure based on ImageNet data set [19][20][21].However, few studies have been reported on solving regression problems related to microstructure prediction for steel.The matrix of wear-resistant steel are mainly divided into ferrite, pearlite, bainite and martensite [22], which are closely related to hardness.With hardness as one of the features, the matrix classification can be quickly predicted by clustering.
Models for predicting wear resistance typically use compositions and heat treatment to form the feature space [23,24].However, the carbides formed in C-Cr-V-Mo steel and the proportion of VC are closely related to the wear resistance [2].The feature space should include the (Cr+V)/C reflecting the characteristics of ideal carbides and the V/Cr reflecting the proportion of VC in total carbides.In addition, previous studies have shown that there is no linear relationship between hardness and wear resistance [25,26], but that both are macroscopic properties of the microscopic matrix and carbide, that is, the two are closely related at the microscopic scale.Therefore, hardness should also be considered as one of the features.
Therefore, a multi-source based TCA + SVR is proposed to predict the properties of C-Cr-V-Mo steel under zero-shot condition.The original data is split into multiple sources, which are transferred with target domain by TCA.The fitness parameter a is proposed to comprehensively integrate the knowledge obtained from the multiple sources for predicting the hardness of samples in the target domain.The distance between the sample and the center of each matrix is calculated to determine the matrix type.The abrasion loss of C-Cr-V-Mo steel quenched at high temperature are predicted using these quenched at low temperature, and the effects of hardness, (Cr+V)/C and V/Cr on wear resistant are analyzed.Then, the optimal values of (Cr+V)/C and V/Cr are found that it is intended to provide a reference for the design of C-Cr-V-Mo steel with high hardness and good wear resistance.The transfer component analysis (TCA) is used to map features from multiple sources and target domains into a new feature space where the marginal probability distribution measured by maximum mean discrepancy (MMD) is reduced.However, most of the samples in the source domains are heterogeneous samples with different feature dimensions, which also differ from the samples in the target domain.The features of samples have to be enlarged and aligned before calculating the MMD.

Method
Assuming that the features of one source domain i the features of multi-source domains are expanded as follows: The feature space of the target domain denoted by G Í G , T S and the missing features in GT are aligned by zero.Then, the features are projected onto the reproducing kernel Hilbert spaces (RKHS) by minimizing the MMD distance using a mapping function ( ) Where, N i is the number of samples in the ith source domain, and N t is the number of samples in the target domain.By introducing a matrix W i with lower dimensions than K, the problem of solving the minimum distance of MMD is transformed into the following optimization problem: The W i is the transformation matrix mapping the source and target domains to the RKHS.For l source domains, there are l corresponding transformation matrices to implement the mapping.

Fitness of multiple-source domains
In the RKHS, the marginal probability distributions of the source and target domains are closer ( ) ( ) ¢ » ¢ P X W P X W .

S S i T T i i i
However, each source domain is not equally close to the target domain.Therefore, a is proposed to measure the similarity between source and target domains.
The larger the a i is, the closer the marginal probability distribution is.The source domain with high fitness plays a more important role in model training.When  ¥ i , the multiple sources are aggregated into massive data, that is å At this station, the distribution is more consistent with the statistical law figure 2, so ML models trained on these source domains will have higher accuracy and better generalization.

Multi-source based TCA + SVR model
After the multi-source based TCA, the mapped features of source and target domains are obtained, based on which the SVR model is built.To solve the SVR problem in multi-source domain, an objective function should be constructed that includes the insensitive loss e and the regularization loss L 2 for multiple sources, as follows: a i is the fitness of the ith source domain.The penalty factor G and the relaxation variable x of each source domain are the same values.By Lagrange duality to satisfy the KKT condition, the objective function can be solved as follows:

Matrix classification by weighted Mahalanobis distance
There are four classes of matrix: ferrite, pearlite, bainite and martensite.The cluster center C i is calculated for each class of matrix.Taking the compositions and heat treatment process as features, the weighted Mahalanobis distance from the target domain sample to the clustering centers is calculated [2].The formula is as follows: According to the maximum membership, the type of matrix is determined by the matrix corresponding to the cluster center with the minimum distance to the target sample.

Procedure of prediction
The transfer learning methodology introduced in the study of wear-resistant steel property prediction is illustrated in figure 1.First, the multi-source based TCA is used to obtain the mapping features of multiple sources and target domains.Then, feature engineering aims to construct new features related to the properties that should be added to the feature space.Then, the SVR model is trained on the mapped source domains, and the property prediction results can be obtained by feeding the target domain samples into the trained model.Key features and the extent to which design variations affect properties can be obtained by feeding target domain samples into the RF model.Finally, R and MAE are used to evaluate the model performance.

Results and discussion
3.1.Experimental data for zero-shot TCA problem 3.1.1.Multi-source domains The steels in the national standard are taken for the source domain s a total of 349 samples containing low-alloy wear-resistant steel, medium-alloy wear-resistant steel, high-alloy wear-resistant steel, tool steel and wear-resistant cast iron [27,28].The original source domain is divide into three subsets S , 1 S 2 and S 3 by WFCM [6], as shown in figure 2(a).Most of the samples in S 1 do not employ the tempering process and the hardness is mainly distributed in a higher range.The samples in S 2 are concentrated in the low-temperature tempering region.The hardness is spread mainly in the middle to high range.The samples in S 3 are high- temperature tempered, with a scatter in hardness from low to high.The mass fraction dispersion of compositions in each source domain is compared, as shown in figure 2(b).The content of C in S 1 is high and the dispersion is small.Cr, V, Mo, Ni and W are the major alloying elements, of which Cr has a large dispersion.The dispersion of V is also large and the median tends to 0, indicating that only a small number of samples add V. Mo, Ni and W all have trace-like contents with small dispersion.The C content in S 2 is lower.Cr is the dominant alloying element, with few other alloys.The average content of C is the highest in S , 3 with Cr, V, W, Mo, Co and Ni being the main alloying elements.Among them, the contents of V, Mo, W and Co are significantly higher and have a larger dispersion.

Target domain
) was designed with C, Cr, V and Mo as four variables.Then 9 groups of designed experimental samples form the target domain T, as shown in table 1.The samples in T were quenched at 850 °C, 900 °C, 950 °C(oil cooling) and tempered at 250 °C respectively that there were total 27 samples obtained.The HR-150A Rockwell hardness tester was used to test for hardness.The scanning electron microscope (VEGA3-TESCAN-SBH) was used to obtain microstructure.The wear test was carried out by a pindisk abrasive wear testing machine (ML-100).The size of sample was f 6 × 20 mm, the load pushed on it was 100 N, and the 280 mesh sandpaper was used.Then, the sample moved back and forth between center and edge of the disc at a speed of 6 mm s −1 (the maximum radius was 120 mm and the minimum radius was 30 mm), and relative to the disk reciprocating circular motion.Each sample was duplicated 15 times and the weight loss was measured using a TG328B analytical balance.

Analysis of the samples in source and target domain
The distributions of C-Cr, C-V and C-Mo in the source domains are shown in figures 3(a) ∼ (c).The largest distribution density of C-Cr is found in regions where the mass fraction of C is less than 1.0% and Cr is less than 4.0%.The distribution density of C-V is highest in the region where the mass fraction of C is less than 1.0% and V is less than 1.0%, while the distribution density of C-Mo is highest in the region where the mass fraction of C is less than 1.0% and Mo is less than 0.5%.Moreover, figure 3(d) shows unbalanced addition of Cr and V, when one is added in large content, the remaining one is added slightly or not at all.
The target domain T contains a series of newly designed wear-resistant steel, whose values of C is specified in [1.5,2.1]Wt%,Cr in [4,12]Wt%, V in [2,4]Wt%, and Mo in [0.5,2.5]Wt%.As shown in the yellow areas in figures 3(a)∼(d).There is only a tiny sample in S distributing in the design space of T. So the samples in S obey to the marginal probability distribution { } = P X , S the samples in T obey to the marginal probability distribution { } = Q X , T and ¹ P Q.That is, the source domain does not contain samples that are strongly correlated with the target domain, which is a typical zero-shot transfer problem.

Model training by cross-transfer task
To validate the proposed multi-source based TCA, the model of SVR is established after transfer and trained by several cross-transfer task, and the predicted results of training set with S 1 and S 3 as source domains and S 2 as target domain are illustrated in figure 4. In previous literature, modeling of transfer learning was based on a single source [8][9][10][11][12][13][14].Therefore, a single-source cross-transfer experiment was performed to compare with the method proposed in this paper.The parameters of the multi-source based TCA model are set as kernel_type = 'rbf', dim = 23, lamb = 1, gamma = 6.TCA narrows the marginal probability distribution  discrepancy between the source and target domains, reducing the MMD of S 1 and S 2 from 1.14 to 0.99, and that of S 3 and S 2 from 0.42 to 0.15.However, the different values of MMD indicate that the similarity between S , 1 S 3 and S 2 is not equal.When only S 1 is used as the source domain to predict the hardness of S , 2 the predicted values differ significantly from the measured values and the prediction accuracy is low.The reasons are mainly twofold.On the one hand, the large MMD distance indicates that the marginal distribution difference is still large after transfer, which does not well satisfy the machine learning modeling condition.On the other hand, the samples  in S 2 are low-temperature tempered, which decomposes the residual austenite and reduces the hardness.However, most of the samples in S 1 are not tempered, which prevents the model from fully learning the effect of tempering on the hardness in the source domain.In particular, the deviation is larger in the high hardness range.When only S 3 is used as the source domain to predict the hardness of S , 2 the predictions are significantly better than those of S .
1 This is because the MMD between S 3 and S 2 is significantly smaller, indicating that the marginal probability distribution is also more similar for higher prediction.For samples with measured values above 55HRC, the model predictions shift significantly toward lower hardness.Since most of the samples in S 3 are high-temperature tempered, the model learns from them that the higher the tempering temperature, the lower the hardness.Some samples with large amounts of Cr, V, and other elements have a good tempering resistance to maintain high hardness at high temperatures, leading to significantly higher prediction values for samples with low hardness tempered at low temperatures.
The method in this paper combines S 1 and S 3 as source domains, which should have different fitness a = 0.13 and a = 0.87.

S 3
The predictions perform higher accuracy that correlation coefficient (R) is 0.96, mean absolute error (MAE) is lower than 2.4HRC, which are at least a 30% increase in R and a 69% decrease in MAE compared with that of single source domain.
Then, to further demonstrate the effect of the proposed method, kernel ridge regression (KRR), multilayer perceptron regression (MLPR) and random forest (RF) are respectively used to modeling compared with the SVR model (the parameters are shown in table 2).The results are shown in figure 5.By comparing the predictions of six single-source experiments and three multi-source experiments, it can be seen that the multisource based model has a significantly improved accuracy.Since the two source domains with different fitness are jointly transferred to the target domain, the predictions are improved to different degrees.In all multi-source experiments, the models of SVR show the highest accuracy.Therefore, multi-source based TCA+SVR is chosen for modeling which is an effective method for property prediction.Models based on single and multiple sources are used to predict hardness of samples ranging from 1# to 9# with 950 °C quenching and 250 °C tempering, respectively.The compositions, heat treatment process and (Cr +V)/C are constructed the feature space to input model.Among the three single-source experiments, the model based on S 3 has the highest prediction accuracy in terms of the highest R and the lowest MAE , while the model based on S 2 has the lowest prediction accuracy, as shown in figure 6(a).It shows that reducing the marginal probability distribution discrepancy by TCA can improve the prediction accuracy of models with smaller MMD.As shown in figures 6(b)∼(d), Cr, V, Mo are the main alloying elements in S , 3 and the effect of these elements on hardness can be learned.However, the model learns more rules of hardness variation from the samples in S 3 under high-temperature tempering, while samples in the target domain are tempered at low temperature.During knowledge transfer, the predictions of some samples are biased due to different conditional probability distributions.Most of the samples in S 1 are non-tempered, so the model learns insufficient about the effect of the tempering temperature on hardness from S , 1 leading to errors.The tempering temperature of the samples in S 2 is the closest to that of the samples in the target domain, but the major alloying elements differ significantly.So the effects of Cr, V, and Mo on the hardness obtained from S 2 are insufficient, resulting in the largest errors.
The proposed multi-source based model can synthesize knowledge from different domains.The knowledge learned from S 3 with the highest fitness contributes the most to the prediction.To further improve the prediction accuracy, the rules of tempering at low-temperature learned from S 2 and the effects of alloying elements learned from S 1 on the hardness should be used to complement the deficiencies learned from S .
3 Therefore, the method proposed in this paper improves the prediction accuracy of the model, with R of 0.98 and MAE reduced to 1.4HRC (39% reduction).Moreover, the deviation of the prediction results on each sample is small.
Experiments show that the proposed multi-source based TCA +SVR model can predict the samples in target domain with high accuracy, when the source domain does not contain samples related to the target domain.

Importance analysis of design variables
To judge the influence of C, Cr, V and Mo on the hardness for the C-Cr-V-Mo steel, an RF model is developed to sort the feature importance.The predicted values are assigned to samples as pseudo-labeled, and then the importance is calculated, as shown in figure 7(a).The effect of Si, Mn, S, P alloying elements on hardness is very tiny to be ignored.The importance order is Cr (0.24709) > C (0.240052) > V (0.183548) > Mo (0.180870).
This indicates that Cr and C have the greatest influence on the hardness of the experimental samples whose values are close to half of total importance.C is the essential element in the formation of carbides and solution strengthening.As the C content increases, the amount of carbide formed increases, and so does the hardness.However, excessive Cr will hinder C diffusion and stabilize the austenite, resulting in increased residual austenite and reduced hardness after quenching, the hardness decreases with increasing Cr content.In addition, the design range of C in this group of samples is relatively small ([1.5,2.1]Wt%), while the range of Cr is relatively large ([4, 12]Wt%), the RF model is more sensitive to the changes of Cr so that the importance value of Cr is slightly higher than that of C. V and Mo have a slightly lower influence on the hardness than former.In the design range (V [2, 4]Wt %, Mo [0.5,2.5]Wt%), the hardness decreases slightly with the increase of V for insufficient hardenability.The change is not obvious with the increase of Mo, and the highest hardness at 1.5%Mo.
With the increase of the total amount of alloying elements, the hardness generally shows a downward trend as shown in figure 7(b).The content of C is 1.5% and 1.8%, the hardness decreases rapidly when the total amount of alloying elements exceeds 12%.Because, the mass fraction ratio of Cr, V, Mo (after equivalent conversion) to C within the design range fluctuated in the range of [8.4,11.1],which significantly deviated from the design sweet zone ( [3,6]) [29].It does not meet the conditions for obtaining higher hardness.However, for the sample with 2.1%C, the total amount of alloying elements exceeds 12%, the mass fraction ratio of Cr, V, Mo (after equivalent conversion) to C floats in the range of [4.8,6.8]Wt%, which still in the design sweet zone, so the hardness is maintained at a high level.

Matrix prediction
The matrix of the samples in the source domain can be classified as ferrite, pearlite, bainite and martensite, of which martensite has a higher hardness and is the ideal matrix for the design of wear-resistant steel, as shown in figure 8(a).The predicted values are used as the pseudo-label assigned to 1#∼9# samples, then the weighted Mahalanobis distance between each sample and the cluster centers of ferrite, pearlite, bainite and martensite is calculated, as shown in figure 8(b).According to the maximum membership principle, the matrix is predicted to be predominantly martensite for samples from 1# to 9#, that is, all samples with the smallest distance to the cluster center of the martensite.This indicates that the matrix of the designed C-Cr-V-Mo steels is predominantly martensite after quenching and tempering.Therefore, the differences in hardness and wear resistance between samples mainly due to the type, size and quantity of carbides.

Abrasion loss prediction of C-Cr-V-Mo steel 3.4.1. Abrasion loss prediction for different tempering temperature task
The SVR model for abrasion loss prediction is trained on the samples quenched at 850 °C and 900 °C tempered at 250 °C, and used to predict the samples quenched at 950 °C and tempered at 250 °C.
The samples are all obtained under the same experimental conditions, so there is no need to consider the effect of experimental conditions in the selection of features.In addition to the effects of compositions and heat treatment process on wear resistance, the hard carbide phase formed in C-Cr-V-Mo alloy steel and the proportion of VC are closely related to wear resistance.Therefore, the (Cr+V)/C that can reflect the characteristics of the ideal carbides and the V/Cr that reflects the proportion of VC in the total carbide should be included in the feature space.Although there is no linear relationship between hardness and wear resistance, both hardness and wear resistance reflect the macroscopic properties of the microscopic matrix and carbide, that is, the two are closely related at the microscopic scale.Hence, hardness should also be considered as one of the features.A 13-dimension feature space is formed by 8 compositions, 2 heat treatment process, (Cr+V)/C, V/Cr and hardness, which does not require further reduction of dimension.In order to verify the effect of adding (Cr +V)/C, V/Cr and hardness into the feature space on the performance of the model, the model taking the compositions and heat treatment process mentioned in the literature as input [2][3][4][5][6] is built to compare with the proposed method.The model is trained and validated using the leave-one-out method, and the trained model is used to predict the abrasion loss of the samples in the test set, as shown in figure 9.
Adding hardness, (Cr+V)/C and V/Cr to feature space, the training accuracy of the SVR model is improved, with R of 0.95 and MAE of 7.12 mg, which is higher than that of model only taking compositions and heat treat process as features (R = 0.85, MAE = 22.35 mg).It shows that hardness, (Cr+V)/C and V/Cr are features closely related to abrasion loss, and the accuracy of the model can be improved by adding these features into the feature space.As can be seen from figure 9(a), the abrasion loss of most samples is lower than 130 mg, and the predicted value of the model in this area is very close to the measured value.For the abrasion loss of samples with more than 130 mg, the deviation between the predicted and measured values is slightly larger.This is because the number of samples with large abrasion loss is too small to form a good training effect on the model.As the number of training samples increases, the prediction accuracy is further improved.
As shown in figure 9(b), the trained SVR model is used to predict the abrasion loss of the test set that R is 0.95, and MAE is 5.23 mg.It shows that the trained model has good generalization ability and high prediction accuracy on the test set, and can predict the abrasion loss of samples quenched at high temperature using samples quenched at low temperature.

Importance analysis of features
The importance of each feature on abrasion loss is obtained based on RF.Since Mn, Si, S, P and TT do not change numerically in the samples, the importance values of them are zero.The S, P, Mn, and Si contents are too small to tune properties that are not the main design elements, and therefore have little effect on the abrasion loss.Although tempering affects the decomposition of martensite and the precipitation of secondary carbides, and thus changes the wear resistance.In this example, a uniform tempering temperature of 250 °C is adopted for all samples, that is, the effect of TT on abrasion loss is not considered.Figure 10 shows the importance values of the other features.The importance of four design variables from high to low is Cr > V > C > Mo.The smallest value of the QT indicates that the matrix and carbide change not obvious quenched in the interval 850 °C ∼ 950 °C and does not significantly affect the abrasion loss.
In general, the sum of the importance of hardness, (Cr+V)/C, V/Cr, Cr and V is more than 90%, so the abrasion loss of C-Cr-V-Mo steel is mainly affected by these five features.Among them, hardness has the highest importance value, indicating that hardness is the most relevant feature for the abrasion loss, and the machine learning model can well fit the nonlinear mapping between hardness and the abrasion loss.It is found that hardness, (Cr+V)/C, V/Cr, Cr and V are mutually coupled cross-scale features.Cr and V are the design variables and the smallest scale features, after combining with C to form carbide, the features of the microstructure scale (Cr+V)/C and V/Cr are derived.The microstructures affect macroscopic hardness, and the relationship between hardness and alloying elements have been analyzed in section 3.3.2.When one of these features changes, it causes the others to change.(Cr+V)/C and V/Cr, as intermediate bridges connecting the three scales, not only contain the information of design variables Cr, V and C, but also reflect the mass fraction proportion of them.They are the most informative integrated features that are closely related to abrasion loss.
(Cr+V)/C is a feature reflecting the characteristics of carbides in C-Cr-V-Mo steel and has a design sweet zone, in which the (Cr+V)/C is near the optimal value (»4.2), ideal (Fe,Cr) 7 C 3 and VC phases can be obtained.When deviating from the optimal value, the type, size, and distribution of the carbides change to affect the abrasion loss.The microstructures with (Cr+V)/ C (removes the small amount of C consumed by Mo) in sweet zone are shown in figures 11(a)∼(c) and the elements of precipitated phases are shown in table 3. Carbides with large sizes and distinct grid-like growth directions precipitate in the a and e regions, and their Cr content is significantly higher than that of other alloying elements.Excluding a small amount of C consumed by V and Mo, the Cr/C in the a region is about 3.8, which is close to the mass fraction ratio of the chemical formula (Fe,Cr) 7 C 3 .The Cr/C of e region is about 2.4, the carbides in this region are mainly (Fe,Cr) 7 C 3 , and a certain amount of C is dissolved in the matrix.The carbides precipitated in the b and f regions are in the shape of short rods, and the V content is clearly dominant in this region.Excluding the small amount of C consumed by Cr and Mo, the V/C is 4.3 in the b region and 4.9 in the f region, which is close to the mass fraction ratio of the VC chemical formula and can be inferred that the carbides are VC.The d and g regions are dominated by Mo, indicating that the precipitation of Mo 2 C is more pronounced when the Mo content is above 1.5%.
V/Cr reflects the ratio of V-rich carbides to Cr-rich carbides.As the V/Cr gradually decreases, so does the proportion of VC in the carbide.As shown in figures 11(d)∼(f), the amount and area of VC in the sample with V/Cr of 0.5 is significantly higher than the sample with V/Cr of 0.33 for the same C content.
The hardness of VC is higher than that of (Fe,Cr) 7 C 3 , and the shape is different, so the two will produce differences in abrasion loss under different proportions, as shown in figure 12.The sample is divided into three groups Y , 1 Y , 2 Y 3 based on the mass fraction of C. The C content of group Y 1 (1#∼3#) is 1.5%, group Y 2 (4#∼6#) is 1.8%, and group Y 2 (7#∼9#) is 2.1%.The lowest value of (Cr+V)/ C in each group samples is close to the optimal value.The C content of group Y 1 is low, so when the total amount of alloying elements is large, the maximum (Cr+V)/ C of this group is much higher than the optimal value, which has deviated from the design sweet zone.The C content of group Y 2 is slightly higher than that of group Y , 1 so the deviation of the maximum (Cr+V)/ C from the optimal value is relatively reduced.The group Y 3 has the highest C content, and the deviation of the maximum (Cr+V)/ from the optimal value is the smallest, but it also exceeded the design sweet   zone.From the perspective of wear resistance, the samples in each group whose (Cr+V)/ C is closest to the optimal value (1#, 4#, 7#) are the samples with the smallest abrasion loss, while the samples 3#, 6#, 9# beyond the design sweet zone have significantly increased abrasion loss.
In each group, with the gradual decrease of V/Cr, the abrasion loss gradually increased.Moreover, the proportion of V in samples 1# is lower than that in samples 4# and 7#, so the abrasion loss is larger than that in samples 4# and 7#, indicating that appropriately increasing the content of V is conducive to improving the wear resistance.
Therefore, it can be concluded that C-Cr-V-Mo steel with good wear resistance should be designed with a high C content, (Cr+V) /C close to the optimal value 4.2, and a larger V/Cr.So the sample with 2.1%C, (Cr+V) /C is 3.6, V/Cr is 0.75 should have the best wear resistant.

Conclusions
In the present study, the hardness of C-Cr-V-Mo steel was predicted by multi-source based TCA + SVR model, whose matrix was classified by weighted Mahalanobis distance.Then, the abrasion loss of samples quenched at high temperature were predicted, and the vital features were analyzed.The following particular conclusions are obtained: 1. Due to the different of MMD after transfer, the proposed fitness parameter a made multiple sources affect the prediction to varying degrees.Multi-source based zero-shot TCA + SVR was able to achieve high accuracy in hardness prediction for C-Cr-V-Mo steel with R of 0.98 and MAE less than 1.4HRC.
2. The importance of design variables effected on hardness was sorted by RF that Cr > C > V > Mo in descending order.The hardness increased with increasing C content and decreased with increasing total alloying elements of Cr, V, and Mo.The (Cr+V)/C of sample floated within the design sweet zone [3,6], the hardness could maintain at a high level.
3. The matrix was quickly classed to be martensite according to the maximum membership by the distance from the sample to the cluster center of matrix.
4. The abrasion loss of the samples quenched at high temperature was predicted using the samples quenched at low temperature, and the R is 0.95 and MAE is 5.23 mg.Hardness had the greatest effect on the abrasion loss of C-Cr-Mo-V steel and it was coupled with (Cr+V)/C and V/Cr.With (Cr+V)/C nearing the optimal value and higher V/Cr, it was conducive to obtaining good wear resistance.

Figure 1 .
Figure 1.Procedure of multi-source based TCA + SVR modeling and prediction.

Figure 2 .
Figure 2. (a) Three source domains divided by clustering; (b) The dispersion of features in S , 1 S , 2 S .3 In order to compare the dispersion of features in the same order of magnitude, the mass fractions of S, P, Ca, Mg and Al are magnified by 10 times, Ce are magnified by 100 times, B, Y are magnified by 1000 times.

Figure 3 .
Figure 3. Distribution density of alloy elements in the source domain.(a) Distribution density of C-Cr; (b) Distribution density of C-V; (c) Distribution density of C-Mo; (d) Distribution density of Cr-V.

Figure 4 .
Figure 4. Predicted results of cross-transfer task.

Figure 5 .
Figure 5. Experimental results for TCA with single and multiple sources.(a)∼(c) The R of four ML models; (d)∼(f) The MAE of four ML models.

Figure 7 .
Figure 7. (a) Regulation of hardness with various contents of C, Cr, Mo, and V, and importance values.(b) The change of hardness with the total amount of Cr, V and Mo.

Figure 8 .
Figure 8.(a) Relations between hardness and matrix of the samples in source domains; (b) Distance heat map from the 1# to 9# samples to cluster centers of the four matrix.

Figure 9 .
Figure 9. (a) The predicted abrasion loss on training set with hardness, (Cr+V)/C and V/Cr added to features; (b) The predicted abrasion loss on training set taking compositions and heat treat process as features.(c) The predicted abrasion loss on test set with hardness, (Cr+V)/C and V/Cr added to features;

Figure 10 .
Figure 10.The importance of features for abrasion loss.

Figure 12 .
Figure 12.Group analysis according to C content.(a) (Cr,V,Mo)/ C in the 1#∼9# samples; (b) Weight loss in the abrasive wear test.

Table 1 .
Designed compositions for orthogonal experiment.