Investigation of effect of processing parameters of 3D printed NHS/EDC crosslinked carboxy methyl cellulose/gelatin hydrogels with machine learning techniques

This study focuses on the effects of preparing conditions of gelatin/carboxymethyl cellulose (CMC) composites on their mechanical properties of gelatin/carboxymethyl cellulose (CMC) by extreme gradient boosting (XGB) machine learning algorithm. The research involved studying the effect of weight fraction of carboxymethyl cellulose (CMC) and graphene oxide (GO) as well as the concentration of ethyl(dimethylaminopropyl)carbodiimide (EDC)/ N-hydroxysuccinimide (NHS) on modulus, % strain at break and ultimate tensile strength (UTS). It also includes a correlation heatmap, feature importance assessment, model performance evaluation, and the Shapley Additive Explanation (SHAP) technique to analyze the dataset. The relationship between independent parameters and mechanical properties reveals insights into the material’s ductility, flexibility, and modulus. Feature importance demonstrates that NHS/EDC concentration has the highest impact on the mechanical properties. Increase of EDC/NHS concentration is observed to drastically elevate the modulus and UTS, however, reduces the flexibility of the nanocomposites. CMC improves flexibility but reduces UTS and modulus. GO improves % strain at break, UTS and modulus up to 1% GO, however, higher wt% of GO reduces the mechanical performance. With lower concentrations of NHS/EDC, the mechanical properties can be tailored for soft tissue engineering applications. The study highlights the importance of optimizing material compositions for tissue engineering applications.


Introduction
Machine learning is a branch of artificial intelligence [1][2][3].The generally accepted definition of machine learning is that computers gain the ability to perform specific tasks without the need for explicit programming dedicated to those tasks [2].According to Tom Mitchell, machine learning involves the development of computer programs that enhance their performance automatically through experience [3].Some of the most commonly used machine learning algorithms are linear regression, artificial neural networks, support vector machine, random forest and extreme gradient boosting (XGB) [1].
XGB is a recently new ensemble learning method [4].The XGB regressor is an ensemble gradient boosting algorithm developed by Guestrin and Chen in 2016 [5,6].This model adds sequential trees after evaluating previous trees to build a strong learner from weak learners [7].Predictions are made by summing the scores of each leaf node [8].XGB is known for its efficiency and prediction accuracy, making it a common choice in regression tasks.To prevent over-fitting to outliers, XGB applies a second-order Taylor expansion to the loss function and normalization to the objective function [5][6][7]9].
3D printing has emerged as a revolutionary technology with vast potential in the field of tissue engineering [10,11].This innovative approach enables the precise fabrication of complex three-dimensional structures, offering the capability to mimic the intricate microarchitecture of biological tissues.By integrating biomaterials, cells, and growth factors, 3D printing holds the promise of revolutionizing regenerative medicine, ushering in a new era of customized and functional tissue constructs for transplantation and therapeutic purposes [10,12].
Gelatin, a natural biopolymer derived from collagen, has garnered significant attention in tissue engineering due to its biocompatibility and biodegradability [13,14].This versatile biomaterial offers a promising scaffold for regenerative applications, as its composition closely resembles that of native extracellular matrices.However, to enhance the mechanical stability and long-term performance of gelatin scaffolds, crosslinking techniques such as N-Hydroxysuccinimide (NHS) and N-(3-Dimethylaminopropyl)-N'-ethylcarbodiimide (EDC) activation are employed.These crosslinkers facilitate the formation of covalent bonds, reinforcing the gelatin structure and enabling the creation of robust tissue-engineered constructs with tailored properties for regenerative medicine [15].
Carboxymethyl cellulose (CMC) has emerged as a valuable biomaterial in tissue engineering due to its biocompatibility and versatile properties [16].Derived from cellulose, CMC offers a biodegradable, watersoluble matrix that can be tailored to mimic the extracellular environment of various tissues.Its ability to control drug release, support cell adhesion, and promote tissue regeneration makes CMC a promising candidate for scaffolds and drug delivery systems in tissue engineering applications [17,18].
Graphene oxide, a derivative of graphene, has garnered significant interest in the field of tissue engineering due to its unique combination of properties [19,20].Its two-dimensional, nanoscale structure offers a high surface area for cell adhesion and growth, making it an attractive material for scaffolds [20].Additionally, graphene oxide's versatility in drug delivery, bioimaging, and biofunctionalization enhances its potential as a multifunctional component in tissue engineering applications, with the ability to support regenerative processes and improve therapeutic outcomes [16].
Composite materials are extensively used in tissue engineering applications, with their mechanical properties, notably modulus, ultimate tensile strength and ductility playing a critical role in determining their suitability for specific uses.This study embarks on a comprehensive exploration of mechanical properties of 3D printed gelatin/carboxymethyl cellulose, offering valuable insights into the factors that influence modulus, % strain at break and ultimate tensile strength (UTS).Our investigation involves data collection of concentrations of key components including NHS/EDC, carboxymethyl cellulose (CMC) and graphene oxide (GO) concentration and their impact on mechanical properties.
The core of our analysis lies in the application of XGB regression model within the Python programming framework [21].This approach reveals the intricate relationship between EDC/NHS, CMC, and GO concentrations and mechanical properties (modulus, % strain at break and UTS) which is difficult to fully comprehend solely by experiments or statistical design of experiments such as Taguchi or design of experiments (DOE) due to their complex relationships [22].In this study, the feature importance of each independent variable on the mechanical properties and impact of variation of the values by SHAP analysis will be studied.Then predictions are made to comprehensively evaluate the effect of each parameter on the mechanical responses.These findings are crucial for optimizing composite materials to meet specific application requirements.By tailoring the processing parameters according to the outcome of the results of this study, the samples may be suitable for soft tissue engineering applications such as cardiac, cartilage or skin tissue engineering.

Data collection
In this study gelatin/CMC were prepared by 3D printing.Then, GO was also added to some formulations.All prepared composites were covalently crosslinked with various concentrations of NHS/EDC.The independent variables in this study include concentrations of CMC (5-15 wt%), GO (1-1.5 wt%) and NHS/EDC (5-20 mM).The dependent variable is modulus, % strain at break and ultimate tensile strength (UTS) and data were collected from experiments.Detailed parameters are provided in table S1 in the supplementary file.

Sample preparation
To create GEL-CMC polymeric inks, gelatin was dissolved in distilled water at 50 °C for 20 min.Subsequently, CMC powder was gradually added to the gelatin solution, and the mixture was stirred with a magnetic stirrer at 50 °C until a homogeneous solution was achieved.For groups involving GO, GO was added to the gelatin solution based on the gelatin concentration, followed by sonication for 2 h.CMC was then added to the solution.The prepared compositions were 7.5 wt% of GEL with 7.5, 10 and 15 wt% CMC.GO was also added with 0.5, 1 and 1.5 wt% into the 7.5 wt% GEL, 10 wt% CMC hydrogels.Prior to printing, GEL-CMC polymeric inks underwent centrifugation at 4000 rpm for 4 min to eliminate bubbles, ensuring a homogeneous polymeric solution for a smooth printing process [16].

3D printing of the scaffolds
Cardiac scaffolds were produced using the Axolotl Biosystem A1 Bioprinter.The 3D GEL-CMC scaffold's of square shape was designed using Solidworks computer-aided design (CAD) with standardized dimensions of 10 mm × 10 mm × 2 mm.Printing parameters included a speed of 4 mm s −1 , a needle type of 21 G, and The printing temperature of the samples were optimized by using a (Anton Paar, MCR302, Graz, Austria) rheometer [23].The extrusion pressures were 20, 45, 82 and 38 psi and the printing temperatures were 36 °C, 50 °C, 75 °C and 50 °C for 7.5GEL-7.5CMC,7.5GEL-10CMC, 7.5GEL-15CMC and 7.5GEL-10CMC-1GO for the prepared inks, respectively.

Crosslinking with EDC/NHS
After printing, scaffolds were crosslinked by immersion in an EDC: NHS solution with ratios of 5:1 (in 90% ethanol) containing 100 mM:20 mM, 50 mM:10 mM, and 25 mM:5 mM for 1 h at 4 °C, respectively.To halt the crosslinking process, scaffolds were immersed in a 0.1 M disodium hydrogen phosphate (Na 2 HPO 4 , Merck) solution for 2 h.Subsequently, they underwent three washes with PBS and deionized water [15].

Mechanical testing
The mechanical properties of the scaffolds were evaluated using a Universal Test Machine (Zwick/Roell Z 100) equipped with a 500-N load cell.Tensile properties were assessed at a crosshead speed of 5 mm min −1 on specimens with dimensions of 10 mm × 10 mm × 2 mm and an infill density of 30%.Young's Modulus was calculated from the linear segment of the stress-strain curves.Each experimental group consisted of five repeated samples.

Computational modeling
All machine learning algorithms were implemented using Python with Pandas, Numpy, Scipy, Matplotlib, Seaborn, and Scikit-learn.The data analysis employed an XGB regression model [24].The Python codes used in this study can be found at https://github.com/duyguege/machine-learning.git.

XGB regressor
The dataset is fitted to the XGB regressor model in Phyton Scikit-learn library.

Training, hyper-tuning, and validation processes
The dataset was split into training (80%) and test (20%) sets.The training set was used for model development, while the test set was used for evaluation.The Python Scikit-learn library was utilized to apply the models.Optimal parameters for the XGB model were determined, and hyperparameter tuning was performed for the subsample ratio of columns, number of estimators, maximum depth, and learning rate (shrinkage factor).To prevent underfitting and overfitting, the optimal parameters for both the training and test sets were selected [25].For performance evaluation, 10-fold cross-validation was employed, randomly dividing the dataset into ten equal folds [26].The model was then used for the final prediction of modulus values [27].

Correlation heatmap
A correlation heatmap was generated using the Seaborn module in Python to assess the strength of the relationship between the three independent factors (NHS/EDC, CMC and GO concentration) and the dependent variables (modulus, % strain at break and UTS).A higher correlation coefficient (r) indicates multicollinearity among the independent variables.

Feature importance
The importance of the features was calculated using an integrated function in the Scikit-learn implementation of the XGB model.Features were ranked based on their importance [28,29].

Model performance assessment
The success of the models was assessed based on higher coefficient of determination R-squared (R 2 ) values with lower root mean square error (RMSE), indicating better performance.Model performance was evaluated using the R 2 and RMSE [30][31][32].

Shapley additive explanation
The Shapley Additive Explanation (SHAP) technique, introduced by Lundberg and Lee in 2017, is employed to understand complex relationships in machine learning models [33].SHAP, a Python model interpretation tool, is used in this study to analyze the marginal relationship between predicted modulus values and each feature.SHAP values indicate the contribution of each feature value to the modulus prediction, with negative and positive values representing negative and positive contributions, respectively.A SHAP summary plot displays the impact of each parameter on modulus, with the primary y-axis showing SHAP values and the secondary yaxis displaying a color bar indicating high feature values.

Results and discussion
Figure 1 shows the correlation heatmap for UTS, modulus, % strain at break and independent parameters EDC/ NHS, CMC and GO concentrations.
Figure 1 shows the relationship of each parameter for CMC/gelatin/GO composites.According to the correlation heatmap, the increase of EDC/NHS reduces % strain at break which is evident from the negatively high correlation coefficient of −0.72.This follows with GO however the correlation coefficient is positive but quite small (0.11).Coefficient of correlation between CMC and % strain at break is considerably low (−0.059) which indicates that the effect of CMC is not as significant as NHS/EDC and GO.
For CMC, a negative correlation of modulus and UTS is observed.This probably indicates that as wt% of CMC increased, UTS and modulus reduced.CMC has lower UTS and modulus and it may also disrupt the amide bonds between gelatin chains [16].GO had a positive correlation coefficient with the modulus and UTS due to hydrogen bonding.This positive effect is much stronger for EDC/NHS due to introduction of the covalent bonds between gelatin chains [16].Figure 2 shows (a) predicted and test % strain values, (b) feature importance implemented from XGB model, (c) RMSE of train and test for % strain at break and (d) SHAP values for independent variables on prediction of % strain at break. Figure 2(a) shows a R 2 value of 0.72 for fitting of the test dataset.This value is relatively high and considered a relatively good fit.According to figure 2(b), among the three independent parameters, EDC/NHS has the highest effect on % strain at break.This is followed with GO and CMC concentrations.This indicates that EDC/ NHS reduces flexibility of the composites due to formation of amide bond between gelatin chains [16].GO and CMC usually improves flexibility of the composites due to introduction of hydrogen bonding.According to figure 2(c), a higher RMSE for test data than train data is observed.This implies that there is overfitting of the XGB model as the model worked more successfully for the training data than the test data.Figure 3(d) shows the SHAP values for the independent variables.It shows that higher NHS/EDC reduced SHAP values which indicates that the higher concentrations of NHS/EDC reduced the predicted % strain at break.On the other hand, lower concentrations of NHS/EDC improved the % strain at break.GO and CMC input dataset did not have such a distinct effect on the predictions.In general, the figure illustrates that the higher concentrations of GO and CMC improved the % strain at break. Figure 3   In figure 3(a), a fitting of the training dataset yields an R 2 value of 0.70, indicating a relatively high and satisfactory fit.According to figure 3(b), among the three independent parameters, EDC/NHS has the highest effect on UTS.EDC/NHS improves the UTS due to formation of amide bonding between gelatin chains [34].This is followed with CMC and GO.CMC is observed to reduce the UTS.This is possibly because of interruption of strong amide bonds in gelatin by the introduction of CMC or GO in the polymeric matrix [16].According to figure 3(c), a higher RMSE for test than train is observed.This implies that there is overfitting of the XGB model as the model worked more successfully for training data than test data.Figure 3(d) shows SHAP values for CMC, EDC/NHS and GO for predicting UTS.The figure shows that the low values of CMC lead to higher UTS values.This further proves that CMC may be disrupting the amide bonding between gelatin chains and reducing the UTS of the composite.On the other hand, as expected, high values of GO and EDC/NHS increased the predicted  Figure 4(a) shows an R 2 value of 0.88 for fitting of the training dataset.This value is quite high and considered a good fit of the dataset.According to figure 4(b), among the three independent parameters, EDC/NHS has the highest effect on modulus.This is again due to the formation of strong amide bonds with higher concentration of EDC/NHS [16,35].This is followed with the effect of CMC and GO concentration.According to figure 4(c), a higher RMSE for test than train is observed.This suggests that the XGB model is overfitting, given its superior performance on the training data compared to the test data.According to figure 4(d), high values of EDC/NHS and GO increased the prediction of the modulus.Higher CMC concentrations reduced the modulus which is again due to disruption of amide bonding in the gelatin chain.The effect of GO for the SHAP value is not as distinct for modulus as for NHS/EDC and CMC. Figure 5 shows the predicted mechanical properties (UTS, % strain at break and modulus) with the variation of the independent parameters (EDC/NHS, CMC and GO concentration).
From figure 5, it is observed that as EDC/NHS concentration increases, both UTS and modulus increases, however % strain at break is reduced.26% reduction of % strain at break is observed with increase of EDC/NHS from 5 to 20 mM.This shows that the material's ductility and flexibility drastically reduce with increase of EDC/ NHS.This is because of the increase of concentration of rigid covalent crosslink between gelatin chains [16,35,36].Therefore, for applications which require flexibility, the material becomes less suitable with high concentrations (20 mM) of EDC/NHS.For applications which require high modulus such as bone regeneration without flexibility 20 mM EDC/NHS may be more suitable.Treatment of CMC/gelatin with lower concentrations of EDC/NHS would lead to softer scaffolds leading to suitable mechanical properties for soft tissue engineering applications such as cartilage, cardiac or skin regeneration [16].
CMC also has a significant role in mechanical properties.With the increase of CMC concentration, % strain at break increases.This is possibly due to the increase of reversible hydrogen bonding both between gelatin and CMC and between CMC chains which improves flexibility [37].However, both UTS and modulus decrease with increase of CMC concentration.This is due to a relatively lower modulus of CMC compared with gelatin.Additionally, CMC may be distrupting the strong amide bonds between gelatin chains reducing the rigidity.Moreover, the formation of hydrogen bonds increases water uptake which ultimately reduces modulus and UTS [16].
GO is observed to have a significant effect on mechanical performance.Up to 1 wt% of CMC, % strain at break of the composites increases.This is because of the increase of hydrogen bonding between components in presence of GO which is abundant with hydroxyl and carboxyl groups.However, further increase of GO reduces % strain at break.This is possibly because of agglomeration of GO in this higher loading.Both UTS and modulus increase with increase of GO which is occurring due to inherently higher modulus and tensile strength of GO compared to both gelatin and CMC.By varying the concentration of CMC, NHS/EDC and CMC concentrations, scaffolds with various mechanical properties could be produced for various tissue engineering applications ranging from bone to skin tissue regeneration.
Moreover, although there is overfitting of the XGB model, some important information could be gathered from the dataset provided for CMC/gelatin composites.Overfitting of the model could be improved if more data can be collected for the analysis of the mechanical properties of CMC/gelatin composites.This model helps to analyze the complex relationship of many independent variables with the mechanical properties of the scaffolds which is not easily possible with experimental work.By altering the composition and crosslinker concentration, the prepared biomaterials may be crafted for soft tissue engineering applications.In the future, this kind of approach can be also used for cutting edge applications such as 4D printing in cancer therapeutics or vascular sutures [38,39].Overall, this approach shows promise as a powerful tool for optimizing physical and biological properties of biomaterials for tissue engineering applications.

Conclusion
In conclusion, our in-depth exploration of gelatin/CMC/GO composites through the XGB machine learning algorithm has revealed valuable insights into their mechanical properties.We have demonstrated the intricate relationships between CMC, NHS/EDC, and GO concentrations and the resulting effects on modulus, % strain at break, and ultimate tensile strength (UTS).High NHS/EDC leads to brittleness of the composites however improves UTS and modulus.CMC improves flexibility while reducing UTS and modulus.GO improves UTS, flexibility and modulus up to 1% concentration.While our study illustrates potential overfitting in the model, it highlights the need for additional data to refine our understanding of these composites.These findings offer a foundation for tailoring material compositions to soft tissue engineering applications such as cartilage, cardiac or skin regeneration.

ORCID iDs
Duygu Ege https:/ /orcid.org/0000-0002-9922-6995 Figure1shows the correlation heatmap for UTS, modulus, % strain at break and independent parameters EDC/ NHS, CMC and GO concentrations.Figure1shows the relationship of each parameter for CMC/gelatin/GO composites.According to the correlation heatmap, the increase of EDC/NHS reduces % strain at break which is evident from the negatively high correlation coefficient of −0.72.This follows with GO however the correlation coefficient is positive but quite small (0.11).Coefficient of correlation between CMC and % strain at break is considerably low (−0.059) which indicates that the effect of CMC is not as significant as NHS/EDC and GO.For CMC, a negative correlation of modulus and UTS is observed.This probably indicates that as wt% of CMC increased, UTS and modulus reduced.CMC has lower UTS and modulus and it may also disrupt the amide bonds between gelatin chains[16].GO had a positive correlation coefficient with the modulus and UTS due to hydrogen bonding.This positive effect is much stronger for EDC/NHS due to introduction of the covalent bonds between gelatin chains[16].Figure2shows (a) predicted and test % strain values, (b) feature importance implemented from XGB model, (c) RMSE of train and test for % strain at break and (d) SHAP values for independent variables on prediction of % strain at break.Figure2(a) shows a R 2 value of 0.72 for fitting of the test dataset.This value is relatively high and considered a relatively good fit.According to figure 2(b), among the three independent parameters, EDC/NHS has the highest effect on % strain at break.This is followed with GO and CMC concentrations.This indicates that EDC/ NHS reduces flexibility of the composites due to formation of amide bond between gelatin chains[16].GO and CMC usually improves flexibility of the composites due to introduction of hydrogen bonding.According to figure 2(c), a higher RMSE for test data than train data is observed.This implies that there is overfitting of the XGB model as the model worked more successfully for the training data than the test data.Figure3(d)shows the SHAP values for the independent variables.It shows that higher NHS/EDC reduced SHAP values which indicates that the higher concentrations of NHS/EDC reduced the predicted % strain at break.On the other hand, lower concentrations of NHS/EDC improved the % strain at break.GO and CMC input dataset did not have such a distinct effect on the predictions.In general, the figure illustrates that the higher concentrations of GO and CMC improved the % strain at break.Figure3shows (a) predicted and test values of UTS, (b) feature importance implemented from XGB model, (c) RMSE of train and test for UTS and (d) SHAP values for independent variables on the prediction of UTS.

Figure 2 .
Figure 2. Analysis of model on % strain data (a) predicted and test % strain values for strain and (b) feature importance implemented from XGB model (c) RMSE of train and test for % strain at break (d) SHAP values for independent variables on prediction of % strain at break.

Figure 3 .
Figure 3. Analysis of model on UTS (a) predicted and test values for UTS and(b) feature importance implemented from XGB model (c) RMSE of train and test for UTS (d) SHAP values for independent variables on prediction of UTS.

Figure 4 .
Figure 4. Analysis of model on modulus data (a) predicted and test values for modulus and (b) feature importance implemented from XGB model (c) RMSE of train and test for modulus (d) SHAP values for independent variables on prediction of modulus.

Figure 5 .
Figure 5. Predicted mechanical properties of the composites (a) UTS versus EDC concentration (b) % strain at break versus EDC concentration (c) modulus versus EDC concentration (d) UTS versus CMC concentration (e) % strain at break versus CMC concentration (f) modulus versus CMC concentration (g) UTS versus GO concentration (h) % strain at break versus GO concentration (f) modulus versus GO concentration.