Cell viability prediction and optimization in extrusion-based bioprinting via neural network-based Bayesian optimization models

The fields of regenerative medicine and cancer modeling have witnessed tremendous growth in the application of 3D bioprinting. Maintaining high cell viability throughout the bioprinting process is crucial for the success of this technology, as it directly affects the accuracy of the 3D bioprinted models, the validity of experimental results, and the discovery of new therapeutic approaches. Therefore, optimizing bioprinting conditions, which include numerous variables influencing cell viability during and after the procedure, is of utmost importance to achieve desirable results. So far, these optimizations have been accomplished primarily through trial and error and repeating multiple time-consuming and costly experiments. To address this challenge, we initiated the process by creating a dataset of these parameters for gelatin and alginate-based bioinks and the corresponding cell viability by integrating data obtained in our laboratory and those derived from the literature. Then, we developed machine learning models to predict cell viability based on different bioprinting variables. The trained neural network yielded regression R2 value of 0.71 and classification accuracy of 0.86. Compared to models that have been developed so far, the performance of our models is superior and shows great prediction results. The study further introduces a novel optimization strategy that employs the Bayesian optimization model in combination with the developed regression neural network to determine the optimal combination of the selected bioprinting parameters to maximize cell viability and eliminate trial-and-error experiments. Finally, we experimentally validated the optimization model’s performance.


Introduction
In the domains of regenerative medicine and cancer modeling, 3D bioprinting has witnessed rapid growth and is rising in popularity [1].3D bioprinting is a biofabrication process that refers to computerguided additive manufacturing that allows the development of extremely accurate and complex 3D constructs utilizing biological elements in a pre-designed configuration [2][3][4].In extrusion-based bioprinting, bioink is a key component of the process, as it serves as the building block for the final construct, composed of cells and materials.In addition to the numerous studies conducted to explore various aspects of 3D bioprinting, such as bioink formulations with different materials, printability [5][6][7][8][9], structural integrity, and mechanical strength [10][11][12][13], there are several studies specifically focusing on the role of cells as an essential component of the bioink in the bioprinting process [9,[14][15][16].
However, preserving high cell viability throughout this process is a significant challenge.Cell viability and functionality can be negatively impacted by the stress created in bioprinting procedures, which can alter protein expression levels and disrupt cell signaling networks [17].This challenge is particularly critical in various applications, such as cancer research, as cell viability directly affects the accuracy of the 3D bioprinted model, the reliability of the results of printing, and the development of new therapies using the 3D model.Consequently, it is significant to address the potential impact and loss of cells that may occur during bioprinting.This involves optimizing the bioprinting conditions to maintain high cell viability, which is essential for achieving successful bioprinting outcomes.
Several efforts have been made to produce biomimetic scaffolds with high cell viability using an extrusion-based bioprinting technique [12].Cell survival, in the 3D bioprinting process can be affected by different variables, including cell type, bioink formulation, 3D printer parameters, and post-printing treatments.These categories may be subdivided into distinct parameters: Bioink formulation consisting of the type and concentration of biomaterials; 3D printing parameters including cartridge temperature, bed temperature, nozzle size, printing pressure, printing speed, etc; and post-printing treatment involving crosslinking process that can be adjusted by choosing a suitable crosslinking method and fine-tuning the amount of crosslinker and crosslinking time.
Until now, optimization of the extrusion-based bioprinting settings has been mainly conducted using time-consuming and costly trial-and-error experimentation.This optimization process can become even more challenging when it comes to incorporating various biomaterials-and printer-related factors into the process [13].As a solution, computational techniques have been introduced in 3D bioprinting.These days, ML, as one of the recent computational techniques, has provided a new perspective for numerous sectors of science and engineering, such as biofabrication.It is anticipated that ML can speed up and improve the bioprinting process by optimizing research design and findings, diminishing trial-anderror experiments, and decreasing fabrication time and expenses [12,14].
ML, is able to find connections between input parameters and forecast expected results based on these connections.The majority of ML applications in 3D printing have focused on improving printability [15], optimizing material utilization [16], and architectural design factors to enhance material characteristics [18,19].However, only a few studies have been working on applying ML algorithms to study the effect of 3D bio-printing parameters on cell viability.Therefore, the relationship between the effective aforementioned-bioprinting parameters with cell survival rate is not yet well-studied.In one of these studies, Xu et al [20] created an ensemble learning model to predict cell survival in stereolithography-based bioprinting by considering the impact of four crucial parameters, including UV intensity, length of exposure, GelMA content, and layer thickness.The model acquired a high level of precision, as evidenced by R 2 of 0.953 when tested on 10% of the dataset.In another recent study, Tian et al [13] studied the capabilities of ML regression and classification approaches in predicting the survival of cells and the printability of cell-containing alginate and gelatin-based bioinks in extrusion-based bioprinting.To do this, they collected a dataset on bioink material content, solvent utilized, crosslinking data, printing settings, survival rate, and printability results from various bioprinting laboratories.Their results showed that the random forest models yielded the highest accuracy in both regression and classification.
In addition to studying extrusion parameters, the study done by Reina-Romo et al [23] investigated the geometry of the nozzle and its impact on shear stress and, consequently, cell viability.Utilizing Gaussian processes, this research aimed to pinpoint the significant geometric characteristics, including the radius of the middle and exit portions of the nozzle and the nozzle's length.This investigation enabled a quantitative evaluation of how nozzle geometry affects cell viability, achieved with a limited number of initial experiments during the extrusion of bioink.
So far, the existing studies have focused on different optimizing bioprinting parameters without taking into account the cell type.However, it is essential to note that different cell types have different optimal bioprinting parameters and have different characteristics that affect their viability during the bioprinting process, such as their size, shape, and susceptibility to mechanical stress.Therefore, the importance of cell type in predicting cell viability and optimizing bioprinting parameters is crucial for the development of effective bioprinting strategies.
In this study, we aim to achieve two objectives: First, to develop predictive regression and classification neural networks for the bioprinting process.To accomplish this, we gathered 92 data points in our laboratory under a variety of experimental conditions, including bioink formulations incorporating gelatin and alginate, 3D printing parameters of extrusion-based bioprinting, and crosslinking conditions.To train and test our ML model, we also acquired data from previously published bioprinting research and combined it with our own data to form a dataset of 591 instances.We evaluated the capability of ML regression and classification algorithms to accurately predict cell survival depending on different bioprinting conditions.We also identified the most important parameters influencing viability prediction through the Permutation Importance technique.
Second, we integrated a Bayesian optimization model with the regression neural network and a subset of bioprinting experimental data for the first time in the bioprinting field.This method enabled us to inversely predict the optimal values of selected bioprinting parameters that yield the highest cell viability.Bayesian optimization is a powerful tool used in ML to optimize complex systems [24].This study used the developed regression neural network as the objective function of the system in the Bayesian optimization model, focusing on fine-tuning the crosslinking conditions, including the crosslinker concentration and crosslinking time, while holding all other parameters constant.
The key idea about this novel optimization is the integration of Bayesian model with regression neural network.The Bayesian model, with its probabilistic approach and Gaussian process prior distribution, is able to provide a robust framework for estimating uncertainty in cell viability prediction.Additionally, using the trained neural network to make predictions in the Bayesian model can effectively eliminate the need for conducting any laboratory experiments at each iteration, thereby saving significant money and time.Hence, this integration could lead to faster convergence and improved system performance, making it an efficient method for obtaining more accurate predictions with limited data.

In-vitro studies 2.1.1. Materials
Sodium alginate, gelatin from bovine skin (type B), sodium chloride, calcium chloride (CaCl 2 ), and dimethyl sulfoxide were all purchased from Sigma-Aldrich.The cell line MDA-MB-231 was provided from ATCC; the cell culture medium and supplements needed for the study, including Dulbecco's Modified Eagle Medium (DMEM), Trypsin/EDTA solution (0.25% w./v.), penicillin/streptomycin (pen/strep), phosphate buffer saline (PBS), and fetal bovine serum (FBS) were purchased from Wisent Bioproducts.Live-Dead Cell Viability assay Kit was supplied by Sigma-Aldrich (Canada).

Cell culture
T-75 flasks were used to cultivate MDA-MB-231 cells in complete media containing DMEM with 10% FBS and 1% 100 U/ml pen/strep.The cells were incubated in a 37 • C, 5% CO2 incubator, with the media replaced every other day.Cells were washed twice with DPBS after they showed around 80% confluency, and then were detached using trypsin/EDTA.

Bioprinting
To prepare bioink, similar to our previous work [22], alginic acid and gelatin were added at quantities ranging from 1% (w/v) to 8% (w/v), and the mixture was stirred using magnet stirrer for 1 h.Before use, the hydrogel solution was sterilized by heating it to 70 • C and cooling for 30 min (3X).To create a bioink with a final cell number of around 2 × 10 6 MDA-MB-231 cells ml -1 , cell suspension in the medium was gradually and consistently stirred with the hydrogel solution with a mixing syringe.The CELLINK INCREDIBLE+ bioprinter, which uses an extrusion method, was used for the bioprinting.For the purpose of making the 3D cell-laden structures, the already-made bioink was extruded through a needle.Experiments used typical conical printing nozzles (25G and 22G), and needle movement speed was set to 20 mms −1 .Using a variety of acceptable pressures, the printings were completed at room temperature.Using Solidworks for the design, and slicer software for the layering, we were able to get ten layers, with the size of 15 × 15 × 3 mm 3 with a rectilinear filling pattern.We opted for the grid structure due to its popularity in bioprinting research, ensuring our study aligns with most of the practices and contributes to a consistent dataset.The 3D structures were then immersed in sterile CaCl 2 solutions (3% and 5% (w/v)) for varying duration of time (5,10,15, or 20 min) for crosslinking alginate with calcium ions [25].Each structure was then rinsed three times in PBS before being placed in a 12-well plate with complete media and kept in an incubator.

Live-dead assay
Live-dead assays were conducted immediately after bioprinting.To do this, each structure was given three PBS washes.Cells within the structure were then stained with 1 µM Calcein-AM (CA) and 2 µM propidium iodide (PI) before being incubated in the dark.A laser scanning confocal microscope (Zeiss LSM 700) was used to capture images of living (green color) and dead cells (red color) in various locations within the 3D construct.The photos were analyzed with the help of the ImageJ program.The viability % was determined by taking the total number of green cells present in each image and dividing by the sum of the two colors (green cells + red cells).

Computational studies 2.2.1. Software and tools
• Python (version 3.9): Python served as the foundational programming language for our project, offering a versatile ecosystem for data analysis, modeling, and visualization.

Dataset preparation
The dataset developed for this research consists of two sections.The first section provides 92 data points collected in our laboratory under a range of experimental settings, including several bioink formulations, particularly 3D printing parameters, and particular crosslinking conditions.The dataset's second section comprises 63 previously reported bioprinting studies, including 14 distinct cell types.This section was integrated with our own data to create a comprehensive dataset of 591 instances, each with a distinct cell viability value.
To select articles, we conducted an electronic search focused on extrusion bioprinting of cells, using the following search terms: Extrusion AND (Bioprinting OR Printing) AND (Gelatin OR Alginate) AND (Viability OR Live-Dead OR CA OR PI).Our electronic database for research was MEDLINE (NCBI PubMed and PMC).Our inquiry was restricted to English-language research published between January 2009 and July 2023.In the selected papers, cell viability was consistently assessed using the live-dead assay, which aligns with the methodology employed in our experiments.Finally, we excluded papers whose used cell types occurred less frequently than five times across all of the retrieved papers.
As various studies measured cell viability at different time points, a new variable, 'Time of measurement (day)' , is added to the dataset to indicate how long after bioprinting the cells were evaluated.Cell survival at the measurement time is represented by the target variable 'Cell Viability (%)' .Some of the numerical variables in the final dataset are summarized in table 1, including their maximum and minimum values.Based on this table, for instance, data points are generated with various bioink formulations containing varying concentrations of gelatin and alginate ranging from 0% to 20% (w/v), and with varying printing setting parameters such as extrusion pressures ranging from 5 kPa for low-viscosity bioinks to 300 kPa for highly viscous ones.

Data preprocessing
Preprocessing is a critical step in ML that involves transforming raw data into a format that can be effectively used by ML models.It typically involves cleaning the data, scaling the data, encoding categorical variables, and handling missing values.During the preprocessing stage of this research, first, we removed all the rows where the 'Alginate_Concentration (%w/v)' column and the 'Gelatin_Concentration (%w/v)' column were both 0 because we wanted to focus on gelatin-and alginatebased bioinks containing at least one of these materials.Then, numerical data were normalized using the MinMaxScaler function for better model evaluation.This function scales the values of each feature between 0 and 1 to ensure that each feature contributes equally to the model.One-hot-encoding was utilized to convert categorical data ('Cell type') into numerical data.This technique involves creating a binary column for each category of a categorical variable.Additionally, the instances in which cartridge and substrate temperatures were null were assigned a value of 22.5 • C, as it was assumed that the experiments were performed at room temperature.In addition, k-nearest-neighbors imputing was applied to impute the missing values of the other variables, with a neighbor range of 10.The dataset is split into a train set and a test set using the train_test_split() function from the scikit-learn library, with a test size of 0.15.
The final architecture and configuration of the classification network used L2 regularization at strength 0.05, one hidden layer of 200 hidden units, dropout with a factor of 0.5 before the final fullyconnected layer, ReLU activation throughout, and an Adam optimizer learning rate of 0.01.

Regression neural networks
For the regression target labels, the cell viability for each print is between 0% and 100%.We have computed some summary statistics such as the mean, The neural network includes three fullyconnected layers: a layer containing 44 neurons and a hyperbolic tangent activation function (tanh), a layer with 120 neurons, a tanh activation function, and L2 regularization with a weight of 0.01, and a layer containing one neuron and a linear activation function.The model is compiled with the MSE loss function and the Adam optimizer (learning rate = 0.01).The neural network used in this study is based on the Keras library, which is a high-level API that allows the creation of neural network models with minimal code.The K-fold cross-validation technique is applied with 10 folds across the train set to evaluate the model's performance.For each fold, the dataset was scaled using the MinMaxScaler, and for 1000 epochs, with a batch size of 5, the model was trained on the train set.The training and validation set MSE and R 2 cores are evaluated for each fold, and the mean scores are calculated across all folds.Finally, the developed model is applied to the test set, and the predicted values for cell viability are compared with the actual values.

Classification neural network
Using the cell viability data, a binary category was constructed for use in the classification model, by converting the target variable to binary by assigning 1 to values greater than 70 and 0 to values less than or equal to 70.Table 3 shows the number of samples that have an acceptable viability ('Yes' label) and unacceptable viability ('No' label).The classification neural network was trained to classify the cell viability as acceptable and unacceptable based on the printing parameter configuration.
The neural network model has three fully-connected layers: a layer with 44 neurons and ReLU activation function, a layer with 200 neurons, ReLU activation function, and L2 regularization with a lambda value of 0.01, and finally a layer with one neuron and sigmoid activation function to produce the binary classification output.The second layer is followed by a 0.7 rate dropout layer.The model is compiled with a binary cross-entropy loss function, accuracy metrics, and the Adam optimizer (learning rate = 0.01, beta_1 = 0.9, beta_2 = 0.999, and epsilon = 1 × 10 −1 ).All other Adam optimizer parameters were set to the default values specified in the Keras documentation.K-fold cross-validation is used in the process of training the model, with 10 folds on the train set.The data is scaled using MinMaxScaler, and the training and validation set accuracy, precision, and recall are recorded for each fold, and the averages across all folds are computed.Similar to regression model, at the end, the developed model is applied to the test set, and the predicted values for cell viability are compared with the actual values.

Performance metrics
In this section, we provide a detailed overview of the performance metrics utilized for the evaluation of our predictive models.

Classification metrics
Accuracy: Accuracy quantifies the proportion of instances that were predicted correctly out of the entire set of instances in a classification problem.It is a fundamental metric to assess the overall correctness of predictions.In the context of our study, accuracy quantifies how well our classification model correctly predicts cell viability categories.
Precision: Precision evaluates the proportion of true positive predictions (correctly predicted positive instances) among all positive predictions.It reflects the model's capacity to minimize false positives.In our case, it indicates how precise our model is when predicting positive cell viability outcomes.
Recall: Recall assesses the proportion of true positive predictions among all actual positive instances.It quantifies the model's ability to capture all positive cases, minimizing false negatives.In our context, it signifies the model's effectiveness in identifying cells with high viability.The MAE computes the average absolute differences between predicted and actual values with the following formula:

Regression metrics
Unlike MSE, it treats all errors with equal weight.MAE assesses the model's general accuracy, where smaller values signify higher accuracy.
Coefficient of determination (R 2 ): The coefficient of determination, often referred to as R 2 , evaluates the portion of the variability in the dependent variable (cell viability) that can be clarified by the independent variables used in the model.R 2 varies from 0 to 1, with increased values indicating an improved alignment of the model with the data.It represents the goodness of fit and quantifies how well the model explains the observed variability.

Hyperparameter fine-tuning
For the choice of the number of epochs, we employed early stopping with respect to the validation set, which helped us strike a balance between avoiding overfitting and underfitting.The specific decision to train the models for 1000 epochs was based on empirical evidence from this approach, ensuring that our model converged to the desired level of accuracy.
As for the neural network configuration, we conducted an extensive grid search, exploring various hyperparameters including the number of hidden layers, the number of hidden units per layer, the activation function, dropout, learning rate and regularization.To do so, we utilized the GridSearchCV module from the scikit-learn library.This module allowed us to systematically search for the optimal hyperparameter combination within a given range by conducting a grid search over a specific hyperparameter space.To ensure the robustness of our hyperparameter choices, we employed 5-fold cross-validation and used relevant evaluation metrics.For the classification model, accuracy was the key metric, while for the regression model, we used negative MSE.These metrics allowed us to systematically assess the performance of different hyperparameter combinations and select the most suitable configuration.
The hyperparameter combinations that were tested for the classification and regression model respectively can be found in tables 4 and 5.The total number of hyperparameter combinations explored was 1728 (classification) and 972 (regression), since one value was chosen from every row.The score for each combination was determined via 5-fold cross validation.

Permutation importance
We used the permutation importance method to rank the relative importance of the features in the constructed regression model in terms of their ability to predict the call viability.To do this, we utilized the 'permutation_importance' utility from the scikitlearn library.This method involves arbitrarily permuting the values of a single feature and measuring the resulting decrease in the model's performance or score.This procedure is repeated K times to estimate feature significance accurately.The importance (i j ) for each feature j for K repetition is calculated as: where s is the reference score of the model, and s K,j is score of the model for permuted data which feature j and on Kth repeat.In this study, K = 10 and, since the neural network's loss function is MSE, the Permutation Importance score is also determined by the decrease in MSE when each feature is permuted.

Bayesian optimization model
We employed a Bayesian optimization model based on a constructed regression neural network in order   to optimize the bioprinting variables for maximizing cell viability.We aimed to determine the optimal 'physical-crosslinking time' and 'crosslinker (CaCl 2 ) concentration' for a given set of bioprinting parameters to achieve the highest level of cell viability possible.The algorithm of this neural network-based Bayesian optimization model is depicted in figure 1.
Here, the base Bayesian optimization framework started by constructing the Gaussian Process model using the created dataset of variables in the bioprinting process and their associated cell viability.The Gaussian Process model consists of mean and covariance functions.The Matern kernel, with a smoothness parameter equal to 2.5, is the covariance function in this study, which conveys the smoothness of the function and determines how the cell viability percentage in one spot affects the prediction of the nearby values [26].Moreover, in the training of the Gaussian Process model, a small positive number, α = 10 −3 , was added to each element on the diagonal of the covariance matrix to modify the numerical stability of this method.
Using the Gaussian Process model's mean and covariance, we developed an acquisition function that suggests the next questioned bioprinting variables to probe.The suggested parameters were chosen to maximize the probability of achieving the highest cell viability while maintaining a balance between exploration and exploitation.In this study, the GP-UCB acquisition function, with κ = 2.5, was applied [27].New suggestions for questioned parameters were chosen within the range specified for them.Some of these ranges in our bioprinting dataset can be found in table 1.Then, the newly recommended crosslinking time, amount, and other constant variables were fed into the developed regression neural network in each iteration.Utilizing the developed neural network, we were able to predict cell viability without having to conduct time-consuming and laborious laboratory experiments.Then, using the collected data so far, the Gaussian process model was updated and generated a posterior distribution.This process continued for 50 iterations, then the best combination of parameters with the highest observed or predicted cell viability was chosen as the solution.

In-vitro studies
In this study, we conducted in-vitro experiments and bioprint gelatin and alginate-based bioinks under various printing and post-printing conditions.In general, bioink can be composed of various materials including hydrogels, polymers, and decellularized extracellular matrices (ECMs) [28].Among these materials, hydrogels have attracted significant interest due to their properties, as they can provide the high-water content of the ECM and let tunability of the mechanical strength and biochemical properties.These materials mostly need optimization for being applied in bioprinting by modifying the hydrogel's viscosity and mechanical strength through the addition of thickening materials or by crosslinking to create the ECM-similar scaffold [29].
Among different hydrogels, alginate and gelatine are the most popular ones in 3D bioprinting.Alginate is biocompatible and biodegradable hydrogel which can be crosslinked by multivalent cations such as Ca 2+ [7].After crosslinking, the alginate-containing scaffolds are illustrated to keep their morphology and enhance their mechanical properties [8].One disadvantage of alginate is its poor capability to provide enough binding sites for cells to attach and proliferate [9].However, this problem can be solved by introducing gelatin which contains cell attachment peptides such as the RGD sequence, a typical cell binding site [10].Gelatin is a water-soluble polymer that is the product of partial hydrolysis of the natural polymer collagen and has a similar structure to that of collagen, the main ECM component.Gelatin contains several biological signals that improve cellular functions, including cell attachment, growth, and proliferation [8,11].However, gelatin alone as bioink has its own challenges, including relatively low mechanical strength and fast biodegradation, which can be overcome by its combination with alginate [11].Hence, the composition of the gelatin and alginate can be considered an effective way to develop a bioink with modified properties, which is used for formulating bioink to create 3D cell-laden structures in this study [8].
We utilized live-dead assay and confocal microscopy to measure cell survival following printing for each set of parameters and collected a dataset of effective parameters and corresponding cell viability.To create a more comprehensive dataset, we combined our results with those from other laboratories.This resulted in a dataset with a categorical parameter, 'Cell type' , encompassing 14 distinct cell types listed in the previous section.Some of the numerical variables in the final dataset are also summarized in table 1, including their maximum and minimum values.
Each of the parameters has the potential to affect the viability of cells in the bioprinting procedure.For example, cells printed within the same bioink (4%(w/v) gelatin-3%(w/v) alginate), crosslinking conditions, and printing settings but with different nozzle sizes showed different cell viabilities.Cells extruded using a nozzle with a diameter of 250 micrometers exhibited approximately 82% cell viability, while those printed using a nozzle with a diameter of 410 micrometers showed only 61% cell viability.This disparity in cell viability is due to the difference in nozzle size; the smaller the nozzle, the lower the cell survival rate.Another example is for cells printed with different bioink formulations.Figure 2 displays images of stained cells in 4%(w/v) gelatin-8%(w/v) alginate and 4 (%w/v) gelatin-4(%w/v) alginate using the same other parameters except for the printing pressure, which varied based on the bioink viscosity.Cells in 4%(w/v) gelatin-8%(w/v) alginate exhibited approximately 57% viability, whereas those in 4%(w/v) gelatin-4%(w/v) alginate exhibited about 68% viability.The lower cell viability in the 4%(w/v) gelatin-8%(w/v) alginate hydrogel may be attributable to the higher amount of alginate, resulting in a more viscous and stiffer bioink necessitating a greater printing pressure.

Computational studies 3.2.1. Regression and classification neural networks
To improve the bioprinting optimization process, we developed neural networks for regression and classification purposes to predict cell viability based on key variables.To construct these networks, we took a random sampling of the data and split it into a training set comprising 75% of the total and a test set comprising 15%.We utilized feature scaling techniques to ensure that all numeric features were normalized to the same range of distribution.Cross-validations were performed on the training datasets to determine the optimal configuration of ML algorithms.In order to prevent data leakage, the encoding and scaling methods were performed exclusively on the subtraining dataset and then applied to the subvalidation dataset  This final configuration was determined by a grid search (for details see 'Hyperparameter Fine-tuning').Table 6 displays the results of the regression neural network model on both the train set and the test set.This table depicts the mean and standard deviation of performance metrics, including MSE, MAE, and R 2 across 10 folds for the train set.This figure also showcases the corresponding values of MSE, MAE and R 2 for the test set.The mean MSE, MAE, and R 2 values for the train set were found to be 0.010, 0.075, and 0.81, respectively.Similarly, the corresponding metrics for the test set resulted in values of 0.016, 0.089, and 0.71, respectively.
These metrics provide insight into the model's success in prediction.When the MSE and MAE are smaller, it means that the model is more accurately predicting values, whereas a higher R 2 value indicates that the model explains a substantial portion of the data's variance.Thus, the relatively low MSE and MAE values for both sets and the relatively high R 2 value suggest that our model was trained effectively on the dataset and can predict cell viability with good performance.Table 7 compares the cell viability of a subset of the data collected in the laboratories with the predicted values derived from the regression model.As shown in this table, the predicted viability and actual one are in good agreement confirming the model's good performance in predicting cell viability and its ability to replicate laboratory measurements.
For the purpose of the classification, we defined two classes for cell viabilities: 0 for viabilities below 70% and 1 for viabilities greater than or equal to 70%, the threshold typically accepted as indicative of acceptable cell survival in bioprinting.To evaluate the neural network's performance in the training and validation processes, we measured the average accuracy, precision, and recall over 10 folds in the training.Similar to regression model evaluation, a test set, 15% of the whole dataset, was used to assess its ability to predict cell viability on unseen data.The results of the model performance on the train set and test set are presented in figure 3.This figure shows that the model yielded a mean accuracy of 0.91, precision of 0.96, and recall value of 0.90 for the train set.Additionally, this figure shows these metrics for the test set, which are 0.86, 0.85, and 0.97, respectively.Thus, with an accuracy score of 0.91 in seen data and 0.86 in unseen data, it can be concluded that the model correctly classified 91% and 0.86% of instances in the train set and test set, respectively, representative of how well the model could be trained and predict the objective.Moreover, with a recall score of 0.97, the model accurately captured 97% of the actual positive or highviability (>70%) instances in the test set.
Comparing our findings to Tian et al [13] Random Forest Classifier performance on our dataset, we find that their best-performing method obtains a mean accuracy of 0.82, precision of 0.85, and recall value of 0.89 for the train set, which is lower than our method in all metrics.These metrics, computed on the test set are 0.86, 0.89, and 0.90, respectively.Only precision is slightly higher on the test set for their method, and otherwise our method obtains a better result.Training and testing their bestperforming method on our dataset show, similarly to the regression model, that our method is more accurate than theirs on several metrics.Thus, overall, these findings suggest that both regression and classification neural networks are effectively trained on the created dataset and can predict cell survival with excellent performance, making it a reliable tool for predicting cell viability in bioprinting processes with high generalization.
Our developed neural networks for both regression and classification also outperformed the most relevant study that predicted cell viability using machine learning so far in extrusion-based bioprinting.This study, conducted by Tian et al [13], compared different classical regression models and different classical classification models.Their analysis highlighted Random Forest models as the top performers, yielding the highest R 2 score and prediction accuracy in both regression and classification tasks, respectively.To substantiate the superior performance of our models, we systematically trained and tested Tian et al's regression and classification Random Forest models using our dataset.
In the case of the regression model, using Tian et al [13] Random Forest Regressor, we obtained an R 2 value of 0.588 and 0.589 on the training set and test set respectively.We found that our R 2 value (0.81 on the training set, 0.71 on the test set) is much higher compared to this model.This shows that our approach in using a neural network for the prediction of cell viability yields more accurate predictions.
In the case of the Classification model, we found that their best-performing method attained a mean accuracy of 0.82, precision of 0.85, and recall value of 0.89 for the train set on our dataset.Our method attained a mean accuracy of 0.91, precision of 0.96, and recall value of 0.90 for the train set.Hence, we outperformed Tian's Random Forest Classifier on the training set.As for the test set, the Random Forest Regression attained accuracy, precision, and recall metrics of 0.86, 0.89, and 0.90, respectively.We attained values of 0.86, 0.85, and 0.97 respectively.Only precision was slightly higher on the test set for their method, and otherwise, our method obtained a better result.Training and testing their bestperforming method on our dataset show, similarly to the regression model, that our method is more accurate than theirs on several metrics.
These outcomes show the efficacy of our neural networks for predicting cell viability in bioprinting and imply that they may offer significant advantages over conventional regression and classification models.Additionally, we believe the models' performance can be enhanced even further by expanding the dataset to capture a broader range of patterns and relationships, reducing overfitting, and enhancing the model's generalization capability.Increasing the number of instances with diverse combinations of bioprinting parameters and corresponding cell viability measurements can be accomplished by collecting additional in-vitro data in the laboratory or incorporating data from other sources.

Permutation importance of bioprinting parameters
Permutation Importance was applied to determine the most significant features of bioprinting for predicting cell viability using a neural network for regression.Figure 4 shows ten of the most important features; the x-axis represents the significance score, while the y-axis represents the evaluated bioprinting parameters.
As demonstrated in this figure, the 'Cell type' has the most significant impact on optimizing the bioprinting process compared to other variables.In the dataset, we included cancer cells for cancer research and normal cells for tissue engineering purposes.In the bioprinting procedure, the survival and viability of these two cell types can vary significantly.Typically, cancer cells with a higher proliferation rate may be better able to recover from any printing-related damage, while normal cells may be more fragile and susceptible to injury.Different cell types within cancerous and normal cell populations may demonstrate varying bioprinting sensitivity.The degree to which cells are affected by pressure and shear stress depends on the cell type.This variation can be attributed to the unique physiological properties of each cell type.For instance, cells with a more robust cytoskeleton may be more deformationresistant during printing.Studies showed that, for example, immortalized cells like human MSCs, L929 fibroblasts, and HeLa cells experience lower cell damage compared to mice embryonic stem cells when exposed to the same level of shear stress [30,31].In other studies, numerous cell types, including fibroblasts [33], myoblasts [34], endothelial cells [35], chondrocytes [36], and Schwann cells [37], were printed in alginate-based structures, and each exhibited unique cell vulnerability to the post-printing process.Hence, these observations highlight the significance of considering cell types as a critical element for optimizing cell viability in the bioprinting process.
The second most significant bioprinting variable is 'Printing_Pressure' , influencing cell viability prediction.The extrusion involves using pressure to force the bioink through the nozzle and deposit it onto the substrate.The amount of pressure applied determines the flow rate and deposition speed of the bioink, affecting the tension experienced by the cells during printing.Previous research has indicated that increasing dispensing pressure during bioprinting has a damaging effect on cell viability [28].This is primarily due to the mechanical stress and shear forces imparted on the cell membrane by the pressure, causing damage and diminishing viability.Additionally, high pressure may cause the formation of air bubbles or the fragmentation of the bioink, which can further damage cells.Consequently, optimizing printing pressure is essential for achieving high cell viability during the bioprinting procedure.It is important to note that pressure can be controlled by considering factors such as nozzle size and material properties within the bioink, which will be further explained in the subsequent explanations.
Among other 3D printer parameters, Permutation Importance in our study shows that after 'Printing_Pressure' , 'Nozzle_size' and 'Movement_speed' are among the ten most important variables affecting cell viability prediction.This result aligns with previously conducted research that concluded that the dispensing pressure has a more substantial impact when compared to the nozzle size [29].Smaller nozzle sizes typically result in higher printing pressures due to the restricted flow area, whereas wider nozzles allow for reduced extrusion pressures resulting in decreased shear stress.Using wider nozzles, however, diminishes the filament's resolution.Hence, selecting the proper nozzle size is one of the parameters that must be tuned to produce the best outcomes in terms of printability and viability [39].High printing speeds, as the 9th important parameter, can also lead to mechanical stress on cells, which can affect their viability and functionality.In addition, high extrusion speeds can reduce the precision and resolution of 3D printed structures, resulting in regions of high stress or low nutrient supply, which affects cell survivability post printing.On the other hand, low printing speeds can result in prolonged cell exposure to the printing environment and excessive residence time in the nozzle, which is harmful to cells [12].Therefore, the optimal printing speed may depend on several variables, including the required printing resolution, as well as the sensitivity of the cells to mechanical stress.
The third significant parameter in predicting cell viability, identified in our study, is 'crosslinker(CaCl2)_concentration' .The crosslinking method, including physical and chemical, can be selected based on the bioink formulation [40].For example, in this study, Ca 2+ in CaCl 2 solution is used as a crosslinker for an alginate-containing scaffold, which forms a flexible and solid structure.Increasing the amount of crosslinker means more calcium ions can move into the 3D network, leading to more crosslinked and a firmer construct, while a higher amount of Ca 2+ can be toxic to cells and decrease cell viability [8].Another important parameter in controlling the crosslinking process is 'phys-ical_Crosslinking_time' which is the length of time the printed structure is immersed in the CaCl 2 solution, and is among the ten most important factors in optimizing cell viability during bioprinting.High 'physical_Crosslinking_time' and long-term exposure to Ca 2+ ions can further reduce cell viability, highlighting the significance of reducing the physical crosslinking duration as well as its amount to minimize cell damage.
However, note that in alginate-based bioprinting, the absence or improper execution of crosslinking can also lead to reduced cell viability.Proper crosslinking is critical for preserving structural integrity, preventing cell leakage and relocation, and supplying nutrients and oxygen.Without sufficient crosslinking, the cell-laden structure may lack mechanical integrity, resulting in ineffective cell attachment and arrangement.Moreover, inadequate crosslinking can result in cell leakage, uneven distribution, and restricted nutrient exchange, all contributing to a decline in cell viability.This creates a trade-off, emphasizing the significance of optimizing the crosslinking condition to balance structural integrity and cell viability.Therefore, in order to achieve optimal cell viability and overall success in alginate-based bioprinting, it is essential to ensure appropriate crosslinking conditions.
The 'Time of Measurement' parameter is demonstrated to be the 4th important parameter in expected cell viability.Over time, after the printing, cells within the bioprinted structure, depend on their proliferation rate, are likely to grow and repair any damage caused by the bioprinting process.However, after some time, there is a growth capacity threshold beyond which cell viability may plateau or decline.This phenomenon was illustrated in our previous study [22], where we assessed the viability and proliferation of MDA-MB-231 cells over an 11 day period after bioprinting.Our results indicated that the viability of cells was around 76% on the day of printing (day 0), and this viability progressively increased during the first week, reaching about 98% on day 4 and 99% on day 7.Then, from day 7 to day 11, the percentage of viability decreased a little and approximately reached 96% on day 11.Therefore, the time at which measurements are taken has a significant impact on the observed cell viability and should be carefully considered in accordance with the specific objectives of the study.
From the material perspective, 'Alginate_Concentration' and 'Gelatin_Concentration' are shown to be the 5th and 8th significant variables in predicting cell viability based on our built neural network.Indeed, the type of hydrogel and its concentration define the bioink property, which is one of the critical factors determining bioprinting success in terms of cell survivability and printability.Depending on the bioink component, the rheology behavior, such as viscosity and mechanical properties of bioink can change [1].When bioink viscosity is high, shear stress increases at the point of contact between the bioink and the nozzle wall, leading to cellular damage, such as membrane rupture, cytoskeletal damage, and DNA damage [32].In addition, bioinks with a greater viscosity or stiffness require higher extrusion pressures to overcome resistance and flow through the nozzle to accomplish proper flow and deposition.For instance, increasing alginate concentration can increase the hydrogel's rigidity and mechanical strength, exposing the cells to greater mechanical stress during extrusion and printing [13].This increased rigidity can also reduce the diffusion of nutrients and oxygen to the cells, resulting in reduced cell viability and growth for more extended studies after printing.Hence, this result shows the significance of bioink formulation in achieving high viability during and after printing.

Bayesian optimization model based on the built regression neural network
This study utilized a neural network-based Bayesian optimization model to optimize the bioprinting process to obtain the highest possible cell viability.Bayesian optimization is a powerful ML optimization tool that is applied to discover the optimal set of variables for a given objective function [24].The Bayesian optimization model in this study allows us to quantify the uncertainty in cell viability prediction as well as predict the optimal bioprinting experimental features for maximizing cell viability.This method in extrusion-based bioprinting was used in the study by Ruberu et al [1] for the first time to optimize the printability of the bioink.The researchers utilized Bayesian optimization as a novel method for optimizing some of the printing variables while minimizing the necessary experiments.The study demonstrated the promising outcome of this method in speeding up the extrusion-based bioprinting process compared to conventional optimization.
Our current work aims to eliminate laborious trial-and-error steps in optimizing cell viability in extrusion-based bioprinting by integrating our built neural network with the Bayesian optimization technique.Inspired by the pioneering efforts of Gao et al [38], who successfully implemented the ML-Based Bayesian optimization model to revolutionize membrane design, we aimed to significantly accelerate the optimization process in the bioprinting process.Indeed, by integrating our trained neural network with the Bayesian optimization model, instead of conducting every required experiment physically, the neural network approximates the objective function, allowing for more efficient and faster exploration and exploitation of the search space by saving significant time and resources in the optimization process.Therefore, this combination can expedite convergence and boost overall optimization performance.
Analysis of Permutation Importance revealed the significance of the crosslinking conditions on cell viability in the bioprinting process for alginateand gelatin-based bioinks, which encouraged us to optimize these parameters in this study.Hence, our optimization model focused on finding the optimal 'Crosslinking (CaCl2)_Concentration' and 'Physical_crosslinking_time' parameters, presuming that all other bioprinting parameters were already predefined.This method allowed us to determine the optimal combinations for crosslinking parameters along with the desired bioink formulation and printing settings.Indeed, using this method, we are able to print cell-laden scaffolds with superior cell viability and minimal crosslinking-induced injury.
To execute Bayesian optimization, we first defined the range space for 'Crosslinking (CaCl2)_Concentration' and 'Physical_crosslinking_ time' by determining the minimum and maximum values.We set the minimum value to be greater than zero because we intended to evaluate optical crosslinking parameters rather than zero for our particular bioprinting parameters.In addition, we defined the range's maximal value as the highest value observed in our dataset.The Bayesian optimization procedure relied on the neural network we developed for regression.During each iteration of the optimizer, the suggested 'Crosslinking (CaCl2)_Concentration' and 'Physical_crosslinking_time' with other predefined and desired variables were fed into the trained neural network to predict cell viability with good accuracy.This approach enabled us to rapidly investigate numerous unexplored parameter possibilities, ultimately identifying the most efficient parameter combination in printing.The Gaussian process in this process model the underlying patterns and correlations in the data, providing a probabilistic estimate of the behavior of cell viability for a set of printing parameters, and discovers the best parameters by iteratively selecting new parameters based on the outcomes of prior evaluations, which is updated as new function evaluations become available [21].
The bioprinting dataset used for developing the Gaussian process in Bayesian optimization was limited to a smaller subset compared to the data used for training the neural networks.We specifically selected a portion of the dataset that met certain constraints, including: having non-zero concentration values for both gelatin and alginate, using MDA-MB-231 as the cell type, and ensuring non-zero values for the crosslinking parameters.We mainly focused on the most essential data necessary for our specific experiments in the laboratory.
To demonstrate the practical implementation of our model, we designed a process of printing cellladen structure using MDA-MB-231 using alginate and gelatin hydrogels in the bioink.Initially, we determined the desired concentrations of hydrogels and specific printing settings for two different structures: (1) 4%(w/v) gelatin-3%(w/v) alginate and (2) 3%(w/v) gelatin-7%(w/v) alginate, as outlined in table 8, respectively.
Using the Bayesian optimization model in combination with our regression neural network, we could identify the optimal 'Crosslinking (CaCl2)_Concentration' and 'Physical_crosslinking_time' for each structure, represented in table 9, thereby predicting the associated optimal cell viability.To assess the reliability and accuracy of our model, we printed the cell-laden structures and crosslinked the bioprinted construct with the optimal crosslinking conditions identified by Bayesian optimization in our laboratory.Finally, the actual cell viability was measured with the live-dead assay.Table 9 represents the results of the prediction values of cell viability and actual viability for both structures.
The findings demonstrate that this approach can successfully improve the bioprinting process by determining the optimum crosslinking condition.There is a good agreement between the actual and predicted cell viabilities for both experiments, showing the promising application of this optimization method for bioprinting.Using this neural networkbased Bayesian optimization model, researchers can determine optimal crosslinking conditions for any set of parameters for alginate-based bioink and optimize the bioprinting without any trial-anderror experiments.Additionally, this optimization method can be extended to predict other bioprinting parameters, such as concentration of materials, printing setting parameters, or combinations of all of them, to optimize cell survival in future studies.
Although there is an agreement between the predicted and actual viabilities, there are still differences.This discrepancy can be attributed to the performance of the neural network.Indeed, our Bayesian optimization model performance is based on the built regression neural network, whose prediction ability is highly dependent on the dataset's precision, and size.Our neural network was developed based on a limited dataset mainly collected from the literature.Hence, the optimization model's performance can be further enhanced by including more precise data collected from different bioprinting laboratories, allowing for more accurate prediction of optimal parameter combinations.
Finally, it can be concluded that the application of neural network-based Bayesian optimization provides a novel and effective strategy for the reverse design of the bioprinting process for gelatin and alginate-based bioinks.This technique eliminates the need for typical trial-and-error optimization, paving the way for a more straightforward and efficient optimization technique for the bioprinting process in the future.

Conclusion
In this study, we developed a novel cell viability optimization process for bioprinting through the application of ML techniques.To do this, we combined the 92 data points generated in our laboratory with those collected from other laboratories to create a larger dataset with 591 data points.Using the dataset, we successfully created regression and classification neural network models to predict cell viability for gelatin and alginate-based bioinks.The trained neural network for regression in our study yielded regression R 2 value of 0.71 and classification accuracy of 0.86, representing excellent performance and a significant improvement over the previous studies that used ML techniques.Using these neural networks, researchers can predict the cell viability after printing for any previously unexperimented bioprinting parameters.
By calculating permutation importance for the neural network for regression, we identified the bioprinting parameters significantly impacting cell viability prediction.Among different parameters, 'cell type' emerges as the most critical variable, highlighting the different sensitivities of various cell types to the bioprinting procedure.In addition, 'extrusion pressure' is identified as the second most significant parameter, demonstrating the detrimental impact of excessive pressure on cell viability due to mechanical stress and shear forces on the cell membrane.After the bioprinting procedure, the 'crosslinker (CaCl2) concentration' and 'physical crosslinking time' are identified as the third and fifth significant features, respectively, which balance the structural integrity and cell viability of bioprinted structures.Therefore, we can conclude that tuning these effective parameters can highly impact the survival of cells during the bioprinting procedure.
We finally developed a novel Bayesian optimization model based on the created trained neural network to inversely predict optimal bioprinting crosslinking parameters, achieving the highest cell viability without any trial-and-error experiments.We validated the performance of our Bayesian optimization model by conducting two distinct laboratory experiments.These experiments showed good agreement between the final predicted and actual cell viability, demonstrating the promising potential of this optimization technique for determining optimal bioprinting parameters.By integrating the capabilities of our neural network and Bayesian optimization model, we achieved quicker and more efficient optimization that surpasses the limitations of traditional time-consuming, and expensive optimization approaches.Our developed optimization technique has the potential to revolutionize bioprinting as it provides a valuable tool to facilitate the design of new bioprinting experiments using gelatine and alginatebased bioink and optimize the bioprinting with high accuracy within minutes.

Future prospect
In addition to the accomplishments and contributions outlined in this study, several promising directions for the future can enhance the performance and broaden the applicability of our research: This optimization method can be extended to predict other bioprinting parameters, such as concentration of materials, printing setting parameters, or combinations of all of them, to optimize cell survival in future studies.
By expanding the available dataset in the future with additional literature and experimental data, with diverse values for each parameter, such as varying hydrogels, cell types, printing settings, and crosslinking conditions, we can further improve the accuracy of our predictions of both neural networks and the Bayesian optimization model.
It is worth noting that the 3D nature and geometry of the scaffold can significantly impact cell viability.Factors like pore size, shape, and layout within the scaffold can influence cell behavior.To gain insights into this, we recommend exploring various scaffold geometries in future studies.This will help us better understand how these variables affect cell survival during the printing process and enable the design of optimized scaffold structures for enhanced bioprinting outcomes.
The structural integrity of the bioprinted constructs is another factor playing a vital role in the overall success of the bioprinting process, which highly relies on the crosslinking of hydrogels.Optimal crosslinking parameters can be accomplished more precisely by creating a balance between preserving cell viability and assuring the desired structural integrity of printed constructs.Hence, to modify the predictive capability of the Bayesian optimization model for the crosslinking parameters, we can include both structural integrity and cell viability in the objective function.This provides us with a more comprehensive optimization model allowing more effective prediction of crosslinking parameters to preserve cell viability while strengthening the structure.

Figure 1 .
Figure 1.Algorithm of a Bayesian optimization model based on regression neural network for bioprinting process optimization.

Figure 2 .
Figure 2. Fluorescent live-dead assay images showing MDA-MB-231 cell viability on day 0 within 3D hydrogel structures.Bioink consists of: (A) 4 (%w/v) Gelatin-8 (%w/v) Alginate, (B) 4 (%w/v) Gelatin-4 (%w/v) Alginate.CA was used to stain living cells green while PI was used to label dead cells red.The laser scanning confocal microscope was used to capture images of the cells.Scale bar, 100 µm.

Figure 3 .
Figure 3. Performance of a neural network classification model evaluated using accuracy, precision, and recall for the train set and test set.Error bars indicate the standard deviation for each metric for the train set across 10 folds.

Figure 4 .
Figure 4. Permutation importance of bioprinting parameters on cell viability, based on regression neural network.The boxplot illustrates the distribution of each parameter's importance scores.

Table 1 .
A subset of numeric variables in the bioprinting dataset and their range.

Table 2 .
Summary statistics of the cell viability for the regression task.
standard deviation, as well as minimum, maximum, and median values, as shown in table 2. The regression neural network was trained to predict the cell viability of each individual print given the printing parameter configuration.

Table 3 .
The label counts for the classification task.

Table 4 .
The hyperparameters tested (grid search) for the classification neural network.

Table 5 .
The hyperparameters tested (grid search) for the regression neural network.

Table 6 .
Performance of the regression neural network model evaluated using MSE, MAE, and R 2 for the train set and test set.±Std indicates the standard deviation for each metric for the train set across 10 folds.

Table 7 .
A comparison between the predicted viability values using the regression neural network model and the actual value produced in the laboratory.

Table 8 .
Predefined values of bioprinting parameters for Experiment 1 and Experiment 2.

Table 9 .
Predicted optimal crosslinking parameters and cell viability using neural network-based Bayesian optimization model and the actual cell viability of Experiment 1 and Experiment 2.