An Artificial Intelligence-based model for cell killing prediction: development, validation and explainability analysis of the ANAKIN model

The present work develops ANAKIN: an Artificial iNtelligence bAsed model for (radiation induced) cell KIlliNg prediction. ANAKIN is trained and tested over 513 cell survival experiments with different types of radiation contained in the publicly available PIDE database. We show how ANAKIN accurately predicts several relevant biological endpoints over a wide broad range on ions beams and for a high number of cell--lines. We compare the prediction of ANAKIN to the only two radiobiological model for RBE prediction used in clinics, that is the Microdosimetric Kinetic Model (MKM) and the Local Effect Model (LEM version III), showing how ANAKIN has higher accuracy over the all considered biological endpoints. At last, via modern techniques of Explainable Artificial Intelligence (XAI), we show how ANAKIN predictions can be understood and explained, highlighting how ANAKIN is in fact able to reproduce relevant well-known biological patterns, such as the overkilling effect.


Introduction
In the last decades, radiotherapy (RT) has increasingly proven to be an extremely effective cure against cancer.Within RT, particle therapy (PT), has been emerging [Durante and Flanz, 2019], and at the end of 2021, about 325.000 patients have been treated worldwide with PT, of which close to 280.000 with protons and about 42.000 with carbon ions [PTCOG, 2022].Furthermore, other ions have been recently gaining attention [Rovituso and La Tessa, 2017]: in 2021 the first patient was (re-)treated with helium [Mairani et al., 2022] at the Heidelberg Ion Therapy center (HIT) in Germany, while perspective studies are looking into the possible using of oxygen [Kurz et al., 2012, Sokol et al., 2017].
The physical rationale of using hadrons in cancer treatment is their characteristic energy loss mechanisms, that result in concrete biological advantages compared to photons, such an increased tumor control and a greater sparing of normal tissues, with a consequently lower risk of toxicity.
Despite the theoretical superior physical properties of hadrons compared to photons, further research is critical for increasing the PT application in the clinic.A correct and accurate estimation of radiationinduced biological damage remains one of the major limitations to the full exploitation this treatment modality.The key quantity used to describe the radiation effectiveness in inducing a specific damage is the Relative Biological Effectiveness (RBE), which is defined as the ratio between the dose delivered by a given radiation and the dose delivered by the reference radiation yielding the same biological effect: RBE allows to quantify how much more lethal a certain radiation is compared to the reference radiation, usually X-rays, and is used in Treatment Planning Systems (TPS) to calculate the biological dose, namely the physical dose multiplied by the RBE.For this reason, over the last decades a plethora of mathematical mechanistic models, [Hawkins, 1994, Inaniwa and Kanematsu, 2018, Kase et al., 2006, Bellinzona et al., 2021, Friedrich et al., 2013b, Friedrich et al., 2013a, Elsässer et al., 2010, McMahon and Prise, 2021, Vassiliev, 2012, Vassiliev et al., 2017, Tobias, 1985, Tobias, 1980, Kellerer and Rossi, 1974, Kellerer and Rossi, 1978, Cordoni et al., 2022, Cordoni et al., 2021, Manganaro et al., 2017], as well as data-driven phenomenological models, [Tilly et al., 2005, McNamara et al., 2015, Chen and Ahmad, 2012, Carabe et al., 2012, Wilkens and Oelfke, 2004, Mairani et al., 2017] have been developed to estimate RBE based on biological as well as physical quantities.At the base of most models is the linear-quadratic (LQ) behavior of the cell survival logarithm with respect to the imparted dose: where α and β are some specific parameters that depend on both biological e.g.tissue type) and physical (e.g radiation quality) variables [McMahon, 2018].Currently, a constant RBE of 1.1 is conservatively used in proton therapy, although evidences show its variability, especially in the distal region [Paganetti et al., 2002, Paganetti, 2014, Paganetti, 2018, Missiaggia et al., 2020, Missiaggia et al., 2022a].For carbon and helium ions, the RBE variations across the irradiation field are significant enough that a constant value cannot be used.Currently, two radiobiological models are currently used to predict RBE in clinical practice: (i) the Microdosimetric Kinetic Model (MKM) [Inaniwa et al., 2010, Inaniwa and Kanematsu, 2018, Bellinzona et al., 2021], and (ii) the Local Effect Model (LEM) [Friedrich et al., 2013a, Elsässer et al., 2010, Pfuhl et al., 2022].Both models have been vastly tested against in vitro and in vivo data [Pfuhl et al., 2022, Inaniwa andKanematsu, 2018], but the outcomes have not indicated a clear superiority of one model to the other.In addition, significant differences in the prediction of RBE across models are evident so that, at present days, the use in clinical practice of a variable RBE is highly subject to the model chosen, [Missiaggia et al., 2022a, Missiaggia et al., 2020, Giovannini et al., 2016, Bertolet et al., 2021].
The lack of a robust and generalized model for predicting RBE hinder the full exploitation of PT, including the use of ions heavier than carbon, such as oxygen, to successfully treat radio-resistant tumors, [Boulefour et al., 2021], or multi-ion therapy, which is nowadays accessible from the technical point of view [Ebner et al., 2021].
Furthermore, although some RBE models have a general mathematical formulation, their implementation in the TPS, especially for inverse planning, requires a heavy calculation effort.This issue is usually overcome both by using look up tables and by making specific assumptions [Inaniwa and Kanematsu, 2018], such as physical or biological approximations, which clearly limit the model generality and affects its RBE prediction accuracy.
Aiming at deriving a general model able to accurately predict RBE across a wide range of physical and biological variables, we developed ANAKIN (an Artificial iNtelligence bAsed model for (radiation induced) cell KIlliNg prediction), a new general AI-driven model for predicting cell survival and RBE.Machine Learning (ML) and Deep Learning (DL) algorithms have recently started to gain attention in the medical physics community with applications on imaging [Sahiner et al., 2019], fast dose estimation [Götz et al., 2020], Monte Carlo simulation [Sarrut and Krah, 2021], and particle tracking [Missiaggia et al., 2022b] have been published.However, only [Papakonstantinou et al., 2021] apply ML for predicting radiation induced biological quantities, conducting a study on induction of DNA damage and its complexity, but no analysis on RBE is performed.
ANAKIN is composed by various ML and DL-based modules, each with a specific tool, and interconnected to each other.The model considers both physical variables such as the kinetic energy of the incident beam or also the Linear Energy Transfer (LET), that is the amount of energy that a particle transfers to the material traversed per unit distance, [Durante and Paganetti, 2016], and biological variables, such as the α and β values for the reference radiation response.To make the model as general as possible, we trained it on cell survival data for 20 cell-lines widely used in radiobiology and 11 different ion types all available on the Particle Irradiation Data Ensemble (PIDE) [Friedrich et al., 2013b, Friedrich et al., 2021].Together with particles of interest for clinical applications, we also included in the training process heavier ions, including iron.This choice extends the application of ANAKIN to other research fields, such as radiation protection in space.To verify ANAKIN predictions and assess its accuracy, we randomly divided the data available in PIDE into two sets, one for training and one for testing.Therefore all results reported in the present work refer to the test set, that consists entirely of experiments that have not been included into the training set.
Artificial Intelligence (AI) has had a disruptive impact both in the research field and in real-life applications.The potential of modern and advanced Machine Learning (ML) and Deep Learning (DL) algorithms have started to gain attention in the medical physics community, where several research papers on application of DL to imaging, [Sahiner et al., 2019], fast dose estimation, [Götz et al., 2020], Monte Carlo simulation, [Sarrut and Krah, 2021], and particle tracking, [Missiaggia et al., 2022b], have appeared.Quite surprisingly, to the best of our knowledge, the only results in literature that use ML to predict radiation induced biological quantities is [Papakonstantinou et al., 2021], where the authors conduct a study on induction of DNA damage and its complexity, but no analysis on RBE is performed [Davidovic et al., 2021].
ML and DL has shown to be an extremely powerful, accurate and flexible tool to extract information and hidden relations as well as to predict the most likely outcome based on data of possibly different nature, [Ongsulee et al., 2018, Khalid et al., 2007, Shwartz-Ziv and Armon, 2022].Moreover, an excellent, systematic and comprehensive data collection of cell survival experiments exists and is publicly available, the Particle Irradiation Data Ensemble (PIDE), [Friedrich et al., 2013b, Friedrich et al., 2021].
ANAKIN is constituted by various ML and DL-based modules, each with a specific task, and interconnected to each other.Two different tree-based models, namely the Random Forest (RF) [Ho, 1995a, Ho, 1998], and the Extreme Gradient Boosting (XGBoost) [Chen andGuestrin, 2016a, Chen andGuestrin, 2016b] algorithms are used to predict cell survival for a wide variety of radiation and cell-lines.It is worth stressing that the final goal of ANKIN is to develop a robust and accurate model that is able to predict cell survival in the most general possible conditions.ANAKIN is trained to predict cell survival for 20 widely used cell-lines and for 11 different ions type.Concerning this last point, despite the driving motivation is HT, many different ions, such as iron which is beyond the possible application in clinic, are included into the model.This make ANAKIN extremely general so that possible future application in space radioprotection are also envisaged.
ANAKIN is trained on the PIDE.It is worth stressing that, in order to be as more realistic as possible, experiments contained in the PIDE dataset are divided into a training set and a testing set.Therefore all results reported in the present work refer to the test set, that consists entirely of experiments that have not been included into the training set.This means that ANAKIN is asked to predict the cell survival for experiments that has never been seen before.Besides the already mentioned variables, ANAKIN considers both physical variables such as the kinetic energy of the incident beam or also the Linear Energy Transfer (LET), that is the amount of energy that a particle transfers to the material traversed per unit distance, [Durante and Paganetti, 2016], and biological variables, such as the α and β values for the reference radiation response.
ANAKIN is tested over several endpoints and metrics to establish the actual accuracy of its predictions.Further, ANAKIN predictions are compared with the MKM and the LEM, which are the only two radiobiological models currently used in the clinic.Regarding the LEM results, an extremely well-done and extensive analysis of the LEM has become available very recently [Pfuhl et al., 2022].As a matter of a fact, much of the analysis conducted in the current paper has been explicitly inspired by [Pfuhl et al., 2022].In this direction, it must be stressed that, in the current paper, the version LEM III is used since the LEM IV is not currently implemented in the survival toolkitand thus, the presented comparisons could not be translated to the state of the art version of the latter code It is clear that the results reported in [Pfuhl et al., 2022] on the LEM IV are more accurate that the one reported in the current research using the LEM III, so this fact must be take into account.
Finally, the current work further aims at demystifying the erroneous myth that ML and DL models are obscure black-box model and whose predictions cannot be interpreted.If this argument can in fact be partially correct for extremely deep and sophisticated NN that have been built mostly in the field of the Reinforcement Learning, the same cannot be said for the vast majority of ML and DL developed in the last years.In fact, on one side, it must be said that some ML models, such as for instance tree-based models, are interpretable by nature and, on the other side, recently a huge attention has been posed to the development of mathematical techniques aiming at explaining ML and DL models that are not of easy interpretation; such area of research is known as Explainable AI (XAI) [Gunning et al., 2019].
The goal of the present research is to: (i) develop for the first time a general AI-driven model to predict cell survival probability over a wide range of biological cell-lines and physical irradiation conditions; (ii) compare ANAKIN with the two radiobiological models used in the clinic (MKM and LEM); (iii) show that ML-and DL-based models are not only accurate, but can also help in gaining new knowledge and understanding in radiobiology and medical physics.

The dataset
The development, training and verification of ANAKIN are based on data from PIDE [Friedrich et al., 2013b, Friedrich et al., 2021].
The MKM and LEM predictions are computed via the survival toolkit, [Manganaro et al., 2018, Attili andManganaro, 2018].This toolkit is a open source implementation that has been checked to be coherent with published results of the models, but nonetheless differences with the most advanced versions of the two formalism may arise.Unfortunately, to date no extensive and qualitative estimation of the MKM predictions over many cell-lines exists so that we could only rely on the survival toolkit.
The PIDE database contains a series of cell survival experiments, conducted over a multitude of different irradiation conditions and cell-lines.In addition to the original data, a set of LQ parameters are calculated for each experiment and is also reported.Following [Pfuhl et al., 2022], ANAKIN is thus trained over the exponential linear-quadratic fit on cell survival experiments.This is done as many experiments contained in the PIDE clearly shows anomalous variability in the reported survival fraction.Experiments reporting less than 3 measurement points are removed from the dataset, because at least 3 values are needed to fit an LQ curve.The dataset obtained from PIDE is then divided into two subclasses, one for training ANAKIN and one for testing its predictions.The selection is done so that each subset contains a sufficient amount of data for each cell lines and ions to be statistically significant.
Unlike [Pfuhl et al., 2022], ANAKIN is trained on both monoenergetic and SOBP ion beams, and for this reason a specific variable is added to the data to specify the irradiation condition.
After applying all the selection criteria described above, the resulting dataset contains 513 experiments, including 20 cell-lines and 11 ion types, of which 333 were randomly assigned for training and the remaining 180 for testing.Figure 1a gives an overall point of view on the number of considered experiments for each cell-lines as well as ion type.
At the end of the cleaning of the data we are left with 513 experiments, 333 experiments are randomly selected for training whereas the remaining 180 are used to test.It is worth stressing that the division between train and test has been performed over the experiments, meaning that ANAKIN is tested on experiments that have never been seen before by the model.All result that are shown in the present paper refer to the test set, so that ANAKIN test reflects a realistic situation in which ANAKIN should predict the cell survival of an experiment or a real situation that has never seen before.
ANAKIN takes as input 14 variables of both physical and biological parameters, either continuous, such as LET or energy, or discrete, such as ion type or cell-lines.The full list is reported in  1: List of all variables used as input to ANAKIN.The names are described according to PIDE documentation [Friedrich et al., 2013b, Friedrich et al., 2021] with the exception of Dose2, which represents the square of the dose value and has been introduced in this work.
1. Variables names as reported in Table 1 are taken from the PIDE and described in [Friedrich et al., 2013b, Friedrich et al., 2021].The only variable that has been added to the dataset is the square of the dose, named Dose2.The choice of considering the square of the dose is motivated by the well-know linear and quadratic form for the logarithm of the survival.It is worth noticing that also α γ and β γ values are passed as input to ANAKIN.

Machine and Deep Learning models
ANAKIN is an ensemble AI model composed of ML and DL modules, each with a different task, that together predict the cell-survival probability.A schematic representation of ANAKIN is shown in Figure 1b.Four different tree-based models are trained on the PIDE: two Random Forest (RF) [Ho, 1998, Ho, 1995a], and two Extreme Gradient Boosting (XGBoost) [Chen andGuestrin, 2016a, Chen andGuestrin, 2016b].PIDE data are directly used as input for one RF and one XGBoost models, while for the other two they are first processed with the Deep Embedding [Guo andBerkhahn, 2016, Micci-Barreca, 2001] Neural Network (NN), where categorical variables with high cardinality (in this case Ion, Cells and CellCycle) are pre-processed to learn a new meaningful data representation.Once the initial parameters are selected, (e.g.cell line, ion type and kinetic energy), the survival is calculated with each of the four models, and these values are these used as input to ANAKIN to predict the final cell survival.
Tree-based models have been chosen for the predictive modules rather than Neural Network (NN)based algorithms because, to date, despite the groundbreaking impact that NN had on image detection, NN had a significant less impact on tabular data; there are in fact several empirical evidences that standard ML approaches have comparable or even better results than NN, [Grinsztajn et al., 2022].On the contrary, DL algorithms are used within ANAKIN in an innovative way to solve a different task.As mentioned above, ANAKIN is trained to predict the cell survival over many cell-lines as well as ions.Such variables can assume only discrete values, typically referred to as categorical variables in the ML and DL community, and for this reason they are not in principle easily handled by a ML or DL model.Even more problematic there is the fact that such categorical variables have a high number of possible values.This poses a serious issue in how these variables must be mapped to numeric values to be efficiently treated by a ML model.Several possible solutions to the above problem exist, [Seger, 2018], but recently, DL have gained a huge attention not as solely a predictive tool but also as an extremely powerful data pre-processing tool, used for instance as a model to extract new information from data.For instance, DL has been recently proposed to specifically treat categorical variable with a high number of values.Such technique is called Deep Embedding, [Micci-Barreca, 2001, Guo and Berkhahn, 2016, Shreyas, 2022], and consists in training a NN that learns the most efficient way of encoding a categorical variable, such as in the present case the cell-line or also the ion type, into a low-dimensional numerical vector that can be efficiently used by another model to understand the most accurate relation between these variables and the target variable to predict.Therefore, ANAKIN has three specifically devoted module to learn a new data representation for the cell-lines, ion type and also cell cycle.The DL-based Deep Encoding modules are connected to the previously mentioned tree-based predictive modules to create ANAKIN, the final ensemble model that takes each single module output and predicts the cell survival fraction.
Each input model has been validated using a 10-fold cross-validation, and their hyper-parameters have been obtained using a Bayesian optimization technique, as described in [Missiaggia et al., 2022b].
are the n input features on which a model is trained to predict the target variable y i ∈ R. In the current case, X i are the variables reported in Table 1, whereas y i is the cell survival.Given a set of parameter W , that depends on the model, and a suitably chosen training set T := {X i , y i } N T i=1 , N T < N , the aim of the ML or DL models is to solve the following optimization problem min being φ the output function for the model and L the loss function.To improve the accuracy and reduce the overfitting, a regularization is added to the loss function, as done in [Bishop et al., 1995, LeCun et al., 2015].
The output function φ is the learned function approximating the ideal function φ, that describes the link between the features X i and the target y i 2.2.1.Ensemble tree-based models: Random Forest (RF) and Extreme Gradient Boosting (XGBoost) Random Forest (RF) is an ensemble ML algorithm that combines weaker models, such as decision trees, to create a more robust final model [Ho, 1998, Ho, 1995a].Being a bagging algorithm, the ensemble model is created in parallel, and thus the output is the average of all trees outcomes.Compared to decision-trees, the Random Forest reduces the overfitting on the train data, and thus it improves the prediction accuracy.
RF [Ho, 1995b, Friedman et al., 2001] assumes that φ is the average of weaker learners decision-trees ψ k , that is where ψ k is the outcome of the k−th decision tree.Like RF, also XGBoost is an ensemble ML algorithm that combines weaker decision trees models to create a more robust final model [Chen andGuestrin, 2016a, Chen andGuestrin, 2016b].XGBoost is a boosting algorithm, so that the ensemble model is created in series, and thus the output of each single model is passed to another, with the aim of reducing the error of the previous one.Also bagging is mostly used to reduce the overfitting on the train data and to improve the predictions accuracy.
XGBoost starts with a potential inaccurate model φ0 (x; W ) = arg min and then it thus expanded in a greedy fashion as φm (x; W ) = φm−1 (x; W ) + arg min Barreca, 2001, Guo and Berkhahn, 2016, Shreyas, 2022] is a NN-based technique for mapping a categorical variable into a vector.Being a supervised algorithm, the NN is trained to predict the the cell survival fraction.Thus, the intermediate representation learnt by the network is extracted and constitutes the new values used for the categorical variable.
In the context of NN, the W parameters defined in equations ( 1)-( 2) are usually referred to as weights.In this work, we chose the multilayer perceptron (MLP) NN, which is the first and most classical type of network used.
A Multi-Layer Perceptron (MLP) is created by connecting several single layer perceptrons, where several nodes are placed in a unique layer.The inputs (x 1 , . . ., x n ) are fed to the network so that the final output z is produced.Typically the output is a non-linear function of a weighted average of the input, i.e.
where w i are the weights and b is the bias.Also, σ is a suitable (possibly) non-linear function, like a sigmoid The connection between single layer perceptrons is done in a preferred direction.This type of network is called feed-forward, because the inputs are fed to the first layer, then the output goes to the second layer and so on until the data reaches the last output layer.By providing a series of corrects results to the network, and thus making the problem supervised, the NN can learn the best weights and bias to reproduce any desired output [Bishop et al., 1995, LeCun et al., 2015].

Explainable Artificial Intelligence
Several XAI techniques [Molnar, 2020, Biecek andBurzykowski, 2021] can be used to understand how a ML model work. in the present work, we focus on three specific very well-known and powerful techniques, namely (i) variable importance, (ii) Accumulated Local Effect (ALE) plot and (iii) the Shapley value.

Variable importance
Variable importance [Breiman, 2001, Fisher et al., 2019] measures the global importance of each feature to the final output of the model.The main idea behind the calculation is that, if a variable is important for calculating the final output of the model, then after a permutation of the variable values, the model performance significantly decreases.Larger changes in the overall model performance are then associated to highly important features.[Apley andZhu, 2020, Grömping, 2020], is one of the most advanced and robust dependence plot for describing how variables influence on average predictions of a ML model.On of the most advanced aspects of this model is that it accounts for correlation between variables.ALE plot thus calculate the average changes in the model prediction and sum (accumulate) them over the values assumed by a specific variable.
Ale plot is defined as [Apley and Zhu, 2020] fALE (x 1 ) := Instead of considering the effect of the prediction φ, the ALE plot considers changes in the prediction , which represents the local effect of the variable.This is averaged over all possible values of the other variable x 2 , weighted by the actual probability of registering the value x 2 given the considered value x 1 .Then, the result is integrated, or accumulated, up to x 1 .This value is centered around the average prediction, represented by the constant appearing in equation (3), so that the average effect over the data is 0. Therefore, ALE plots calculate the average difference in the prediction to be imputed to a local change in a variable.

The SHAP value
The SHapley Additive exPlanation (SHAP) value, [Lundberg and Lee, 2017], is a local XAI technique extremely powerful that aims at explaining individual predictions and in particular what is the contribution of each single variable to the overall prediction.The SHAP method computes Shapley values [Hart, 1989] as an additive feature attribution, alike a linear model, so that the prediction is decomposed as where ϕ i represents the contribution of the i−th feature and ϕ 0 is an intercept.

Error assessment
To provide a comprehensive and accurate assessment of ANAKIN performances, many metrics are used throughout this paper.In order to compare cell survival fractions, for each experiment we computed the logarithmic Root Mean Square Error (logRMSE ), defined as where D is the dose and N D is the number of doses measured in the i−th experiment.Ŝi and S i are the cell survivals predicted and measured, respectively.In the paper, the average and standard deviation of all the errors used are calculated by averaging the results for experiments included in the test set.
The RBE at the survival level ρ is defined as where D is the dose giving ρ survival fraction.Also, we denote where α γ and β γ represent the α and β value for the reference radiation, respectively.We specifically consider three survival levels at ρ = 0.5, 0.1, 0.01.In the paper, we focus on RBE 0.1 predictions, as this is the main value used in radiobiology for particle therapy.
The comparison of RBE measured or calculated with ANAKIN is investigated using the Mean Absolute Error (MAE) metric and the Mean Absolute Percentage Error (MAPE) metric

RBE
i ρ and RBE i ρ represent ANAKIN values and measurements, respectively, for the endpoint ρ = 0.5, 0.1, 0.01 α, β and the i−th experiment.Since the range of RBE is extremely wide, the two metrics are often used together to provide a better evaluation of the performances of ANAKIN.
We also calculated the MAE values of α and β as where ᾱi ion and βi ion are the predicted α and β values for the i−th experiment, whereas α i ion and β i ion are the measured data.For those quantities, the MAPE values were not calculated, as the absolute value of both α and β were close to 0.

Results
Results of the current work include a quantitative and comprehensive analysis of the comparison between ANAKIN cell survival predictions and experimental measurements available in PIDE.A wide range of possible metrics, such as RBE at different cell survival probabilities, α and β predictions as well as the cell survival at different doses are presented.A detailed description of the used metrics is reported in Section 2.4.
Figure 2a shows (A) MAPE and (B) MAE values (Section 2.4) for RBE 10 , RBE 50 and RBE 1 , while the numerical values as well as the logRMSE are reported in Table 2.The results indicate that ANAKIN has similar errors for different endpoints, with RBE 50 exhibiting a slightly higher MAE and MAPE than RBE 10 and RBE 1 .
Measured RBE 10 values and ANAKIN predictions are reported in Figure 2b as as function of the LET and β γ /α γ .In addition, the MAE for RBE 10 , RBE 50 and RBE 1 are plotted against LET and β γ /α γ .The results indicate an excellent agreement between RBE 10 ANAKIN and the experimental data over the entire range of LET.The smoothing spline of the RBE 10 predicted as a function of LET completely overlaps with the experimental curve.despite this is not necessarily a proof of a perfect agreement, it is nonetheless clear that the experimental trend is predicted by ANAKIN.A good agreement can be also seen by analyzing each experiment results.This is also supported by the MAE for the other endpoints    (Figure 2b (C) and Table 2), which remains mostly constant around for LET>10 keV/µm.Concerning errors as a function of α γ /β γ , there is a higher variability than observed for LET.The discrepancy observed in the spline smoothing at high β γ /α γ seems an artifact of the smoothing procedure, as it is not reflected in the MAE (panel (D)).On the contrary, at low β γ /α γ , i.e. for high α γ /β γ cell-lines, ANAKIN clearly underestimates the RBE 10 , as it is also indicated by the high MAE in the low β γ /α γ region.
Figure 3 shows the experimental RBE 10 against ANAKIN prediction.The results are sharply distributed around the bisector representing the ideal perfect prediction.The deviation between the bisector and the model prediction increases as the RBE grows.
Figure 4 reports ANAKIN RBE 10 predictions compared to the measurements, plotted against LET for 4 different ions (protons, helium, carbon and iron) in a very broad LET range.Overall, ANAKIN seems to reproduce well the the experimental data.For protons, ANAKIN can reproduce the small RBE variability at low LET as well as the clear rise above 20 keV/µm.ANAKIN is accurate also for helium and carbon ions, and it is clearly able to reproduce the overkilling effect, that yields a decrease in the RBE 10 around 100 keV/µm.ANAKIN values appear to be very close to the measurements also for iron.
Similar conclusions can be drawn from Figure 5a.Iron shows a strongly peaked distribution because of the low number of available experiments; nonetheless, iron a low error in both metrics.Beside iron, the other ions show comparable results, with helium having a broader distribution in both MAE and MAPE reflecting a lower accuracy of ANAKIN.Protons exhibit an error distribution peaked around the average values, as well as some outliers with high errors, as clearly indicated by the spikes in the high errors region.However, these peaks are for MAPE, and we hypothesize that they might be mainly caused by low RBE values, that can result in high percentage errors.
Figure 5b shows MAE and MAE error distributions evaluated for RBE 10 , grouped by monoenergetic beam and Spread-out Bragg-peak (SOBP).The peak of the distributions is similar for both cases, but the error distribution for the monoenergetic beams is clearly broader than for the SOBP.
Figure 6 shows RBE α and RBE β plotted against LET and β γ /α γ .RBE α values are accurately predicted by ANAKIN independently of LET and β γ /α γ .A higher inaccuracy is observed for RBE β in the low β γ /α γ region.The absolute errors in the α and β predictions show a steady behavior over the LET range (panel (E)), while the errors on the α values clearly decreases as β γ /α γ increases, coherently with previous analysis performed above.

Comparison with MKM and LEM
To further assess the accuracy of ANAKIN in predicting cell survival and RBE, we compared it with the only two RBE models that are currently used in clinical practice, namely the MKM [Hawkins, 1994, Inaniwa et al., 2010, Inaniwa and Kanematsu, 2018, Bellinzona et al., 2021] and LEM [Krämer et al., 2000, Elsässer and Scholz, 2007, Elsässer et al., 2008, Pfuhl et al., 2022].To calculated the biological outcomes from the MKM and LEM III, we used the survival toolkit [Manganaro et al., 2018].We performed the comparison for the HSG and V79 cell-lines, because they are among the most used in radiobiological experiments, and several datasets are available in literature.For the V79 cell-lines, we used 41 different experiments with proton, helium and carbon ions, while for the HSG cell-line, we included 15 experiments conducted with helium and carbon ions.To compare the models, the same metrics introduced in Section 2.4 are used.
Predictions with MKM and LEM have performed with the survival toolkit [Manganaro et al., 2018, Attili andManganaro, 2018] including the implementation of a limited number of versions for the latter models.In particular, a newer version of the LEM, namely the LEM IV [Elsässer et al., 2010], has been recently developed but it has not been used in the currently study since a freely usable version    is not available.For the LEM, we used version III, as the latest version (IV) is not available.However, an extensive quantitative study has been published [Pfuhl et al., 2022], and so further quantitative comparison between the LEM IV accuracy with ANAKIN can be conducted.The comparison between the models is shown in Figures 7a-7b, while the numerical values are reported in Table 3.The results indicate that overall ANAKIN is more accurate than both the MKM and the LEM in predicting all metrics and endpoints considered.For both cell-lines, the LEM shows the largest deviations from the measurements, closely followed by the MKM with ANAKIN reporting lower errors.In particular, for HSG cell-line, the LEM shows a MAE for the RBE 10 of 1.55, whilst the MKM and ANAKIN has respectively a MAE 1.18 and 0.43.For the V79 cell-line, the MKM and the LEM predict comparable results with a MAE for RBE 10 of 1.5 and 1.2.ANAKIN shows a MAE for RBE 10 of 0.43.Other endpoints and metrics has comparable results.Together with having lower average errors, ANAKIN exhibits a narrower error distributions, and does never reach absolute errors as high as the MKM and LEM.
Figures 8-9 illustrates the absolute difference of MAE between ANAKIN and MKM or LEM, calculated for the RBE 10 of both cell lines.All experimental datasets were obtained from PIDE.
This analysis confirms the results shown in Figures 7a-7b and Table 3. ANAKIN is more accurate in predicting the selected biological outputs than both the MKM and LEM for the majority of experiments.The maximum discrepancy between MAE is significantly higher when ANAKIN is closer to the experimental data (yellow dots), reaching differences above 6 for V79 cell-lines, but only 2 for the opposite case.This result indicate that for the cases where ANAKIN is less accurate than the other models, its error is smaller than when the predictions of the MKM and LEM III are off.

Explainable Artificial Intelligence
Figure10a shows the global importance of all variables included in ANAKIN, calculated over the whole test dataset.The plot suggests that the dose is by far the most relevant parameter, followed by β γ , LET and Dose2 (the dose squared), which are all close together.This finding indicates that ANAKIN uses both physical and biological variables to predict the biological outcome.The Ions Cells variables, on which a categorical embedding has been performed, denotes the ion type and cell-line, respectively, and have both a high impact on ANAKIN.
To a have a better understanding of ANAKIN global behavior, and in particular on the correlation between LET and the predicted survival, we calculated the ALE plot as described in Section 2.3 and reported in in Figure 10b as a function of the LET.To obtain an unbiased effect, the ALE has been evaluated at the same dose of 2 Gy for all the experiments.The typical trend of the overkilling effect is clearly visible: Figure 10b implies that small positive variations in the LET yields a clear negative variation in the cell survival, with a consequent increases of the RBE, up to 100 keV/µm , after which the cell survival starts increasing again as the LET increases, with therefore a decreases in the RBE. Figure 11a reports the SHAP value for each experiment plotted against LET, considering a fixed dose of 2 Gy.Unlike the ALE plot, the SHAP value is a local technique, namely the SHAP is evaluated for each individual input variable, and thus Figure 11a shows the importance of LET in the overall cell survival assessment, evaluating it for each experiment.As for the ALE plot, the typical behavior of the overkilling effect emerges.Protons exhibit a high positive SHAP value, which remains almost constant up 15 keV/µm .As the LET rise above 15 keV/µm , protons shows a steep increase in the RBE, and the SHAP value for LET drops.In this region, especially for low-energy protons, the kinetic energy becomes more important than the LET to predict the cell-survival (Figure 12b).The SHAP value decreases steadily up to 100 keV/µm , after which it start increase again, reproducing the typical shape due to the overkill effect.
Figure 11b shows the SHAP value for α γ and β γ .The SHAP value for α γ shows that low α γ has a positive but almost equal importance to the model, but as α γ increases and α γ /β γ goes over 5 Gy the SHAP values linearly decreases to have at last negative high values.
For α γ below a certain threshold, that coincides for cell-lines with low α γ /β γ < 5, the SHAP is positive, and then it starts decreasing linearly with α γ , reaching high negative values for high α γ and α γ /β γ .A similar trend is shown by the β γ SHAP values.For low β γ and high α γ /β γ , the SHAP value is positive, and then it begins to diminish.The data also indicate that the SHAP values for β γ show both higher variability and higher absolute values than those for α γ .
Finally, we performed a comparison of experiments considering the SHAP values.Figure 12a shows the SHAP values for ANAKIN input features for two experiments performed with protons of similar LET of 18 keV/µm and 19 keV/µm for different cell lines.The corresponding RBE 10 values of the two (a) Importance analysis for the variables used by ANAKIN.The variable names reflect PIDE documentation [Friedrich et al., 2013b, Friedrich et al., 2021].We also introduced the Dose2 as a new variable, representing the square of the dose.experiments are significantly different, being 1.2 and 2.7, respectively, as can be seen in Figure 4 panel (A).ANAKIN outputs are extremely accurate for both experiments, being 1.13 and 2.5, with a MAE of 0.07 and 0.1 and a MAPE of 0.05 and 0.03, respectively.Figure 12a suggests that the only variables showing a significant difference between the two experiments are the α γ and β γ , as it should be since the two experiments have been performed over different cell-lines.
Figure 12b compares the SHAP values for 4 different experiments, performed with either protons of helium of different LET (high or low).The rationale for the measurements selection is to test ANAKIN for different ions and LET.Besides differences in the cell-lines specific parameters, focusing only on ion specific variables, it can be seen how LET and energies are treated significantly differently.The SHAP value for LET is high and positive for both protons datasets and for low-LET helium, while is negative for high-LET helium.The SHAP related to the beam energy is positive and close to 0 for both particles when the LET is low, it is negative and close to 0 for high-LET helium, and negative with a high absolute value for high-LET protons.

Discussion
The results reported in Section 3 show that ANAKIN produces accurate results over a wide range of biological endpoints, beams of different particle species and energies, with a consistent behavior for different error metrics.Despite the fact that logRMSE is the most robust metric, being able to detect discrepancies between the predicted cell survival and the experiments at different doses, to be easily comparable to existing radiobiological model, most of the analysis of ANAKIN results has been conducted for RBE.As often in predictive analysis, the choice of the metric is of fundamental importance and strongly depends on the specific variable that the model is built to predict.In this particular case, the cell survival curves considered in the study have an extremely wide range, with corresponding RBE up to 6, and for this reason just a single metric cannot give a robust evaluation of the model accuracy.Experiments conducted with high LET radiation, are characterized by high RBE values, might have high absolute errors.However, the relative percentage error might be lower in such cases, being perhaps a more appropriate metric for experiments with high RBE.On the contrary, for experiments where the RBE is close to 1 , such as those conducted with high energy protons, the MAPE might be misleading, and the MAE might be a better tool.Since the choice of the most relevant metric is not always trivial, and depends on the chosen endpoint, our analysis was usually performed studying MAPE and MAE together.Nonetheless, for the sake of readability and to avoid giving an excessive amount of information, when reasonable, only the MAE metric is reported as it is considered to be, for the present case, more informative as compared to the MAPE.
MAPE and MAE distributions (Figure 2a and able 2) show that the errors for RBE 10 , RBE 1 and RBE 50 are all sharply peaked around the average values with low deviation, denoting an overall consistent cell-survival prediction despite the extremely large range of LET and cell-lines considered in the study.Further, it can be seen how ANAKIN is able to reproduce not only the average RBE, represented by the continuous splines, but also the high variability of the RBE across many LET and cell-lines.The validation of ANAKIK against RBE 10 measurements (Figure 2b) shows the model accuracy.When the MAE is plotted against the LET, two key aspects emerge: (i) in the low-LET region, the MAE for RBE 50 is slightly lower compared to high LET-region; (ii) the MAE for RBE 10 and RBE 1 remains almost constant in the whole LET range, with a slight drop at around 30 keV/µm .Notable enough, in the range 80-120 keV/µm , the experiments exhibit a large variation in RBE, nonetheless ANAKIN error does not seem to be affected by this huge variability with no evident increase in ANAKIN inaccuracy.This could go in the same direction as noted above, meaning that ANAKIN is able to predict RBE fluctuations at high LET.Further, a feature that support the potential of AI in modeling cell survival and RBE, is that ANAKIN predicts the overkill effect around 100 keV/µm , without any specific training.
ANAKIN RBE 10 predictions show a slowly higher discrepancy from the experimental values in the low β γ /α γ region, again mostly for RBE 50 , which corresponds to cell-lines with high α γ /β γ (Figure 2b) These cell-lines are extremely radiosensitive, and therefore are characterized by a larger experimental variability that reflected in the low accuracy of ANAKIN prediction.Further, less experiments have been performed for cell-line with high α γ /β γ , so that a higher error might simply be a natural fluctuation due to a lower statistics.
A specific analysis of single ions species prediction shows how ANAKIN accurately predicts cell survival over a wide range of ion species with very different LET, also guessing correctly the dependence of LET-RBE profiles on the ion type.For protons, ANAKIN is capable of reproducing the almost constant RBE at low-LET with a steep increase after 5 keV/µm , as well as the extremely high RBE at around 20 keV/µm .As shown, for example in [Missiaggia et al., 2022a], currently used RBE models a often unable to accurately reproduce the RBE for very low energy protons.ANAKIN could then provide a robust and accurate tool to predict the RBE of clinical protons, thus allowing to develop TPS with a variable RBE instead of the fixed value of 1.1 currently used.
The comparison with experimental data acquired with helium and carbon ions show that ANAKIN prediction are accurate also for these species, even if exhibiting a larger variability on the errors.This is a direct consequence of the higher variability of RBE characterization connected to these two ions.These findings suggest that ANAKIN could provide an invaluable tool for predicting RBE for heavy ions, where it is commonly accepted that a constant value cannot be used in their clinical applications.
The error distribution for monoenergetic beams is lower than for SOBP, as indicated by the main peak of the MAE distributions (Figure 5b), but it is broader.We hypothesize that this behavior could be due to the fact that for a monoenergetic beam, an inaccurate LET estimation can result in a significantly different RBE prediction.Overall, ANAKIN is able to accurately predict RBE value for both monoenergetic and SOBP, without the need of adding ad hoc adaptations.Also RBE α and RBE β show a good accuracy between predicted and experimental values.Both α and β errors seems to remain constants over the whole range of LET, whereas an analysis of the α variability as a function of β γ /α γ shows that for high α γ /β γ cell-lines the estimation of α is subject to higher uncertainty.There is a clear underestimation of α for high α γ /β γ cell-line, which directly translated into the high RBE error in these cell-lines, as already discussed above.Similar conclusions can be drawn for β with a slightly higher error in the high β γ /α γ region.These results point out that ANAKIN is able to reproduce not only α, but also β, which is typically subject to an extremely high uncertainty, as shown for instance by low accuracy of many models in reproducing β variability, [Pfuhl et al., 2022].In addition, β is predicted to be dependent on the radiation quality, as shown by the trend of the experimental data and in contrast to many other existing radiobiological models.

Comparison with MKM and LEM
To further test ANAKIN capability and appreciate its accuracy, we compare its results with predictions from the two radiobiological models currently used in the clinics (MKM and LEM).
The finding of this comparison indicate that ANAKIN has an overall higher accuracy than both the MKM and the LEM in all the metrics.The MKM performs slightly better than the LEM, but this result could be related to the fact that we had to use LEM III, instead of the latest version LEM IV, which was not available.This hypothesis is supported by the results reported in [Pfuhl et al., 2022], which shows that LEM IV has significantly better accuracy than LEM III.
ANAKIN error distribution is significantly less broad that both the MKM and LEM distributions, and its maximum error is lower.ANAKIN has not been specifically trained to predict the two cell-lines selected for the comparison (i.e.V79 and HSG), but on a wide range of different cell-lines available onn PIDE.
The analysis performed on several experiments suggests that ANAKIN is more accurate than both the MKM and LEM.Overall, ANAKIN shows a lower error than the MKM and LEM, and even when its prediction is less accurate than the other two models, the maximum error is lower than those obtained with the other two.

Explainable Artificial Intelligence
The global variable importance study presented in Figure 10a shows how both biological and physical variables are efficiently used by ANAKIN to predict cell survival.The dose is the most important variable, as expected.The analysis also identifies the square of the dose as a relevant variable, which is also reasonable as the quadratic relation between the survival and the dose is widely used in many RBE models.Concerning the physical variables, LET is considered to be more important than the ion kinetic energy.
For the biological variables, α γ and β γ are among the most relevant input parameters together with the cell-line.Our hypothesis is that ANAKIN uses these three variables to understand how a specific cell-line responds to ionizing radiation.
The variables over which a Deep Embedding was performed, namely Ion, Cells and CellCycle, are also relevant to the model predictions, suggesting that such advanced DL based embedding has been able to uncover important information.
The same effect emerges analyzing the dependence of the SHAP value from the LET.For protons, ANAKIN gives approximately the same positive and high importance to LET up to 15 keV/µm .In this region, protons shows an almost constant RBE, and ANAKIN recognizes this behavior by giving the same importance to different values of LET.
The association between high α γ values and high negative importance in Figure 11b reflects the fact that, for highly radiosensitive cell-lines, the α γ value, that describe the contribution of single track, should be more important than in cell-lines with lower α γ /β γ .
The SHAP value could be also extremely important in understanding which variables lead to a certain RBE.To support this hypothesis, we compared two experiments conducted with protons of comparable energies, namely 18 keV/µm and 19 keV/µm .The results indicate that ANAKIN can correctly predict a significant variability in the RBE, that in this case stemmed from the fact that different cell lines were considered, as pointed out by the SHAP values for the α γ and β γ parameters.
To investigate how different physical variables affect ANAKIN outcomes, we considered proton and helium beams of different energies.We found that LET has always a positive high impact only for high-LET helium, as this beam was the only one with an LET in the overkill region.Therefore it might be concluded that the LET is used by ANAKIN in a significantly different manner when an overkill effect is expected.The SHAP value for the beam kinetic energy is positive for protons and helium at low LET, whereas it is negative for the high LET beams, which have have extremely low energies (below 1 MeV/u ), corresponding to depth downstream of the Bragg peak.The SHAP analysis indicates that the kinetic energy is much more important for protons than helium at low values.This behavior can be due to the fact that low-energy helium ions have an LET above the overkill threshold, and thus the main information to predict cell survival is carried by LET.Overall, it seems that ANAKIN is able to use jointly LET and kinetic energy to accurately predicts the cell survival fraction.
Advanced XAI techniques have been applied to understand what variables are relevant to ANAKIN predictions, as well as to show how specific biological features observed in experimental data, such as the overkilling effect at high LET and the variable β coefficients, are reproduced by ANAKIN.The implementation of such behaviors is non-trivial in a purely mathematical model and it represent thus a strength of ANAKIN.Furthermore, these XAI techniques can play a major role in clinical applications since they allow the interpretation of ANAKIN prediction but also the understanding on how reliable the given prediction could be.It is worth stress that, one of the major limitation in biophysical modelling of radiation effects, both for curative and radioprotective purposes, is exactly on the uncertainties estimation.Although an AI-based approach is not derived from mechanistic considerations, unlike existing radiobiological models, and thus cannot provide a validation of these mechanism, on the other hand its power in processing and filtering the data dependencies can help to reveal features hidden in the data, that on their turn can drive further comprehension of the phenomenon.

Conclusions
The present paper presents the AI-based model ANAKIN, for predicting the survival probability of various cell lines exposed to different types of radiation.The findings contained in this paper, proves that a single model is able to predict the behavior of different ion species, without the need of specifically train the model on data relative to a single ion.Although the main motivation for developing ANAKIN is to apply it in particle therapy, the model accuracy in predicting the biological effect of extremely high LET events could extend its application in other fields, such as space radioprotection.
The analysis described here indicates that ANAKIN is able to accurately predict cell survival and RBE over a wide range of different cell-lines and ions type.Higher uncertainties and errors emerges for cell-lines characterized by low α γ /β γ and LET in the range from 20 to 150 keV/µm .These uncertainties reflects the uncertainties in the experimental data, on which ANAKIN has been trained on.In fact, celllines with high α γ /β γ as well as experiments with high LET beams show a higher variability of RBE.
When compared with two of the mostly used radiobiological models, namely the MKM and LEM III, ANAKIN showed in average more accurate predictions.The gap between the models could be smaller if the latest versions of the MKM and LEM become available in literature.
Although purely data-driven models are often considered to be less powerful than mechanistic models, ML and DL have the advantage of being extremely flexible.This is supported by the fact that ANAKIN predicts both the overkill effect and the variable β into the MKM.On the contrary, in mechanistic model ad hoc correction terms must typically be added to include above effects.
The modular structure of ANAKIN makes very easy to include advanced features.The most relevant example is the implementation of a radiation quality description different from the classical LET, such as microdosimetric or nanodosimetric quantities, as well as the coupling of ANAKIN with a mechanistic RBE model.
In conclusion, we showed that ANAKIN is an intuitive and understandable model, that demonstrates high accuracy in predicting cell survival and RBE.Any prediction given by ANAKIN can be analyzed into details, so the contribution of each input variable can be precisely assessed.Several advanced techniques of XAI can be used either to understand if a well-known biological or physical phenomena, such as the overkill effect or LET dependent β, has been included in ANAKIN, but also to gain further insight and unveil existing correlations between different variables.While AI has been broadly employed in radiotherapy treatment planning, either for physical dose optimization or image segmentation, ANAKIN is the first application on radiobiological calculations, which may open the possibility to use for the first time an AI-based model to biological treatment planning, i.e. the optimization of the dose delivery with explicit consideration of a voxel-dependent RBE.This problem is typically extremely heavy computationally and strongly linked to uncertainties, two features where a model like ANAKIN may be of the outermost advantage.
(a) Number of experiments for each ion type, reported in the horizontal axis, and cell-line, reported in the vertical axis; the size refers to the number of experiments.(b) A schematic representation of ANAKIN.Data from PIDE are input into two types of tree-based models (RF and XGBoost), either directly or after being processed with a Deep Embedding.All four models predict cell survival, and the values are used as input for ANAKIN, that opportunely combines the predictions to provide a final cell survival output.

Figure 2 :
Figure 2: ANAKIN predictions and error assessment for different endpoints.

Figure 3 :
Figure 3: RBE 10 extracted from PIDE plotted against the RBE 10 predicted by ANAKIN.The color represents the density, while the diagonal dotted red line indicates the perfect prediction.

Figure 4 :
Figure 4: RBE 10 predicted by ANAKIN (black) and extracted from PIDE (yellow) plotted against LET for protons (top left), helium (top right), carbon (bottom left) and iron(bottom right) ions.To guide the eye, the continuous lines represent spline smoothing.
(a) MAPE (A) and MAE (B) distributions for ANAKIN RBE10 prediction for carbon ions (yellow), iron (blue), helium (purple) and protons (red).Dotted vertical lines indicate the distributions average values.(b) MAPE (A) and MAE (B) distribution for ANAKIN RBE10 prediction for monoenergetic beams (yellow) and SOBP (blue).The dotted vertical lines indicate the distributions average values.

Figure 5 :
Figure 5: ANAKIN MAPE and MAE distributions for different ions and irradiation conditions.
(a) MAPE (A) and MAE (B) distributions for RBE10 calculated with ANAKIN (yellow), LEM III (blue) and MKM (purple) for the V79 cell line.The dotted vertical lines denoted the average values.(b) MAPE (A) and MAE (B) distributions for RBE10 calculated with ANAKIN (yellow), LEM III (blue) and MKM (purple) for the HGS cell line.The dotted vertical lines denoted the average values

Figure 7 :
Figure 7: ANAKIN, MKM and LEM MAE and MAPE distribution for V79 and HSG cell-lines.

Figure 8 :
Figure 8: MAE difference between ANAKIN and MKM (A) or LEM III (B) calculated for different experiments (labeled with an ID number on the y axis).All values are for the RBE 10 of the V79 cell line available in PIDE.The black dots represents measurements for which the MKM or LEM exhibit lower MAE than ANAKIN, while the yellow dot those for which ANAKIN MAE is lower.The red vertical line indicates the zero.

Figure 9 :
Figure 9: MAE difference between ANAKIN and MKM (A) or LEM III (B) calculated for different experiments (labeled with an ID number on the y axis).All values are for the RBE 10 of the HGS cell line available in PIDE.The black dots represents measurements for which the MKM or LEM exhibit lower MAE than ANAKIN, while the yellow dot those for which ANAKIN MAE is lower.The red vertical line indicates the zero.
(b) ALE plot of the effect of LET on the survival probability predicted by ANAKIN for an imparted dose of 2 Gy.
(a) SHAP values for LET plotted against LET.All data are for a 2 Gy dose irradiation with different ions, ranging from protons to iron.(b) SHAP values calculated for αγ plotted against αγ (A) and βγ plotted against βγ (B).The data are divided into two groups, for αγ/βγ > 5 (yellow dots) and αγ/βγ < 5 (blue dots).The data are for a dose of 2 Gy.
(a) SHAP values calculated for the most relevant ANAKIN variables for protons of similar LETs ((A) 18 keV/µm and (B) 19 keV/µm).The dose has been set at 2 Gy.(b) SHAP values calculated for the most relevant variables for the V79 cell line for protons at 4 keV/µm (A) and 28.8 keV/µm and helium at (C) 28 keV/µm and (D) 190 keV/µm.Dose has been set to 2 Gy.

Table 2 :
Average errors and standard deviations of different error metrics and endpoints.

Table 3 :
Average errors and standard deviations of ANAKIN, MKM and LEM III calculations for V79 and HSG cell-lines considering different endpoints.