Deciphering Oxygen Distribution and Hypoxia Profiles in the Tumor Microenvironment: A Data-Driven Mechanistic Modeling Approach

The distribution of hypoxia within tissues plays a critical role in tumor diagnosis and prognosis. Recognizing the significance of tumor oxygenation and hypoxia gradients, we introduce mathematical frameworks grounded in mechanistic modeling approaches for their quantitative assessment within a tumor microenvironment. Our approach provides a non-invasive method to measure and predict hypoxia using known blood vasculature. Formulating a reaction-diffusion model for oxygen distribution, we apply it to derive the corresponding hypoxia profile. The modeling and simulations successfully replicate the observed inter- and intra-tumor heterogeneity in experimentally obtained hypoxia profiles across various tumor tissues (breast, ovarian, and pancreatic) in our dataset. Employing a data-driven approach, we propose a method to deduce partial differential equation (PDE) models with spatially dependent parameters, enabling us to comprehend the variability of hypoxia profiles within a tissue. The versatility of our framework lies not only in capturing diverse and dynamic behaviors of tumor oxygenation but also in categorizing states of vascularization. These categories are distinguished based on the dynamics of oxygen molecules, as identified by the model parameters.


Introduction
Within several types of solid tumors, increased metabolic demands and abnormal vasculature lead to inadequate oxygen availability.This state is termed hypoxia and the corresponding regions of the tumor microenvironment are defined as hypoxic (McKeown 2014).Hypoxia, considered as one of the hallmarks of most solid tumors, compromises the functionality of cells and even their survival, resulting in the adaptation of (cancer) cells and their dynamics.These adaptations, which involve changes at both the cellular and the population levels, impinge on tumor progression by influencing drug resistance, invasiveness, angiogenesis, and vasculogenic mimicry (D'Aguanno et al 2021).Members of the hypoxia-inducible factor (HIF) family of transcription factors play a central role in the response to hypoxia by influencing multiple signaling pathways and gene regulation that promote neovascularization, invasion, migration, adhesion, metastasis, metabolism, DNA replication, protein synthesis and phenotypic switching (from proliferative to invasive) (Harris 2002, Widmer et al 2013, Al Tameemi et al 2019).Hypoxia-driven changes contribute to the intra-tumor heterogeneity, thereby affecting accurate diagnosis.Another crucial effect of hypoxia is reduced treatment efficacy, as tumor cells in hypoxic regions do not respond well to many treatments including those based on chemotherapies or radiation therapy.Such effects contribute to the aggressiveness of several tumor types including melanoma, the most aggressive, complex, and heterogeneous skin cancer (Bedogni and Powell 2009, Grzywa et al 2017, D'Aguanno et al 2021).The alleviation of hypoxia in a tissue region can be achieved through several strategies involving an increment in oxygen delivery or the decreased oxygen consumption by the cancer cells (Gallez et al 2017).The former is a widely accepted procedure (Suh et al 2006, Horsman andVan der Kogel 2009) and few studies, e.g.(Secomb et al 1995, Lin and Maity 2015, Gallez et al 2017), suggest that the latter can also be efficient in overcoming hypoxia and its consequences.Hence, a better assessment of oxygen distribution, hypoxia profiles, and O 2 consumption in the tumor microenvironment would improve prognosis and evaluate therapeutic options.
Mathematical modeling has proven to be beneficial in describing complex dynamics, offering powerful and cost-efficient tools for making predictions in various fields.In the context of tumor dynamics, mathematical models are valuable for reconciling spatiotemporal and phenotypic heterogeneity, as well as designing improved treatment strategies (Altrock et al 2015, Hodgkinson et al 2022).Numerous computational and mathematical models have been proposed to describe tissue oxygenation, especially focusing on the distribution of oxygen in both cancerous and regular tissues (Hsu and Secomb 1989, Secomb et al 1993, 2004, Goldman 2008, Toma-Dasu and Dasu 2013, Grimes et al 2016).
The cylinder model, initially proposed in Krogh (1919), laid the foundation for calculating oxygen flow based on an analytical expression dependent on the difference in oxygen tension in steady-state.Subsequently, several partial differential equation (PDE) models of the reaction-diffusion type were developed to represent the oxygen spatial distribution (Hsu and Secomb 1989, Secomb et al 1993, 2004, Dasu et al 2003, Powathil et al 2012, Skeldon et al 2012, Toma-Dasu and Dasu 2013, Welter et al 2016).These models involve different forms of source terms and boundary conditions.The resulting PDEs were numerically solved through discretization methods (finite-difference or finite element) or numerical integration, as seen in the alternative approach based on Green's function method (Hsu andSecomb 1989, Grimes et al 2016).This method was later modified in Secomb et al (1993Secomb et al ( , 2004))to account for more complex vascular architecture.
Despite these advancements, only a few works explicitly discuss hypoxia in connection with real-world tumor samples or experimental data.For instance, in Powathil et al (2012), the predicted hypoxia, observed using the needle electrode technique, is qualitatively compared with CD316 stained images.However, most formulations are limited to synthetic data and qualitative results, lacking direct application and connection with actual tumor tissues and experimental outcomes.Hence, there is still a need for accurate predictive models of hypoxia that take into account tissue heterogeneity.
Here, we propose a physical model that predicts hypoxia in heterogeneous tumor microenvironments from stained tissue images.For this aim, we develop a data-driven mathematical framework to model the distribution of oxygen and hypoxia in the tumor microenvironment, incorporating mechanistic inferences.We produce experimental data in the form of high-pixel images illustrating staining of markers of blood vessels, hypoxia, and cell nuclei.The identification of these three elements is detailed as follows: • Blood Vessels: The CD31 marker is a proxy for blood vessels.CD31, also known as PECAM-1 (platelet endothelial cell (EC) adhesion molecule 1), is a 130 kDa transmembrane glycoprotein belonging to the immunoglobulin superfamily.It is present on the surface of platelets, monocytes, macrophages, and neutrophils and is a constituent of the endothelial intercellular junction (Pusztaszeri et al 2006).ECs form a continuous lining (endothelium) along the inner surface of blood vessels, providing a barrier between circulating blood and surrounding tissues.By identifying these cells, the desired blood vessels are marked.• Hypoxia: Intra-tumoral hypoxia is responsible for activating the key transcription factor HIF, which mediates the activation of several defense mechanisms for tumor cells.HIF-1 is a heterodimeric transcription factor comprising an HIF-1α subunit and an HIF-1β subunit (Mabjeesh and Amir 2014).HIF-1, promoted by intra-tumoral hypoxia, regulates Carbonic Anhydrase IX (CA9) expression.CA9 is considered one of the best-characterized HIF-1 targets (van den Beucken et al 2009), serving as an intrinsic marker for tumor hypoxia7 .• Cell Nuclei: The DAPI marker, which stands for 4' ,6-diamidino-2-phenylindole, is used to identify cell nuclei (Tarnowski et al 1991(Tarnowski et al , kapuscinski 1995)).DAPI is a standard marker for detecting nuclei.
The images acquired upon immuno-detection of CD31, CA9 and DAPI on tissue sections prepared from Patient-Derived Tumor Xenograft (PDTX) mouse models illustrate intra-and inter-tumor heterogeneity in oxygen and hypoxia distribution.Leveraging this dataset, we develop a mechanistic framework to model oxygen and hypoxia distribution, predicting the hypoxia profile for a tumor tissue based on known vasculature.
Our study successfully recovers the biologically relevant heterogeneity of oxygenation and identifies crucial parameters involved in reducing hypoxia.As hypoxia promotes radio/chemo -resistance (Vaupel andMayer 2007, Lin andMaity 2015), these results are also valuable for predicting the efficacy of therapies.We modeled oxygen distribution using a steady-state solution of a reaction-diffusion PDE.For a known distribution of O 2 , we modeled hypoxia based on the dynamics of CA9 expression, used as a proxy for HIF-mediated transcription.While similar theoretical and computational models for oxygen distribution exist in the literature (Carlson et al 2011, Toma-Dasu and Dasu 2013, Kempf et al 2015), the novelty of our work lies in identifying and validating a predictive theoretical model using tumor samples and addressing the challenge of hypoxia heterogeneity.More precisely, we demonstrate the derivation of a space-dependent coefficient PDE model that predicts oxygen and hypoxia gradients.This is achieved by starting from a static model with space-invariant coefficients and identifying proxies for space-dependent coefficients directly from images of stained tissues.
The remainder of the article is organized as follows.The methods used in this work are detailed in section 2, covering information about the experimentally produced data, data processing, quantification of blood vessel diversity, mathematical modeling of oxygen and hypoxia, numerical simulation, parameter estimation, and validation.The obtained results are discussed in section 3. The study is concluded with a discussion in section 4 and a summary in section 5.
Once the PDTX model is established and frozen at passage 3, a single fragment of frozen tumor, approximately 8 mm 3 , is implanted into the inter-scapular fat pads of 3 to 4-week-old female Swiss-nude mice.Tumor growth is measured weekly with a caliper, up to the ethical limit of 1500 mm 3 (du Manoir et al 2014, Colombo et al 2015).

Immunohistochemistry (IHC) and immunofluorescence (IF)
PDTX samples are fixed in 4% neutral-buffered formalin for 24 h and then paraffin embedded.The FFPE tissues are sectioned into 4µm sections and processed for IHC or IF.CD31 (Abcam, #ab28364) and CA9 (Roche, #760-6080) stainings are carried out on a VENTANA Discovery Ultra automated staining instrument (Ventana Medical Systems) following the manufacturer's instructions.For IF, after applying the primary antibodies, sections are incubated for 1 h at room temperature with secondary antibodies coupled to Alexa-488, Alexa-647 (Life Technologies) and nuclei are counterstained with 4' ,6-diamidino-2phenylindole (DAPI, Sigma).Sections are then washed in PBST, and coverslipped using Mowiol mounting solution.For image analyses, sections are scanned on a 3DHISTECH scanner.
Analysis of CA9 and CD31 expression by IHC is conducted on adjacent tissue sections spaced four microns apart.The outcome reveals a brown color, indicating the presence of the targeted proteins, accompanied by cell nuclei depicted in blue, as exemplified in figure 1(middle and right columns).Conversely, IF stainings are performed on the same tissue section with different target proteins represented by specific colors.Specifically, CD31, CA9, and DAPI are shown in red, green, and blue colors, respectively.The left column images in figure 1 visually demonstrate these stains in the mentioned colors.It is important to note that although the tumor sample remains the same, the tissues involved in both scanning types differ.

Heterogeneity in data: blood vessel diversity
Inter-and intra-tumoral heterogeneity is considered a challenging factor affecting therapeutic responses, and such heterogeneity is evident in our dataset, particularly concerning blood vessels, as depicted in figure 2. The images reveal diversity in blood vessel architecture and geometry, showing long and large capillaries in figure 2(a) and smaller ones in figure 2(g).Capillaries are densely packed in patches figures 2(g) and (i), while they are rare in the second patch of figures 2(d) and (h).This diversity may arise from the underlying heterogeneity within and between tumors and the angle at which the 3D tissue is sliced for 2D imaging.Since effective oxygen distribution relies on blood vessel distribution in the tissue, quantifying this heterogeneity is crucial, especially in our modeling process discussed in the subsequent sections.

Data preprocessing
In order to utilize this data for spatial modeling, we apply some preprocessing techniques.Since the IHC stains of CA9 and CD31 are not conducted on the same samples, we perform image registration to obtain aligned images.Image registration is performed using two methods: • Control point-based method: In this feature-based registration method, we manually identify a few matching features in both images (CD31 and CA9 stained IHC images) -specifically, matching tissue shapes.We choose 10 such points identical for each IHC scan pair.This selection of points is followed by estimating an affine transformation that differentiates one image (CD31 stained image, considered as the fixed one) from the other image (CA9 stained, considered as the moving one) geometrically.Applying the estimated transformation to the moving image aligns it with the fixed image.• Optimization-based method: In this intensity-based registration method, we estimate the transformation using a regular step gradient descent optimization with mean square error metric configuration using the function imregtform in MATLAB8 .
For the images utilized in this study, the control point-based method outperforms the intensity-based method.However, the latter proves useful when marking identical features in the images is challenging.Following the completion of registration, we employ color thresholding in MATLAB to isolate specific marker colors in the tissue images.This allows us to obtain distinct stains for each sample; for samples stained with the IF method, color selection, and corresponding thresholding are executed in Zen software 9 to achieve the same outcome.Furthermore, given that these images are high-pixel images (i.e. a single image can be 20k × 20k in height and width), we extract small, representative patches from the whole image, particularly focusing on areas of interest for the subsequent steps outlined in the following sections.

Extraction of marker densities
After the necessary processing, we obtain images for the respective markers for each tissue sample.The upper row of figure 3 depicts the CD31, CA9, and DAPI marker staining images obtained from the preprocessing of the IF-based method for a patch of a breast tumor sample (1090).To facilitate modeling based on these images, spatial marker densities are required.This goal is achieved through three methods: • Intensity-based approach: In the first method, a 2D Gaussian filter 10 (G) is applied to the images, resulting in pixel intensity-based density.• Binary-pixel approach: The second method extends the first by obtaining a binary image with non-zero pixels for staining.Then, a Gaussian filter is applied to get the density.• Kernel density estimator approach: The third method is based on kernel density estimate 11 for the image parts covered by the non-zero pixels.In this case, a Gaussian kernel is defined for each point12 on a considered grid in non-zero pixel regions, and the resulting density is the sum of all the kernels for the image.
The mentioned approaches are applied to the three markers based on their specific characteristics: • For CA9: The intensity-based approach is applied to extract density for CA9, as the intensity and gradient are meaningful to mark the level of hypoxia.This method smoothens the images and ensures a continuous representation of hypoxia distribution.• For CD31: The binary-pixel approach is used to obtain CD31 marker density.Initially, it is crucial to binarize the vascular and non-vascular regions for the blood vessels, followed by obtaining the density.Additionally, the resulting binarized image is further smoothed using a Gaussian filter.• For DAPI: The kernel density estimator approach is applied to obtain cell density.This method captures the image portions covered by non-zero pixels and the proximity of the cell nuclei, forming a map of cells close to high density and low density for rarely packed cells.
The resultant densities from both these methods are demonstrated in the lower row of figure 3.

Heterogeneity quantification
To quantify the heterogeneity of blood vessels, we employ Claramunt entropy (Claramunt 2005) within patches.This entropy metric, a modified version of the Shannon entropy, takes into account the spatial distribution of blood vessels.Assuming that an increase in distance between entities of the same type and a decrease in distance between entities of different types will result in increased entropy, the spatial entropy H s is defined as follows: d int i denotes the average distance between the entities (blood vessels) of the same type i (i.e.intra-distance), while d ext i represents the average distance between the entities of a given type i and the entities of the all other types (i.e.extra-distance), here p i denotes the proportion of each entity i, with n being the number of such entities.
To apply the entropy method previously mentioned, we first categorize image patches based on the geometrical variations of blood vessels present within them.These variations are determined by the number of pixels in a region covered by each capillary in a given patch.To ensure consistency across all images, we 9 ZEISS ZEN (blue edition), Version 3.6.095.01 000, Carl Zeiss Microscopy GmbH, Germany. 10The Gaussian filter G for image I in 2D is defined as I * G, where G = 1 2π σ 2 e − x 2 2σ 2 , with σ being the standard deviation with zero mean applied to each point x = (x, y) on a 2D grid. 11The density pn(x) is estimated by pn , with K(x) being a Gaussian kernel function given by K , h being the smoothing bandwidth, x i be the observed data point, x is the value where kernel function is computed and n being the number of data points involved.classify these variations into five categories (C i ) based on pixel area 13 (A i ), i ∈ [1, 5] of blood vessels, ranging from very small to massive vessels (2.2) In the context of the formulation presented in equation (2.1), the term p i denotes the percentage of all non-zero pixels associated with the category C i .For a particular patch, as depicted in figure 4(a), these categories are graphically represented using five distinct colors, as shown in figure 4(b).Each color signifies a category C i , with the boundaries of individual blood capillaries drawn by these colors.Consequently, intra-distances d int i refer to the distances between vessels of the same color, whereas inter-distances d ext i are the distances between vessels of different colors.The measurement of distances is achieved using the centroid of each vessel, identified as the connected pixels within the image.This approach comprehensively characterizes the spatial relationships between blood vessels, forming the basis for the subsequent entropy calculation outlined in equation (2.1).
Applying a moving window approach, we compute the entropy for patches within each sample using equation (2.1).As illustrated in figure 5(a), we initially divide an image sample into 100 patches.Subsequently, we calculate the diversity index (H s ) for each patch, as one of the patches exemplified in the zoomed-in form in figure 5(b).The resulting diversity map for this tissue sample is depicted in figure 5(c).The colormap in figure 5(c) visually represents the diversity of blood vessel architecture within a tumor sample.This quantification of entropy provides valuable insights into the spatial distribution and variation of blood vessels, serving as a basis for identifying similar patches during the validation process.

Modeling
Utilizing the data mentioned above, which includes the densities of blood vessels, hypoxia, and cell nuclei, we develop a mathematical framework and physical models to investigate the relationship between oxygen and hypoxia distribution within tumor tissues.We begin by forming a model of oxygen distribution. 13Throughout this classification, the term 'area' is used to denote the number of pixels in a connected component, which remains independent of the actual pixel dimensions in each image.When applying equation (2.1), we require the proportion of such connected components, making it independent of the actual pixel length and breadth.Similarly, distances are calculated based on pixel coordinates, so the ratio of distances involved in equation (2.1) is also independent of actual pixel dimensions.This approach ensures that the comparison of Hs between different image patches is meaningful.) where the score is normalized (with respect to the maximum value) for improved visualization.The patch numbering is executed in a Cartesian coordinate system manner, commencing from the bottom left and then progressing from left to right in the x-direction.This counting pattern repeats until reaching the top right corner.

Model of oxygen distribution
The dynamics of oxygen distribution within the microenvironment are influenced by various factors, including blood vessel architecture, geometry, functional state, permeability, the metabolic activity of cells, and oxygen consumption.In this study, we specifically focus on the following aspects: • Production of O 2 : Our model assumes a direct correlation between oxygen production and blood vessel density in the tissue.Increased vessel density, particularly of capillaries, enhances oxygen exchange between blood and tissues.As detailed in section 2.4, estimating vessel density involves considering both overall density and spatial distribution.By incorporating vessel positions, we account for each vessel's neighborhood, crucial for determining local oxygen supply.• Neighborhood Determination: The appropriate neighborhood for O 2 distribution is determined by the ability of O 2 molecules to migrate the longest distance from their source vessels.To model this, we consider the random motion of oxygen molecules in the tissue, hence employing a linear diffusion to quantify this process.This method is a PDE counterpart of the experimentally observed diffusion length, which is applied in some previous studies as in Turner et al (2002) and Vaupel et al (2021).• Consumption in the Tissue: Oxygen undergoes consumption in the tissue, primarily by cells and other chemical reactions in the tumor microenvironment.The uptake of oxygen is a crucial factor affected by the cellular metabolic state and the state of the tumor microenvironment.From a treatment perspective, a decrease in consumption can be utilized to reduce hypoxia (Gallez et al 2017).In our modeling approach, we consider uptake directly proportional to the available oxygen concentration, encapsulating all the influencing factors into a unified representation.
To facilitate our modeling process, we introduce some notations for the variables involved.We consider the following: • u(t, x): the oxygen concentration depending upon time t and space x, • V(x): blood vessel density, • α: oxygen uptake coefficient, • β: oxygen production coefficient.
Given the considerations outlined for modeling, along with the declared variables, we formulate a reaction-diffusion model for oxygen distribution in the following form: In this formulation, the modulation of oxygen concentration over time is governed by constant diffusion (represented by the first summand on the right-hand side of equation ( 2.3)).Additionally, there is a constant uptake proportional to the existing O 2 distribution (second summand) and oxygen production linked to the blood vessel density (third summand).This formulation aligns with previous models developed in Kelly and Brady (2006), Skeldon et al (2012) and Welter et al (2016).Notably, in our data-driven approach, we have established an explicit relationship between blood vessel density obtained from data and oxygen production.Further, as oxygen dynamics is rapid enough and it reaches the corresponding equilibrium state very quickly, as in Kelly and Brady(2006) and Welter et al (2016), we consider the steady state from the reaction-diffusion model: with boundary conditions: ∇u • n = 0, which ensures no oxygen flux exchange between the considered domain and outside, with n representing the outward normal to the surface boundary (Welter et al 2016).
The model is equation (2.3) ensures the supply of oxygen by the available blood vessels; however, the production should not be unbounded.To incorporate this phenomenon, we define a limited production in another form of the model: where K V is the carrying capacity of the blood vessel density.Rewriting Ṽ = V KV and dropping the tilde for simplicity, we get the steady-state form as: with the same no-flux boundary condition.

Modeling of hypoxia-oxygen relationship
As mentioned in the introduction, hypoxic regions result from poor oxygenation.Therefore, the relationship between hypoxia and O 2 would be formulated in an inversely proportional manner.
One initial choice for modeling this relation is to consider a linear decay of hypoxia with an increase in O 2 .Let h(u) represent hypoxia as a function of normalized oxygen distribution 14 , and the linear decay can be expressed as: here, γ is a dimensionless constant conversion coefficient. 14The oxygen distribution is normalized by its maximum value in the considered domain.Another formulation could involve an exponential decay: with k 1 being a dimensionless constant representing the rate of decay.Such decay has also been observed in Qutub and Popel (2006).
A more general approach is to consider a sigmoidal form: (2.8) In this case, k i , where i ∈ [1, 3], are constants that govern a more controlled decrement of oxygen leading to hypoxia.
Utilizing the three models mentioned above, we obtain the corresponding hypoxia profiles, each representing the decay of hypoxia (linear, exponential, or sigmoidal).Furthermore, a switch-like15 relationship between O 2 and hypoxia is proposed in previous works, such as in Ginouves et al (2008), Cavadas et al (2013) and Nguyen et al (2015).We incorporate this response with the lower and upper thresholds of O 2 corresponding to non-zero hypoxia, denoted as O l and O h ; this relationship is modeled as follows: (2.9) For the particular values of O 2 thresholds (O l = 0.2 and O h = 0.8, figure 6 illustrates the hypoxia-O 2 relationship represented by (2.9).
The upper bound (O h ) is evident as hypoxia decreases to zero in the presence of sufficient oxygen (normalized value = 1).This decrease can manifest as a linear decay (figure 6(a)), a sharp exponential decay (figure 6(b)), or a sigmoidal-type decay (figure 6(c)).The lower bound (O l ) models two processes.First, in cases of severe and prolonged hypoxia, HIF-1 levels decline, as suggested in Ginouves et al (2008), Saxena and Jolly (2019), Korbecki et al (2021).As HIF regulates CA9 (Hirota andSemenza 2006, Mabjeesh andAmir 2014), which is a well-characterized HIF-1 target (van den Beucken et al 2009).Hence, below the lower threshold (O l ), the hypoxia h(u) switches to zero.Secondly, this lower bound is also useful to incorporate prediction for non-cellular regions, where CA9 and CD31 markers are absent simultaneously.However, in our simulations, these parameters are obtained from the underlying data using an estimation method.

Final model system
Hence, our developed model system involves a PDE (equation (2.4)) for oxygen distribution and an algebraic relation between O 2 and hypoxia profile in (equation (2.9)) based on the mentioned dataset, which will be utilized to estimate the involved parameters.This estimation is performed with the exponential decay form of equation (2.9).We have also tested the model with bounded production described by (2.5) but did not observe an increase in training and validation performances.

Numerical simulation methods
To estimate the parameters in our formulated system, we numerically solve the PDE presented in equation ( 2.3).We use a finite-difference scheme where the spatial domain is discretized into 400 × 400 grid points.A five-point stencil (central-difference) discretization is applied to the diffusion term on these grid points.The resultant system of linear equations consists of 160 000 unknowns, representing the values of unknown (u(x)) at the grid points.The simulations are conducted within the MATLAB environment.
Upon solving the PDE in equation ( 2.3) for a given set of parameters, the O 2 distribution u is obtained at the discretized points.Subsequently, hypoxia h(u) is calculated using equation (2.9) at the considered grid points within the domain.This forms the basis for further analysis and parameter estimation.

Estimation of parameters
Model parameters are determined using an optimization method.To do this, we first define a loss function as: (2.10) where h(u) is the computed hypoxia from (2.9), H represents the hypoxia profile obtained from the CA9 scan and ||.|| denotes the square of L 2 -norm per grid point.The loss function is computed within the considered domain where either of the scans (CD31 or CA9) is non-zero.The minimization of this loss function provides estimates for the parameters.
The optimization is performed in MATLAB using the built-in MultiSearch function, where 100 to 500 random values are chosen for the initial set of parameters in the optimization process.This function uses multiple random starting points and generates optimal and also sub-optimal solutions.The search of optima is parallelized and calls upon the sequential quadratic programming (sqp) algorithm via fmincon in MATLAB.Additionally, we have also explored the non-linear least square fit method and GlobalSearch algorithm, which is based on the scatter-search method of generating trial points and automatically selecting 1000 starting points.Our findings indicate that the MultiSearch with fmincon equipped with the sqp algorithm outperforms other methods, at least as far as our problem is concerned.

Training and validation datasets
We divide the dataset into two sections: one for training the parameters and the other for validating the obtained model predictions.In this partition, we ensure diversity in both datasets by including all types of tumor samples and scan types.This selection ensures that the same tumor tissue is not used for both training and validation, and a variety of tumor types are included in each set to validate predictions effectively.Our framework is designed to accommodate different scan types for staining.Therefore, we chose IF for training, while validation is performed on both IF and IHC types, as illustrated in table 2. Training on IF samples is convenient due to the same tissue used for staining.Additionally, for validation (as shown in table 3), we include both scan types and tumor types to comprehensively test the predictive capabilities of our modeling framework.Next, we further divide each sample into small patches and train the model on each of them, as shown in table 2, indicating the number of patches.

Training
As depicted in figure 5(a), we partition each image into 100 patches and independently train the parameters for each patch.The final model system discussed in section 2.6.3 is solved using a no-flux boundary condition for each patch.This condition accounts for the inflow and outflow of oxygen between neighboring patches, considering the presence of blood vessels on either side.Therefore, no flux is exchanged at the boundaries.To assess the training performance against a worst-case scenario, we compute the minimized loss function and compare it with the mean of 50 random values sampled from the hypoxia profile at each grid point, which is generated during the discretization of the PDE.This evaluation involves generating 50 random values (ranging between 0 and 1) at each point and calculating their mean.The obtained mean value represents the potential hypoxia value at that point in a worst-case scenario.Further, in the training process, we record the parameter values corresponding to the 50% higher than the global optimum for each patch.Executing this process for the entire training set enables us to obtain optimal and sub-optimal parameters and an interval for each parameter for each patch, considering sub-optimal solutions 50% higher than the global optimum.This information provides insights into the uncertainty of the estimated parameters across different patches in the image.

Stratification of tissue regions
Upon completing the training phase, the trained framework is applied to classify a patch and identify similar patches from the trained dataset based on the blood vessel architecture.For a given patch from a tissue sample outside the training data, the classification process involves the following steps: • Assigning Heterogeneity Score (H s ): Initially, calculate the heterogeneity score for the patch's blood vessel density using the method outlined in section 2.5 to quantify hypoxic regions.• Proportion of Non-zero Pixel (N p ) Calculation: Next, we calculate the proportion of non-zero pixels representing blood vessels in the patch.As discussed in section 2.4, a binary pixel method is used to detect blood vessels in the images, providing a measure of the vessel-filled region in the patch.• Initial Interval Determination: Determine two intervals with a variance of ±10% above and below both calculated values: (max(0, N p − 0.1N p ), N p + 0.1N p ) and (max(0, H s − 0.1H s ), H s + 0.1H s ), ensuring that the minimum values remain non-negative.• Identification of similar patches: Once heterogeneity and non-zero pixels intervals are obtained for the target patch, identify patches with trained parameters, whose two scores lie within the intervals.These identified patches are considered 'similar' patches, and their parameters can be utilized to predict the hypoxia profile.
Initiate an iterative search starting with the initial interval.The algorithm searches for trained patches within this interval.The search iteratively increases the interval until there is at least one set of values satisfying both conditions specified by these two intervals, providing a similar patch.
Hence, this classification ensures that similar patches contain equivalent vascular regions and that their spatial spread is comparable.

Validation of the physical model
This step involves predicting the hypoxia profile for a patch and subsequently for an entire image, then comparing the predictions with the actual data.The process is as follows:

Validation for a patch
To predict the hypoxia profile of a patch, the following steps are undertaken, starting from image preprocessing to solving a PDE, ultimately providing insights into the hypoxia profile of the tumor tissue sample.
• Input Image: Tumor tissue image stained with CD31 and DAPI, representing blood vessels and cell nuclei, respectively.• Necessary Preprocessing: Implementation of necessary preprocessing steps as mentioned in section 2.1 to obtain the respective markers' densities.• Patch Extraction: We divide the input sample into small patches.Depending upon the image pixel size, we divide them into 25 to 100 small patches.• Classification of a Patch: Next, we apply the stratification method to identify similar patches from the training dataset, thereby obtaining an interval for each parameter.• Parameter Assignment for Each Patch: We obtain a range of parameters for the patch, and each value can be identified as a different snapshot of dynamic oxygenation.For a general case, parameters are assigned for the patch by taking the mean of the interval of the similarly trained patches.
• PDE Solution: With the obtained parameters, the PDE in equation (2.4) is numerically simulated for the patch to get the O 2 distribution.• Cell Region Identification: Identification of the cell regions based on DAPI marker density is performed to mark the regions where cells can be present and to make predictions for hypoxia.• Hypoxia Calculation: Application of the hypoxia equation (2.9) and then setting the obtained to zero for non-cellular regions provide the desired profile for hypoxia for the input patch.• Result Image: The final result is the hypoxia profile for the input patch.

Validation on whole sample
We assume constant parameters in the formulated model system described by equations (2.3) and (2.9), which may not hold for all tumor tissue samples.We adopt a data-driven approach to retrieve space-dependent model parameters from the data to predict the hypoxia profile for an entire sample.This is accomplished through the following steps using a moving window approach: • Partition the input sample into several patches (e.g. 100 patches per sample), as illustrated in figure 5(a).
• Apply a classification strategy based on the heterogeneity score of each patch, as detailed in section 2.2.
• For each patch, acquire parameters from similar patches in trained samples, following the process detailed for the single patch case.• The obtained parameters for each patch result in space-dependent parameters that remain invariant within a patch, expressed by the equation: (2.11) • Together with the no-flux boundary condition, equation (2.11) serves as a data-driven, space-dependent counterpart of equation (2.4).
This adaptation enables the consideration of spatial heterogeneity in parameters within the tissue sample, providing a more realistic representation of oxygen and hypoxia profiles.To address the spatial dependence of the diffusion coefficient D and prevent numerical oscillations, especially with discontinuous diffusion coefficients in the simulations of equation (2.11), we employ a conservative form of the five-stencil discretization method, as described in section 2.7.

Training the mechanistic model for oxygen-hypoxia relationship
A simple yet effective mathematical framework is formulated to predict hypoxia from a known vessel architecture.The framework is mechanistic and data-oriented, utilizing underlying data for model development, training, and subsequent validation.The approach combines a theoretical foundation with empirical insights from the available dataset, emphasizing a balance between mechanistic understanding and practical applicability.Following the training protocol detailed in section 2.10, it is apparent that the proposed framework aligns effectively with the data, as evidenced by the representations in figures 7 and S2.These scan images result from two distinct staining techniques, as elucidated in section 2.1, and the modeling framework exhibits a robust fit with both types of scans.Beyond the visual appeal of the fit comparison, a quantitative assessment reveals a strong agreement as we compare the minimized loss function values with the worst-case scenario, as explained in section 2.10.This is observable in the figures presented in figures 7 and S2.This consistent trend persists across the entire training set, where each patch's minimized loss function value is significantly smaller than the worst values, as shown in the figure S1.

Biological heterogeneity and epistemic uncertainty are responsible for distributed physical parameters
Upon estimating parameters for all patches within the training set, a distribution for each parameter is obtained, as depicted in figure 8(a).This visualization shows the parameter distribution, revealing notable inter-patch heterogeneity.The observed hypoxia heterogeneity in experiments (Iakovlev et al 2007) is consistent with the model predictions, particularly in terms of inter-patch variation of parameters.
The first source of parametric variability is biological and integrates several factors.Tissue oxygenation is a complex process influenced by various factors beyond our modeling constraints, with one such factor being the dynamic nature of oxygenation, particularly in vascularization induced by normal or tumor-related processes.This intricate process involves a series of coordinated steps, commencing with tumor cells Initially, we determine the blood vessel density from the CD31 scan, shown in (a), and the hypoxia profile from the CA9 marker in (d), following the details in section 2.4.Further, with the initial guess of parameters, we calculate the oxygen distribution using the obtained blood vessel density and apply equation (2.3).The process continues with acquiring the hypoxia profile using equation (2.9).The obtained hypoxia profile is then compared with the CA9 marker density in (c) through the loss function in equation (2.10).Beginning with the initial parameter guess and iterating through this process until the loss function in equation (2.10) is minimized, an optimal value is achieved.The resulting oxygen and hypoxia distribution, depicted in (b) and (d) respectively, with the loss function value of 0.07 and the worst case value being 10.59.
overexpressing hypoxia-inducible regulators of angiogenesis, including vascular endothelial growth factor (VEGF).As a chemoattractant, VEGF promotes the proliferation of ECs lining the surrounding blood vessels.For a more detailed model and qualitative insights, particularly in the context of glioma, we refer to Kumar and Surulescu (2020).
The complexity of this process and the diversity in blood vessels and hypoxia profile are mirrored in the parameters.The observed inter-patch or inter-tumor variability is intricately linked to the underlying dynamics, facilitating an exploration of different possibilities and leveraging the mechanistic behavior of the model.These obtained distributions provide insights into the diverse, dynamic states of tissue oxygenation, as elaborated in subsequent sections.
Another source of parametric variability is epistemic, stemming from the fact that several parameter sets can result in similar fits.Therefore, trained parameters are inherently uncertain.In the optimization process, parameters corresponding to the minimum loss function, as defined in equation (2.10), are selected.However, introducing a bit more tolerance in the loss function results in an interval of parameters, reflecting epistemic parametric uncertainty.This type of variability results from insufficient training data, which can be mitigated by adding more data.

The stratification, based on blood vessel heterogeneity, reduces the variability of parameters
As we explore the variability of parameters across patches and consider the underlying dynamics, it becomes essential to narrow down options when making predictions using the framework.Indeed, model predictions depend on parameters that are a priori unknown.Our strategy is to use the heterogeneity score (spatial entropy H s ), which can be directly extracted from stained tissue images, to reduce the range of parameters required for prediction.
To assess the feasibility of this strategy, we employ the stratification method outlined in section 2.11 to associate a set of trained parameters with each new image patch for which we require prediction of hypoxia.This method effectively restricts the parameter range during the validation process.Applying the parameter assignment method, as detailed in section 2.12, reveals reduced variability in the parameters for individual patches-for example, one from breast tumor tissue (0520-25-2) and another from pancreatic tumor tissue (1087)-in the validation set.The former tissue image is an IF scan, while the latter is an IHC scan image.This reduced variability observation is evident from the distribution plots shown in figures 8(b) and S3.

Prediction based on a restricted set of parameters and assignment of a state of vascularization
Using the stratification method, we narrow down parameter intervals, resulting in a reduced range for each parameter.While not providing an exact value, this interval offers a range of possibilities reflecting the underlying dynamics in the tissue.The variations in individual parameters signify different underlying dynamics in the concerned tissue.To better understand these dynamics, we analyze the changes in specific parameters one at a time while keeping the other parameters at the mean of their obtained interval for a patch sample.For the given blood vessel in figure 9(a) of the patch, oxygen and hypoxia distribution is predicted, as shown in figures 9(b) and (d), whereas the CA9 marker density is depicted in figure 9(c).By exploring the range of parameters obtained for this patch, we observe the following:  • Diffusion coefficient (D): As mentioned, oxygen supply is assumed to be the diffusion of oxygen molecules produced by blood vessels.This diffusion is controlled by the diffusion coefficient (D).In a properly oxygenated tissue microenvironment, O 2 molecules would be acquiring the maximum limit of the diffusion coefficient, resulting in less hypoxia, as seen in figure 10  The results feature the same patch as shown in figure 9.For example, a minimum diffusion coefficient results in more regions of hypoxia for the cells (a), while very high diffusion would eliminate the possibility of hypoxic cells for most regions (d).Similarly, deficient oxygen uptake would maintain the tissue normoxic (b), whereas high uptake would result in hypoxic regions (e).Additionally, an increase in the decay rate of hypoxia with oxygen would lead to hypoxic regions (c), while a decrease shows the reverse trend (f).Therefore, the variation in these classified coefficients for similar patches can be seen as a snapshot of different oxygen dynamics.
Combining these parameters can help profile the blood vessel's functioning, with several possibilities of parameter combinations reflecting different dynamic conditions of tumor oxygenation.Here, we on two scenarios: functional and non-functional blood vessels.Functional blood vessels may result from efficient diffusion, low oxygen uptake in tumor tissues, and a high decay coefficient of hypoxia with increasing O 2 .In contrast, the non-functional category includes less diffusion, less production, and high oxygen uptake-where even with high production, this could be the case for new vessel linings.The plots in figure 10 showcase plots corresponding to these scenarios, extracting the dynamic nature of the dynamics through this static model.

Validation of the physical model using diverse tumor samples
A desirable feature of a mathematical modeling framework is its predictive nature, coupled with a mechanistic explanation.Moreover, it is crucial to validate these predictions to ensure biological relevance.As detailed in section 2.12, we validate the model predictions using the outlined procedure, beginning with the patch validation.In this section, we present the results, demonstrating alignment with the mean values of the obtained parameters.Although we have intervals for each parameter, the validation results are consistent with the mean values for many of these samples.The validation was performed using a diverse set of tumor samples (see table 3).

Validation for patches
Our model predictions demonstrate strong alignment with the validation data, as evident in figures 11(a)-(e) and figure S4 for IF, and figure S6 for the IHC scan type.The predicted hypoxia profiles in figures 11(e) and S4(d) showcase the recovery of CA9 positive cells, utilizing the mean of the parameters after the classification process detailed in the previous section.While most CA9-positive cells are well-recovered, there are some additional tissue regions with cells marked as hypoxia-positive.However, these instances are relatively low compared to the well-aligned predictions.These additional positive markings may arise from various approximations and simplifications incorporated in our modeling process.
Despite the presence of some additional cells marked as CA9-positive with a slight gradient in the previously shown patch prediction, it is crucial to highlight that the modeling framework's predictions exhibit excellent alignment with patches devoid of CA9.This is evident in the results obtained for a patch from an ovarian tumor sample (0721) sample, illustrated in figure S5.The scatter distribution of blood  1090), stained by IF, follows the steps outlined in 2.12.The blood vessel density is depicted in (a), and a range of involved coefficients are obtained.The resulting predictions from equations (2.3) and (2.9), representing oxygen distribution and the hypoxia profile, are shown in plots (c) and (e), respectively.The hypoxia profile closely aligns with the CA9 density shown in plot (d), yielding a validation error 0.2310.Lower row: Predictions and their validation for the entire tissue sample were carried out following the steps detailed in section 2.12.2.Utilizing the blood vessel density from an input image, as shown in plot (f) for a pancreatic tumor sample (1087) stained with IF, the model framework produces predictions for O2 and the hypoxia profile, presented in plots (g) and (i), respectively.Upon comparison with the CA9 data, displayed in (h), additional hypoxic cells are observed, resulting in a relatively high validation error of 0.6366.vessels in figure S5(a) is well-captured by the framework, leading to a uniformly distributed oxygen pattern and corresponding absence of hypoxic regions, as depicted in figures S5(b) and (d).Notably, this closely resembles the CA9 profile in figure S5(c) with a very small validation error.

Validation for whole sample
Detailed information at the patch level is valuable for examining microenvironments, but predicting hypoxia profiles for entire samples is equally crucial.Following the procedure outlined in section 2.12, we extend our predictions to whole samples, utilizing both IF and IHC scan types for testing.As illustrated in figures 11(e)-(h) for an IF scan from a breast tumor tissue and figure S7 for an IHC scan from a pancreas tumor, our predictions align well with the CA9 marker density.Similar to the observations at the patch level, our predictions cover CA9-positive cells and capture additional cells with a low hypoxia gradient.
It is essential to emphasize that identifying hypoxic cells is accurate; however, there are instances of false hypoxic regions being detected.This may be attributed to the coexistence of blood vessels and hypoxia or the non-availability of similar types of samples in the dataset.

Discussion
The well-established influence of hypoxia-driven alterations in the tumor microenvironment underscores the need for accurate assessments of oxygen and hypoxia distributions in effective cancer characterization and treatment design.We present a mechanistic framework applicable to various cancer types to predict hypoxic regions within the tumor microenvironment from tissue sections.This represents a crucial initial step towards a realistic model for hypoxia prediction and aiding in radiation dose estimation based on existing models (Dasu et al 2005, Titz and Jeraj 2008, Powathil et al 2012, Toma-Dasu and Dasu 2013).Understanding hypoxia is not only pivotal for radiation therapy efficacy but also essential for comprehending other hypoxia-induced phenomena, such as metabolic reprogramming, angiogenesis, tumor invasion and metastasis.In this study, we employ a simple mathematical approach capable of replicating biologically relevant hypotheses.
This data-driven yet mechanistic approach involves obtaining proxies for blood vessels, hypoxia, and cell nuclei (CD31, CA9, DAPI markers).We have formulated a mathematical model equation (2.3) to derive oxygen distribution from blood vessel density and determine hypoxia profiles equation (2.9).Optimization processes extract parameters by comparing model predictions with CA9 markers.The model is trained on tissue image patches, yielding parameter distributions.While similar results are obtained with models incorporating cell-dependent oxygen uptake (equation (S.1)) or introducing nonlinearity through limited oxygen production and uptake equation (2.5) and (S.2), we choose to utilize the linear PDE (equation (2.3)).This decision is motivated by computational efficiency concerns associated with solving nonlinear PDEs multiple times for optimization.Furthermore, sticking with the linear model helps mitigate overfitting and reduce epistemic uncertainty stemming from additional parameters.
Assigning a class to each tissue patch proves beneficial to address the heterogeneity in blood vessels across different tissues and within the same tissue.We classify each patch based on its blood vessel architecture using the spatial entropy measure (refer to equation (2.1)).This classification is integral to our prediction framework, where the first step involves identifying the patch class and obtaining the corresponding parameters.Additionally, a range of parameters is obtained for each classified patch, enabling the observation of different dynamics within the tissue.
Recognizing the limitations posed by the assumption of constant diffusion, production, and uptake of O 2 molecules, we introduce a data-driven method to ascertain spatially varying coefficients.The method is based on the spatial entropy of patches from CD31 stained tissue images, summarizing the heterogeneity of blood vessels distributions, but the same approach can be adapted to other types of data such as those obtained from hematoxylin and eosin (H&E) stained tissue sections.Utilizing observed variations from trained and similar (classified) patches, we have employed a varying coefficient model (refer to equation (2.11)) to quantify this heterogeneity.Through this process, we have deduced a reaction-diffusion PDE wherein the coefficients exhibit variability between the considered patches.This approach facilitates the use of tailored rates for different patch types, enabling predictions for the entire tissue sample using a simplified model.
Our modeling predictions perform well for both types of scans, IHC and IF, commonly used in practice for staining different markers.Specifically, IF is preferred due to computational efficiency, eliminating the need for image registration.Additionally, it utilizes the same tissue in experiments, simplifying both experimental and computational aspects as the cell structures remain consistent.In addition, since H&E -stained tissue sections are produced as a routine procedure for cancer diagnosis, we expect our model to provide additional interesting information in the future that may help pathologists evaluate hypoxia on this type of standardized biological material.
We have identified additional hypoxic regions beyond CA9 marker densities, possibly due to the 2D approximation of the 3D blood vessel architecture.To address this, we plan to enhance our model by incorporating the 3D geometry and architecture of blood vessels, moving beyond the 2D slices used in the current estimation.While 3D reconstructions, exemplified in the study (Arslan et al 2023), provide a more precise representation, its practicality in clinical settings is constrained by by tissue availability and the time consuming step of generating large series of adjacent tissue sections.To overcome this limitation, we are developing a framework to quantify the approximation error when using 2D datsets, crucial for evaluating hypoxia profiles in scenarios where 3D reconstruction is limited.
While CA9 is often considered an intrinsic marker for hypoxia, studies such as Hedley et al (2003), Mayer et al (2005) for uterine cervix cancer and Iakovlev et al (2007) for cervical carcinoma suggest a poor correlation between tumor oxygenation and CA9.This raises the need for exploring alternative markers or combinations of markers to improve the accuracy of hypoxia assessment.Additionally, the heterogeneity in oxygen consumption driven by cell metabolism is a crucial factor influencing hypoxia levels.Beyond the static assumptions and the 3D to 2D approximation, the dynamic state of cells may play a significant role in determining the hypoxia profile.
Predicting or estimating hypoxia without a time series dataset is valuable in various scenarios, especially given dynamic experiment challenges.Creating a profile based on the cell metabolic state and the functioning of blood vessels becomes essential, even if done less quantitatively, as attempted in our simulations for the diffusion coefficient, oxygen production, and uptake coefficients.Scores such as hypoxia scores (as seen in Van Laarhoven et al (2006)) may need more information, as they often overlook spatial heterogeneity in both cells and hypoxia when applied to an entire image.Experimental investigations involving cell metabolic states and other markers to assess the state of blood vessels would be essential for validating the proposed profiling approach.

Conclusion
This study introduces an innovative data-driven yet mechanistic tool for characterizing tumor oxygenation, effectively capturing the intricate dynamics of the tumor microenvironment.The computational framework successfully reproduces these complex dynamics, as demonstrated in the context of three distinct tumor types: breast, ovarian, and pancreatic.The results not only facilitate localized predictions for small tissue regions but also offer a data-driven approach to extract spatially varying rates related to oxygen (such as diffusion coefficient, production, and uptake rate).This spatially resolved information is crucial for capturing and predicting spatial heterogeneity across larger tissue samples.Additionally, based on the derived oxygen-related rates, the study explores the potential to predict whether tumor vascularization is in a functioning or non-functioning state.
This approach has multiple applications.In fundamental research, it can serve as the basis for more realistic models of tumor microenvironment heterogeneity, applicable to tumors undergoing various types of treatment.Along this line of research, we intend to enhance current models by incorporating the dynamics of cellular adaptation to local hypoxia.As a further enhancement, we also plan to expand the model dimension from 2D to 3D and validate the 3D models using serial sectioning of tumors.In translational research, having effective tools for predicting hypoxia could prove valuable in guiding radiotherapy, especially when coupled with imaging techniques for tumor blood vessel reconstruction.These techniques include positron emission tomography, computed tomography, magnetic resonance angiography, ultrafast ultrasound imaging, or their combinations (Provost et al 2018, Facchin et al 2020, Moog et al 2022, Denis et al 2023).

Figure 1 .
Figure 1.Our generated dataset comprises of images of tumor tissues stained with colors representing specific markers.Here we present the imaging data of pancreatic tumor tissue (1087), obtained through two distinct staining methods: immunofluorescence (IF) in the left column and immunohistochemistry (IHC) in the middle and right column images.The IF scanned image is depicted in (a), and a zoomed-in section (highlighted by the red rectangle) of this image is shown in (d).These images illustrate all the staining involved in our study, namely, CD31 in red, CA9 in green, and DAPI in blue, all on the same tissue.Figure (b), with a section zoomed in (e), focuses explicitly on CA9 staining in brown, with color intensity indicating the gradient of hypoxia.Image(c) and its zoomed section in (f) presents the blood vessels on the tissue adjacent to that used for CA9, also in brown.The cell nuclei are stained in blue color in the IHC-stained images, similar to the IF staining, as observed in the middle and right columns.It is important to note that although the tumor sample remains the same, the tissues involved in both scanning types differ.

Figure 2 .
Figure 2. Various types of blood vessel architecture, represented by CD31 staining, are depicted in the top and middle rows for breast tumors (0520-04-1 and 0520-05-1) and in the bottom row for a pancreatic tumor (1087).These are patches of the mentioned tumor samples and are IF-type scans.For better visualization, the background of the IF images has been changed to white.

Figure 4 .
Figure 4. (a) Blood vessels of a patch (pancreatic tumor 1087), (b) categorization based on pixel area covered by a blood vessel, recognized as a connected component; the colors correspond to the categories as represented by the color bar.

Figure 5 .
Figure 5. (a) A full sample of breast tumor blood vessel distribution, (b) a zoomed patch, (c) calculated heterogeneity score obtained using equation (2.1) where the score is normalized (with respect to the maximum value) for improved visualization.The patch numbering is executed in a Cartesian coordinate system manner, commencing from the bottom left and then progressing from left to right in the x-direction.This counting pattern repeats until reaching the top right corner.

Figure 7 .
Figure7.Data fitting and parameter estimation for a patch extracted from the breast tumor (0520-4-2) tissue image, an IF scan, involve utilizing the training procedure outlined in section 2.10 and the specified optimization method for extracting parameters.Initially, we determine the blood vessel density from the CD31 scan, shown in (a), and the hypoxia profile from the CA9 marker in (d), following the details in section 2.4.Further, with the initial guess of parameters, we calculate the oxygen distribution using the obtained blood vessel density and apply equation (2.3).The process continues with acquiring the hypoxia profile using equation (2.9).The obtained hypoxia profile is then compared with the CA9 marker density in (c) through the loss function in equation (2.10).Beginning with the initial parameter guess and iterating through this process until the loss function in equation (2.10) is minimized, an optimal value is achieved.The resulting oxygen and hypoxia distribution, depicted in (b) and (d) respectively, with the loss function value of 0.07 and the worst case value being 10.59.

Figure 8 .
Figure 8. Inter-patch parameter variation and the reduction of this variation through stratification of patches.(a) The trained parameters exhibit variation, as shown in the violin plots from left to right: diffusion coefficient D, O2 uptake and production rates (α, β), O l , O h , and heterogeneity score for the trained patches.Notably, these distributions are often multimodal.The variability in D spans several orders of magnitude, showcasing highly and poorly diffusing O2 molecules.The multimodal nature of production and uptake rates reflects variations in cellular demands.Meanwhile, the modality in k1 represents the variation in the decay of hypoxia profiles between different patches.(b) Distribution of parameters obtained after the classification of a selected patch from a breast tumor (0520-25-2) scan of the IF type.The parameter intervals are determined from similar patches, leading to significantly reduced variability compared to the distribution shown in (a).The parameters include the diffusion coefficient D, O2 uptake and production rates (α, β), O l , O h , and heterogeneity scores for the trained samples, arranged from left to right.

Figure 9 .
Figure 9. Prediction of oxygen distribution and the hypoxia profile for a patch of a pancreatic tumor sample (1087) of the IF type.The parameters used in this prediction are the mean values obtained from similar patches after the classification process.
(a).Conversely, less diffusion would lead to more hypoxia, as depicted in figure10(d).•Production and uptake coefficients: The production coefficient (β), related to blood vessels, characterizes the state of blood vessel functioning.Higher values of β reflect increased oxygen production, resulting in lower hypoxia.On the other hand, the Oxygen uptake (α) rate is modulated by the tumor microenvironment and cellular metabolism and plays a crucial role in the hypoxia profile.Increased uptake leads to more hypoxia, as demonstrated in figure10(b) for a minimum value and for a maximum value of α in figure 10(e).• Decay coefficient (k 1 ) and conversion coefficient (γ): The decay coefficient k 1 , which signifies the rate of decay of CA9 marker with increasing oxygen concentration, exhibits a high hypoxic response at minimum values (figure 10(f)), whereas maximum values result in low hypoxia (figure 10(c)).A high value of k 1 , such as the maximum selected parameter, suggests a microenvironment where hypoxia diminishes rapidly with the concentration of oxygen.This could indicate a tissue state with strong oxygen needs, that reacts abruptly to a lack of oxygen.Conversely, a low value represents a scenario where the tissue is less dependent on oxygen, as in the case of anaerobic metabolism.As γ is a conversion coefficient limiting the maximum value of hypoxia distribution, a high value of γ implies a state with high oxygen demands, while a lower value suggests lower oxygen dependence.

Figure 10 .
Figure10.Impact of varying different parameters, each representing distinct dynamics within the intervals obtained after stratification and assigning intervals for each of them.Each subfigure individually displays the variation of a specific parameter while keeping the other values as the mean of the obtained ones after stratification.The results feature the same patch as shown in figure9.For example, a minimum diffusion coefficient results in more regions of hypoxia for the cells (a), while very high diffusion would eliminate the possibility of hypoxic cells for most regions (d).Similarly, deficient oxygen uptake would maintain the tissue normoxic (b), whereas high uptake would result in hypoxic regions (e).Additionally, an increase in the decay rate of hypoxia with oxygen would lead to hypoxic regions (c), while a decrease shows the reverse trend (f).Therefore, the variation in these classified coefficients for similar patches can be seen as a snapshot of different oxygen dynamics.

Figure 11 .
Figure 11.Upper row: Validation of the prediction for a patch of the breast tumor sample (1090), stained by IF, follows the steps outlined in 2.12.The blood vessel density is depicted in (a), and a range of involved coefficients are obtained.The resulting predictions from equations (2.3) and (2.9), representing oxygen distribution and the hypoxia profile, are shown in plots (c) and (e), respectively.The hypoxia profile closely aligns with the CA9 density shown in plot (d), yielding a validation error 0.2310.Lower row: Predictions and their validation for the entire tissue sample were carried out following the steps detailed in section 2.12.2.Utilizing the blood vessel density from an input image, as shown in plot (f) for a pancreatic tumor sample (1087) stained with IF, the model framework produces predictions for O2 and the hypoxia profile, presented in plots (g) and (i), respectively.Upon comparison with the CA9 data, displayed in (h), additional hypoxic cells are observed, resulting in a relatively high validation error of 0.6366.

Table 1 .
Produced experimental data: In this work, we perform staining on three different PDX tumor types, namely breast, ovarian, and pancreatic.This table lists these tumor samples, with the first column containing the name of the sample used during the experiment.For each PDX, both staining techniques are applied to different tissue samples, generating one stained for IF and one for each CD31 and CA9 with IHC scans.Except for the tumor 0520, where IF is performed on multiple tissues of that tumor sample.