The stochastic digital human is now enrolling for in silico imaging trials – Methods and tools for generating digital cohorts

. Randomized clinical trials, while often viewed as the highest evidentiary bar by which to judge the quality of a medical intervention, are far from perfect. In silico imaging trials are computational studies that seek to ascertain the performance of a medical device by collecting this information entirely via computer simulations. The beneﬁts of in silico trials for evaluating new technology include signiﬁcant resource and time savings, minimization of subject risk, the ability to study devices that are not achievable in the physical world, allow for the rapid and eﬀective investigation of new technologies and ensure representation from all relevant subgroups. To conduct in silico trials, digital representations of humans are needed. We review the latest developments in methods and tools for obtaining digital humans for in silico imaging studies. First, we introduce terminology and a classiﬁcation of digital human models. Second, we survey available methodologies for generating digital humans with healthy and diseased status, and examine brieﬂy the role of augmentation methods. Finally, we discuss the trade-oﬀs of four approaches for sampling digital cohorts and the associated potential for study bias with selecting speciﬁc patient distributions. Social media blur (100-w): From digital twins to other digital humans for in silico trials: we review methods and tools for obtaining stochastic humans for digital cohorts [LINK]


Introduction
Two decades ago, in the epilogue of their seminal textbook on image science [1], Barrett and Myers pointed out that in the future, sport games might be played with simulated athletes.The advancement of computer graphics and simulation technologies sparked the notion that perhaps the excitement of a real-life sports event could be conducted in the simulation space with digital models of athletes.Since then, continuous advances in computer processing power and modeling techniques have taken place, driven primarily by entertainment applications [2] and quickly becoming a significant component of research and development (R&D) efforts in a variety of industries ‡.Industries that have widely adopted computational modeling and in silico methods throughout the product life-cycle include automotive [3] and manufacturing [4] among others [5].Medicine lags considerably behind [6] due, in part, to model complexity, challenging validation, associated potential risks for new devices and drugs, and lack of consensus and regulatory standards.
Randomized clinical trials, while often viewed as the highest evidentiary bar by which to judge the quality of a medical intervention, are far from perfect.Common causes of failure include safety issues, difficulties with patient recruitment, enrollment, and retention [7].In addition, clinical trials can suffer from under-representation of rare subpopulations [8].These limitations represent a unique opportunity to develop in silico trials that are completed as planned, safely, and that include digital cohorts with a representative distribution of subject characteristics and numbers large enough for appropriate statistical power.As pointed out in [9], in silico data has the potential to address lack of data availability, sharing mechanisms and privacy challenges associated with the use of medical information.
In silico imaging trials are computational studies that seek to ascertain the performance of a medical device for the intended population, collecting this information entirely in the digital world via computer simulations.The benefits of in silico imaging trials for evaluating new technology include significant resource and time savings, minimization of subject risk, and ethical considerations [10,11].Moreover, in silico trials can be used to study devices that do not yet exist or are not practically attainable in the (limited) physical world, allow for the rapid and effective investigation of new technologies [11,12,13], and facilitate representation from all relevant subpopulations.Each one of these benefits is an ‡ To date, Superbowl games are played with physical-world athletes, in part due to the difficulty of conveying real-life personal struggle, an essential component of the entertainment context for sport players and teams (see, for instance, here).
essential consideration within the context of the regulatory evaluation of medical technology [11].
The realization that computational models of humans would take center stage in medical imaging system assessment is not new.Full optimization of imaging systems for specific medical tasks requires objects (physical or digital) that represent the variability seen in patients.
For many decades, scientists have relied on practical and simpler versions of patients [14].However, recent advances in computer processing power and simulation methods are now facilitating the development of more detailed and realistic patient models that are based on digital stochastic descriptions of the model components.For instance, a recent report demonstrated the feasibility of an in silico trial, the Virtual Imaging Clinical Trial for Regulatory Evaluation (VICTRE), as an alternative approach to establish regulatory evidence in support of medical imaging products [15].
There are numerous parallels between digitaland physical-world trials.Fundamentally, in silico trials must include the same essential elements of well-designed physical-world clinical trials.Firstly, the population of subjects for whom the new device or technology is intended must be defined.The study design must contain clear rules for selection and rejection of subjects from a distribution of healthy and diseased subjects.However, in silico trials are not subject to effects from covariates in patient selection.For instance, a common problem in evaluating screening tests meant for asymptomatic subjects is that a portion of the enrolled population might be symptomatic [16] with the potential for verification bias [17].Secondly, when there are two technologies that are being compared, i.e., a new, yet unproven technology and a comparator technology currently in clinical use, both must be unambiguously defined.A good choice for comparator technology should be associated with accurate representations of the device characteristics as supported by validation studies [18].Thirdly, the study requires a definition of the users of the device's outcome (i.e., images in the case of an imaging device trial).These first three components reflect the physical intended use of the device under investigation, i.e., the intended populations of subjects, the intended device comparison, and the intended image interpreters that will be using the device in the physical world.Finally, whether physical or digital, the trial design must provide a definition of the primary outcome to be evaluated, a protocol and statistical analysis associated with the trial, and an analysis of the risk and benefits introduced by the device under investigation.
Both physical and in silico studies require enrollment of representative subjects.
In this review, we survey the latest developments in methods and tools for generating the cohorts of digital humans for imaging studies that represent the variability of physical-world subject populations.We refer to the digital cohorts consisting of digital humans (realizations of the digital human models) as "stochastic humans".Assessment of new technology and the regulatory evaluation of that technology requires establishing performance levels for intended populations and, therefore, necessitates computational models that allow sampling of the parameter space defining the subject population in the physical world.
We propose to name these models digital humans as opposed to digital replicas or twins to avoid confusion.The review is organized as follows.
First, we introduce terminology and representation models regarding the different types of digital humans described throughout the article.Second, we survey available methodologies for generating digital humans with healthy status and for generating diseased cases.Then, we briefly discuss the role of augmentation methods and conclude with an analysis of sampling techniques that may be used to generate the digital cohorts for evaluating the performance of imaging devices.

Terminology
A variety of terminologies are being used or proposed for describing digital representations of humans in medicine and other fields.In the literature, some of these are often used without the benefit of a clear definition and, in some instances, wrongly interchangeably.
We propose to use the term stochastic digital human to denote digital representations of humans (or human body parts) generated from multiple random outputs by sampling known distributions for the model characteristics matching the variability observed in human populations.In contrast, non-stochastic representations are deterministic digital versions of a single physical exemplar (e.g., a model of a human body at a given time) or a group (or family) of physical exemplars which are differentiated by varying physical parameters.Contrary to other terms and concepts currently being discussed including digital families, avatars, chimeras, and digital twins, the concept of a stochastic digital human represents an approach for in silico trials and regulatory evaluation that estimates the performance of an imaging device for a population of subjects rather than for an individual patient, thus incorporating the variability observed in the population.
We propose to classify all digital humans as either individual or population models (see Figure 1).
Individual models are necessarily image-based while population models can be derived either from images or from knowledge of the fundamental characteristics that define the relevant features of a human.Note that we will use the term digital human to refer to the models even if the represented object is a part of the body or the whole body of a subject.

Representations
Physical objects (including humans) can be represented using continuous variables.We consider the models of humans as continuous in space (r) and time (t) and described by a coefficient vector affecting a set of model characteristics: Here, N is the dimension of the approximate finitedimensional representation of the object, and the subscript m indicates the modeling approximation to differentiate from the actual object f (r, t).
The collection of expansion functions {φ n (r, t)} N n=1 is employed to form f m (r, t), and θ n denotes the n-th component of the N -dimensional expansion coefficient vector θ.The quantity f m (r, t) constitutes a discrete representation of a digital human that can be readily displayed on a computer or digitally processed.For the case where the expansion functions are defined as indicator functions that describe non-overlapping spacetime voxels, θ can sometimes be interpreted as a digital image whose components θ n represent the integrated value of the object over the support of the voxel.More generally, a digital human model can be established by integrating the continuous representation f m (r, t) over a collection of N voxels as where v n denotes the support of the n-th spatialtemporal voxel and f n denotes the n-th component of a N -dimensional vector f that represents the digital human.
As discussed below, the choice of the expansion functions and associated expansion coefficients can be specified in different ways, with the general goal of making f m (r, t) an accurate approximation of f (r, t).The expansion functions can depict geometry (e.g., size, morphology), material properties (e.g., x-ray interaction cross-sections, elasticity) or other relevant features (e.g., radioactivity, blood oxygenation levels).For simplicity, we will consider that the stochastic human does not vary with time and proceed only with the spatial dimension r.However, the concepts that follow can readily be generalized to model time-varying descriptions.[19] In practice, the coefficient vector θ can be modeled as a random vector and the expansion functions {φ n (r)} N n=1 as random processes.Methodologies for generating large cohorts of digital stochastic models of humans for in silico imaging trials, including models for organs and tissues with appropriate variability, can rely on either sampling θ, φ n or both from appropriate distributions representing the intended population.We can denote the cohort of digital stochastic humans as follows, where s denotes a particular state or random realization of a digital human in a cohort of size S.When φ n are known, analytically or numerically, the stochastic models are referred to as procedural.In this case, the modeler is left with choosing the coefficient vector defining the object (θ).In cases for which the defining characteristics are unknown, θ n and φ n can be estimated from imaging data.
In the following sections, we review available methods and tools for generating digital human models and digital cohorts.We present a classification of available approaches in Figure 1.

Individual models
Individual models attempt to create a digital replica of a specific physical object.Individual models can be categorized as personalized and family models.These models are not stochastic since they are meant to represent individual subjects with as much detail and accuracy as achievable from the image data.In this respect, the representation introduced in Section 3 applies only with S = 1 resulting in a single coefficient vector (θ n ) defining the individual.
The digital representation in these cases is typically a multidimensional voxelized array that can be segmented into structures such as tissues and organs.Early attempts relied on geometrical volumes represented by analytical expressions altered to generate a wide variety of sizes and shapes.In other words, φ n are described by quadrics and θ n represent properties of the volumes defined by the surfaces (e.g., x-ray attenuation and scattering properties).These computational models have proved useful in areas of quality control of imaging systems [20,21] and in radiation dosimetry [22].Even with more sophisticated geometrical structures [23,24,25] and more spatial detail, these approaches lack the ability to accurately represent the statistical variability found in humans, organs and tissues.While these simpler models remain practical and useful for some tasks, the lack of realism and variability makes them unsuitable for generating digital humans for in silico imaging trials.

Personalized models
Personalized models aim to capture patient-specific information in a digital representation [26].Medical digital replicas of human subjects are in silico representations of an individual in terms of anatomy and physiology.Sometimes referred to as digital twins [27], these replicas are designed to simulate parts or the whole body of a subject for prognostic or predictive assessments.
These models including digital twins can be continuously updated from multimodal medical if the data characteristics change over time §.Digital twins are of interest in the context of evaluating and selecting optimal medical treatments [28] or imaging procedures [29] within clinical practice, and can also be incorporated into other in silico applications [30].For instance, Wang [31] suggested three applications in the areas of medical imaging: optimal selection of scanning techniques (so called "virtual comparative scanning"), data sharing from in silico scanning of the digital replica to the open source community, and improvement of the regulatory process of image reconstruction algorithms.Patient image datasets can also be used to generate models of specific tissues and organs.For instance, the Visible Human project [32] was first made available in 1994 by the National Library of Medicine (NIH) to facilitate anatomy visualization applications and includes a detailed data set of cross-sectional photographs of the human body.

Family models
Personalized models of a small number of subjects can be assembled into families to generate a collection of a small number of digital humans spanning a common set of parameters, such as subjects' body size and age.These models are based on image acquisitions using different modalities including computed tomography (CT), magnetic resonance imaging (MRI) and chest radiographs (CXR).
An example of a family model is the Virtual Family [33], released by FDA in 2012.The Virtual Family consists of a set of detailed, anatomically correct whole-body models of an adult male, an adult female, and two children based on high-resolution MRI data of healthy volunteers.Organs and tissues § A related concept is an avatar, an artistic and sometimes aspirational digital representation of the human in the digital world for interactivity purposes.
https://www.fda.gov/about-fda/cdrh-offices/virtual-family are represented using computer-aided design (CAD) techniques where each component is a high-resolution, non self-intersecting mesh.In this case, the models are used for electromagnetic, thermal and acoustic simulations in the safety assessment of active and passive medical implants [34].Safety evaluations do not require full sampling of the intended population and can be performed with a small number of exemplars, provided the exemplars adequately cover the needed parameter space.Similar approaches are utilized in efforts to provide models of patient anatomy using patient images as the basis for development of cohorts including using MRI and CT images for modeling lungs [35] and torso [14].More recently, image-derived digital and physical models of the breast have been proposed by Kiarashi [36] and Bliznakova [37].In this approach, a voxelized breast model is derived from patient images through image segmentation for determining the composition of each voxel [38,39,40,41,42,43,44].Patient-derived models are limited to the imaging characteristics of the acquisition system and are also affected by the imperfections of the segmentation methods.The resulting models can also be augmented with physiological features to facilitate imaging studies involving contrast agents [45].

Population models
Testing new imaging devices, however, requires the availability of large digital cohorts of stochastic digital humans that can be assembled to properly power a study not only on the aggregate (i.e., for the entire population), but also to analyze for specific subgroups with notable characteristics, including under-represented populations.In this section, we focus our attention on models suited for the generation of large cohorts of digital humans to be enrolled within in silico imaging trials.

Image-based models
Image-based models estimate and sample model components from relevant characteristics within the acquired medical images.
Image-based models estimate model components φ n and θ n in Eq. 3 from within the acquired medical images.Whether parametric or generative, all image-based models are limited by the quality of the source data (i.e.medical images), including noise, artifacts, and contrast constraints, and do not provide an unequivocal mapping to the underlying tissues.In practice, the use of image-based models should also acknowledge the limitation arising from the existence of a null space of the imaging system [46].The null space, which typically arises from the mapping of a continuous object to discrete data with an imperfect image acquisition system, results in an unavoidable loss of information regarding the object.Given that the imaging system operator is only partially known for most imaging systems and cannot represent information obscured by the null space of the imaging transformation, image-based models are limited even when imaging system models include noise measurement.

Image-based parametric models
In imagebased parametric models, the generation of cohorts is achieved by creating models based on available sets of patient imaging data and model modification techniques including parametric deformation, morphing, and registration.Parametric models (also known as stylized phantoms [47]) capture a population cohort by a set of mathematical equations representing a series of surfaces (e.g., splines) defining organs that are later voxelized into a volumetric model.The popular 4D extended cardiac-torso (XCAT) phantom [48] is an example of an image-based parametric model, and a survey of other representations can be found in Kainz [49].
One limitation of this approach is that model development is typically performed on a small number of available patient images.For instance, Erickson [39] presented a methodology to create a database of anatomically variable 3D digital breast models from dedicated breast CT images using a tissue classification and segmentation algorithm and a fuzzy C-means segmentation algorithm.
The study provided a population of 224 breast phantoms incorporating a range of breast types, volumes, densities, and parenchymal patterns.However, using hundreds of images might be insufficient to properly characterize a population for statistically powered in silico imaging trials across patient variability.
Some recently released image datasets include a larger number of cases.
For example, the Medical Information Mart for Intensive Care (MIMIC) CXR dataset [50] contains 227,835 imaging studies from 65,379 patients presenting to the Beth Israel Deaconess Medical Center Emergency Department between 2011-2016.Similarly, the Medical Imaging and Data Resource Center (MIDRC) effort [51] is undertaking a large, multi-year, systematic effort to collect high-quality COVID data, and over 100,000 imaging studies have been made public after 2 years of work and with significant funding from the NIH.
However, data sets collected in these well-defined areas are likely still insufficient to capture the total variability in patient images and the large number of subgroups one may find interesting to study ¶.This limitation precludes the use of image-based parametric models for accurately creating digital cohorts for large scale in silico trials.
Generation of multiple realizations of humans to constitute a cohort can be obtained by extending image-derived models to create populations in a statistical manner.
For instance, Sturgeon [52] developed synthetic breast models using principal component analysis (PCA) to describe a small training set of patient images.
In this approach, each existing patient breast CT volume was compactly represented by the mean image plus a weighted sum of eigenbreasts.The distribution of weights was sampled to create synthesized breast phantoms that matched fibroglandular density and noise power law exponent distributions in real images.Hence, the distribution of the synthetic model is determined by that of the training data, and, therefore, might suffer from a lack ¶ "I cannot breed them.So help me, I have tried.We need more . . .than can ever be assembled.Millions, so we can be trillions more," Niander Wallace in Blade Runner 2049 (see https: //www.imdb.com/title/tt1856101/characters/nm0001467).
of appropriate representations of cases at the tails of the distribution (e.g., very large or very small, very dense or very glandular breasts).A related concept from the computer vision and graphics community is the statistical human body model, in which a vertexbased model of the body surface is learned, typically via PCA, from subjects' input.The techniques rely on linear blend skinning (LBS) to constrain the surface vertex deformation with respect to a template bone skeleton [53].Created for non-medical purposes, these parametric models are typically learned from training examples of lower resolution than what is common in medical imaging.
One alternative approach is to add deformation morphing using an anatomic template [26].Lee [47] introduce a hybrid, non-uniform rational B-spline surface (NURBS) based phantom of an infant by combining the expressiveness of a voxel phantom with the flexibility of geometric manipulation and organ positioning in a parametric phantom.Another example is the XCAT Warp [54], where AI-assisted unsupervised registration is used to warp XCAT to patient CT images to capture a more broad set of variations, compared to the existing organ and model scaling offered by XCAT.These methods are suitable for investigating digital-twin approaches where individual models reflecting the characteristics of a single individual are needed.

Image-based generative models
Image-based generative models attempt to synthesize a population of stochastic digital humans from information contained in medical images.Ideally this population captures the variability in the anatomy and tissue properties within a specified cohort of to-be-imaged subjects.Consider a collection of N -dimensional digital humans {f s } S s=1 that represents the cohort of interest as described by Eq. 3.This setting corresponds to a practical situation in which an in silico study employs a fully discrete representation of an imaging system in which a finite-dimensional approximation of an object is mapped to discrete image data.As mentioned in Section 3, each digital human f s can be interpreted as a realization of a random vector f that is characterized by an unknown probability density function pr(f ).The ability to sample from pr(f ) to generate large ensembles of objects for use in in silico imaging trials is, at least conceptually, the ultimate objective of a stochastic digital human model.Emerging generative methods that utilize neural networks are being actively developed for this purpose [55].We refer to these methods as generative models.A generative adversarial network (GAN) is a type of generative model that has recently been very popular for high-resolution image synthesis [56], image translation [57,58] and a number of generative image applications [59].Instead of explicitly modelling pr(f ), which is difficult due to the high dimensionality of f , GANs seek to define a stochastic process for drawing samples.As such, GANs are categorized as implicit generative models.Specifically, GANs operate by mapping samples from an analytically tractable, low-dimensional distribution pr(z) to the sought after samples of the high-dimensional distribution pr(f ).Typically, pr(z) is specified as an independent and identically distributed (i.i.d.) standard normal distribution, and therefore, samples of the random vector z can be readily generated.The mapping is usually implemented via a deep neural network referred to as the generator.Simultaneously with generator training, a discriminator network is trained to discriminate between the real and generated examples.Therefore, the training process is adversarial and is approximately solving a min-max optimization problem [60].In this case, a collection of training data (typically images) are utilized to learn how to sample from an empirical distribution that approximates pr(f ).An excellent review of GAN applications for medical image generation can be found in [61].The adversarial training process for GANs is inherently unstable and can result in a phenomenon known as mode collapse, in which the model fails to sample from certain regions of probability space.In addition, the generated samples are often of low resolution.A number of alternative generative models [62,63] have been developed to address these challenges in applications to medical imaging [64].For example, generative latent optimization (GLO) [62] trains deep convolutional generators by minimizing a simple reconstruction loss, improving on GAN training instabilities.Diffusion models [63,65] learn a Markov chain of diffusion steps incrementally adding and subtracting noise from data, significantly outperforming GANs in output image quality [66].To date, almost all studies of deep generative models have focused on synthesizing images rather than object representations.
Limitations There are several significant challenges to employing GANs or other types of deep generative models to establish stochastic human models.A fundamental and potentially limiting issue is the fact that a collection of objects {f s } S s=1 is generally not available.Medical images are degraded by the presence of measurement noise and/or reconstruction artifacts which are a limitation of the imaging system and not representative of the true underlying objects.As such, conventional GANs that are directly trained on degraded images will not learn how to sample from the true distribution of objects.In essence, there is a "chicken and egg problem" when seeking to establish stochastic human models via deep generative models.There are two possible ways to circumvent this limitation.First, one can utilize high-quality medical images as surrogates of the objects.For example, in certain tomographic imaging modalities and under specific conditions, images of object properties can be reconstructed and accurately approximate the true object properties.In this case, GANs are trained in the conventional manner, with images representing the training data.If these images are representative of the desired subject cohort, the GAN has the opportunity to accurately capture object variability.Second, one can modify the GAN training process to incorporate the image degradation process in training.This approach, referred to as an ambient GAN (AmGAN) [67], utilizes a generator network that is augmented with a measurement operator.Objects produced by the generator are mapped to degraded image data, which are then compared with experimental images by the discriminator network.This permits establishment of an implicit generative model that describes object randomness to be learned from indirect and noisy measurements of the objects themselves.In a preliminary study, the AmGAN was explored for establishing stochastic object models from imaging measurements for use in optimizing imaging systems [67].
While promising, the use of deep generative models for in silico clinical trials is nascent and there remain important topics for future investigation.The objective assessment of these technologies is largely lacking, and there is no consensus regarding what statistical information can be reliably learned.Additionally, current models have largely been applied on 2D images and their extension to three-dimensions is an ongoing topic of research.Finally, as with any data-driven method for establishing stochastic human models, the presence of an imaging system null space will fundamentally limit the ability of GANs to describe certain components of the to-be-imaged objects.The extent to which the null space can be mitigated also remains a topic of ongoing research [67].

Knowledge-based models
Knowledge-based (also known as procedural) models are constructed by sampling a set of φ n and θ n in Eq. 3 from distributions representing the relevant characteristics of the models.The characteristics of the distributions are often derived from physical or biological measurements.Procedural models allow for an unlimited number of random realizations of the object, leading to the possibility of creating large cohorts of digital humans including the representation of rare cases, and at varying spatial resolution which can properly account for small structures that might be relevant for the specific imaging task being studied.
However, they are usually computationally intensive and require a large number of parameters to be defined and estimated based on prior knowledge.Their accuracy and realism depend on the parameter combinations and they can sometimes generate completely unrealistic outputs.
Knowledge-based, procedural models are common in modeling breast anatomy for imaging studies.Graff [68] proposed a detailed model that begins with defining an outside surface using a quadratic hemisphere shell with a skin layer and nipple area overlaid.
The shape of the shell is then adjusted for the overall breast volume and surface curvature.Using a Voronoi segmentation approach, the interior is randomly divided into regions of fat or glandular components, with each glandular component containing a ductal network with terminal duct lobular units.
The volume is then filled with Cooper's ligaments, chest muscle, and blood vessels.
For the VICTRE trial [15], the breast model was sampled with a 50-µm voxel size.The implementation is initiated with a set of random seeds and creates random voxelized breast anatomy objects segmented into nine different tissue types.Several different modeling techniques are employed including a non-isotropic Voronoi segmentation, recursive tree branching algorithms to generate a ductal tree and vascular network, and Perlin-noise perturbed random spheroids to create fat lobules.
A similar effort by Bliznakova [37] describes a 3D breast software model for x-ray breast imaging simulations based on a breast external shape, ductal lobular system, Cooper's ligaments and pectoralis muscle.In this approach, a mammographic background texture is added to the tissue regions.Blood vessels, nerves and lymphatics were not modeled explicitly.A similar, more simplistic approach, was developed by Bakic [69] based on two ellipsoidal regions of large scale tissue elements: predominantly adipose tissue and predominantly fibro-glandular tissue.Internal tissue structures within these regions are approximated by a distribution of elements including shells, blobs, and a ductal tree.Similar approaches have been reported for full-body models [47].

Modeling disease
Disease states can be incorporated into digital cohorts using image-based methods or object-space models of the condition.The analogy between digital human models and disease models can be established if we consider lesions as continuous variables in space (r) and time (t), described by a coefficient vector affecting a set of lesion model characteristics.For simplicity, we will consider the disease independent (of the underlying anatomy where the disease is located) and additive.This assumption allows us to represent the disease cases as a sum of the stochastic human model and the disease component, an addition that is typically performed in the voxelized object model or directly within the model images.We recognize this approach is a known simplification, as disease processes often have significant impact on underlying tissues.
Analogously to the description provided by Eq. 3, we can generate a set of disease models {d s } defined by: where λ s n is a disease characteristics coefficient vector described by the function ψ n over N parameters.Characteristics that define lesions can include geometric functions (e.g., size, morphology), material properties (e.g., x-ray interaction cross-sections, elasticity) or other relevant features (e.g., radioactivity, blood oxygenation levels).
Methodologies for generating and incorporating disease into cohorts of digital stochastic models rely on sampling λ n and ψ n from appropriate distributions representing the intended population.In some cases, disease models are specific to a given anatomical location or physiology corresponding to a digital human exemplar.In other cases, disease models are independent of the digital healthy human and are simply added or inserted multiple times into models of healthy anatomy.In both cases, diseased subjects are denoted by a cohort of digital stochastic humans with added disease components: where {f s } S s=1 is a cohort of diseased digital humans (for simplicity, and similarly as in the previous section, we choose to omit the time dimension).Similarly to normal models, when ψ n are unknown, models of disease can be obtained relying on imaging.Alternatively, when ψ n are known, analytically or numerically, the stochastic disease models are referred to as knowledge-based (also known as procedural).

Image-based models of disease
Similar to image-based models of the human body, image-based models of disease rely on imaging data for extracting lesion information.Various techniques for capturing disease characteristics, particularly for breast lesions, have recently been explored [70,71].Image-based neural network models for disease modelling have also been explored.For instance, Kadia [72] proposed a method to generate synthetic, infection-like patterns in the lung to create large collections of 2D and 3D training examples for deep segmentation models.
While image-based models contain features from actual patient data and thus may look more realistic at first glance, they suffer from limited resolution of the tumor model, largely determined by the imaging acquisition characteristics and limited number of available lesion morphologies, shapes, and sizes.In addition, image-based methods require an institutional review board (IRB) approval for obtaining and utilizing the diseased case data for research and development, which could delay or disadvantage some analysis efforts.

Knowledge-based models of disease
Knowledge-based models of disease are constructed by sampling a set of known (or assumed known) ψ n and λ n in Eq. 4 from distributions representing the relevant characteristics of the disease, where distributions are often derived from physical or biological measurements.In contrast to image-based models, knowledge-based models enable the generation of unlimited numbers of lesion shapes with variable resolution.
Examples of knowledge-based models include de Sisternes [73] spiculated breast cancer mass model and Sengupta [74] growing breast mass models.In [74], a breast lesion growth method based on biological and physiological phenomena accounting for the stiffness of surrounding anatomical structures was introduced.Breast ligaments were considered as rigid structures with elastic moduli in the range of 8x10 4 -4x10 5 kPa, while fat (elastic modulus varying from 0.5 to 25 kPa) and glandular tissues (elastic modulus varying from 7.5 to 66 kPa) constituting the more elastic regions of the breast.In this approach, tumor cells are less likely to grow through stiffer structures and instead, preferentially proliferate through the more elastic regions of the breast.Depending on the breast local anatomical structures, a range of unique lesion morphologies can be realized, allowing lesions to blend naturally into the anatomical regions.
A common simplifying assumption is to define the disease model independent from other human model components.For example, in VICTRE [15] and in Sengupta [75], breast cancer mass lesions are added to the normal breast models by replacing voxels in the breast with voxels of the lesion model, without modification to adjacent voxels.This approach, while practical, does not account for the significant effect of the growing tumors on its surrounding tissues, typically visible in x-ray images as architectural distortions suggestive of abnormalities.To consider these effects, Eq. 5 needs to be modified to account for the interaction between normal and disease models.

Role of augmentation methods
Augmentation methods start with an already-defined object, image or a set of defined objects, and generate new examples based on properties of inputs, as well as pre-defined or data-driven transformations (in contrast, digital human models start with only an object description, such as that given in Eq. 1).GAN-based models (see Section 5.1.2) are similar to augmentation methods in that they employ complex transformations derived with the help of training data sets.Augmentation methods typically employ analytically-defined or stochastic operators that do not require the use of neural networks, and can be applied both in the object domain and in the acquired image domain.Techniques in the latter group generate examples that could be obtained through an imaging system applied to an object with an accompanying degradation (e.g., smoothing, noise, reconstruction artifacts).
Geometric transformations, intensity operations, and spatial filtering are among the most basic types of augmentation methods.Geometric transformations redefine the spatial relationships among voxels or geometrical locations in an object, and include affine (scaling, rotation, translation, reflection and shearing), as well as non-affine transformations, such as nonlinear warping and morphing [76].Intensity operations modify intensity values in a grayscale image or channel values (e.g., RGB or CMYK) in a color image.Examples include operations such as a family of gamma corrections, linear contrast adjustments, and remapping voxel values using a pre-defined or pseudo-random remapping curve [77,78].Spatial filtering (using a filter mask) is another possibility for generating a new image or object based on an existing one.Spatial filtering can be linear (in which case it can be implemented by a convolution operation) or nonlinear (e.g., median filtering), and can be implemented to smooth or sharpen to emphasize certain features.Finally, all three types of augmentations can be combined using a continuous mapping from the parameter space of transformations to the image or object space [79].
Noise injection is an image augmentation method that enhances robustness of machine learning models and belongs to the family of domain randomization (DR) methods [80].Although noise injection after data acquisition does not generate a new member of a patient population, it can generate a different representation of an object in the image domain, and can be useful for augmenting patient cohorts obtained with in silico modeling.Some earlier and non-medical applications of noise injection in machine learning sought to augment the image data sets without regard to the physics of image acquisition [81,82].
Other works used physics-based techniques for noise modeling and addition, improving realism of the noise appearance in the augmented images [83,84].The main benefit of noise injection in the image domain for in silico trials is that it may allow for the rapid generation of different representations of the same object at different noise levels, leading to comparisons that may require less computational power compared to a full implementation of image acquisition physics applied to a digital stochastic object model.Addition of texture to a model in the object domain has similarities to noise injection in the image domain in that both techniques aim at producing noise-like properties (e.g., using a noise power spectrum in modeling), but are different in that addition of texture in the object domain does not attempt to model the noise from data acquisition [85].
Combination of objects or images is another popular augmentation technique.
In the object domain, combination of an object model for a normal (non-diseased) patient with a lesion model (as described in Section 6) can be thought of as an example of this type of augmentation.Generating new members of a patient population based on an eigenspace analysis of existing patient objects, as was done in [52] and described in Section 5.1.1 is another example of augmentation in the object domain.In the image domain, researchers investigated tools for the extraction of image parts from one clinical image and then their insertion into a new location on the same or different image.Pezeshk [86] used an image blending technique based on Poisson image editing to insert pulmonary nodules extracted from one chest CT exam into another.Augmenting a training data set for a machine learning model using this technique can improve the model performance on independent, real test datasets [87].Likewise, Ghanian [88] used a similar technique to insert microcalcification clusters extracted from one mammogram into another mammogram, and showed that experienced observers cannot reliably distinguish between computationally inserted and native clusters.Besides the ability to convince experts, desirable properties for such combination techniques include acceptable noise properties in the combined image, plausible lesion-background combinations (that might require the intervention of an operator during the augmentation process), and a sufficient range of variation in the combined images that can be generated, which are often difficult to satisfy simultaneously.
The main advantage of data augmentation methods is their practicality.For example, existing images or models both for normal and diseased patients can be manipulated (with relative ease) with geometric transformations leading to expanded patient representations.When implemented in the image domain, augmentation methods are fast, bypassing the stage where a model for the imaging system is applied to the object to yield an image.However, important shortcomings accompany these advantages.Unless deliberate attention is paid, augmentation methods may yield objects or images that are biologically or physically implausible.An extreme example may be an intensity transformation that results in bones with lower Hounsfield units than soft tissue.Although this can be avoided easily by using an intensity transformation that is monotonically increasing, most augmentation methods and transformations need careful planning to avoid such inconsistencies, and it may not be possible to avoid all inconsistencies.The consequences of such implausible images or objects on the results of an in silico imaging trial should be carefully considered.In addition, many augmentation techniques do not result in an independent, new representation from the population, but rather in representations that are highly dependent on the original objects or images used as inputs to the augmentation method.For example, lesion insertion methods described in the previous paragraph do not increase the number of lesions in the augmented data set, but only the lesion-background combinations that are generated.Again, the consequences of this limitation in the range of variation of generated images should be an important consideration in an in silico imaging trial that uses augmentation.

Considerations for sampling digital cohorts
In silico studies require careful study planning and good clinical trial design.
Even if and when methodologies for developing digital stochastic models of humans for imaging studies become widely available, generating digital cohorts needs an understanding of the trade-offs and potential for bias associated with selecting a specific distribution of study subjects.At the start of the design of an in silico imaging trial is the challenging task of scoping the population of the digital humans to be included in the study.For instance, a number of previous computational studies in breast imaging using procedural models used a uniform sampling with a desired average of 50% adipose and 50% fibroglandular voxels [89] with an uncompressed breast size of 14 cm.Another example of enrollment strategy can be found in the OpenVCT platform, where a range of size and glandularity is specified and then uniformly randomly sampled [90].A more recent in silico imaging study used sampling from a multi-class distribution identifying 4 different breast densities resulting in the characteristics of the intended population [15].Analysis of ∆f corresponding to a given in silico enrollment strategy may be needed to understand how the difference across study subject distributions could affect the outcome of the trial.Here, we discuss a test case (see Figure 2) that compares different enrollment strategies for an in silico trial comparing digital mammography (DM) and digital breast tomosynthesis (DBT) derived from the VICTRE [15] project.We assume the populations (digital and physical) consist of normal and diseased subjects with a prevalence of 0.5.These two classes of patients are therefore sampled with equal probability.We calculate the difference of performance (measured using the area under the receiver operating characteristic curve, or AUC, in the task of differentiating between normal and disease subjects) between mammography and digital breast tomosynthesis.We consider the following four sampling approaches.In the first approach (uniform), f i is unknown and subjects are sampled uniformly within a range of interest, from all possible combinations of the input parameters that define f .In the second approach (matched), f i is known and subjects are sampled from the true underlying distribution.In the third approach (simpler), f i is unknown, but can be approximated by another, simpler distribution from which samples are obtained.Finally, in the fourth approach (narrow), f i is known to be a narrow, well-defined subset of the general population of subjects of particular interest (e.g., rare diseases or very obese subjects).
For this simplified example, let f i be a bimodal distribution defined by two parameters (e.g., breast size and glandularity).Using Eq. 3, we can express the model through two expansion functions φ 1,2 , each associated with one of the two random variables affected by a random parameter set given by θ 1,2 .As seen in Figure 2, one of the modes of the distribution has twice the amplitude and half the variance of the other.The four density plots illustrate a top view of the distribution contour plot with the individual samples drawn using the four different sampling strategies.The results demonstrate that the choice of sampling strategy can have a significant effect on the difference in AUC, which for this example case, ranges from a difference of 0.01 (almost zero) to 0.11 in terms of device performance.

Summary and conclusions
In silico trials are an emerging area of regulatory research that offer the ability to capture highly diverse patient distributions at a significant time and cost savings, compared to traditional physical clinical trials.To conduct in silico trials, realistic digital representations of humans are needed.In this paper, we reviewed and discussed existing techniques for generating digital humans, including disease models, for in silico imaging trials.
Digital humans can be created using image-based or knowledge-based techniques.In summary, we favor techniques with object-based representations (rather than images of objects) in order to decouple the characteristics of the image acquisition system from the characteristics of the object (true representation of the physical-world human).In generating digital humans for in silico trials, one should consider the quality and quantity of the source data or knowledge used, and whether the models represent a single patient, a small cohort, or a sizable population with realistic patient variability.
It remains a crucial next step to evaluate the quality of the digital human models and the images that can be generated with them.In particular, it is essential to carefully identify the patient distribution that the particular digital human model can and cannot capture, in order to prevent misuse and ensure patient safety.We need to study to what extent model-derived data contributes to our understanding of performance levels for populations with rare diseases or for populations underrepresented in traditional clinical trials.Future work should examine the ethical and safety considerations of relying on digital humans for clinical trials.Overall, the use of in silico imaging trials and in silico trials in medicine is a rapidly developing field and has the potential to address many of the emerging challenges in the regulatory evaluation of medical devices.

Figure 1 .
Figure 1.Classification of ethods to generate digital humans for in silico clinical trials.

Figure 2 .
Figure2.Effect of sampling strategies on performance assessment.Sampling is from a bimodal distribution of subjects (seen in 3D insert in the second panel from the left) described by 2 random parameters: (from left to right) uniform, matched, simpler, and narrow.Only 20 samples are shown here for ease of visualization.The gray shading depicts the distribution from which samples are taken in each of the 4 cases.A M , A T , and ∆A refer to the lesion detection average AUC for mammography, average AUC for digital breast tomosynthesis, and the average AUC difference, respectively. .