Experimental quantification of Geant4 PhysicsList recommendations: methods and results

The Geant4 physicsjists package encompasses predefined selections of physics processes and models to be used in simulation applications. Limited documentation is available in the literature about Geant4 pre-packaged PhysicsLists and their validation. The reports in the literature mainly concern specific use cases. This paper documents the epistemological grounds for the validation of Geant4 pre-packaged PhysicsLists (and their accessory classes, Builders and PhysicsConstructors) and some examples of the author's scientific activity on this subject.


Introduction
Geant4 [1,2] PhysicsLists represent selections of physics processes and modelling options, among those available in the Geant4 toolkit, which are used in a simulation application. A collection of pre-packaged PhysicsLists (and their accessory classes, Builders and PhysicsConstructors) is distributed along with Geant4 source code for users' convenience; it is intended to facilitate the use of Geant4 functionality despite its intrinsic physics complexity.
Limited documentation is available in the literature regarding these Geant4 physics configuration tools, especially concerning the quantification of their accuracy, their computational performance and the stability of their results. The grounds for their assembly and the appraisal of their performance are mostly related to LHC experiments; assessments of their validity in other application environments are sparse. Quantitative estimates of the validity of the Geant4 physics models used in pre-packaged PhysicsLists and PhysicsConstructors are scarce in the literature. Comparisons of physics modeling components and simulated observables with experimental data often rest on qualitative appraisal, lacking objective quantification based on statistical methods.
The authors of this paper are involved in a broad-scoped, long-term scientific project concerning the validation of Geant4 physics capabilities, which exploits statistical methods for objective quantification of the results. The validation of Geant4 pre-packaged PhysicsLists and accessory classes (Builders, PhysicsConstructors) is logically pertinent to this programme. This paper reviews the methodological foundations of the validation of Geant4 pre-packaged PhysicsLists carried out in this context and illustrates some examples of results. More extensive documentation can be found in related publications produced by the authors' team or in preparation at the time when this contribution to the proceedings of CHEP 2015 was written.
We stress that the study of Geant4 physics validation is addressed as a scientific research project, which documents the statistical comparison of physics modelling functionality embedded in public versions of an open source code with respect to experimental data available in the literature: in doing this, there is no hidden intent to exalt, nor to denigrate any persons, institutes, computer codes, theoretical, phenomenological or empirical physics models.

Epistemological grounds
From the perspective of software development, simulation validation pertains to the discipline of testing. The validation of the physics capabilities of Monte Carlo simulation codes falls into the more general domain of software verification and validation. A brief summary of basic concepts is reported here to facilitate the comprehension of the following sections; it could also serve as a pedagogical reference for younger physicists, who often do not have access to appropriate training in the epistemology of physics simulation in the course of their university studies, although their experimental research activity usually involves directly or indirectly the use of simulation codes, which are nowadays an essential instrument in high energy and nuclear physics experiments.
The definition of these processes is established in IEEE Standard 1012 for System and Software Verification and Validation [3], which is related to other ISO (International Organization for Standardization) standards: ISO/IEC 15288 [4] and ISO/IEC 12207 [5] Standards. Consistent with the standards, simulation validation is the process of providing evidence that the software solves the right problem (e.g., correctly models physical laws, implements business rules, and uses the proper system assumptions), and satisfies intended use and user needs. It is distinct from the process of verification, which concerns the evaluation of conformity to requirements.
In the context of physics simulation, it is worthwhile to stress the difference between validation and calibration of the software [6], as these concepts are often confused. Calibration, also informally known as "tuning", is the process of improving the agreement of the outcome of the simulation with respect to a chosen set of benchmarks through the adjustment of parameters in the simulation model, while validation involves the comparison between the outcome of the simulation and independent experimental references.
The foundation of the scientific method determines that simulation validation involves the comparison with experimental measurements. Comparisons of simulation models, of the outcome of simulation performed with different Monte Carlo codes or different modelling options of a Monte Carlo system (e.g. with different Geant4 PhysicsLists), do not qualify as simulation validation, although they could be interesting for other purposes.
The principles of the scientific method also define the quantitative and objective character of simulation validation. Qualitative, subjective visual appraisal of plots cannot be properly considered as a method to establish the validity of the software: at most, visual assessment can be a preliminary step in the course of the validation process. Established statistical methods exist, that allow testing the hypothesis of compatibility between simulation and experiment on objective grounds. Two-sample goodness-of-fit tests are the most widely used statistical resources for this problem. While hypothesis testing is a well established branch of statistics, open issues are still present, some of which are specific to the problem domain of simulation validation.
A major open issue, which is still an area of active research, is the power of different goodnessof-fit tests that are documented in the literature (e.g. Kolmogorov-Smirnov [7,8], Anderson-Darling [9,10], Cramer-von Mises [11,12] etc.), and whether the relative power of these test can be established in absolute terms or is related to the characteristics of the application scenarios. In the absence of definite conclusions on this subject, the use of a variety of goodness-of-fit tests in the course of the validation process of simulation models mitigates the risk of systematic effects in the assessment of validity drawn from the results of the tests, which could be due to peculiarities of their mathematical formulation. Another open issue, which is specific to the application of goodness-of-fit tests to the validation of simulation models, is the ability to calculate a quantitative uncertainty of the simulation data based on the p-value resulting from the tests. Statistical and systematic errors affecting to a variable extent the experimental data involved in the tests contribute to the complexity of estimating the uncertainties to be associated with model data.
Finally, a common question once the validation of individual simulation models has been estimated, concerns the ability of quantifying on objective grounds the relative merits of different simulation models regarding their validity.
A methodology, first described and applied in [13], has been developed for this purpose: it consists of expressing the compatibility of the various simulation models with experiment, established by the outcome of goodness-of-fit tests, as categorical data, which are input to contingency tables, which in turn can be analyzed by means of appropriate statistical tests. Further refinements of this methodology, which take into account the character of independence, or possible dependence, of the categories subject to comparison, are described in [14].
Similarly to what previously remarked concerning goodness-of-fit tests, the use of a variety of categorical tests mitigates the risk of systematic effects, possibly related to peculiarities of their mathematical formulation.

What is validated?
In the context of the validation of the physics of Monte Carlo simulation, one can distinguish two main issues: the validation of elemental physics components of Monte Carlo transport codes (e.g. cross sections, atomic and nuclear parameters) and the validation of observables resulting from the execution of simulation (e.g. the energy deposited by particles in a given volume).
Observables produced by Monte Carlo simulation are usually the result of several physics processes. Their validation entails different epistemological aspects with respect to the validation of elemental physics modeling features, as it necessarily involves modelling an experimental scenario.

Validation of Geant4 elemental physics components
Conceptually, elemental components of physics models can be validated against experimental data independently from any specific application scenario, although in practice the extensive presence of dependencies in some Monte Carlo codes, often resulting from the evolution of the software over many years, in some cases prevents testing basic physics calculations in simple unit tests outside a full simulation application configuration: this issue is discussed in a dedicated paper of these conference proceedings [15].
Validation unrelated to any specific simulation configuration scenario allows establishing general conclusions: for instance, a cross section calculation based on QED (Quantum Electrodynamics) principles, which has been validated with respect to experimental data, retains its validity in any simulation application scenario where it is used (provided it is used within the range of energies, target materials etc. where its validity has been established).
It is worthwhile to note that the validation of Geant4 elemental physics components sets the grounds for the validation of complex observables deriving from the concurrent effects of multiple elemental components.
Quantitative, objective validation of Geant4 basic physics elements is scarcely represented in the literature, which is dominated by greater emphasis placed by the experimental community on the assessments of experimentally relevant quantities that are directly related to detector performance, while systematic evaluations of the physics foundations of general-purpose Monte Carlo simulation systems are usually performed by a small number of developers of these codes.

Validation of Geant4 pre-packaged PhysicsLists and accessory classes
Due to their intrinsic nature -an assembly of physics processes and models, Geant4 PhysicsLists can only be assessed over specific use cases, which in turn involve specific observables pertaining to the simulated experimental scenario. Validation of specific observables in specific simulation scenarios lacks generality: this implies that the validation of Geant4 pre-packaged PhysicsLists (and their accessory classes, Builders and PhysicsConstructors) cannot be established in absolute, general terms, rather, it can only rely on the body of knowledge derived from the use cases that are documented in the literature. This peculiarity of the validation of Geant4 pre-packaged PhysicsLists (and their accessory classes) determines the need of documenting quantitatively their performance over a large number of experimental use cases as a reference for the experimental community in view of their use.
In this context, extensive, systematic assessments of relatively simple observables produced with Geant4 pre-packaged PhysicsLists (and their accessory classes) in a wide variety of experimental configurations are especially useful, as their findings provide valuable guidance for further applications.

An example of validation concerning a simple observable
The concepts and methods outlined in the previous sections have been applied to a systematic validation of Geant4 pre-packaged electromagnetic PhysicsConstructors regarding the simulation of the fraction of backscattered electrons. Deposited energy fraction q qqq qqq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.1 0.5 5 50 500 5000 50000 q Pb, Z=82 Geant4 9.6p03 Urban UrbanBRF WentzelBRF Coulomb EmLivermore EmStd Figure 1. Energy deposited in a semi-infinite lead target by an electron beam, corresponding to the use of different Geant4 multiple and single scattering models (Urban, in two configuration options), WentzelVI and Coulomb) and of two pre-packaged electromagnetic PhysicsConstructors (Standard and Livermore). This plot was obtained using Geant4 version9.6p03. This observable, which is a manifestation of how electron multiple and single scattering are modelled in the simulation, is directly related to the spatial energy deposition pattern resulting from electron interactions with matter. An example of this relation is illustrated in Figure 1, which shows the energy deposited in a lead target produced by simulations involving different Geant4 multiple and single scattering models: Urban (in two configuration options), WentzelVI and Coulomb as well as two pre-packaged PhysicsConstructors (G4EmStandardPhysics and G4EmLivermorePhysics). One can observe significant differences across the various physics configurations of the simulation.
The validation involved an extensive collection of experimental backscattering measurements from the literature, amounting to more than 3000 test cases in total, spanning a variety of target materials and electron energies. The corresponding experimental scenarios were reproduced in the simulation model; a sketch of a typical geometry configuration is shown in Figure 2. The simulation was configured with several pre-packaged Geant4 electromagnetic PhysicsConstructors: they are listed in Table 1 along with a brief identification of their characteristics as defined in Geant4 Application Developer's Guide [16]. Further details about these PhysicsConstructors can be found in the cited Geant4 user documentation.
Validation tests were performed over Geant4 versions 9.6-patch03, 10.0-patch03 and 10.1 to document the evolution of the performance of electromagnetic PhysicsConstructors over different Geant4 versions. The analysis was concerned not only with the behaviour of the pre-packaged PhysicsConstructor classes, but also with the individual physics model settings embedded in them. Rigorous statistical methods were applied in the comparison between simulation end experimental data.
The results of the validation tests are extensively documented in [17]. Page allocation constraints in the conference proceedings prevent entering the details of this extensive validation test; only a relevant issue is mentioned here, which was investigated on the basis of the results reported in [17]. This issue concerns the sensitivity of the outcome of the same physics configurations, depending on the geometrical configuration of the simulation: it is illustrated in Figures 3 and 4, which were produced with the same electromagnetic PhysicsConstructors and the same Geant4 version 10.1, but with slightly different geometrical settings. Figure 3 corresponds to a geometrical configuration where a boundary surface is shared between the target and the detection system in the backward hemisphere, while Figure 4 corresponds to a "models based on Livermore databases" G4EmStandardPhysics "default electromagnetic physics" G4EmStandardPhysics option1 "fast but less accurate electron transport", ApplyCuts G4EmStandardPhysics option2 "fast but less accurate electron transport" G4EmStandardPhysics option3 "for simulation with high accuracy" G4EmStandardPhysics option4 "combination of best EM models" G4EmStandardPhysics SS -G4EmStandardPhysics WVI geometrical configuration where the target is slightly separated from the backward detection system: this small displacement (1 pm) determines a negligible loss of geometrical acceptance in the detection of backscattered electrons. Significant differences are observed in the fraction of backscattered electrons. It is worthwhile to note that Geant4 built-in tests to identify malformed geometries did not detect any anomalies in either geometrical configuration. Further investigations were performed to ascertain the origin of the observed differences.   Verification with Geant4 built-in visualization tools demonstrates that the effect on electron backscattering is manifest as a basic physics feature of the simulation, rather than as a possible effect of algorithms registering "hits" in the detector: for instance, in a test involving G4EmStandardPhysics option3 PhysicsConstructor, Figures 5 and 6 show the absence and the presence of backscattered electrons emerging from the target when a shared boundary surface is present or absent, respectively, between the target and the backward detection system. More detailed investigations and discussions are documented in a forthcoming dedicated publication; however, this example shows a concrete application of the epistemological statement in section 3.2, that the validation of PhysicsLists concerns specific observables in specific use cases.

Conclusions
The validation of Geant4 physics requires sound epistemological foundations and rigorous statistical methods, which are the grounds to provide objective guidance to the experimental community in the optimal configuration of simulation applications. This paper has briefly reviewed basic concepts and methods concerning the validation of the physics of Monte Carlo simulation codes. Two distinct aspects are involved in the validation   of the physics of Monte Carlo simulation codes: the validation of elemental physics modelling features and of complex observables resulting from the concurrent contribution of various physics processes and models. The former can be addressed in a general way, independent from specific application scenarios. The validation of Geant4 pre-packaged PhysicsLists and accessory classes (Builders, PhysicsConstructors) concerns the latter, although it would benefit from the former: it involves specific observables and specific simulation scenarios. The body of knowledge deriving from extensive assessments of various observables in several scenarios would constitute a reference for the experimental community. Our team is actively involved in the validation of Geant4 physics under different perspectives, including the development of sound validation methods and tools for its quantitative, objective evaluation. This activity has recently included a project concerning the evaluation of several pre-packaged electromagnetic PhysicsConstructors with respect to the observable represented by the electron backscattering fraction. In the course of this project various issues have emerged, which require further investigation.
Finally, it is worthwhile to draw the attention of the high energy physics community to the fact that the trust in Monte Carlo simulation codes is ultimately based on the availability of experimental data suitable for the validation of their physics models. The complexity of physics observables deriving from very high energy interactions in complex detectors is not always the optimal environment for the validation of the physics models embedded in Monte Carlo codes, which cover a wide range of energies and involve several particle types: their detailed assessment usually requires a fine appraisal of the observables they produce, which is beyond the scope of large scale experiments devoted to fundamental physics discoveries. A paradigm shift is needed in the experimental community [18] to appreciate the need of experimental measurements explicitly performed for the validation of Monte Carlo codes and to provide the necessary support for dedicated code-validation projects.