CaloClouds: fast geometry-independent highly-granular calorimeter simulation

Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models would enable them to augment traditional simulations and alleviate a major computing constraint. This work achieves a major breakthrough in this task by, for the first time, directly generating a point cloud of a few thousand space points with energy depositions in the detector in 3D space without relying on a fixed-grid structure. This is made possible by two key innovations: i) Using recent improvements in generative modeling we apply a diffusion model to generate photon showers as high-cardinality point clouds. ii) These point clouds of up to 6,000 space points are largely geometry-independent as they are down-sampled from initial even higher-resolution point clouds of up to 40,000 so-called Geant steps. We showcase the performance of this approach using the specific example of simulating photon showers in the planned electromagnetic calorimeter of the International Large Detector (ILD) and achieve overall good modeling of physically relevant distributions.


Introduction
Large-scale experiments in high energy physics (HEP) have one unifying feature: the need to record and analyze an ever-increasing amount of data produced primarily through higher collider luminosities and higher granularity detectors with growing numbers of readout channels.The reduced experimental uncertainty that comes with this increased amount of higher-resolution measurement data presents one of the most promising avenues to increase the precision of measurements and the sensitivity of searches for new physical phenomena beyond the Standard Model.
In order to compare experimental measurement and theoretical prediction, modern HEP experiments rely on high-precision Monte Carlo (MC) simulations.These simulations model collider events from the initial hard scattering, over hadronization, to the detailed responses of the tracking-, calorimeter-, and muon-systems of the detector.Such detailed modeling is a highly resourceintensive process.Further, for subsequent comparison to collision data to be valid, the number of simulated events has to at least equal the number of recorded collisions.The combination of these two factors means that MC simulation alone places a significant strain on the available HEP computing resources, which, for future high-luminosity experiments, is only bound to increase further [1,2].
The most notable bottleneck within the MC simulation chain is the step that has to deal with a large number of particles interacting with complex detectors -shower simulation -as the modeling of all involved particles is inherently time-consuming [3].This provides a strong incentive to find new ways of speeding up MC simulations.One of the most promising current fast simulation approaches is provided by generative machine learning (ML) models.These models can learn an underlying distribution from a given data set and can subsequently be used to generate new data from the learned distribution, allowing them to generate simulated data potentially orders of magnitude faster than classical simulations.
Previous generative calorimeter models were influenced by machine learning models used in computer vision for image generation, therefore focused on fixed geometrical structures, either in the form of 3-dimensional tensors that mirror the design of a calorimeter or in the form of an ordered list, that connects one output to one calorimeter cell.Within the fixed-structure framework, applications have started out using comparatively simplistic calorimeter geometries and gradually moved to higher and higher detector granularity.This directly follows the advances in calorimeter development, which similarly aim for increasingly granular detectors [26,27].
It is, however, this move to higher granularity, that presents the largest challenge for fixedstructure generative models, as higher data dimensionality entails more resource-intensive generation.Most notably, the increased sparsity of highly granular detector data means a large fraction of the model evaluation is wasted on empty cells.In an ideal scenario, this wasted computation could be avoided and only the non-empty sections of the calorimeter would be simulated.
To overcome this issue, we introduce CaloClouds, a generative shower simulation approach that does not rely on a fixed structure, but instead generates geometry-independent point clouds. 1 This approach removes the need to simulate empty detector regions.Further, the produced point clouds can be directly projected into an arbitrary detector geometry, thereby allowing for effortless translation invariance and making the handling of e.g.non-regular grids or hexagonal structures straightforward.
Point cloud-based generative models have previously been explored in HEP, most notably for fast hadronization simulation [24,28,29], however, this work is the first to successfully expand this technique to handle the significantly larger point clouds of several thousand space points required for realistic calorimeter simulations.
In the following, we describe the data in Section 2, introduce the model in Section 3, show results in Section 4, and offer conclusions in Section 5.

Data Samples
We investigate the performance of our generative point cloud model with a purpose-built dataset that describes the energy depositions of photon showers in the electromagnetic calorimeter (ECAL) of the International Large Detector (ILD) [30].The calorimeter data training set consists of 524k 2 photon showers, with incident energy uniformly distributed between 10 GeV and 90 GeV, simulated in the ECAL of the ILD.The ILD ECAL is a highly granular sampling calorimeter, consisting of 30 layers with alternating passive tungsten absorbers and active silicon sensors.The first (as encountered by the shower) 20 absorber layers have a thickness of 2.1 mm, while the last 10 absorbers have a thickness of 4.2 mm.Every silicon layer has a thickness of 0.5 mm and subdivided into 5 mm × 5 mm cells, which are read out individually.The showers are simulated using Geant4 Version 10.4 (with QGSP BERT physics list) in a detailed simulation model of the ILD detector implemented in the iLCSoft [31] framework.In the case of the ILD ECAL, the model implemented in DD4hep [32] includes realistic gaps between silicon sensors and staggering between layers, such that the cells form an irregular (position-dependent) grid in 3D space.In order to describe the showers, we introduce two coordinate systems: a compact area near the photon's impact position within the calorimeter is denoted by the coordinates [,  , ], with  and  pointing parallel to the orientation of the calorimeter layers and  pointing perpendicular to the calorimeter layers into the detector along the direction of the shower propagation and the [ ′ ,  ′ ,  ′ ] coordinates that constitute the global coordinate system used in ILD, with  ′ pointing parallel to the beam pipe,  ′ laying in the horizontal plane and  ′ pointing vertically upwards.
The simulated photons are produced at [ ′ = 0,  ′ = 1811.3mm,  ′ = 40 mm] with a trajectory along  ′ , i.e. they are created right at the front of the calorimeter in order to avoid interactions before entering the calorimeter, as well as being positioned so as to avoid cracks in the detector.
During the full simulation with Geant4, a very large number of individual energy depositions (on average 20,000 per shower) from secondary particles traversing the sensors is created in the sensitive materials and then added up in the actual calorimeter cells of 5 mm × 5 mm transversal size.While in principle these steps3 could be used as input to the point cloud model, their number is prohibitively large, and therefore a pre-clustering procedure is applied before the actual training.However, this clustering still needs to be more granular than the physical cell size to allow a shower projection into any part of the calorimeter (except changing its depth) without introducing reconstruction artefacts due to cell staggering and gaps.
For this pre-clustering procedure, the steps are grouped by their layer number and then projected into a regular grid with a 36 times higher granularity than that of the actual calorimeter, i.e. into grid cells of 0.83 mm × 0.83 mm transversal size, thereby reducing the total number of space points by a factor of circa 7. The effects of this clustering are largely negligible when considering complete showers, as is shown in detail in A. Note, that only active steps are simulated, maintaining the main advantage of point cloud-based simulation.
In order to normalize the range of inputs to the model, we define a bounding box around the showers.This box extends from −200 mm to 200 mm in  and  around the fixed impact point and spans across all layers, making it around seven times more voluminous than what was used in previous ILD ECAL photon shower data sets [16].Any clusters that still end up outside the bounding box -less than 4 per mille -are discarded.Finally, as there can be no recorded hits in the absorber layers, the data is transformed such that the absorber regions are removed and the active layers become contiguous.Furthermore, the  positions within the active layers are uniformly smeared within a layer to produce data that is smooth in , whereas in the ILD simulation, the 3Following the standard Geant4 [33] terminology.energy is deposited for purely technical reasons predominantly at the center of a layer.The cluster positions are then normalized such that the opposing boundaries of the bounding box correspond to a position of −1 and 1 respectively.Figure 1 presents an overview of the data preprocessing pipeline.In addition to the training data, two validation data sets and one test set were simulated.The two validation data sets were simulated at the same position as the training data, the first containing 40k samples with energies uniformly distributed between 10 GeV and 90 GeV, and the second consisting of fixed energy photon showers with incident particle energies of 10 GeV, 50 GeV, and 90 GeV.Each set with fixed energy contains 2k showers, totaling 6k data samples.Finally, the test set contains 2k samples with fixed incident particle energy of 50 GeV that differs in the position at which the photons were simulated.The validation sets have the photons produced at

Data
Training with mu / sigma EPiC Encoder Finally, Table 1 provides an overview of the types of point clouds considered in this work.

Model
Point cloud generative models take a given number of random noise points and transform them into a point cloud of the desired shape (potentially with additional point features as in our case).This makes producing clouds with a variable number of points inherently challenging.To address this cardinality issue, we employ a two-step approach during model training and a three-step approach during sampling, drawing inspiration from Ref. [34].In the following, we first briefly outline the overall structure with additional details provided in corresponding subsections.An illustration of the training and sampling procedure is shown in Figure 2.
During training, the original data is first encoded with a Variational-Autencoder (VAE)-like EPiC Encoder, conditioned on the incident particle energy and the number of points, into a near-Gaussian distributed latent space.Second, this latent space, together with the incident energy and number of points, is used in a conditional point cloud diffusion model termed PointWise Net.
During sampling, the encoded latent space is generated with a conditional Latent Flow model.Since this Latent Flow needs to be conditioned on the incident energy and the number of points, a second Shower Flow is employed during sampling to generate an appropriate number of points from a requested incident energy.This way, the only conditional variable for the whole model is the particle incident energy .Additionally, the Shower Flow generates the total visible energy of the calorimeter point cloud  sum as well as the number of points per layer  , for a post-diffusion calibration of the generated point cloud.
In Section 3.1 we discuss first the PointWise Net at the core of the overall proposed CaloClouds architecture.Then, in Section 3.2, we introduce the EPiC Encoder used during training and the Latent Flow used to perform sampling.Finally, in Section 3.3 we outline the Shower Flow used to generate the conditioning variables as well as calibration factors.

Diffusion Point Cloud Generator
The main part of the CaloClouds architecture is the PointWise Net which is trained as a diffusion probabilistic model and was introduced in Ref. [34].In general, diffusion models [35] learn a Markov chain of Gaussian distortions that over  time steps diffuse data to noise.Here the data is a point cloud  (0) = { (0)   }  =1 comprised of  points with the exponent in parentheses denoting the time step.
During training, a forward diffusion process for every point is modeled as a Markov chain: for time steps  ∈ 1, . . ., , N denoting the normal distribution, and variance schedule  1 • • •   controlling the diffusion process.To generate realistic point clouds from a simple noise distribution, the reverse diffusion process starting from noise ( ( )  ) = N (0, ) is defined as:  where the estimated mean   is modeled by a neural network with parameters  and the generation process is steered by a conditioning latent vector .This latent variable is crucial as the generated point cloud needs to have consistent global properties, i.e. realistic energy and center of gravity.As all initial noise points  ( ) =  ( )   =1 are sampled identical and independently distributed (i.i.d.), the latent  is the only source of shared information for the generated point cloud  (0) .
For practical training, Ref. [36] showed that  ( )  can be sampled directly from data  (0)  without going through all previous steps: with notation   := 1 −   and   :=  =1   .This allows parameterising  ( )  as The model approximating the mean   can be parameterized as: where   ( ( )  , , ) is a neural network which predicts  from  ( )  .Hence, the model   was rephrased as a noise predictor   which can be optimized with a simple L2 loss: Therefore the diffusion loss for the full point cloud at time step  is given by .For efficient training, a random time step  ∈ 1, . . .,  is chosen at each optimization step.A full derivation of the PointWise Net and its training objective can be found in Ref. [34].We optimize our model with a total of  = 100 time steps with a quadratic variance scheduler between   = 0.02 and  1 = 10 −4 .
The neural network   is the same for each point  ( )  ∈  ( ) , therefore we can sample permutation-equivariant point clouds with a variable number of points .Output and input dimensions of the network need to be identical -in our case four (three spatial dimensions and energy).We utilize the same PointWise Net architecture as used in Ref. [34].An overview of the architecture and the layer structure is shown in Figure 3.It consists of multiple ConcatSquash layers [37] which are conditioned on a global context vector -in this case a concatenated vector of the incident particle energy , the number of points , the time features , and the latent vector .The ConcatSquash layers are essentially two-layer fully connected networks, with the first hidden layer aligning the dimensionality of the context conditioning vector with the hidden dimensionality of the layer input.A LeakyReLU activation follows the second hidden layer.
This architecture allows for each point to be sampled independently of all other points of the point cloud.Note that the assumption of i.i.d.sampling of each point of the calorimeter shower is theoretically not optimal as in reality, the photon shower cascade develops through particle interactions.However, due to the simple topological structure of electromagnetic showers, we assume the points to be i.i.d. and gain encouraging generative fidelity with this approach.Since we do not model any particle interactions, the point cloud generation scales O (N) with the number of points, which is crucial since we generate several thousand points per shower.We performed a few early experiments with the equivariant point cloud (EPiC) layers introduced in Ref. [28] as they allow for point interactions with linear computational scaling but did not observe improved performance.This is likely due to the EPiC layers being optimised for much smaller point clouds of O (10) cardinality.Graph network and transformer approaches for point cloud generation such as Ref. [29,[38][39][40][41][42] generally scale O (N 2 ) and are therefore significantly slower than our model.However, fast implementations of attention, graph networks or sequence convolutions [43][44][45][46] might be interesting avenues to explore in the future, e.g. for generating hadronic showers including nuclear interactions.
Figure 4 shows a visualization of the calorimeter point cloud generation at different stages of the reverse diffusion process (i.e. starting with noise).

Encoder & Latent Flow
The PointWise Net generates point clouds via the reverse diffusion process.For the resulting point clouds to be realistic, PointWise Net needs to be conditioned on additional quantities beyond multiplicity  and shower energy .
In principle, one could add additional physically relevant quantities such as the total visible energy, the center of gravity, or the shower start as explicit conditioning features.However, such a choice of observables might bias the generated showers.Instead, we opt for learning an additional global context vector  to capture any other relevant distributions via an additional encoder.
This encoding is learned by an Equivariant Point Cloud (EPiC) Encoder using three EPiC layers introduced in Ref. [28] with a hidden dimensionality of 128.The EPiC Encoder is conditioned on  and  and learns to encode the original Geant4 point cloud into two latent space vectors  and .Similar to the encoder in a VAE,  and  are regularised towards a Gaussian distribution with the Kullback-Leibler divergence (KLD) loss and the latent space  is sampled with the reparametrization trick [47].The KLD loss is given by: with the latent variables sampled via  ∼ Z = N ( ,  2 ).We set the size of  to 256, the default in Ref. [34], without performing a hyperparameter optimisation.
During training, the EPiC Encoder and the point cloud generator are trained in parallel using a combination of  KLD and the diffusion reconstruction loss.At a randomly sampled time step  this results in the total loss function: with the KLD weight  KLD = 10 −3 .To prevent a posterior collapse of  we enforced a minimum KLD loss of  KLD = 1.0.The EPiC Encoder and the PointWise Net were implemented using PyTorch [48] and trained together for a total of 800k iterations using the Adam [49] optimizer with a learning rate scheduled between 10 −3 and 10 −4 .
During sampling, we generate the encoded latent space  with a Normalizing Flow model [50] termed Latent Flow.This Latent Flow model is conditioned on  and  and consists of ten coupling blocks with monotonic rational-quadratic splines [51] each with two layers, a hidden dimensionality of 128, and LeakyReLU activations.The Latent Flow is trained simultaneously with the EPiC Encoder and the diffusion model via the negative log-likelihood loss but with a separate Adam optimizer and a learning rate scheduler.It was implemented using the nflows package [52].
A similar encoding and latent flow strategy are implemented in Ref. [34], however, we choose a more advanced encoder and flow architecture as well as choosing to disentangle the encoder and diffusion loss from the flow training loss to achieve a more flexible optimization regime.

Shower Flow & Post-Diffusion Calibration
To estimate the number of points  in a point cloud i.e.  gen for a requested incident energy , we employ a separately trained normalizing flow model, called Shower Flow.This Shower Flow is trained to generate the showers' total visible energy  sum and the number of points per calorimeter layer  , for the purpose of post-diffusion calibration.For consistency, the total number of points  gen is defined by  gen = 29 =0  , .The Shower Flow consists of ten blocks with seven coupling layers each.Out of these, six are based on affine transformations [53] and one is based on element-wise rational splines [51].Each coupling layer is conditioned on the incident particle energy .This flow was implemented with the Pyro package [54].
While the number of points  gen generated by the Shower Flow describes the corresponding Geant4 distribution  data well, the number of calorimeter hits, i.e. the occupancy  gen , of the projected point cloud may not necessarily correctly match the corresponding Geant4 distribution  data .This occurs as the projection of the space points into cells of the detector grid has two possible failure modes: either too many points  gen are projected into the same calorimeter cell and therefore the number of hits  gen is lower than the expected  data , or too few points are projected together leading to an overestimation of  data .For our model, we observed the latter case since for a given shower the  gen is up to 10% larger than the  data .To resolve this mismatch, we introduced an energy-dependent correction factor   () =  gen (  data ())/.Here,  data represents a cubic polynomial fit of  data to  data , while  gen corresponds to a cubic polynomial fit of  gen to  gen .This correction factor is used to rescale  gen predicted by the Shower Flow as well as the predicted points per layer  , , resulting in a calibrated number of points  cal =   •  gen and  ,,cal =   •  , .The calibrated number of points is then used for sampling from the diffusion model.
Finally, we consider an additional rescaling of the point features following the diffusion models' generation: The post-diffusion calibration is performed in four steps.First, in case negative point energies were generated, they are set to zero.Second, the total energy of the point cloud is rescaled to match the predicted  sum .Third, all points are ordered based on their Z-coordinate and then iteratively the first  ,=0 points are set to Z-coordinate index   = 0, afterward the next  ,=1 points are set to coordinate   = 1 and so on until layer index   = 29.This way, the total energy and the energy distribution in the Z-direction are well calibrated, otherwise both distributions are challenging to model with the diffusion model alone precisely.Fourth, a calibration of the center of gravity in the X-and Y-directions is performed by shifting all X-and Y-coordinates by their mean center of gravity of the training data set.
Overall the CaloClouds model together with these calibrations achieves high fidelity on a number of important shower physics observables as presented in the following section.

Results
In the following Section 4.1, we compare the distributions produced by the CaloCloud model to the Geant4 ground-truth for several per-shower variables and global observables.For this, the space points generated with CaloCloud are projected back into the real and irregular grid-like geometry of the cells in the ILD detector model -in the same manner as in the Geant4 simulation.Section 4.2 then demonstrates the model's ability to generate showers originating from any arbitrary incident point of the particle along the detector surface.Finally, in Section 4.3 we conclude with an investigation of our model's speed-up in comparison to Geant4 simulation.

Physics Performance
First, we consider how well the distribution of measured energies across a broad range of incident energies is described after projecting back to the physical coordinate system.Figure 5 (left) shows the per-cell energy distribution.Overall, this is well described by the diffusion model, with the largest differences occurring at regions in the spectrum where the slope changes (i.e. at 0.5 MeV and 100 MeV).Here, the shaded region indicates energies below 0.1 MeV, corresponding to half the energy deposited by a minimally ionizing particle (MIP).As these energies will be dominated by detector noise they are ignored in the reconstruction process and therefore are also not considered when calculating other quantities.
The radial shower profile (Figure 5 (center)) and the longitudinal shower profile (Figure 5 (right)) are overall well-learned by the diffusion model with some differences observed at the edges of the distribution.
Next, we consider the modeling of the center of the shower in Figure 6.For this, the center of gravity (i.e. the energy-weighted centroid) is calculated per shower along the  (left),  (center), and  (right) direction.The radial directions in  and  are well described, with the diffusion model predicting a slightly more peaked distribution in both cases.In the longitudinal direction, the shape is captured correctly by the diffusion model as well, with the peak shifted approximately 10 mm compared to Geant4.
To test whether the response for different shower energies is learned correctly, we consider the visible energy (left) and the overall number of hits above the threshold (right) in Figure 7.The distributions are shown for incident energies of 10, 50, and 90 GeV.At all incident energies, the visible energy distributions are very well modeled -an important feature for a calorimeter simulation as it is the width of these distributions that directly determines the energy resolution of the detector.The distributions for the number of hits above the threshold are modeled well with, some deviations visible at higher energies.

Shower Translation
To test the capability of the CaloClouds architecture to generate photon showers originating from arbitrary incident points on the detector, we compare the model using the two different data sets introduced in Section 2: a validation data set that has been simulated at the same position in the calorimeter as the training data set and a test data set simulated at a different position in the calorimeter.Depending on the exact impact point in the calorimeter the local cell geometry will look different, for example, due to the local staggering of cells between layers, the position of the impact point with respect to the cell centers, gaps in the cell structure at the edge of silicon sensors, etc.The distribution that is most susceptible to potential artifacts resulting from such a translation is the cell energy distribution.This distribution is shown in Figure 8 for the validation (left) and test data set (right) respectively, where in the case of the test data set the generated point cloud has been translated to the new impact position and then projected into the local cell geometry used for the full simulation with Geant4.Both comparisons show equally reasonable agreement between Geant4 and CaloClouds with no visible deterioration for the translated point cloud.

Timing
The main motivation for simulation using generative models is speeding up the generation time per shower.In Table 2, we present the generation time of CaloClouds in comparison with the baseline Geant4 simulation on either a single CPU core or on a machine equipped with an NVIDIA A100 GPU.The Geant4 baseline time is taken from Ref. [16], which was obtained in an identical detector  model and sub-detector to that studied here.In all cases, we consider uniform energy distributions with incident particle energies between 10 and 100 GeV.Note that the range of 10 -100 GeV for timing studies is chosen for consistency with Ref. [16]  simulator on a single core of an Intel ® Xeon ® CPU E5-2640 v4 (CPU) and an NVIDIA ® A100 with 40 GB of memory (GPU).Showers were generated with incident energy uniformly distributed between 10 and 100 GeV.The batch size was set to 1 on the CPU and to 64 on the GPU.Values presented are the means and standard deviations over 25 runs.The Geant4 time is taken from Ref. [16].
modeling the data well, we expect the timing performance to be robust also in extrapolation.
On a single CPU core, the speed-up provided by CaloClouds is about 1.2×, whereas on a GPU the relative speed-up is increased to 107× faster than Geant4.
When sampling, the Shower Flow is evaluated once for all events, followed by batch-wise sampling from the Latent flow and the PointWise Net.Unlike previous image-based fast simulation models such as Ref. [16,18,21], the computational cost of the PointWise Net scales O () with the number of points and hence with the energy  just like Geant4.Therefore, to improve overall training and sampling speed, we batch together events with a similar number of points.Overall, the model is trained for 800k iterations on an NVIDIA A100, which takes about 80 hours.
Our implementation of PointWise Net uses a 100-step reverse diffusion process for shower generation.This reverse diffusion process is implemented as a computational loop and can therefore not be parallelized.Recent developments investigated how few-step or even single-step generation with diffusion models might be possible, e.g. using consistency distillation [55] or progressive distillation [56], which has been already applied in high-energy physics [23].Using one of these techniques will likely speed up the generation by at least 10× since the diffusion process in our model is significantly slower than the other components such as the Shower and Latent Flow generation.Since this work is focused on model development, such improvements are left to future work.

Conclusions
Motivated by the overarching need of simulating particle showers in complex calorimeter detectors in a time and resource-efficient manner, generative models are widely explored in current particle physics research.A key problem is how to scale up such generators to be able to simulate showers with thousands of particles in high-resolution calorimeter regions.Given the inefficient scaling behavior of fixed-structure approaches, point clouds are an attractive alternative and form the foundation of our proposed strategy.This work makes several contributions to advance the state-of-the-art in this task.At the core of the proposed CaloClouds architecture is the PointWise Net, a permutation invariant, diffusionbased, point cloud generator.It is accompanied by an EPiC Encoder (in the training phase), a Latent Flow (in the sampling phase) to provide realistic conditions, and an additional Shower Flow that learns the visible energy distribution and the number of points per layer.
In principle, this combination of four models could already be used to directly simulate a point cloud of calorimeter hits.However, the relatively dense arrangement of hits is difficult to model, with the correct number of hits per volume, in particular, being difficult to learn.Additional difficulties arise from realistic calorimeter cell layouts due to staggering between layers or gaps between sensors, resulting in different local cell topologies in different parts of the calorimeter.In order to circumvent these constraints we, for the first time, move to an even higher-resolution space and consider individual Geant4 steps.While this increases the number of points to simulate by a small integer factor, it yields a distribution that can be more easily learned and subsequently downsampled to the actual resolution of a given calorimeter.
Together these two innovations allow us, for the first time, to generate high-cardinality point clouds with unprecedented fidelity in all observed distributions with moderate speed-ups compared to Geant4.In follow-up studies, techniques such as consistency distillation [55] have been proven to significantly increase the sampling speed [57].Additional comparisons of the CaloClouds model to already established fixed-structure calorimeter generative models, i.e. the BIB-AE [16] and CaloScore [22], we leave to further research.Based on the shown results, these models currently offer higher generative fidelity, yet are geometry-dependent and suffer from the discussed computational inefficiencies.For future research, we plan to improve the CaloClouds diffusion paradigm and the model architecture, taking inter-point correlations into account, to increase the generative fidelity.
For scenarios where higher resolution versions of simulations are not available for training (e.g.shower simulations, samples taken from collider data, the ongoing CaloChallenge [58] competition or problems from other domains), it might be possible to generate surrogate high-resolution data by smearing points with an appropriate function.Even so, CaloClouds demonstrates the potential of scaling point cloud simulations with high fidelity to much higher cardinalities, making the realistic simulation of showers at the HL-LHC, the ILC, and beyond a feasible target.

A Effects of the Pre-Clustering
In order to study possible effects of the pre-clustering procedure (see Section 2), we test its closure by applying the procedure to a sample simulated with Geant4 alone and then project the resulting clusters that form the point cloud used in CaloCloud back into the realistic cell geometry of the ILD simulation model.Figure 9 shows this for a typical shower with 90 GeV incident energy in a given layer of the calorimeter (layer 21).The left figure shows the cell energies in this layer as generated during the Geant4 simulation by projecting the energy depositions into the calorimeter cells.The thick black bars are gaps between sensors as they occur in the simulation model of ILD at the chosen impact point.The center plot shows the resulting space points after the projection of the Geant4 steps into the virtual ultra-high granular grid.Finally, the right plot shows the cell energies resulting from simply projecting the space points back into the calorimeter cells.Additionally, the Geant4 steps are shown on the left in blue and the clusters on the right in orange.No obvious discrepancies are seen in this example.In order to quantify the effects, in Figure 10 we show an overlay of 2,000 showers in the same layer.The left figure shows this overlay as produced by Geant4, while the center plot shows this overlay after a full round trip of pre-clustering and projecting back into the same cell geometry.The right plot shows the resulting relative difference for every cell.The resulting difference is well below 10% for all and below 2% for most cells, indicating good closure.Importantly, most differences occur in sparsely populated parts of the showers where small differences in absolute energies lead to large relative differences.The core, where most of the shower energy is concentrated, is especially well preserved.Finally, Figure 11 presents a comparison of this difference plot for the applied procedure with 36 times higher granularity (left -same as Figure 10-right), as used in this paper, and a difference plot arising from a granularity increased by only 4 times (left).While such a lower granularity would potentially result in faster sampling times, it also shows significantly larger differences compared to the applied pre-clustering procedure.

Figure 1 :
Figure 1: Illustration of the data preprocessing pipeline.
[ ′ = 0,  ′ = 1811.3mm,  ′ = 40 mm], matching the training data, while the test set places the origin of the photons at [ ′ = 37.5 mm,  ′ = 1811.3mm,  ′ = −36.1 mm].This allows us to investigate the behavior of the generated point clouds under a translation to another position in the

Figure 2 :
Figure 2: Illustration of the training and sampling procedure of the CaloClouds architecture.The separate training of the Shower Flow and the Latent Flow is not shown.

Figure 3 :
Figure 3: Illustration of the CaloClouds' PointWise Net (a) consisting of multiple ConcatSquash layers (b).The number of hidden dimensions is indicated.MLP denotes a multi-layer perceptron.*No activation function is applied in the last layer of the PointWise Net.

Figure 4 :
Figure 4: Illustration of the reverse diffusion process.Starting from the initial noise.The color scale corresponds to the point energy.

Figure 5 :Figure 6 :
Figure 5: Histograms of the cell energies (left), radial shower profile (center), and longitudinal shower profile (right) for both Geant4 and CaloClouds.In the per-cell energy distribution, the region below 0.1 MeV is grayed out (see main text for details).All distributions are calculated for a uniform distribution of incident particle energies.

Figure 8 :
Figure 8: Per-cell energy distribution for the 50 GeV validation (left) data set, created at the same position as the training data set and for a 50 GeV test (right) data set simulated at a different position with the generated point cloud translated to this position.

Figure 9 :
Figure 9: Example shower of 90 GeV in one layer of the calorimeter.Left: cell energies as recorded in the ILD simulation model with blue dots representing the Geant4 steps.Center: projection of the steps in the ultra-high granularity grid (pre-clustering).Right: Resulting cell energies after projecting the space points (orange dots) back into the detector cells.

Figure 10 :
Figure 10: Overlay of 2000 90 GeV showers in one calorimeter layer.Left: as simulated in Geant4.Center: same showers after a full round trip of pre-clustering and projecting back into the detector cells.Right: relative difference of the initial and processed distribution.

Figure 11 :
Figure 11: Difference after pre-clustering and projecting back into the cell geometry for 36 times (left) and 4 times (right) increased granularity.

Table 1 :
Overview of the three different types of point clouds considered in this work, indicating their role.Note that the number of points per shower (second column) only provides an order of magnitude. )

Table 2 :
but exceeds the training range of 10 -90 GeV used for the CaloClouds model.However, while this might in principle be out-of-distribution for Comparison of the computational performance of CaloClouds to the baseline Geant4