Evaluation of clustering algorithms at the < 1 GeV energy scale for the electromagnetic calorimeter of the PADME experiment

A possible solution to the Dark Matter problem postulates that it interacts with Standard Model particles through a new force mediated by a “portal”. If the new force has a U(1) gauge structure, the “portal” is a massive photon-like vector particle, called dark photon or A’. The PADME experiment at the DAΦNE Beam-Test Facility (BTF) in Frascati is designed to detect dark photons produced in positron on fixed target annihilations decaying to dark matter (e+e-→γA′) by measuring the final state missing mass. One of the key roles of the experiment will be played by the electromagnetic calorimeter, which will be used to measure the properties of the final state recoil γ. The calorimeter will be composed by 616 21×21×230 mm3 BGO crystals oriented with the long axis parallel to the beam direction and disposed in a roughly circular shape with a central hole to avoid the pile up due to the large number of low angle Bremsstrahlung photons. The total energy and position of the electromagnetic shower generated by a photon impacting on the calorimeter can be reconstructed by collecting the energy deposits in the cluster of crystals interested by the shower. In PADME we are testing two different clustering algorithms, PADME-Radius and PADME-Island, based on two complementary strategies. In this paper we will describe the two algorithms, with the respective implementations, and report on the results obtained with them at the PADME energy scale (< 1 GeV), both with a GEANT4 based simulation and with an existing 5×5 matrix of BGO crystals tested at the DAΦNE BTF.


Introduction
The long standing problem of reconciling the cosmological evidence of the existence of dark matter with the lack of any clear experimental observation of it, has recently revived the idea that the interaction of the new particles with the Standard Model (SM) gauge fields is not direct but occurs through "portals", connecting our world with new "secluded" or "hidden" sectors. One of the simplest models introduces a single U(1) symmetry, with its corresponding vector boson, called Dark Photon or A'. In the most general scenario, the existence of dark sector particles with a mass below that of A' is not excluded: in this case, so-called "invisible" decays of the A' are allowed. Moreover, given the small coupling of the A' to visible SM particles, which makes the visible rates suppressed by ε 2 (ε being the reduction factor of the coupling of the dark photon with respect to the electromagnetic one), it is not hard to realize a situation where the invisible decays dominate. There are several studies on the searches of A' decaying into dark sector particles, recently summarized in [ to detect the non-SM process e + e -→γA', with A' undetected, by measuring the final state missing mass, using a 550 MeV positron beam from the improved Beam-Test Facility (BTF) of the DAΦNE Linac at the INFN Frascati National Laboratories [5]. The collaboration will complete the design and construction of the experiment by the end of 2017 and will collect O(10 13 ) positrons on target in two years starting in 2018, with the goal of reaching a ε~10 -3 sensitivity up to a dark photon mass of MA' ~ 24 MeV/c 2 . The experiment, shown in figure 1, is composed of a thin active diamond target, to measure the average position and the intensity of the positrons during a single beam pulse; a set of charged particle veto detectors immersed in the field of a 0.5 Tesla dipole magnet to detect positrons losing their energy due to Bremsstrahlung radiation; and a calorimeter made of BGO crystals, to measure/veto final state photons. As the rate of Bremsstrahlung photons in its central region is too high, the calorimeter has a hole covered by a faster photon detector, the small angle calorimeter (SAC). The apparatus is inserted into a vacuum chamber, to minimize unwanted interactions of primary and secondary particles that might generate extra photons. The maximum repetition rate of the beam pulses is 50 Hz. One of the key roles of the experiment will be played by the electromagnetic calorimeter, which will be used to measure the properties of the final state recoil γ, as the final error on the measurement of the A' mass will directly depend on its energy, time, and angular resolutions. The total energy and position of the electromagnetic shower generated by a photon impacting on the calorimeter can be reconstructed by collecting the energy deposits in the set of crystals interested by the shower. This set of crystals is not known a priori and must be reconstructed with an ad hoc clustering algorithm. In PADME we tested two different clustering algorithms: one based on the definition of a set of crystals centered on a local energy maximum, called PADME-Radius, and one based on a modified version of the "island" algorithm in use by the CMS collaboration [6], where a cluster starts from a local energy maximum and is expanded by including available neighboring crystals, called PADME-Island. In this paper we will describe the implementations of the two algorithms and report on the results obtained with them at the PADME energy scale (< 1 GeV), both with a GEANT4 based simulation and with an existing 5×5 matrix of BGO crystals tested at the DAΦNE BTF.

The PADME electromagnetic calorimeter
The recoil photon from the e + e -→γA' process will be detected by an electromagnetic calorimeter positioned 3 m downstream from the active diamond target (figure 2). It consists of 616 BGO crystals recovered from one of the electromagnetic end-caps of the L3 experiment at CERN [7]. The crystals are cut to a 21×21×230 mm 3 shape and arranged in a roughly cylindrical shape with ~ 60 cm diameter. Light coming from the crystals is read by 19 mm diameter PMTs.
In the region immediately around the axis of the incoming beam, the rate of photons emitted by Bremsstrahlung process in the target is too high to be resolved by the BGO crystals, which have a relatively slow 300 ns signal decay time. To avoid this pile-up of low-angle Bremsstrahlung photons, a squared hole of 5×5 crystals is left in the central region of the calorimeter. This hole is covered by a much faster small angle calorimeter (SAC) positioned immediately behind the electromagnetic calorimeter and composed by an array of 7×7 SF57 lead-glass blocks with a 20×20×200 mm 3 shape (figure 3).
A prototype of the calorimeter, composed by a 5×5 array of BGO crystals, was successfully tested at the BTF in 2016 [8].

Clustering algorithms
When an electromagnetic particle (e ± or γ) impacts on a dense material, it generates a shower which releases the particle's energy in a volume defined longitudinally by the interaction length and transversally by the Molière radius of the material.
In a finely segmented calorimeter, such as the PADME electromagnetic calorimeter, a shower develops through a set of adjacent crystals, usually called cluster. Each crystal in the cluster collects only a fraction of the total energy released by the shower so that, in order to reconstruct the original energy of the impacting particle, it is necessary to sum up all the energies of the crystals and then apply some corrections to take into account energy losses and shower fluctuations.
In general, the set of crystals composing a single cluster is not known a priori but must be reconstructed by looking at the distribution of energy deposits: this process is known as cluster finding and is usually demanded to specific algorithms which implement a wide range of strategies, each aimed at optimizing different aspects of the final reconstruction. Factors affecting the final result of the cluster finding algorithm include the minimal energy visible by a crystal, the presence of noise coming from the energy measurement process, and the presence of multiple electromagnetic particles close to each other. In order to find the optimal approach to the cluster finding problem for the PADME electromagnetic calorimeter, the collaboration decided to develop and test two different algorithms, with two substantially different underlying strategies: the PADME-Radius algorithm and the PADME-Island algorithm.

The PADME-Radius algorithm
The PADME-Radius algorithm implements the simplest possible approach to cluster finding: a cluster is created around each local energy maximum (called seed) by collecting all crystals with some energy deposit within a given radius from it. 1 Given its structure, the behaviour of the PADME-Radius algorithm can be tuned by setting three parameters: • the minimum energy deposit for a crystal to be included in a cluster; • the minimum energy deposit for a crystal to be used as seed for a cluster; • the radius of the circle around the seed to include in the cluster. This parameter is strongly related to the Molière radius of the crystal (2.23 cm for BGO).
The values used for this test were: = 100 keV, = 10 MeV, = 5 cm. Figure 4 shows the pseudocode listing of the algorithm while figure 5 shows an example of the results obtained by the algorithm for an arbitrary distribution of energy in the calorimeter.

The PADME-Island algorithm
As PADME-Radius, the PADME-Island algorithm starts by looking for a local energy maximum, the cluster seed. Neighboring crystals are then attached to the cluster by applying a recursive search with the requirement that the energy of the neighbor is below that of the adjacent cluster boundary. Figure 6 shows the pseudocode listing of the algorithm while figure 7 shows an example of the results obtained by the algorithm for the same arbitrary distribution of energy in the calorimeter used for figure 5.
The behaviour of the PADME-Island algorithm can be tuned by setting only two parameters: • the minimum energy deposit for a crystal to be included in a cluster; • the minimum energy deposit for a crystal to be used as seed for a cluster.

Algorithm comparison
Both algorithms were implemented and included in the general PADME software framework so that they could be tested on both MC events and real data coming from the testbeams at the DAΦNE BTF. An initial test with MC events looked for differences in the results from the two algorithms starting from two 10,000 single photon events samples, one at 100 MeV and one a 400 MeV. The list of the total energies deposited in each crystal was used as input to the clustering algorithms. From the list of clusters reconstructed by each algorithm we evaluated: • the total number of reconstructed clusters; • the energy distribution of the single cluster; • the total reconstructed energy obtained by adding the energy of all the clusters found in the event.  Figure 9 shows the difference between the algorithms in the single cluster energy distribution at 100 MeV, while figure 10 shows the same comparison for the total energy distribution.    This very preliminary comparison shows that, as expected, in a single particle data sample PADME-Island has a tendency to split the total energy deposit into more clusters than PADME-Radius. This effect also reflects in the total energy reconstruction, as the energy deposits not included in the reconstruction of the main clusters are too small to start a new cluster and are therefore excluded from the final energy count. By looking in more detail at the algorithms' results, we found that most of the differences in cluster splitting are due to statistical energy fluctuations in the crystals where the energy deposit is very low: a possible improvement for the PADME-Island algorithm could therefore contemplate a modification to the "If adjacent is not used and its energy is below that of current" statement to take this effect into account. The collaboration is now proceeding with more detailed tests on the two algorithms: these will include a more realistic energy reconstruction procedure, an optimization of the clustering parameters, and will evaluate the effect of the cluster finding algorithms on the final energy and spatial resolutions both for low and high multiplicity events.

Conclusions
The optimization of the clustering strategy in the PADME electromagnetic calorimeter is of primary importance to the reconstruction of the missing mass in the e + e -→γA' process. The collaboration is currently evaluating two different clustering algorithms, called PADME-Radius and PADME-Island.
These two algorithms present significantly different characteristics: the first collects all energy within a cylinder around each energy maximum, while the second lets a cluster grow around each energy maximum till all nearby energy has been collected or till it finds a neighboring cluster. The effect of these differences has to be tested for all the relevant parameters of the electromagnetic calorimeter, from the energy and spatial resolutions to the cluster separation power, so that each reconstruction task will be able to select the optimal algorithm for its physics goals.
Both algorithms have been implemented in the general PADME software framework and are now being tested on MC events and on real data from testbeams at the DAΦNE BTF.