Track segments in hadronic showers in a highly granular scintillator-steel hadron calorimeter

We investigate the three dimensional substructure of hadronic showers in the CALICE scintillator-steel hadronic calorimeter. The high granularity of the detector is used to find track segments of minimum ionising particles within hadronic showers, providing sensitivity to the spatial structure and the details of secondary particle production in hadronic cascades. The multiplicity, length and angular distribution of identified track segments are compared to GEANT4 simulations with several different shower models. Track segments also provide the possibility for in-situ calibration of highly granular calorimeters.


Introduction
The physics goals of future high-energy linear electron-positron colliders place strict requirements on the performance of the detector systems.One of these requirements is excellent jet energy resolution to provide adequate separation of gauge bosons in all-hadronic final states.Detailed simulation studies have extensively demonstrated that such resolutions can be achieved with Particle Flow Algorithms [1,2,3].To provide the necessary separation of particles within hadronic jets, highly granular imaging calorimeters are required.The CALICE collaboration has constructed and thoroughly tested several prototypes of electromagnetic and hadronic imaging calorimeters.One of these calorimeters is the analog hadron calorimeter (AHCAL) [4], a sampling calorimeter with 38 active layers of plastic scintillator sandwiched between steel absorber plates with lateral dimensions of 90 × 90 cm 2 and a total instrumented volume of approximately 1 m 3 .The scintillator layers consist of 5 mm thick scintillator tiles, each individually read out by a silicon photomultiplier (SiPM) [5,6].The first 30 layers of the calorimeter have a highly granular core of 10 by 10 cells each with a size of 30 × 30 mm 2 .This is surrounded by three rings of 60 × 60 mm 2 cells, and one incomplete ring of 120 × 120 mm 2 cells, as shown in Figure 1.In the last eight layers, the highly granular core is replaced by 60 × 60 mm 2 cells, otherwise the layout is identical.In total, the number of cells is 7608.The passive material per layer amounts to 21 mm of steel: 4 mm from the two cover plates of the cassettes housing the scintillator tiles, and 17 mm from the absorber plates of the calorimeter structure.In total, the depth of the calorimeter is 5.3 λ I , corresponding to a depth for pions of 4.3 λ π .
During data taking at the CERN SPS, a silicon-tungsten electromagnetic calorimeter (SiW-ECAL) [7] was installed upstream of the AHCAL together with a downstream tail catcher and muon tracker (TCMT) [8].These detectors are not explicitly used in the analysis presented here, but the ECAL contributes to the event selection discussed below.
The high granularity of the AHCAL enables detailed investigations of the three-dimensional structure of hadronic showers.Rather than consisting of amorphous blobs of energy deposits, hadronic showers are characterised by dense electromagnetic sub-showers and sparse hadronic components forming a tree-like structure.The hadronic component includes (approximately) minimum-ionising particles (MIPs) which can propagate an appreciable distance before undergoing an inelastic interaction.In the present analysis track segments are identified using a basic nearest neighbour algorithm.Figure 2 shows a typical hadronic shower in the AHCAL, with identified track segments highlighted in red.The properties of these track segments, such as length, multi-B e a m Identified track segments

CALICE
Figure 2: Event display of a typical hadronic shower in the CALICE AHCAL initiated by a negative pion with an energy of 60 GeV.The identified minimum-ionising track segments are highlighted in red.The beam enters from the lower left, indicated by the black arrow.plicity and angular distribution, are compared to GEANT4 [9] simulations performed with several shower models.

Track-finding
The tracking algorithm used here consists of two stages.The first stage is the identification of track candidates in a layer by layer search using a nearest neighbour algorithm.In a second stage, these candidates are passed through a filtering algorithm based on a Hough transformation to remove inconsistent hits such as noise hits and hits not due to energy depositions from the tracked minimum-ionising particle.

The tracking algorithm
For the track finding, the coordinate system is defined as indicated in Figure 4, with the z-axis given by the beam axis, the x-axis pointing left when looking downstream in positive z direction and the y-axis pointing up.The track finding algorithm used for the pattern recognition is a simple implementation of a nearest neighbour algorithm.The algorithm was specifically developed for the test beam data taken with the CALICE AHCAL.It exploits the primary flight direction of incoming beam particles along the z axis by assuming that all particles found by the algorithm have a sizeable momentum component along that axis.This is reflected by the assumption that any MIP-like particle will only create at most one hit in a given layer, and that cells on the same track in adjacent layers are neighbours, sharing at least one corner when projected on the same layer.With the layer to layer distance of 31.6 mm and a cell thickness of 5 mm this limits the algorithm to the identification of tracks with a maximum angle with respect to the beam axis of approximately 60 • in the central region with tiles of 30 × 30 mm 2 , of 70 • for the 60 × 60 mm 2 tiles and of 80 • for the outer 120 × 120 mm 2 tiles, respectively.It is important to note that these requirements also allow the identification of backward-going tracks, since it is not required that the tracks point outwards from the beam axis when looking downstream.
Prior to the actual track finding, a set of calorimeter hits to be used in the algorithm is identified.As a first step, all hits with an amplitude below the noise cut-off are rejected.Then, isolated hits are identified.These hits have no directly adjacent -orthogonal and diagonal -neighbouring hit within the same layer.The tracking itself is performed recursively on the isolated hits only.For illustration, a flowchart of the algorithm, which is described in the following, is shown in Figure 3, with the recursive function findTrack enclosed in the gray box.
The underlying idea behind this recursion algorithm is that when searching for a track using a given (isolated) hit as seed, the hit itself is already a valid track candidate.A single recursion step seeks to extend this track candidate with track candidates further downstream within a search cone.These candidates in turn are identified by the (recursive) application of the same tracking algorithm, using the identified isolated hits (if they are not part of any previously identified tracks) in the search cone as seed.Note that the tracking algorithm only considers hits downstream with respect to the seeded hit and therefore is independent of the calling history, i.e. the track candidates found so far.Starting with the isolated hit which is furthest upstream, i.e. closest to the calorimeter front, the tracking search will be performed for all isolated hits that are not part of a previously identified track candidate.
The search cone consists of cells one layer further downstream which share at least a corner with the seeded cell when projected onto the layer of the seeded cell (i.e. the cell with the same x/y coordinates and the ring of cells around it), plus all cells two layers downstream of the seeded cell sharing at least a corner with the cells in the cone one layer downstream, again after projection onto the layer of the seeded cell (i.e. the cell with the same x/y coordinates and two rings of cells around it).This search cone is illustrated in Figure 4. Extending the search cone over two layers has the advantage that it gives the tracking algorithm the possibility to bridge gaps in a track, which can originate from the finite efficiency of the scintillator cells and from individual cells failing the isolation criterion due to neighbouring noise hits or other energy deposits.
All isolated hits within this cone which are not part of any track candidate so far are identified.As hits can only be assigned to one track candidate, all these hits are sorted by distance to the seeded cell to favor track continuity over gaps.As stated before, for each of these hits another recursion of the tracking algorithm is started and the resulting track candidates, each of which is a viable option to extend the current track candidate consisting of only the seeded cell, are collected.At this stage, three possibilities are considered: • There is no isolated hit within the search cone of the seeded cell.Hence the seeded cell is the last one (furthest downstream) on a possible track.Therefore the recursion returns a new track candidate with the seeded cell as its only member.
• If a single isolated hit, i.e. a single track candidate is found within the search cone of the seeded cell, the seeded cell itself is added to the candidate and the track candidate is returned, ending this recursion step.• If the recursive calls return more than one track candidate as a possible continuation from the seeded cell, it has to be decided to which of these candidates the seeded cell should be added to.For this analysis it was decided to maximize the number of found tracks.Therefore all track candidates that already surpass the minimum length threshold for the acceptance of tracks of five layers are declared final and added to the list of identified tracks, while the seeded cell is added to the longest candidate below the threshold.However, if all track candidates surpass the threshold, the shortest one is chosen to accept the additional hit (and thus not declared final yet).This case of multiple possible continuations from the seeded cell is relatively rare, and occurs in less than 9% of the cases where a continuation is found.
At the completion of the algorithm, once all recursions are closed, the remaining uncompleted track candidates are declared to be final tracks if they fulfil the minimum length requirement of five layers.

Hough based track filtering
The tracking algorithm presented in Section 2.1 works on a simple nearest neighbour basis and does not take continuity of the track direction into account.Noise hits and other energy deposits not originating from the tracked particle which are added to a track can give it an unphysical kink or step.In rare cases a track can even entirely consist of noise and other random hits.To remove hits probably not belonging to tracks and to eliminate fake tracks, a filtering stage is applied after track finding.Since fit based solutions with χ 2 -based straight line fits were found to be inefficient in this analysis, the used filter is based on a two-dimensional Hough transformation [10].It identifies hits which are inconsistent with the dominant direction of the track.In the filter, the x − z plane and the y − z plane are treated separately.In the following, the procedure for the x − z plane is described in detail.The y − z plane is treated in an analogous way.
As the maximum track angle is limited to less than 90 • with respect to the beam axis, a track given by a straight line can be described by an equation of the form given in Equation 2.1.Each point in real space transforms into Hough space (described by the coordinates t and m) into a line as given by Equation 2.2.
Points lying on one common track transform into lines which intersect in Hough space.The coordinates of this intersection are the parameters m and t of the original Equation 2.1 for an identified track.
Since the cells of the calorimeter have a finite size, the x and y position of a hit is not known precisely, but is uncertain by the horizontal and vertical size of the cell of 30 mm, 60 mm or 120 mm.Thus, a hit (x, z) in a cell with size s is transformed to a band in the Hough space, with the upper and lower limits given by (2. 3) The problem of finding hits on a straight line through the intersections of lines in Hough space thus transforms into one of finding areas of overlap of the individual bands.The regions of overlap are determined numerically.The area shared by the highest number of bands is taken as the range of track parameters in Hough space.All hits whose bands are not overlapping with this area are considered to be noise and are removed from the track.If the resulting cleaned track has fewer than four hits the entire track is removed from the event.

Data set and detector simulations
The tracks identified in hadronic showers are used to validate the realism of detector simulations based on GEANT4 by comparing results from test-beam data and simulated data.The analysis is based on a data set recorded at the CERN SPS at the H6 beam line in 2007 with energies ranging from 10 GeV to 80 GeV.All runs included in the present analysis used negative polarity secondary or tertiary beams, consisting of a mix of negative pions, muons and electrons.Antiproton contamination as well as kaon content are negligible.

Calibration
The response of each calorimeter cell is calibrated with high-energy muons, using the visible signal of these minimum-ionising particles (MIP) as the cell-to-cell calibration scale [4].Throughout the event selection and the analysis, this calibration scale is used since no conversion to reconstructed hadronic energy is required.

Event selection
With the event selection procedure, a pion sample of high purity entering the AHCAL is selected.
Figure 5 shows all events of a 25 GeV run with mixed particle content as a scatter plot of the energy measurement of the hadronic and the electromagnetic calorimeters.Within this plot different  particle types are marked.Electrons are eliminated by rejecting events with an energy deposition of more than 80 MIP in the ECAL upstream of the AHCAL, which is incompatible with the assumption of a minimum-ionising particle traversing the entire ECAL.This requirement also rejects most events in which the pion has its first inelastic interaction in the ECAL.In addition, a minimum energy deposition of 150 MIP in the AHCAL is required, which is approximately three times the energy deposited by a minimum-ionising particle traversing the entire AHCAL.This selection rejects muons and requires some hadronic shower activity to take place in the calorimeter.To illustrate these cuts, Figure 5 shows the distribution of reconstructed energy in the ECAL versus reconstructed energy in the AHCAL for a 25 GeV hadron run, with the regions rejected by the selection cuts shown in red.The cuts successfully select hadrons showering in the AHCAL, while rejecting muons, electrons (which are fully absorbed in the ECAL) and showers starting already in the ECAL.Prior to the event selection this data set contains approximately 1.2 % electrons and 5.8 % muons, which are both successfully suppressed to a negligible contribution after the event selection.Since approximately 60% of all hadrons already start showering in the ECAL, about 35 % of all events are accepted in the analysis, corresponding to typically 50 000 to 80 000 events at each energy.
In each event, noise hits are rejected by requiring a minimum amplitude of 0.5 MIP per channel.After this cut, the remaining noise contribution, measured in random trigger events, is 8.88 ± 0.14 MIP per event.

Detector simulations
The simulations are based on a detailed model of the test-beam setup including calorimeters and beamline instrumentation performed with GEANT4 9.4p02.Following the GEANT4 shower simulations, detector-specific effects such as the light-yield of the scintillator, the response of the SiPMs and the influence of the readout electronics are simulated in a digitisation stage.Detector noise is taken into account by overlaying noise recorded at the test beam with random triggers during the runs that are to be compared with the simulations.The dependence on environmental parameters is fully taken into account.Both data and simulations are treated identically by the reconstruction code which identifies hits using the MIP-scale calibration.Further details on the simulation procedure of the AHCAL can be found in [11].At each energy, 150 000 events are simulated, resulting in typically 65 000 accepted events.
GEANT4 provides various models for the simulation of hadronic interactions with different regions of validity in energy and for different particle species.To cover the full energy range of interest for detector simulations in high-energy physics, so-called physics lists are provided which are combinations of several of these models [12,13].In energy regions where two models are overlapping in the physics list, GEANT4 chooses one of the two randomly on a call-by-call basis.The probability for the model valid at lower (higher) energies to be chosen decreases (increases) linearly with energy in the overlap region to provide a smooth transition between models.
For the analysis in this paper, the following four physics lists are used: LHEP Uses the LEP (low energy parametrised) and HEP (high energy parametrised) models, with a transition region from 25 GeV to 50 GeV.This physics list is essentially a GEANT4 adaptation of the GEISHA model [14] used in GEANT3.It is known to be less accurate than newer models, but is included here to provide an indication of the progress achieved with more recent codes.
QGSP_BERT Uses a Quark Gluon String (QGS) model [15] followed by the Precompound (P) and evaporation model for the de-excitation of nuclei for energies above 12 GeV.The Bertini (BERT) cascade [16] is used for energies below 9.9 GeV.In the intermediate region between those two models in the range from 9.5 GeV to 25 GeV the LEP model is used.
FTFP_BERT Uses the Fritiof (FTF) model followed by a Reggeon cascade and the Precompound evaporation (P) model [17] for energies higher than 4 GeV.Below 5 GeV the Bertini cascade is used.This physics list uses the same cross section model as the QGSP_BERT list.
QGS_BIC This list is identical to QGSP_BERT for energies above 12 GeV.However, for lower energies the Bertini cascade is replaced by a combination of the LEP model and the binary cascade (BIC) [18], with a transition between 1.2 GeV and 1.3 GeV.

Results
Since the tracking algorithm introduced in Section 2.1 uses only isolated hits, it finds MIP-like track segments which are well separated from regions of dense shower activities.Apart from the track of the incoming charged pion prior to its first inelastic interaction, these are mainly tracks of

CALICE
Figure 6: Correlation between the number of identified track segments and the true number of charged particles (excluding electrons and positrons) with a kinetic energy of greater than 500 MeV.These results were obtained with the FTFP_BERT physics list for 10000 simulated events at an energy of 25 GeV.The Pearson correlation coefficient of the distribution is 0.38.higher-energy secondary particles which travel an appreciable distance before interacting again.A comparison of the test beam results to simulations is thus most sensitive to the sparse outer and tail regions of hadronic showers.To disentangle possible differences between the primary track of the incoming hadron and secondary tracks, the energy dependence of the observables are studied both for all identified tracks and tracks starting in layer three or later.

Track multiplicity
The multiplicity of identified tracks per event is sensitive to the number of secondary isolated charged hadrons created in the hadronic shower.Figure 6 shows the correlation between the number of identified track segments and the true number of charged particles (except electrons and positrons) with a kinetic energy greater than 500 MeV for events simulated with the FTFP_BERT physics list for a beam energy of 25 GeV.The threshold of 500 MeV is chosen to ensure that pions remain approximately minimum-ionizing for the typical length of identified tracks despite ionization energy loss in the absorber plates.For both data and simulations the number includes the track of the primary pion.The observed correlation demonstrates that the multiplicity of identified track segments is indeed sensitive to the overall number of energetic secondary particles.The Pearson correlation coefficient has a value of 0.38 for FTFP_BERT and of 0.41 for QGSP_BERT.
The multiplicity of identified tracks for data taken with an energy of 25 GeV is shown in the upper panel of Figure 7.The comparison to simulations with different physics lists is shown in the lower panel as the residual r between data and simulations, given by r = simulations -data data .Here, the residual is calculated for distributions normalised to the overall integral to compare the shape of the distributions rather than the overall number of tracks.The grey band indicates the statistical  uncertainty of the residual of data and simulations with the QGS_BIC physics list to give an indication of the size of the statistical uncertainties in the study.Since the number of simulated events is identical for all physics lists, the statistical errors of the other data points are omitted here and for the rest of this article for better legibility.In comparison to data, the four physics lists studied fall into two groups.While the QGSP_BERT and the FTFP_BERT lists reproduce the distribution of the track multiplicity very well, simulations with the LHEP and the QGS_BIC physics lists show significant discrepancies, with an overall shift towards a lower number of identified tracks.
This comparison is further expanded by a study of the energy dependence of the mean track multiplicity per event, shown in Figure 8. Again, the statistical errors are omitted except for the residual of data and simulations with QGS_BIC.As for the differential distribution discussed above, the QGSP_BERT and the FTFP_BERT lists reproduce the energy dependence of the mean multiplicity quite well, with QGSP_BERT showing the best agreement with deviations below 5% at all energies.Simulations with QGS_BIC consistently underestimate the number of tracks by 15%, and LHEP produces 25% to 30% too few tracks.This shows that these two lists, in particular the rather old LHEP model, provide insufficient production of higher energy secondary hadrons which propagate outside of the core of the shower.Since QGS_BIC uses the low-energy part of the LHEP model, LEP, for mesons in a wide energy range, its poorer performance compared to QGSP_BERT and FTFP_BERT is expected in light of the observed LHEP performance.Consistent behavior is observed when considering only secondary tracks, defined here as starting in calorimeter layer three or later.Figure 9 shows the mean multiplicity of identified secondary tracks as a function of energy.Compared to the full track sample shown in Figure 8 the mean is reduced, as expected from the exclusion of primary track segments.The comparison to simulations show a slight increase of the discrepancy between data and the QGS_BIC and LHEP models with respect to the observations for the inclusive track sample, consistent with the interpretation of insufficient production of higher energy secondary hadrons.

Track inclination
The distribution of 1−cos θ , where θ is the angle of the track with respect to the beam axis, here for brevity referred to as track inclination, provides sensitivity to the contribution of higher-energetic secondary particles emitted at large angles.The inclination of a track is calculated by dividing the distance in z between the position of the centre of the first and the last tile in the track by their absolute distance r, The algorithm is capable of identifying tracks with an inclination of 0 up to 1 − cos θ = 0.47 for the central detector region and up to 1 − cos θ = 0.69 for the outer detector region.to unity to provide sensitivity to the shape of the distribution while ignoring the influence of differences in the overall number of identified tracks between data and simulations with different physics lists.The number of identified tracks falls approximately exponentially with higher inclination.An exception to this is the bin from 0 to 0.02 which contains an excess of tracks since it also includes the tracks of the primary beam particles.While all physics lists follow this trend, most tend to underestimate the importance of tracks with high inclinations emitted at large angles.Again the QGSP_BERT physics list provides a good description of the data while the relative importance of large-angle tracks is underestimated by approximately a factor of two by the LHEP physics list.This is further illustrated by the energy dependence of the mean track inclination shown in Figure 11.The average track inclination increases with increasing energy, meaning that the contribution of large-angle tracks to the overall track sample increases.This is consistent with the observed increase in overall track multiplicity as discussed above, which is connected to an increase in the number of identified secondary tracks which are not necessarily aligned with the beam direction.This trend is reproduced by all physics lists.As expected from the observed differences in the shape of the track angle distribution, the quality of the description of the mean track inclination observed for data varies from list to list.In general, the differences are small due to the strong predominance of tracks at inclination values close to zero.Simulations with the QGSP_BERT physics list, which gives the best description, agree with the data at a level better than 5%, and in particular at low energies agree well with observations.On the other hand, LHEP, which gives the worst description, shows deviations of up to 20%.
Restricting the analysis to secondary tracks reduces the predominance of tracks along the beam axis, thus increasing the mean track inclination as shown in Figure 12.Also the secondary sample is still dominated by tracks at small inclinations since the production of highly energetic secondaries tends to be forward, resulting in a relatively small mean also for secondary tracks only.As in the case of the track multiplicity, the comparison to simulations gives results which are consistent with the observations for the inclusive track samples, with in general slightly smaller differences observed between data and the different physics lists.

Track length
The probability of a hadron to undergo an inelastic hadronic interaction with matter increases exponentially with the distance travelled.The negative inverse of the slope of this exponential distribution is given by the nuclear interaction length λ I .The length distribution of the track segments identified by the tracking algorithm is expected to follow a similar distribution.In practice, however, the tracks are typically shorter since the tracking algorithm is not capable of identifying tracks in regions of dense shower activity and thus at the creation point of many secondary tracks.The finite single-hit efficiency can lead to a reduction of the overall length of an identified track and the presence of noise and other energy deposits can result in an early termination or the splitting of a track.In addition to a sensitivity to the quality of the description of higher-energy cross sections in the physics lists, the comparison of the track length observed in data with that in simulations provides information on the quality of the overall description of detector effects in the simulations.
Figure 13 shows the distribution of the length of identified tracks for a beam energy of 25 GeV together with a comparison to simulations.The distributions are normalised to show differences in the shape of the distribution rather than in the overall number of identified tracks.All physics lists reproduce the general shape well, with larger deviations around a length of 80 cm and around 95 cm.These are due to reduced numbers of tracks found in data with a shower start in two layers in the front part of the calorimeter, which results in fewer tracks with a length given by the distance between these layers and the last calorimeter layer, as well as due to reduced tracking efficiency in data compared to simulations in the transition region from the fine to the coarse inner region of the detector layers.This points to an incomplete modelling of dead or noisy cells in these layers in simulations, which however does not have a significant impact on the overall results since the absolute number of affected tracks is very small.
To study trends with beam energy, the inverse slope parameter λ track is determined with an exponential fit in the range from 30 cm to 70 cm, excluding problematic regions.Due to the reasons outlined above, this parameter is not identical to the nuclear interaction length of the calorimeter, but is related to it.Figure 14 shows the negative inverse slope parameter λ track as a function of energy for data and simulations.It decreases with increasing energy approximately following a 1/E behavior.This reduction of the typical track length with increasing energy can be understood by the increase in shower activity, which increases the probability that portions of tracks are not found due to other energy deposits in the vicinity, and by the increase of the contribution of secondary tracks which originate in regions of high energy density.All physics lists studied reproduce the general trend with energy, but consistently overestimate λ track and with that the track length.
While QGSP_BERT is closest to data with a typical deviation of 5%, LHEP overestimates the slope parameter by 10%.The observed differences point to differences in shower shape and an underestimate of spurious hits which can lead to a truncation of tracks, but can also partially originate from the treatment of noise hits in simulations as discussed below.
The negative inverse slope parameter for secondary tracks, shown in Figure 15 is lower than the one observed for primary tracks, as expected from the bigger impact of the regions with higher shower density in the vicinity of inelastic interactions.The general trend with energy is identical to the one observed for the inclusive sample shown in Figure 14.As for the other observables as well, the comparison to simulations for secondary tracks only yields results which are consistent with the observations for all tracks.In general the difference between data and simulations are somewhat smaller for the secondary tracks, with the largest deviations observed for LHEP, which overestimates the slope parameter by approximately 8%.

Systematic Uncertainties
Systematic uncertainties on the track finding performance could potentially affect the conclusions drawn from the comparison of data and simulations.The sizes of possible effects are studied here using simulations with the QGSP_BERT physics list by comparing the results obtained with the standard simulations with those with modified parameters.The identification of track segments is a rather robust technique since it is purely based on the identification of hits irrespective of their precise energy content, and as such does not impose strict requirements on the control of the energy calibration of the detector.Nevertheless, there are two potential sources of systematic effects which are studied in the following, namely the MIP energy scale given by the precision of the single cell calibration and the modelling of the detector noise.

MIP energy scale
Each cell of the AHCAL is calibrated with muons recorded in dedicated calibration runs.The most probable value of the response to the penetrating MIPs is used as the cell-to-cell calibration scale and as the reference energy scale in the tracking algorithm, as discussed in Section 3. The exact calibration procedure is described in [4].The uncertainty of these calibration factors is approximately 2%, originating from statistical and fit systematic uncertainties [11].
In the present analysis, the cell energy information is only used for noise rejection by requiring a minimum energy of 0.5 MIP.The uncertainty of the calibration can therefore have an influence on the single cell efficiency and thus on the track finding performance.The influence of an uncertainty  of the MIP energy scale is studied by altering the threshold, which provides an upper limit on possible effects since this corresponds to a correlated shift of the calibration of all detector cells.In addition, rather large shifts of the threshold of ±4% and ±8%, corresponding to values of 0.46, 0.48, 0.52 and 0.54 MIPs respectively, are studied.Even these very conservative values have little influence on all observables studied in the present article.With threshold shifts of ±8% the changes are comparable to or slightly in excess of the statistical error, while smaller shifts result in variations significantly below the statistical uncertainties.Overall, systematic effects originating from the MIP energy scale uncertainty are negligible, with respect to the data-simulation comparisons.

Detector Noise
The detector noise has a significant impact on the performance of the tracking algorithm, since only isolated hits are considered, irrespective of their energy content.On one hand, the addition of noise hits, which often are isolated since they occur at random locations inside the detector, can lead to an increase in track length from the addition of noise hits to real tracks.On the other hand, additional hits also lead to a reduction of the number of real isolated hits which are used in the tracking, and thus reduce the length of identified tracks.In practice, this second effect far outweighs the first.The influence of noise on the track segments found is studied by comparing results from full simulations with simulations where no detector noise from data has been overlaid.While no significant effect on the track multiplicity and on the angular distribution is observed the inverse slope parameter λ track of the track length distribution decreases by 5% to 10% once detector  noise is included, as shown in Figure 16.The influence of noise on the track length is comparable to the difference observed between data and simulations.Since simulations use detector noise taken directly from the test beam data they are compared to, the uncertainties on the noise level in the simulations are significantly smaller than the noise level itself, and cannot alone explain the discrepancy between data and simulations.

Track segments as a calibration tool
In addition to the possibility of probing the three-dimensional structure of hadronic cascades, minimum-ionizing track segments identified in hadronic showers can also serve as calibration tools to monitor the single cell response of a highly granular calorimeter in situ without dedicated calibration runs.Figure 17 left shows the data distribution of the single cell amplitude in identified track segments in 80 GeV pion showers.To avoid corrections for the track length in the cell due to larger angles with respect to the beam axis, only tracks at small angles, satisfying cosθ > 0.98 are considered.This sample contains tracks from particles prior (i.e.primary tracks) and post (i.e.secondary tracks) the first hard interaction, but is dominated by the former.The distribution is fit with a convolution of a Landau function with a Gaussian, where the former accounts for the energy loss of minimum-ionising particles in the scintillator tiles and the latter for effects from photon statistics, SiPM response and contributions from the detector front-end electronics.The most probable value of the distribution is consistent with 1 MIP, the calibration factor set by calibration runs with muons.This good agreement demonstrates the performance of the corrections applied to account for the dependence of the detector response on temperature and operational parameters, and shows that the identified track segments are indeed minimum-ionising particles.The cell-to-cell variations of the distribution are displayed in Figure 17 right by showing the distribution of the measured most probable value for all cells with at least 100 entries in the studied 80 GeV data set where the error on the most probable value was below 0.3 MIP.The width of the distribution is dominated by the fit uncertainty of the most probable value and not by cell-to-cell response variations.The distribution of fit errors has a mean value of 4.2% and extends substantially beyond 10% for some of the cells with low statistics.Overall, these results demonstrate that track segments identified within hadronic showers can be used to calibrate and monitor highly granular calorimeter systems at future colliders.To increase the statistics for such a monitoring algorithm, also non-isolated hits could be considered, identified as belonging to a track by interpolating between already identified track segments.

Summary
We have studied the spatial structure of hadronic showers in a highly granular scintillator-steel sampling calorimeter by identifying track segments of minimum-ionising particles within showers induced by negative pions with an energy from 10 GeV to 80 GeV.The tracks are identified with a nearest neighbour algorithm considering isolated hits, providing sensitivity to the outer regions of hadronic showers.This analysis demonstrates the imaging capabilities of the CALICE AHCAL, which is also crucial for the performance of particle flow algorithms.The single-cell energies of hits on identified tracks show a distribution expected for minimum-ionising particles, making such tracks within hadronic showers suitable for an in-situ cell-to-cell calibration of highly granular calorimeters at future colliders.In GEANT4 simulations the reconstructed track multiplicity is correlated with the total number of charged higher-energetic particles (excluding electrons and positrons), providing the basis for tests of GEANT4 hadronic shower models with observables based on the reconstructed tracks.Four physics lists of GEANT4 9.4p02 have been investigated.The observations for all identified track segments and those for secondary tracks only are consistent with each other.QGSP_BERT generally provides the best agreement with data with a good reproduction of the track multiplicity, while LHEP shows the largest discrepancies, producing too few secondary tracks overall and strongly underestimating the number of tracks at large angles.

Figure 1 :
Figure 1: The cell structure of the AHCAL active layers showing the different regions of granularity with cells of size 30 × 30 mm 2 , 60 × 60 mm 2 and 120 × 120 mm 2 .

Figure 3 :Figure 4 :
Figure 3: Flowchart of the tracking algorithm.The recursive function findTrack(seedHit) is marked by the gray box.The control flow is indicated by the black arrows, with the exception of the recursion flow, which is indicated by the purple arrows.The blue dashed arrows show the direction of data flow for clarity.

Figure 5 :
Figure 5: The energy of the ECal versus the energy of the AHCal of a 25 GeV hadron run.Regions with electron and muon content as well as events with showers starting in the ECAL are marked in blue, while the event selection cuts are indicated in red.

Figure 7 :
Figure 7: Distribution of track multiplicity for 25 GeV pion showers.The upper panel shows the normalised distribution for test beam data, while the lower panel shows the normalised residuals (simulation/data−1) between test beam data and the different physics lists.The grey area indicates the statistical error of the residual between test beam data and QGS_BIC.

Figure 8 :
Figure 8: Mean track multiplicity as a function of energy.The upper panel shows data while the lower one shows the normalised residuals (simulation/data − 1) between test beam data and the different physics lists.The grey area indicates the statistical error of the residual of test beam data and QGS_BIC.Systematic errors are below the level of statistical errors, as discussed in Section 4.4, and are not shown.

Figure 10 showsFigure 9 :
Figure 9: Mean secondary track multiplicity (tracks starting in layer three or later) as a function of energy.The upper panel shows data while the lower one shows the normalised residuals (simulation/data−1) between test beam data and the different physics lists.The grey area indicates the statistical error of the residual of test beam data and QGS_BIC.Systematic errors are below the level of statistical errors, as discussed in Section 4.4, and are not shown.

Figure 10 :
Figure 10: Normalised distribution of inclination of identified tracks for 25 GeV pion showers.The upper panel shows the distribution for test beam data normalised to an integral of 1, while the lower panel shows the normalised residuals (simulation/data − 1) of data and the different physics lists.The grey area indicates the statistical uncertainty of the residual of data and the QGS_BIC physics list.Systematic errors are below the level of statistical errors, as discussed in Section 4.4, and are not shown.

Figure 11 :
Figure 11: Mean track inclination as a function of energy.The upper panel shows the mean track inclination (1−cos θ ) for all physics lists and test beam data.The lower panel shows the normalised residuals (simulation/data − 1) of data and simulations with the grey area showing the statistical uncertainty of the residual of data and QGS_BIC.Systematic errors are below the level of statistical errors, as discussed in Section 4.4, and are not shown.

Figure 12 :
Figure 12: Mean secondary track inclination (tracks starting in layer three or later) as a function of energy.The upper panel shows the mean track inclination (1 − cos θ ) for all physics lists and test beam data.The lower panel shows the normalised residuals (simulation/data − 1) of data and simulations with the grey area showing the statistical uncertainty of the residual of data and QGS_BIC.Systematic errors are below the level of statistical errors, as discussed in Section 4.4, and are not shown.

Figure 13 :
Figure 13: Normalised distribution of the length of identified tracks for a beam energy of 25 GeV.The upper panel shows the distribution for data, while the lower panel shows the normalised residuals (simulation/data − 1) of data and simulations.The grey area indicates the statistical uncertainty of the residual of QGS_BIC compared to data.

Figure 14 :
Figure 14: Negative inverse slope parameter of the exponential fit to the track length distribution as a function of beam energy.The upper panel shows data and simulations.The lower panel shows the residuals (simulation/data − 1) of test beam data and simulations with the different physics lists with the grey band indicating the statistical uncertainties of the residuals of data and QGS_BIC.

Figure 15 :
Figure 15: Negative inverse slope parameter of the exponential fit to the track length distribution as a function of beam energy for secondary tracks (starting in layer three or later).The upper panel shows data and simulations.The lower panel shows the residuals (simulation/data − 1) of test beam data and simulations with the different physics lists with the grey band indicating the statistical uncertainties of the residuals of data and QGS_BIC.

Figure 16 :CALICEFigure 17 :
Figure 16: Comparison of the inverse slope parameter of the track length distribution as a function of energy for simulations with and without the inclusion of detector noise.The residual is defined here as QGSP_BERT w/o Noise QGSP_BERT with Noise − 1, and its statistical uncertainty is indicated by the grey band.