Statistical Properties of Radio Halos in Galaxy Clusters and Origin of Seed Electrons for Reacceleration

One of the most promising mechanisms for producing radio halos (RHs) in galaxy clusters is the reacceleration of cosmic-ray electrons by turbulences. However, the origin of the seed electrons for the reacceleration is still poorly constrained. In the secondary scenario, most of the seed electrons are injected via collision of proton cosmic-rays, while non-thermal electrons are directly injected in the primary scenario. In this paper, we examine the two scenarios for the seed electrons with the observed statistical properties of RHs, combining two methods: following the temporal evolutions of the electron energy and radial distributions in a cluster, and the merger history of clusters. We find that the RH lifetime largely depends on the seed origin, as it could be longer than the cosmological timescale in the secondary scenario. We study the condition for the onset of RHs with the observed RH fraction and the RH lifetime we obtained, and find that long-lived RHs in the secondary scenario should be originated from major mergers with a mass ratio of $\xi\sim0.1$, while the short lifetime in the primary scenario requires more frequent onsets by minor mergers with $\xi\sim0.01$. Our simple model of the turbulence acceleration can reproduce the observed radio luminosity-mass relation. The RH luminosity functions we obtained suggest that the expected RH number count with the ASKAP survey will detect $\approx10^3$ RHs in both the scenarios.


INTRODUCTION
An increasing number of galaxy clusters are found with diffuse cluster-scale synchrotron emission in the radio band. Giant Radio halos (RHs) are the Mpc-sized emission often found in the central region of massive clusters (see van Weeren et al. 2019, for a review). The progress in the RH observation has been achieved with the NRAO VLA Sky Survey (NVSS, Giovannini et al. 1999), the Giant Meterwave Radio Telescope (GMRT, Venturi et al. 2007) and its extension (Kale et al. 2013). A novel progress in this field is now ongoing with current generation of high-sensitivity radio telescopes, such as LOw Frequency ARay (LOFAR, van Haarlem et al. 2013) and the the pathfinders for Square Kilometer Array (SKA, Dewdney et al. 2009).
A variety of statistical properties of RHs have been reported. For example, the occurrence of RHs correlates with the dynamical disturbance of clusters observed in X-ray (e.g., Schuecker et al. 2001;Cassano et al. , 2013Cuciti et al. 2021a), and the radio luminosity is known to correlate with the X-ray luminosity or the mass measured with the Sunyaev-Zel'dovich (SZ) effect (e.g., Cassano et al. 2013). In addition, those correlations show bimodalities, as the clusters without RHs fall well below the relations (Brunetti et al. 2007a). The occurrence of RHs seems to increase with the mass of the cluster (Cuciti et al. 2015(Cuciti et al. , 2021a. Those facts indicate a firm connection between the formation process of galaxy clusters and the non-thermal components in the intra-cluster medium (ICM).
During the structure formation of galaxy clusters, a part of the gravitational potential energy would be dissipated into the acceleration of relativistic particles by the merger. In the so-called turbulent reacceleration model, RHs originate from the reacceleration of seed cosmicray electrons (CREs) by the merger-induced turbulence (e.g., Brunetti et al. 2001;Petrosian 2001;Fujita et al. 2003). As the turbulence could permeates the most of the cluster volume, this model can explain the large extension of the radio emission. This model is supported by the observational signature of the turbulence acceler-ation such as the steep spectral index (F ν ∝ ν −αsyn with α syn ≈ 1.5) or the break feature in the radio spectrum (e.g., Brunetti et al. 2008;Macario et al. 2010;Wilber et al. 2018;Di Gennaro et al. 2021). However, the origin of the seed electrons for the reacceleration has not been revealed yet (e.g., Brunetti & Jones 2014).
There are two possible origins of the seed CREs. One is the primary scenario, where CREs are directly injected by accelerators like large-scale accretion shocks, or escaped from radio jets launched from the central active galactic nuclei (AGNs) (e.g., Kang et al. 2012;Völk & Atoyan 1999;Gitti et al. 2002;Heinz et al. 2006). The other is the secondary scenario, where the primary CRE injection is dominated by the secondary injection from the pp collisions between cosmic-ray protons (CRPs) and thermal protons in the ICM (e.g., Brunetti & Blasi 2005;Brunetti & Lazarian 2011;Pinzke et al. 2017). Recently, by modeling the well-studied RH in the Coma cluster, we have found that the non-thermal properties, such as the CR energy density and the location of the primary CR accelerators, could be significantly different between those two scenarios, which could be a clue to reveal the seed origin (Nishiwaki et al. 2021). In this study, we further discuss possible differences appearing in the statistics of RHs.
The luminosity function of RHs has been calculated with the scaling relation between the radio power and the mass or X-ray luminosity (e.g., Enßlin & Röttgering 2002). Several theoretical attempts have been made to investigate the occurrence of RHs. In the context of the reacceleration scenario, Cassano & Brunetti (2005) constructed a self-consistent statistical model for the turbulent injection and the RH generation during the cluster evolution. Their model is constrained by the observed occurrence of 30% around 1.4 GHz, and predicts a much smaller RH fraction (≤ 5%) in less massive clusters. The increased occurrence at lower frequencies (≈ 100 MHz) is predicted in . The bimodality in the RH luminosity distribution may be explained with the rapid CR streaming or diffusion rather than the reacceleration. Cassano et al. (2016) pointed out that the lifetime of RHs and the critical mass ratio for the reacceleration onset can be constrained from the observed fraction of radio-loud clusters with the merger rate of the dark matter halo obtained in N-body simulations. Zandanel et al. (2014) and Cassano et al. (2012) proposed the combined scenario, where the emission in the core region is produced by the secondary CREs, while the outer parts are dominated by primary CREs.
The main objective of this paper is to investigate how the statistical properties found in the RH observations provide constrains on those different scenarios for the seed population. In this paper, we combine following two methods; the time-dependent calculation of the CR spectral evolution solving the Fokker-Planck equation, and the Monte Carlo (MC) procedure to simulate the macroscopic cluster evolution through mergers and the mass accretion. In the former calculation, we consider the reacceleration of CRs and compare the two scenarios, secondary or primary, of the CRE injection (Section 2). Those models are constrained by the observed spectrum and spatial profile of the Coma RH (Section 2.5). With this method, we can estimate how different the lifetime of the radio emission is between the secondary and primary scenarios (Section 3).
In the latter part, we construct the MC merger tree from the halo merger rate based on N-body simulations (Section 4). Considering the lifetime obtained from the former part, we investigate consistent conditions for the onset of RHs with the observed statistics for both the two scenarios (Section 5). Our attempt requires that the peak radio luminosity after a merger is related to the cluster mass scale. The required correlation is partially justified in Section 7, where we again solve the evolution of CRs in RHs with various masses, assuming a scaling between the acceleration efficiency and the cluster mass. Finally, the detectability of low-mass RHs with upcoming radio observations is discussed in Section 9.

TIME EVOLUTION OF COSMIC-RAY DISTRIBUTION
First, we study the emission lifetime of RHs in the turbulent reacceleration model. For this purpose, we integrate the Fokker-Planck (FP) equation for the CR distribution function and follow the time evolution of the synchrotron spectrum. We mainly discuss the comparison of the two scenarios for the origin of seed CREs, i.e., the primary scenario and the secondary scenario. Those scenarios are distinguished by the injection rate of primary cosmic-ray protons (CRPs) relative to that of CREs. Our model here is basically the same as Nishiwaki et al. (2021), which includes the reacceleration of CRs. We adopt the observed property of the Coma RH, for which we have well-examined data sets, to constrain the model parameters.
Our model is one-dimensional in space, which enable us to follow the evolution of brightness profile in addition to the spectral evolution. In our previous study, the large extension of the RH was explained with the radial dependence in the CR injection with a constant reacceleration efficiency in space. In this paper, we investigate another possibility: the radial dependence in the reacceleration efficiency for a fixed injection profile. While both the models well reproduce the Coma RH, the spectral evolution or the lifetime is not largely affected by the injection or reacceleration efficiency profile. Our model includes relatively less parameters and it is easy to apply to other RHs with different masses (Section 7).

The Fokker-Planck equation
We solve the isotropic one-dimensional Fokker-Planck (FP) equation to follow the time evolution of the distribution function in the phase space, N s (r, p, t) = (4πr 2 )(4πp 2 )f s (r, p, t), where f s (r, p, t) is the number density of particle species s of momentum p at radial position r, and time t in the phase space.
For CRPs, the FP equation can be written as (e.g., Brunetti & Blasi 2005;Fujita et al. 2003;Pinzke et al. 2017;Nishiwaki et al. 2021) where N tot p (r, p, t) is the distribution function of CRPs, b (p) C is the momentum loss rate (b ≡ −dp/dt) due to the Coulomb collisions, D pp is the momentum diffusion coefficient due to interactions with turbulence, and Q p (r, p) denotes the injection of primary CRPs. The factor τ pp denotes the pp collision timescale. For simplicity, we ignore the effect of repeated collisions of a CRP, so we do not follow an energy loss per collision. The cooling due to the pp collision is expressed like escape as −N p /τ pp , where the loss timescale includes the inelasticity coefficient of κ pp = 0.5. The details of τ pp and b (p) C are shown in Nishiwaki et al. (2021).
Throughout this paper, we focus on the synchrotron emission only below ∼GHz and the gamma-ray emission below ∼ 100GeV. The spatial diffusion of low-energy particles responsible for such emission is negligible, so we omit the spatial diffusion terms, D rr .
For CREs, the FP equation is expressed as where b (e) C is the energy loss rate due to CRE-e collisions, b rad denotes the radiative cooling, and Q e (r, p; N p ) is the source term, which includes both the secondary and primary injections: Q e (r, p; N p ) = Q sec e (r, p, ; N p ) + Q pri e (r, p). We use the same formulae for the momentum loss term b, b (e) C and b as those given in Nishiwaki et al. (2021). We do not solve the evolution of high-energy CREs of p > 10 7 m e c to save the computation time.
The radial profile of the thermal electron density is expressed with the beta-model profile as The parameters for the Coma cluster are (Briel et al. 1992), n th (0) = 3.42×10 −3 cm −3 , β = 0.75, and r c = 290 kpc. We also use the same temperature profile as that in Nishiwaki et al. (2021). For simplicity, the ICM is a pure hydrogen plasma, and we use Eq. (3) for the density of thermal protons in our model. Relativistic electrons can be produced through the decay of pions produced by the inelastic collisions between CRPs and thermal protons (p+p → π 0,± +X). We adopt the same method as that in Nishiwaki et al. (2021) to calculate the secondary injection from pp collisions.

Injection of primary cosmic-rays
Primary CRPs can be accelerated through the shocks caused by the structure formation process or injected in the ICM from the internal sources like AGNs. We assume a single power-law spectrum with an exponential cutoff for the injection spectrum of primary CRs; where C inj p is the normalization factor, and Q(r) represents the radial dependence of the injection 1 . Those two quantities can be constrained by the observed flux and the brightness profile of the RH, respectively. The maximum energy of primary CRPs is taken to be E max p = 100 PeV. The minimum momentum of CRPs is taken to be ten times larger than that of the thermal protons. Following Brunetti et al. (2017), we adopt the injection index of α inj = 2.45.
We also take into account the injection of primary CREs. The number ratio of primary CREs to primary CRPs at the injection is expressed with a parameter f ep ; Eq. (4), which is defined only above the minimum momentum of CRPs, p = 30 MeV/c, is extrapolated to the minimum momentum of CREs, p e = 0.3m e c = 150 keV/c, where the strong Coulomb cooling inhibits the reacceleration.
As we have mentioned, we compare the two scenarios: the secondary scenario (f ep = 0), where all of CREs are produced through the pp collision, and the primary scenario (f ep = 0.01), where the primary CRE injection dominates the injection from the hadronic interaction. The adopted ratio of f ep = 0.01 is close to the observed CRE to CRP ratio in our galaxy (e.g., Schlickeiser 2002). We neglect the possible radial dependence of f ep (e.g., Pfrommer et al. 2008) for simplicity. In Sect. 3, we show that the expected lifetime of the radio emission is significantly different between these two scenarios.
The extended profile of RHs indicates a fair amount of CREs outside the thermal core of the cluster. This challenges the secondary scenario, since the injection of secondary CREs is more effective at smaller radii where the timescale of the pp interaction is shorter. Nishiwaki et al. (2021) tested the case where the injection profile K(r) has a peak at r ≈ 1 Mpc to reproduce the surface brightness profile of the Coma RH. However, even though CRPs are injected outside the core, high-energy CRPs selectively diffuse into the core, where the radio emissivity is higher due to the stronger magnetic field. As a result, the spectral hardening is caused by this radial diffusion of parental CRPs, and this makes it difficult to fit the observed spectrum. This difficulty is expected to be relaxed if the reacceleration is more efficient at larger radii.
In this paper, we assume that CRs in the ICM are mainly injected from internal sources, such as AGN activities and other outflows from member galaxies. We adopt the injection profile of primary CRs proportional to the thermal density profile (Eq. (3)): where we use the same r c and β as Eq.
(3). We fix this injection for both the primary and secondary scenarios. As will be explained in Sect. 2.3, the observed brightness profile of the RH can be modeled by tuning the radial profile of D pp .

Turbulent reacceleration
Recent studies show that the observed ∼Mpc extension of the Coma RH with the cored profile of the magnetic field (Bonafede et al. 2010) requires the CR profile flatter than the ICM profile (e.g., Brunetti et al. 2017;Pinzke et al. 2017). In this study, we consider that the extended profile of radio-emitting CREs is reproduced by the radial dependence of the reacceleration coefficient, D pp , rather than that of the CR injection. This assumption is supported by the turbulent profile often seen in the numerical simulations of the cluster formation (e.g., Nelson et al. 2014;Vazza et al. 2017;Angelinelli et al. 2020). In the following, we discuss how the profile of D pp is related to the profile of the turbulence, using the expression derived from the quasi-linear theory. As a result, D pp (r, p) will be expressed with two parameters; one is for the radial dependence, and the other is for the normalization.
We assume that the (re)acceleration of CRs occurs through the transit-time damping (TTD) with the compressible turbulence in the ICM (e.g., Brunetti & Lazarian 2007;Teraki & Asano 2019). In this case, the momentum diffusion coefficient becomes hard-sphere type (D pp ∝ p 2 ), which implies that the acceleration timescale is independent of the particle momentum t acc = p 2 /(4D pp ) ∝ p 0 . We focus on only the TTD acceleration in this paper, though other possibilities for the reacceleration mechanism have been proposed, such as the reacceleration by the incompressible turbulence(e.g., Brunetti & Lazarian 2016) or Alfvén waves(e.g., Fujita et al. 2003).
We assume that the fast mode turbulence in the ICM is well described with the Iroshnikov-Kraichnan (IK) scaling (e.g., Brunetti & Lazarian 2011). Isotropic cascades of fast mode turbulences with the IK spectrum are seen in numerical simulations (e.g., Cho & Lazarian 2003).
Using the turbulent energy spectrum W(k) ∝ k −3/2 , D pp can be written as (e.g., Miniati 2015) where k L corresponds to the injection scale, while k cut represents the cut-off scale (see Appendix B).
Here, we have assumed that the turbulence is isotropic with respect to the background magnetic field. The average over the angle θ between the direction of the wave and the background magnetic field is represented by a dimensionless function, ≈ 5, where x = v ph /c and v ph is the phase velocity of the wave. Because the TTD is the interaction with the fast mode waves and the ICM is a high-beta plasma (β pl 100 for ∼ 1µG magnetic field), we adopt the sound velocity c s for v ph .
The diffusion coefficient D pp could depend on the radial coordinate r through the profile of the turbulent energy density. Instead of specifying the parameters such as k L and the normalization of W(k), we use another parameter, t acc (r c ), that represents the acceler-ation timescale at r = r c , and discuss constraints on this parameter from the observed spectrum. Introducing cut-off factors for both the maximum and the minimum energies (Nishiwaki et al. 2021), the diffusion coefficient may be written as where D(r) is a dimensionless factor representing the radial dependence, and m s is the mass of particle species s. We adopt the Hillas limit (Hillas 1984) for the maximum energy of CRs, E max c (r) = qB(r)l F c ∼ 9 × 10 19 (B(r)/1 µG)(l F c /0.1Mpc) eV, where l F c is the maximum size of the turbulent eddy of compressible turbulence. We assume l F c = 0.1 Mpc as a reference, which is compatible with the constraint from the thermal Sunyaev-Zel'dovich effect (SZ) observation of the Coma cluster (Churazov et al. 2012).
Following Pinzke et al. (2017), the turbulent energy density, ε turb ≈ ρV 2 L , is assumed to scale with the thermal energy density ε th as where R 500 is a parameter R ∆ with ∆ = 500. The radius R ∆ is defined as the radius inside which the average of the total density becomes ∆ times the critical density of the Universe, ρ cr : We adopt R 500 = 47 arcmin for the Coma cluster (Planck Collaboration et al. 2013), which corresponds to R 500 = 1.35 Mpc under the cosmology we adopted. The need for the cut-off in Eq. (9) is discussed in Section 2.5.
From Eq. (7) and the Mach number for the turbulent velocity at the injection scale r) , where I L is the volumetric injection rate of turbulence energy. For simplicity, the injection scale k L is assumed to be a constant. With those assumptions concerning ε turb and k L , the radial dependence factor D(r) in Eq. (8) can be expressed with one parameter, α turb : where the normalization is given at r = r c , i.e., Using Eq. (7) and neglecting the cut-off terms in Eq. (8), t acc (r c ) can be obtained as a function of the ratio of the turbulent energy density to the thermal energy density: where x = c s (r)/c.

Model outline
In our modeling, the history of the RH evolution is divided into following three phases: • Injection phase. In this phase, we follow the longterm evolution of the CR distribution before the major merger event. We integrate the FP equations (Eqs. (1) and (2)) with D pp ≡ 0 for 4 Gyr until the the CRE spectrum settles in a steady shape due to the balance between the constant injection and the cooling processes.
• Reacceleration phase. This phase starts just after a major merger event. The merger-induced turbulence reaccelerates CRs. The initial condition of this phase is the final state of the injection phase. We follow the evolution including the reacceleration expressed with the diffusion coefficient of Eq. (8). The Coma RH is assumed to be in this phase and the elapsed time in this phase, t R , is treated as a model parameter. We do not consider the time evolution of the turbulence, so the parameters, α turb and t acc are not time dependent.
• Cooling phase. After the reacceleration ceases, the emission decays with time. This phase is modeled with D pp = 0, and the emission lifetime will be investigated in Section 3.
The CR injection spectral index is fixed to be α inj = 2.45, which can model the typical radio spectral index of RHs, α syn ≈ 1.2. Concerning the CR injection, we have one free parameter, L inj p , which is the injection luminosity (in unit of [erg/s]) of relativistic (p > m p c) protons integrated over the cluster volume, i.e., L inj p = dr dpE p (p)Q p (r, p), where E p (p) = m 2 p c 4 + p 2 c 2 . The injection rate of primary CREs is determined by Eq (5).
For a given magnetic field, our model includes four parameters: the CR injection luminosity, L inj p , the powerlaw index for the radial dependence of D pp , α turb , the timescale of the reacceleration at a given radius, t acc (r c ), and the elapsed time after the reacceleration onset, t R .

Modeling the Coma RH
In this section, we apply the model explained above to the Coma RH and discuss the constraints on the model parameters. The radial dependence of D pp is constrained by the observed brightness profile, while t R , t acc , and L inj p are by the spectral shape and the flux at a given frequency. With those parameters, we study the emission lifetime in Section 3.
The magnetic field in the Coma cluster is well studied with the rotation measure (RM) measurements. Here we use the following scaling of the magnetic field strength with cluster thermal density: where the best fit values are B 0 = 4.7 µG and η B = 0.5 (Bonafede et al. 2010), and see Eq.
(3) for n th (r). The uncertainty in the magnetic field and its impact on the multi-wavelength spectrum are discussed in Nishiwaki et al. (2021). For a fixed injection of primary CRPs, the profile of the turbulence α turb can be constrained from the brightness profile of the RH. Before the reacceleration, the injection with profile of Eq. (4) produces too steep brightness profile in both primary and secondary scenarios (dashed lines in Figure 1). This steepness is further enhanced in the secondary model because of the density dependence of the electron injection.
As seen from Eq. (8), the turbulent profile with α turb < 1 enhances the acceleration at larger radii (r > r c ), and the brightness profile would be modified to fit the observation. As shown in Figure 1, we find that α turb = 0.28 and α turb = 0.52 provide the best fit to the secondary and the primary scenarios, respectively. The other model parameters are summarized in Table  1. The radial dependence of the acceleration timescale t acc (r) ≡ p 2 /(4D pp (r, p)) is shown in Figure 2. Adopting Eq. (12) with constant L = 500 kpc, we estimate the turbulent energy density from t acc as shown in Figure 2.
In the secondary scenario, the reacceleration timescale at the edge of the RH is required to be almost 2 times shorter than inside the core region. The decline of t acc towards large radii is suppressed by the artificial cut-off of the turbulent profile at R 500 (Eq. (11)), with which the observed decline of the brightness around 1 Mpc is well explained.
If we omit the cut-off term in Eq. (11), the acceleration efficiency becomes unrealistically high at the periphery of the cluster. Although a model without the cutoff can explain the synchrotron spectrum within r ap = 500 kpc, the gamma-ray flux calculated within r < R 200 exceeds the upper-limit given by Fermi-LAT (Ackermann et al. 2016) in the secondary scenario. This problem can be circumvented by introducing the cut-off.
The elapsed time, t R , and the timescale of reacceleration, t acc , can be constrained from the spectral shape of the RH (see Nishiwaki et al. 2021, for the detail), especially the spectral steepening around 1 GHz. We find that t acc (r c ) ≈ 300 Myr is favorable in both the scenarios.
The log-scaling in the cross section of the pp collision (e.g., Kamae et al. 2006) causes the hardening in the secondary CRE specrtum, and our secondary model predicts a larger flux above 1 GHz (Figure 1). A possible prescription for that tension is adopting a steeper injection index. The spectra for different α inj are shown in Appendix A, where we change L inj p and t R for each α inj to fit the data, while the other parameters, α turb and t acc are fixed. We find that the α inj = 3.0 model provides a better fit to the higher frequencies. Since the purpose of this paper is the discussion about the RH statistics, we do not excessively tune the parameters to match with the Coma RH spectrum. Throughout this paper, we fix α inj = 2.45 as a fiducial value (e.g., Pinzke & Pfrommer 2010;Brunetti et al. 2017).
In each scenario, The expected flux of gamma-ray becomes smaller than the upper limit by Fermi-LAT observation (Ackermann et al. 2016). Note that possible gamma-ray signals are reported in the direction of Coma (e.g., Xi et al. 2018;Adam et al. 2021). The predicted flux in our secondary scenario is comparable to the data reported in those papers, while the GeV gamma-ray flux becomes almost two order of magnitude lower in the primary model. Note-The other fixed parameters are 4 Gyr for the duration of the injection phase, αinj = 2.45 for the slope of the injection spectrum, and the cutoff R500 of the turbulent profile.
As shown in Table 1, the acceleration timescale t acc are similar to those in the previous studies (e.g., Brunetti et al. 2017;Nishiwaki et al. 2021). However, the primary CR injection luminosity in the primary scenario is one order of magnitude larger than that in Nishiwaki et al. (2021). This is attributed to the different best fit values of t R /t acc . In our previous model, t R /t acc 3.0 is required to match the brightness at  outer radii (r > 500 kpc), while this requirement is relaxed considering the radially increasing turbulent profile. The relatively small t R /t acc in the current model reduces the energy contribution from the reacceleration, which in turn increases the contribution from the primary CR injection. The injection power of primary CREs (p e c ≥ 150 keV) is smaller than that of CRPs (pc ≥ 30 MeV) by a factor of ≈ 5 in the primary scenario with α inj = 2.45. In our previous study, we modeled the brightness profile of Coma with a fine-tuned injection profile with larger number of model parameters, under the assumption of spatially homogeneous acceleration efficiency. In this study, however, we succeeded in the modeling with much simpler injection profile (Eq. (4)), adopting the turbulent profile shown in Figure 2. Note that both the models are compatible with the current radio observations. The direct measurement of turbulent velocity with future X-ray telescopes featured with improved spectral resolution and effective area is desired to distinguish those two models. A similar turbulent profile for a Coma-like halo is obtained in Pinzke et al. (2017). The best-fit value for their M-turbulence model is α turb = 0.67, which is slightly larger than that in our primary scenario. Our model is in good agreement with that model concerning the profile of t acc , although the overall normalization is larger in their M-turbulence model by a factor ∼2.

LIFETIME OF RADIO HALOS
In this Section, we discuss the lifetime of RHs in the cooling phase (Section 2.4). As an example case, we present the time evolution of a Coma-like cluster after the reacceleration ends. In this cooling phase, we solve the FP equation with D pp = 0. In our method, we do not follow the evolution of the turbulent spectrum, so the acceleration timescale is assumed to be constant within the reacceleration phase and it suddenly turns to be zero at the transition to the cooling phase. Two competing effects govern the evolution of the CR spectra: cooling processes and injections (primary and secondary).
The model is basically the same as that explained in Section 2. We adopt the same magnetic field, ICM profile, CR injection profile, slope of the injection spectrum, and redshift. We adopt the model parameters, such as L inj p , t acc , and α turb , from the obtained values in Section 2. However, we change t R as explained in the following.
The initial condition for the cooling phase corresponds to the final state of the reacceleration phase. To set the initial condition, the maximum duration of the reacceleration phase needs to be specified. In the case of the Coma RH, the present stage does not necessarily correspond to the end of the reacceleration phase, i.e., the peak of the luminosity evolution. We assume that the Coma RH is in the midst of the reacceleration phase. Considering that there exit some RHs with masses similar to the mass of Coma and much brighter radio luminosities as P 1.4 ≈ 10 25 W/Hz, such as RXCJ 0510.7-0801 and Abell 2744 (Cuciti et al. 2021b), we assume that a Coma-like cluster could achieve similar luminosity at t R = t peak R . Our model suggests that the luminosity of a Coma-like halo reaches that value around t R ≈ 500 Myr (about 200 Myrs from the present stage of Coma). Thus, we set the initial condition for the cooling phase by the state at t peak R = 500 Myr In the following, we compare the two scenarios for the seed population: the secondary scenario (Section 3.1) and the primary scenario (Section 3.2). We define the emission lifetime as the duration for which the flux at 1.4 GHz, F 1.4 , satisfies the condition, F 1.4 0.1F peak 1.4 , where F peak 1.4 is its peak value.

Secondary scenario
In the left panel of Figure 3, we show the time evolution of synchrotron spectrum of our Coma-like cluster in the secondary scenario. We adopt the model parameters, α turb = 0.28, t acc = 330 Myr (at r = r c ), and t R = 500 Myr. The aperture radius is fixed to r ap = 500 kpc. The thick black line shows the flux at the end of the reacceleration phase, or the beginning of the cooling phase. The flux at the each epoch in the cooling phase is distinguished by colors, as it evolves from red to blue. After several 100 Myrs, the spectrum approaches a power-law shape due to the balance between the electron injection and the cooling. The flux decrement in the cooling phase is at most one order of magnitude and most highlighted at ≈ 300 MHz, where the cooling and acceleration balance in the reacceleration phase. At higher frequencies (ν > 1 GHz), the peak flux is maintained by a high injection rate of secondary CREs from reaccelerated CRPs. Since parent CRPs do not significantly suffer from cooling, the flux at > 1 GHz is maintained even after the reacceleration phase. Therefore, the lifetime of the RH can be much longer than the cooling time of CREs especially at high frequencies.
Our calculation demonstrates that not only the reacceleration of the pre-deposited secondary CREs but also the flesh electrons injected from reaccelerated CRPs make significant contributions to the radio emission. The fractional contribution of those two components could depend on the parameters, t acc and t peak R . As shown in Nishiwaki et al. (2021), in the reacceleration phase, the evolution of the radio flux around 300 MHz is relatively faster than that of the gamma-ray flux originated from π 0 decay. This suggests that as t R /t acc increases, in the radio flux, the contribution of the reaccelerated CREs dominates that of the enhanced electron injection from reaccelerated CRPs. Even at 1.4 GHz, the flux decay would be more prominent, if the reacceleration is more efficient (t acc 200 Myr) or the reacceleration lasts longer (t peak R 500 Myr). However, such a model causes a tension with the observed spectral feature of the Coma RH (e.g., Nishiwaki et al. 2021) or the radio luminosity becomes far brighter than any observed RHs.
Note that such long-living emission is achieved only under the assumption of hard-sphere type reacceleration (D pp ∝ p 2 ), where the reacceleration timescale is the same between CRPs and CREs. When the CRP reacceleration is less efficient compared to that for CREs, e.g., in the case of the Kolmogorov type reacceleration (Nishiwaki et al. 2021), the emission would decline within a few times the cooling timescale.
As the radiative cooling proceeds, the shape of the spectrum approaches to a single power-law shape. The concave shape caused by the reacceleration disappears within a few times the cooling timescale. If the observed break feature around 1.4 GHz is considered to be significant, the RH in Coma should still be in the reacceleration phase or within ≈ 200 Myrs after the reacceleration stops.
In Figure 4, we show the evolution of the brightness profile at 350 MHz. The decline of the brightness is   within one order of magnitude, consistent with the spectral evolution. Given a threshold for S ν , the size of the RH shrinks with time. For example, considering an observational threshold around S ν ≈ 10 −2 [Jy/arcmin 2 ], the apparent RH size evolves from 1 Mpc to 0.5 Mpc in the cooling phase. A slight increase of the brightness in the core region is seen at the later stage. This is a transitional phenomenon while the CRE distribution is evolving towards the equilibrium state regulated by the balance between the cooling and the enhanced injection from reaccelerated CRPs.

Primary scenario
The right panel of Figure 3 shows the spectral evolution in the primary scenario. The model parameters are, α turb = 0.52, t acc = 255 Myr (at r = r c ), and t R = 500 Myr. In this case, the evolution of the secondary injection rate is negligible, and the radio flux in lower frequencies (ν < 1 GHz) decays almost one order of magnitude within ≈ 500 Myr after the end of the reacceleration phase. The lifetime of the RH at 1.4 GHz is comparable to the cooling timescale of responsible electrons, or the reacceleration timescale, as shown in the previous studies (Cassano & Brunetti 2005).
The evolution of the brightness profile is shown in the right panel of Figure 4. Adopting the same threshold for S ν as in Sect. 3.1, we find that the RH disappear in 400 Myr after the peak time. In this scenario, not only the flux but also the size of the RH show rapid increase and decrease. Considering both the reacceleration phase and the cooling phase, one can approximate the lifetime of the giant RH as ≈500 Myr in total.
The spectral profile at the late stage of the cooling phase, t off > 1 Gyr, is not equivalent to the prereacceleration profile (Figure 1 dashed line). This difference is originated from the assumption of f ep = 0.01, i.e., the finite injection rate of CRPs even in the primary scenario. The off-state emission is supported by the enhanced injection from reaccelerated CRPs, although the synchrotron flux becomes smaller by one order of magnitude than the peak state.

Discussion
In this Section, we have discussed the lifetime of the radio emission around 1.4 GHz. As seen from Figure 3, the lifetime depends on the frequency. A longer time is required for lower frequencies to reach the steady flux due to the balance between the cooling and the injection. In addition, the cooling timescale depends on the magnetic field and redshift. Especially for the primary scenario, one can expect a much longer lifetime for a weaker magnetic field and lower frequencies.
While we have adopted α inj = 2.45, as demonstrated in Figure 7 in Nishiwaki et al. (2021), a smaller α inj in the secondary model results in a harder and brighter flux around 10 GHz, where the emission is mostly powered by the injection from reaccelerated CRPs. Then, the decay of the emission seen in secondary scenario ( Figure 3 left) becomes less important when α inj < 2.45.
We have estimated the RH lifetime under the "stepfunction approximation" for D pp . In reality, the transition into the cooling phase would be gradual. If its transition timescale is much longer than the cooling timescale, the lifetime in the primary scenario can be longer than our estimate in the previous section.

MERGER TREE OF DARK MATTER HALOS
In the previous section, we have studied the lifetime of RHs. Our next steps is to calculate the population of RHs induced by cluster mergers, which can be compared with the observed statistic of RHs. In this Section, we explain our method to follow the cosmological evolution of galaxy clusters by constructing merger trees.
Merger trees describe the mass evolution of a dark matter (DM) halo. In the extended Press-Shechter (EPS) formalism (e.g., Press & Schechter 1974;Bond et al. 1991;Lacey & Cole 1993), the mass growth rate of a halo, or the conditional mass function, is given consistently with the Press-Shechter (PS) mass function. However, the EPS formalism tends to underestimate the number density of massive halos compared to cosmological N-body simulations (e.g., Sheth & Tormen 1999;Tinker et al. 2008). We introduce a Monte Carlo algorithm to build the merger trees compatible with the mass function given in those simulations.

Mean merger rate per halo
In our picture, DM halos evolve through stochastical mergers and continuous accretion of matters. Each halo experiences both the continuous mass evolution due to the accretion and the sudden mass increase due to the mergers. We employ the fitting formulae for the merger rate and the mass accretion rate (MAR) from N-body simulations (Fakhouri et al. 2010). However, as explained in the following, the separation between accretion and mergers is not definitive, so we need to define the minimum mass ratio for mergers, on which the estimate of the continuous mass accretion depends.
We treat a merger as a two-body event. The merger between n ≥ 3 halos can be regarded as a sequence of binary mergers within a sufficiently small time interval. For a merger between two halos of masses M 1 and M 2 , we call these two halos "progenitor" halos, and the one produced by the event "descendant" halo.
In the following sections, we assume that RHs are induced by "mergers" and discuss the condition for the RH onset. Thus, the value of ξ min needs to be smaller than the threshold mass ratio for the RH onset, ξ RH (Section 5.1). In this paper, we set ξ min = 10 −3 to make the above condition satisfied with saving the computational cost. This value is larger than the minimum ξ used in Fakhouri et al. (2010). The number of mergers for given intervals ∆z and ∆ξ is expressed as In our computation, we adopt sufficiently small values for ∆z and ∆ξ so that P(M 0 , ξ, z) < 10 −3 for ξ ≥ ξ min , ensuring the two-body merger approximation.

Mass accretion rate
In the limit of ξ min → 0, the mass evolution rate due to mergers obtained by integrating Eq. (14) with respect to ξ becomes infinity. This indicates that the mass evolution is dominated by numerous minor mergers with small ξ, rather than rare major mergers. For an appropriate time interval, most of halos with similar masses would evolve with similar rates through minor mergers and the accretion, while a small fraction of halos experience abrupt growth due to major mergers. To express the continuous accretion, we adopt the median value of the total MAR in N-body simulations, in which the contribution of major mergers may be negligible.
The total MAR is defined as dM dz = −(M 0 − M 1 )/∆z, where M 0 is the descendant mass at z and M 1 is the mass of its most massive progenitor at z + ∆z. The process of the mass increase, mergers or accretion, is not specified here. According to Fakhouri et al. (2010), the median values of MAR in the two Millennium simulations is well fitted with the following formula; As discussed in Fakhouri et al. (2010), the mean MAR is overall larger than than median MAR, because the probability distribution of dM dz has a long tail in the highaccretion rate region. As the long tail, which boosts up the mean value, is mainly due to major mergers, the median value would be appropriate as the accretion rate in our definition.

Halo mass function
The merger tree calculation needs to reproduce the halo mass function (HMF), which describes the number of DM halos per unit comoving volume per unit mass. This can be written as, where ρ 0 is the mean mass density of the Universe, σ is the rms mass variance, and the function f (σ, z) is called multiplicity function, whose functional form is determined analytically (Press & Schechter 1974) or empirically from the fit to the mass function in N-body simulations (e.g., Sheth & Tormen 1999;Jenkins et al. 2001;Tinker et al. 2008;Watson et al. 2013 as where P (k) is the linear matter power spectrum and W (χ) is the Fourier transform of the top-hat window function, which is expressed aŝ The radius R M is larger than the physical radius of galaxy clusters R ∆ (Eq. (10)), since the mass density in clusters is generally larger than the mean mass density of the Universe. The matter power spectrum, P (k), in Eq (18) is conventionally expressed as where T (k) is the transfer function, and n s is the slope of the primordial power spectrum. The deformation from the primordial power spectrum at each scale is expressed by T (k). The calculation of T (k) includes technical challenges, because it is affected by, for example, various damping processes due to interaction between matters and relativistic particles. In this study, we adopt T (k) generated with the public code CAMB.py (Lewis et al. 2000). The normalization A 8 is calculated from the cosmological parameter, σ 8 , which is the mass variance on a scale of 8h −1 Mpc. We adopt n s = 0.97 and σ 8 = 0.81 from Planck Collaboration et al. (2020). From Eq. (18), we can calculate the rightmost factor of Eq. (17); Under the PS formalism, the multiplicity function, f (σ, z), appearing in Eq (17) becomes universal to the changes in redshift and cosmological parameters and has an expression of f PS (σ) = 2 π δc σ exp − δ 2 c 2σ 2 , where δ c = 1.69 is the threshold parameter for the density fluctuation. However, we adopt the empirical formula for the multiplicity function given by Tinker et al. (2008), which agrees with the results in N-body simulations, as with best-fit parameters, , where ∆ vir is the overdensity with respect to ρ 0 within the sphere of radius R M . We adopt ∆ = 500 to compare with the mass of the sample clusters in radio observations (Sect. 3).

Monte Carlo Simulation
For the initial state, we prepare N = 4, 000 halos at z = 0 in the mass range of [10 13 M , 10 16.5 M ]. To realize the mass distribution consistent with the HMF of Tinker et al. (2008), each halo is weighted with the HMF.
In our MC algorithm, the halo mass decreases through the mergers and the accretion as the time goes back to higher redshifts. The occurrence of a merger with each ξ in each time step is simulated with a random uniform number R in the interval 0 to 1. When R < P(M 0 , ξ, z), the descendant halo with mass M 0 split into two progenitor halos with masses of M 1 = M 0 /(1 + ξ) and M 2 = ξM 0 /(1 + ξ). This criterion is adopted for each ξ in each time step. When R ≥ P(M 0 , ξ, z) for all ξ, the mass decreases only through the accretion.
In very rare cases, two or more mergers occur in one time step with different ξ. In this case, we choose M 1 as the mass of the most massive progenitor and M 2 as the sum of the rest, and redefine the mass ratio of the event as ξ = M 2 /M 1 . For example, in an event where a descendant halo with a mass of M 0 split into three haloes with M a > M b > M c , we set M 1 = M a and M 2 = M b + M c . The ratio ξ = M 2 /M 1 can be larger than unity in this case.
The boundary between the merger and accretion is ambiguous, so that the simple summation of ∆M = M 2 + ∆M median , where ∆M median = dM/dz median × ∆z, would overestimate the mass evolution in each time step in out computation. Thus, we set ∆M = ∆M median for M 2 < ∆M median , considering that the contribution of frequent minor mergers are effectively included in ∆M median . When M 2 > ∆M median , the halo mass at z + ∆z is chosen to be M 1 = M 0 − M 2 to account for the abrupt growth due to the major merger. Our code also follows the evolution of sub-progenitors with masses larger than 10 13 h −1 M , although those trees are not used in the following analysis. When multiple mergers occur in one time step, although such event hardly occurs, two or more sub-progenitors are produced in that step.
We follow the mass evolution of each halo up to z = 2. An example of the MC merger tree is shown in Figure 5 (left). Figure 5 (right) compares the HMF obtained from our MC merger tree with the fitting form of Tinker et al. (2008). Starting from the HMF given by Tinker et al. (2008) at z = 0 (blue), the HMF evolution shows a good agreement with the fitting formula at higher redshifts. Our method slightly (a factor of 2) overproduce halos at a higher redshift (z > 1.0). However, since galaxy clusters above M > 10 14 M are mostly formed at a low redshift, that difference does not affect our discussion in the following sections. Although better match at higher redshifts would be achieved by tuning the param-eters appearing in Eqs. (14) and (16), we adopt the same parametrization as Fakhouri et al. (2010) for simplicity.

STATISTICAL STUDY ON THE MERGER-RH CONNECTION
In this section, we study the connection between the halo merger and the occurrence of RHs, using the MC merger tree (Section 4). The RH survey observations have revealed that only a fraction of clusters host RHs, and they are preferentially found in merging systems (e.g., Buote 2001;. We assume two conditions for the RH onset: massratio condition and break-frequency condition (Section 5.1). Based on the RH lifetime obtained in Section 3, we model the mass dependence and time evolution of the radio power in Section 5.2. We then calculate the luminosity evolution of each cluster between 0 ≤ z ≤ 2. A cluster is categorized as a RH only when the radio power at the observation epoch is larger than the minimum value for the detection (Section 5.3). In Section 5.4, we discuss the constraints on model parameters by calculating the fraction of clusters with RHs, f RH , and comparing it with the observation (Cuciti et al. 2021a).
The mass of clusters in those observations is defined with the over-density with respect to the cosmic critical density, while our MC merger tree refer to the mean mass density. We multiply a factor of Ω 1/2 m to convert the mass in merger tree into the observational mass (see also Enßlin & Röttgering 2002).

Criteria for the onset
We introduce two conditions for the onset of the radio emission in the ICM. The first one is called mass-ratio condition, which is a simplified approach using a threshold for the merger mass ratio, ξ RH , above which mergers can ignite the reacceleration. We treat ξ RH as a model parameter and do not include its dependence on the total mass or redshift.
The purpose of this simplified model is to discuss the relation between the RH lifetime (Section 3) and the typical mass ratio for the onset that can explain the observed fraction of RHs f RH . Since the merger rate (Eq. (14)) decreases with ξ, we can expect that the model with longer lifetime requires larger ξ RH (Cassano et al. 2016).
The other condition, the break-frequency condition, is more sophisticated approach based on the reacceleration model. The reacceleration of CREs balances with the radiative cooling at a certain energy scale, which causes a spectral steepening at the break frequency ν b . That steepening is actually seen in both the secondary and primary scenarios (Figure 3). In this break-frequency condition, ν b need to be large enough to reproduce typical power and spectral index of RHs.
To calculate the efficiency of the reacceleration, we introduce a parameter η t , which is the fraction of the turbulent energy to the merger kinetic energy. The formalization in Appendix B is used to calculate the reacceleration efficiency, or its timescale, t acc , of each merger event as a function of η t , M and ξ.
In this model, we firstly set ν b around 1.4 GHz. The case with different ν b will be discussed in Section 5.4.2. A merger can ignite the emission, whose spectral feature matches with the typical observed one, only when the acceleration is efficient enough to balance or overcome the cooling at a given ν b , i.e., t acc ≤ t cool | ν=ν b . The cooling time scale due to synchrotron and IC radiation in the Thomson limit is approximated as (e.g., Rybicki & Lightman 1985;Brunetti & Jones 2014) where σ T is the Thomson cross section, u B = B 2 /8π and u CMB = 0.262 eV/cm 3 (T CMB /2.73 K) 4 (1 + z) 4 are the energy densities of the magnetic field and the cosmic microwave background (CMB), respectively, and the Lorentz factor γ b is related to ν b as ν b = 3eB 4πmec γ 2 b . For simplicity, we fix B = 3.2 µG and t cool is treated as a function of z and ν b . The acceleration timescale becomes shorter with a larger η t , which leads to a larger f RH for a fixed ν b .
For the secondary scenario, we do not take into account the break-frequency condition. The spectrum of CRPs has no cooling break in low-energy region. As seen in Section 3.1 (left panel of Figure 3), a significant part of the radio emission at 1.4 GHz is powered by the enhanced electron injection from CRPs. This makes the break feature less significant compared to the primary scenario. To simplify the following discussion, we apply break-frequency condition only to the primary scenario.

Power of the radio halo
One of the main objectives of this paper is to calculate the luminosity function of RHs (RHLF). Combining the HMF and the criteria proposed in Section 5.1, one can obtain the mass function of RHs. Enßlin & Röttgering (2002) used the empirical relation between the radio power and the cluster mass to transform the HMF into the RHLF.
The important improvement in our method is to include the time evolution of the radio luminosity discussed in Section 3. Although the calculation in Section 3 was done only for the Coma-like RH, we have adopted a similar luminosity evolution for different masses. We assume a simple power-law relation between the maximum radio power at the end of the reacceleration period and M 500 like the previous study in Enßlin & Röttgering (2002).
With those assumptions, the radio luminosity at 1.4 GHz is expressed as a function of both mass and time. We write it as P 1.4 (M 500 , t R ), where t R is the time interval between the RH onset and the observation. We do not consider the possible dependence on ξ for P 1.4 , although this parameter could affect the efficiency of the reacceleration (Cassano & Brunetti 2005). This simpli-fication may be justified, if the merger rate is a steep function with respect to ξ and a large fraction of RHs arise from mergers with a value close to the lower limit ξ RH .
Since both the mass and the luminosity evolve with time, it is important to distinguish P 1.4 − M 500 relation at the onset from that at the observation. The timescale of mass increment can be evaluated from Eq. (16), and it is about 5 Gyr for M 500 ≈ 10 15 M . If the RH lifetime is much shorter than this timescale, as in the primary scenario, the mass evolution after the onset is almost negligible. However, that can be not the case for the secondary scenario, since the emission can last for the cosmological timescale (≈ 10 Gyr). We adopt a powerlaw relation between the peak luminosity and the mass at the onset as P 1.4 (t peak R ) ∝ M α M 500 , which could satisfy the observed P 1.4 − M 500 relation, neglecting the mass evolution during the reacceleration phase (≈ 500 Myr). Thus, P 1.4 (M 500 , t R ) can be expressed as (24) where the function f (t R ) satisfies f (t peak R ) = 1, so that the factor 10 ARH M 500 /10 14.9 M α M corresponds to the peak luminosity for the descendant mass M 500 . The overall normalization A RH is considered to be a constant, and treated as a model parameter.
As shown in Figure 6, we adopt the temporal evolution of RHs extrapolated from the modeling of the Coma-like RH. We have assumed in Section 3 that the Coma-like RH is in the midst of the reacceleration phase and the peak luminosity has not been achieved. The maximum duration of the reacceleration phase was assumed to be t peak R ≈ 500 Myr, which is close to the timescale of the turbulent cascade at the ≈ 500kpc scale (Section 7). The peak luminosity of the Coma-like RH becomes almost two orders of magnitude larger than the luminosity at the pre-acceleration state.
In the secondary scenario, such a high luminosity is sustained even in the cooling phase (Figure 3). We take into account the decline (a factor of ≈ 4 at 1.4 GHz) in the cooling phase. The time evolution of the luminosity, f (t R ), can be modeled as where t cool (z) is the cooling timescale at 1.4 GHz (Eq. (23)), and t 1 = 1.4t cool (z) corresponds to the time at which the luminosity becomes ≈ 4 times smaller than the peak value, i.e., f (t 1 ) = 0.25. Since the Coulomb cooling is negligible at higher energies (γ > 10 3 ), t cool is calculated from Eq. (23), for which we substitute the Lorentz factor γ b corresponding to 1.4 GHz. As done in Section 5.1, we fix B = 3.2 µG for the magnetic field. In this case, t cool only depends on z. The luminosity is constant after t R − t peak R > t 1 . In the primary scenario, the RH would revert to the radio-quiet state within a few times the cooling timescale. We assume an exponential decay after the peak time as When another merger occurs after one merger event, we compare the two values of luminosity induced by each merger and choose the larger one as the luminosity at a given epoch.
The relation between the peak (t R = t peak R ) luminosity and the mass is shown in the left panel of Figure 6. As a reference, we adopted (α M , A RH ) = (3.5, 0.6). Those values are constrained from the observed fraction of RHs, and their number counts per observed flux (see Section 5.4). The observed power of RHs and the upper limits listed in the extended sample of Cuciti et al. (2021a) are plotted with black points and green arrows, respectively. The black lines are the sensitivity limits for NVSS and ASKAP at z = 0.2 (Section 5.3). The luminosities of some RHs, such as Abell 545, Abell 1995, and Abell 2744 are close to the assumed peak luminosity for their mass, indicating that they are in the transition from the reacceleration phase to the cooling phase.
In the right panel of Figure 6, we plot the assumed evolution of Eqs. (25) and (26). We have adopted t cool at z = 0 for the primary scenario.
As discussed in Section 3.2, the CR injection rate does not revert to that at the pre-reacceleration state, when a certain amount of CRPs are injected as primaries. However, we have neglected this effect in Eq. (26) in order to treat each merger event independently.

Observational limit
To estimate the RH number expected in the previous and future radio surveys, we need the minimum flux detectable in those surveys. We adopt the brightnessbased criterion of  for the observational limit in the NVSS survey. In this case, the minimum flux is estimated as,   25) and (26) for the secondary (red) and the primary (blue) scenarios, respectively. We assumed M500 = 10 14.9 M for both the lines, while different values for ARH are adopted to improve the visibility. The vertical dashed line shows the peak time. Right: The red solid line shows the mass dependence of the peak luminosity. The black lines show the sensitivity limits of observations. The mean redshift z = 0.2 is adopted to convert the flux limit into the luminosity limit. The observational limits for NVSS and GMRT (dotted and solid) are calculated with Eq. (27).
The dot-dashed line shows the sensitivity of ASKAP survey (Eq. (28)). Data points are taken from Cuciti et al. (2021a). The Coma RH is shown with the yellow star.
where θ b and θ H are the angular size of the beam and the source, respectively, and F rms is the rms noise per beam. Following Cassano et al. (2012), we assume ζ 1 ≈ 3, and (F rms , θ b ) = (0.45 mJy, 45 arcsec) for the rms noise and the beam size of the NVSS survey. We adopt θ H = 0.35θ 500 as the typical size of RHs, where θ 500 is the angular size corresponding to R 500 . In Section 5.4, we compare our calculation to the data given by Cuciti et al. (2021b). The data includes the RHs observed with GMRT at 330 MHz and 610 MHz. Although our method only follows the luminosity at 1.4 GHz, we also consider the sensitivity of those low frequency observations. In this study, we simply scale the sensitivity at 610 MHz to 1.4 GHz, assuming a spectral index of α syn = −1.2. We adopt F rms and θ b for GMRT 610 MHz provided in Brunetti et al. (2007b), which leads to (F rms , θ b )=(19 µJy, 25 arcsec) at 1.4 GHz. Note that the value of F rms in Brunetti et al. (2007b) is similar to the ones in the actual GMRT survey (Venturi et al. 2007;Kale et al. 2013).
The flux of each RH is evaluated from P 1.4 as f 1.4 = (1 + z) 1−αsyn P 1.4 /(4πD L (z) 2 ), where D L (z) is the luminosity distance at redshift z and α syn is the spectral index of the RH (Section 1). For simplicity, we fix α syn = −1.2 for all RHs in our merger tree.
In the right panel of Figure 6, the observational limits for NVSS and GMRT are shown with the dotted and solid lines. The mean redshift of the sample of Cuciti et al. (2021b), z = 0.2, is adopted to convert the flux into luminosity. The figure shows that the NVSS limit is comparable to the upper limits in the sample (green arrows), while the GMRT limit is lower than the upper limits and the power of detected halos (black points). Note that the redshift was fixed only for the visualization purpose. In the following calculation, the flux of each RH is calculated depending on its redshift.
In Section 9, we use the flux-based criterion to estimate the number count in the ASKAP survey. In this case, f min can be written as f min = 1.43 × 10 −3 mJy ζ 2 F rms 10 µ Jy Following Cassano et al. (2012), we adopt ζ 2 ≈ 10 and (F rms , θ b ) = (10 µJy, 25 arcsec) for the ASKAP survey. The apparent halo size is assumed to be θ H = 0.35θ 500 . The radio power at z = 0.2 corresponding this limit is shown with the dash-dotted line in Figure 6.

mass-ratio condition
First, we discuss the case where the onset is triggered by the condition ξ > ξ RH . In this model, we have three model parameters: ξ RH , α M , and A RH . The observed fraction of RHs and the luminosity function (Section 5.5) give constraints on those parameters. Cuciti et al. (2021a) reported that the fraction of RHs, f RH , is ≈ 0.7 in the high-mass (HM) bin (8.0 × 10 14 < M 500 < 12.0 × 10 14 ), while it drops to ∼ 0.35 in the low-mass (LM) bin (5.7×10 14 < M 500 < 8.0×10 14 ) in a sample of clusters from the Planck SZ catalogue. When the small halos and the ultra-steep spectrum radio halos (USS-RHs) (e.g., Brunetti et al. 2008) are excluded from the statistic, f RH ≈ 0.33 in the HM bin and f RH ≈ 0.23 in the LM bin.
We adopt a Monte Carlo procedure to calculate the fraction f RH and its statistical error in our MC merger tree. We randomly extract a sub-sample of clusters from our merger tree, and calculate f RH as a ratio between radio-loud and radio-quiet cluster. The size of the subsample is taken to be the same as the observation (Cuciti et al. 2021b), i.e., 60 clusters for the LM bin and 15 clusters for the HM bin. Those sub-samples are parts of the population with masses in the LM or HM bin at some time within the sampling range of redshift (0.088 < z < 0.33). If the radio flux of a halo, calculated with Eq. (25) or Eq. (26), is larger than the observational limit (Eq. (27)) at a given redshift, the halo is counted as a radio-loud one, while halos with a flux below the above criteria are regarded as radio-quiet one. The observation redshift (0.08 < z < 0.33) is divided into 50 bins, and we calculate the mean value of f RH , weighting each redshift bin equally. Note that we have calculated the luminosity evolution over the whole redshift range of the merger tree (0 ≤ z ≤ 2). The time interval t R in Eq. (25) or (26) is measured from the most recent merger that satisfies the condition ξ > ξ RH before the observation.
The observed fraction could be reproduced in our merger tree by tuning the model parameters. Since the merger rate is larger for smaller ξ (Eq. (14)), one can expect more RHs for smaller ξ RH . The parameter A RH , which regulates the typical radio flux, also affects the RH fraction.
For any ξ RH , we can find a parameter sets of (α M , A RH ) to reproduce the observed f RH and its mass dependence. However, there should exit a typical mass for a given luminosity of RHs, considering the observed RHLF and the MF of clusters. For example, in the models with A RH ≥ 1.0 and α M = 3.5, even low-mass RHs with M 500 ≤ 5 × 10 14 M can be observable with the NVSS sensitivity, and such a model overproduces the RH number count (Figure 7). Thus, the parameter A RH is constrained to be A RH ≈ 0.5. 2 If the mass dependence of f min Eq. (27) is significantly weak, the assumed peak luminosity-mass relation naturally leads to the positive mass dependence of f RH (see Figure 6). However, the observed mass dependence is 2 A larger value of A RH is preferable for a larger α M . not significant enough to give a definite constraint on the parameters. We temporary adopt α M similar to the one adopted from the fitting to the observed RH, i.e. α M = 2.5 − 4.0. This ensures a reasonable fit to the observed number count (Section 5.5). In Section 7, we show that a similar mass dependence can be reproduced by calculating the peak luminosity for clusters with various masses. On the other hand, the threshold ξ RH does not largely affect the mass dependence of f RH , since the merger rate has a weak mass dependence. We search for the best fit value of ξ RH for given (α M , A RH ) in the range discussed above, and find ξ RH ≈ 0.1 and ξ RH ≈ 0.01 for the secondary and the primary models, respectively. The values of the three parameters, constrained from both the observed f RH and the number count (Figure 7), are listed in Table 2. Here, we refer the secondary and the primary models as model (i) and (ii), respectively. To study the impact of different ξ RH and P 1.4 (M 500 , t R ) on the observable quantities, we adopt the same (α M , A RH ) = (3.5, 0.6) for both the models. Possible constraints from the correlation between the RH occurrence and the X-ray morphological disturbance will be discussed in Section 6.
In Figure 7, we show f RH for models (i) and (ii). The errors in the models due to statistical fluctuation are given by N = 100 trials of the MC sampling. The gray points and the shaded region show the observed fraction and its 1σ uncertainty associated with the statistical error on the masses, adopted from Cuciti et al. (2021a).
The calculation was carried out for the two different sensitivities: NVSS (F rms = 0.45 mJy, θ b = 45 arcsec) and GMRT (F rms = 19 µJy, θ b = 25 arcsec). With the NVSS sensitivity (points with dotted error bars), the fraction f RH is about 10 − 30%, while it becomes 40 − 70% with the GMRT sensitivity (points with solid error bars). The mean value of f RH at the GMRT sensitivity within LM and HM bins are shown in Table 2 Table 2. The black points show the NVSS number count within 0.04 < z < 0.2 for the all-sky, adopted from Cassano et al. (2012).
models result in f RH close to the observation including small halos and USSRHs, f RH = 0.44 (Cuciti et al. 2021a). The difference in the best fit values for ξ RH between the secondary and primary scenarios (models (i) and (ii)) is caused by the difference in the RH lifetimes. The longer lifetime in the secondary scenario requires less frequent onsets and thus a larger value for ξ RH . In this scenario, RHs appears only after major mergers. This condition supports the expectation that only major mergers ξ ≈ 0.1 can cause disturbances over the scale of giant RHs (Cassano et al. 2016). On the other hand, the short lifetime in the primary scenario requires a much smaller ξ RH for the same (α M , A RH ). Thus, RHs need to be powered by frequent minor mergers.
Since the lifetime in model (i) is longer than the timescale of the mass evolution, a RH observed with a certain mass, M obs , can have the low radio luminosity corresponding to a smaller mass, M onset < M obs . Since the relative margin between the peak luminosity and the observational limit is enough for larger masses (see Figure 6), more "aged" RHs (larger t R ), can contribute to the number count in higher mass bins. In other words, almost all of the RHs in lower mass bins are driven by recent (t R 1 Gyr) mergers, while the count in the higher mass bins can include RHs with larger t R (> Gyr). On the other hand, in model (ii), ages of RHs should be similar in all mass bin.

Break-frequency condition
In this section, we discuss the break-frequency condition in the primary model. In the models (iii) and (iv), we observe the competition between the acceleration and the cooling (Section 5.1). These models are basically compatible with model (ii) but described with different parameters, ν b and η t . As long as the same luminosity evolution is assumed, mergers with ξ 0.01 are required for the RH onset to explain the observed f RH and number count.
We adopt a slightly steeper P 1.4 − M 500 relation for models (iii) and (iv), as it provides a better fit to the mass dependence of f RH for the primary scenario. Because we have not considered the redshift dependence in the parameters, while t cool decreases with z (Eq (23)), this model tends to predict a smaller f RH at higher redshifts. That makes it difficult to fit simultaneously both the observed f RH and the RH number count (Section 5.5), which is measured in lower redshifts. We find that the model is improved by the combination of a larger η t and a smaller A RH , rather than a smaller η t and a large A RH . Here, we adopt a slightly smaller A RH compared to model (ii).
For ν b = 1.4 GHz, η t ≈ 0.6 is required. More than a half of the kinetic energy need to be dissipated into the compressible turbulence to achieve an efficient acceleration in mergers with ξ ≈ 0.01. A smaller value of η t can be allowed for a smaller ν b (model (iv)). Even with those large η t , models (iii) and (iv) predict a slightly small f RH compared to the observed value, while the RH number count is well reproduced (Figure 7 right). A better model would be available with an even larger η t (smaller A RH ). Therefore, if a large η t > 0.5 or small ν b is allowed, the observed RH statistics are reproduced even by minor mergers with ξ 0.01.
In the right panel of Figure 7, we show the all-sky cumulative number counts of RHs observable with NVSS sensitivity and compare them with the number count in the NVSS follow-up (Giovannini et al. 1999) of the XBACs sample (Ebeling et al. 1996). In this figure, we take into account not only the observational limit explained in Section 5.3 but also the X-ray flux limit in the XBACs sample (f X > 5.0×10 12 [erg/cm 2 /s]), where we have used the scaling relation reported in Yuan et al. (2015) to translate the radio power at 1.4 GHz P 1.4 into the cluster X-ray luminosity L X . The parameters listed in Table 2 agree with the observed count (data points are adopted from Cassano et al. (2012)), ensuring the compatibility between the observations and our models. In Section 9, we discuss the number count in future highsensitivity survey. Figure 8 shows total RHLF calculated in our model at different redshifts by directly counting the RH number in our MC merger tree. Any observational limits are not included in this figure. Because we do not impose any conditions concerning the mass for the RH onset, a large number of RHs with low luminosities are predicted. Both the secondary and primary models predict similar redshift evolution. The RHLF at z = 0 is well approximated with that calculated from the HMF and the typical P 1.4 − M 500 relation obtained from observations (e.g., Cassano et al. 2013), assuming mean occurrence of f RH = 0.3 (dashed line) (see also Enßlin & Röttgering 2002). The best fit value of A RH for the RHLF, A RH ≈ 0.0 − 0.1, becomes smaller than those assumed for reproducing the peak luminosity in Table. 2, and it becomes similar to the fit to the observation (e.g., Cuciti et al. 2021a).

Caveats
We have assumed that all RHs follow the same luminosity evolution described with Eqs. (25) and (26), regardless of ξ. The apparent difficulty in the primary scenario, such as the requirement of a small value for ξ RH or relatively large value for η t , may be due to this simplification.
In some numerical simulations of galaxy clusters, the turbulence is generated over a duration comparable to the crossing timescale of merger events (e.g., Miniati 2014;Vazza et al. 2018). In such cases, the emission could be sustained over a longer duration than the lifetime we assumed in Section 5.2.
Assuming a longer lifetime of RHs, one can obtain a larger ξ RH . To clarify this point, we have tested the primary scenario with a different assumption about its luminosity evolution. We introduce a decay time t decay instead of using t cool (z) in Eq. (26). When t decay > t cool (z), the decline of the luminosity after the peak becomes modest and the lifetime is effectively extended. Such a slow decline could be reproduced with a moderate acceleration efficiency, which is smaller than that we assumed in Section 2.5. We find that the threshold mass ratio becomes ξ RH = 0.1 for t decay = 1 Gyr, and ξ RH = 0.15 for t decay = 2 Gyr, which is consistent with the estimate in Cassano et al. (2016). Thus, the apparent difficulty in the primary scenario can be alleviated if the emission is sustained over ∼1 Gyr.
Assuming that t decay is comparable to the cascade timescale of the IK turbulence, we find that η t ≤ 0.1 leads to t decay > 1 Gyr for the major (ξ ≈ 0.2) merger of massive (M 500 ≈ 10 15 M ) clusters (Appendix B).
We have simplified the luminosity evolution in a sequence of mergers, as we neglect the effect of mergers before the onset. However, the turbulent energy could be accumulated though the evolution of clusters (e.g., Cassano & Brunetti 2005). The time between mergers can be shorter than the cascade timescale of turbulence for minor mergers with ξ < 0.01. Our method would underestimate the lifetime and the luminosity of RHs initiated by such mergers.

CORRELATION WITH MERGING SYSTEMS
RHs are preferentially found in clusters showing the signatures of merger activities (e.g., Schuecker et al. 2001;Govoni et al. 2004). In this section, we quantify the fraction of RHs classified as a merging cluster in observations using our merger tree. We employ the same method used for the calculation of f RH with the mass-ratio condition (Section 5.4.1).
The RH lifetime in our secondary scenario is comparable to the loss timescale of CRPs, so it can exceed the dynamical timescale of cluster mergers. Thus, the secondary scenario predicts a fraction of RHs hosted in relaxed clusters. This scenario should be tested in the light of the observed correlation between the occurrence of RHs and the dynamical disturbance in X-ray morphology.
According to Cassano et al. (2013), the fraction of all merging clusters, including both RHs and non-RHs, is about 60% in both LM and HM bins. Firstly, we try to recover this fraction in our merger tree, using a similar method to that in Section 5 (see also Cassano et al. 2016). We newly introduce two parameters, ξ mer and t relax . The former one, ξ mer , is the threshold mass ratio, above which such a merger makes a descendant cluster disturbed enough to be classified as a merging system in the X-ray morphological analysis. The parameter t relax is the timescale required for a merger system to recover a relaxed state, or the "lifetime" of the merging system. We do not model the detailed merger dynamics with parameters to diagnose if clusters are merging systems, such as the concentration parameter, c, the centroid shift, w, and the power ratio, P 3 /P 0 (e.g., Mohr et al. 1993;Buote 2001;Santos et al. 2008). Those parameters generally depend on various factors other than the mass ratio, e.g., the projection effect and the impact parameter of the merger.
Unlike Section 5, we fix ξ mer and find t relax that can explain the merging fraction of ≈ 60%, because the latter is less constrained from observations. As a reference value, we assume ξ mer = 0.1 for the secondary scenario, which is close to the minimum mass ratio of observed RH-hosting clusters (Cassano et al. 2016). As in Section 5.4, we extract sub-sample of halos, which have masses in LM or HM bin within 0.088 < z < 0.33. The redshift range is divided into 50 bins as before. A cluster is identified as a merging cluster only when the time interval between the merger and the observation is smaller than t relax , i.e., t R ≤ t relax . Under the above condition, we find that t relax = 3.6 Gyr successfully explain the merging fraction of ≈ 60% (see also Cassano et al. 2016).
In the next step, we calculate the fraction of RHs found in merging clusters over all RHs, f mer∩RH /f RH , to test the correlation between the merging state and RHs. In the sample of Cuciti et al. (2021b), this fraction is 90%-100%. We use the same procedure as before to calculate that fraction in our merger tree; f mer∩RH is defined as the fraction of clusters that show the radio power larger than the observational limit and also satisfy t R < t relax at the observed epoch. Figure 9 shows the results for models (i) and (ii). We find that the fraction is compatible with the observation. Thus, most of RHs in our merger tree should show disturbed morphology at the observed epoch and are classified as merging systems. In other words, the time interval between the onset and the observation, t R (Eq. (25)), is typically shorter than t relax .
A similar discussion can be applicable for the primary scenario as well. The tight correlation between RHs and merger systems requires that ξ mer should be similar to or smaller than ξ RH . Although ξ RH can be lager with longer lifetime (Section 3.3), ξ mer ≈ 0.01 is required in our fiducial primary scenario.
We adopt ξ mer = 0.02 for the primary scenario and find that t relax = 1.7 Gyr leads to the merger fraction of 60%. With those ξ mer and t relax , f mer∩RH /f RH 0.8 is well reproduced in this scenario (Figure 9). However, that requirement for ξ mer 0.01 seems problematic, since the mass ratio estimated from optical or near infrared observations is typically ξ 0.1 for known merging clusters (Cassano et al. 2016).
The above difficulty in the primary scenario may be due to our simplified treatment of the turbulent reacceleration. As discussed in Section 5.4.2 this difficulty can be alleviated if the emission is sustained for a longer duration.

PEAK LUMINOSITY-MASS RELATION
In Section 5.2, we have introduced a simple powerlaw relation between the peak luminosity and the mass of RHs. In this section, we verify if such a steep relation could arise from the mass dependence of the reacceleration efficiency. Some previous attempts have succeeded in reproducing the relations similar to the observed ones (e.g., Cassano et al. 2007;Zandanel et al. 2014). However, the relation between the steep mass dependence and the reacceleration parameters, such as f ep , t acc , or t R , has not yet been clarified. We examine this point, extending the model used in Section 2 for RHs with various masses.
In this section, we use a representative value for the acceleration timescale, τ acc , which is the volume average of t acc (r) = p 2 /(4D pp (r)) within r < 0.5R 500 , where most of the radio emission is produced. As seen in Nishiwaki et al. (2021), our turbulent reacceleration model is more sensitive to t R or τ acc than the duration of the injection phase, and the energy injection from the reacceleration is at least an order of magnitude larger than that from the injection. The steep mass-dependence of the peak luminosity could be due to the steep dependence on the ratio t R /τ acc .

Luminosity evolution in the reacceleration phase
First, we show the exponential increase of the radio luminosity in the reacceleration phase. We examine the Coma-like case as an example. We use the same model adopted in Section 2.5 with the same model parameters. The typical reacceleration timescale becomes τ acc = 253 Myr and 231 Myr for the secondary and the primary scenarios, respectively.
In Appendix B, we roughly estimate the timescales, τ acc and t peak R with the simple binary merger approximation. There, we introduce a parameter η t , which quantifies the fraction of the turbulent energy to the merger kinetic energy. A merger is parameterized with the mass ratio ξ and the mass M 500 . We find that the reacceleration with τ acc ≈ 250 Myr could be induced by a merger with ξ ≈ 0.1 and M 500 ≈ 10 15 M when η t ≈ 0.15. The duration of t peak R ≈ 500 Myr assumed in Section 3 might correspond to the turbulent decay time at the injection scale but multiplied by a factor of ≈2.
In Figure 10, we show the time evolution of P 1.4 during the reacceleration phase. The luminosity is normalized with the value at the beginning of the reacceleration phase, while t R is normalized with τ acc . The exponential increase of the luminosity can be seen in both the scenarios. The evolution in the secondary scenario can be approximated as P 1.4 ∝ 14 x , where x = t R /τ acc .
As discussed in Section 3, the synchrotron at high frequencies is mostly powered by the enhanced injection from reaccelerated CRPs. That discussion seems in line with the evolution seen in Figure 10, as explained in the following. Since CRPs do not suffer from significant coolings, the hard-sphere type reacceleration causes the advection of the spectrum in the momentum space. At t R = τ acc , the spectrum shifts to a higher energy by a factor of e ≈ 2.718. As we have assumed α inj = 2.45 for the typical injection index, the production rate of secondary CREs, which is proportional to N p , is increased by e 2.45 ≈ 11.59, slightly lower than 14. The slight deviation from this estimate would come from the spectral evolution of CREs or the definition of the typical acceleration timescale τ acc (Note that D pp depends on the radius).  Cuciti et al. (2021a). We also plot the observed luminosities with thin black data points. Some of the RHs are highlighted with star symbols. We adopt t peak R = 690 Myr and 700 Myr for the secondary and the primary scenarios, respectively. The solid lines show the fits to the red and blue points, while the dashed line is the relation assumed in Section 5.4.
The luminosity evolution in the primary scenario (blue line in Figure 10) shows more complex features. The overall trend is slightly shallower than the secondary scenario, possibly because of the radiative cooling. It is characterised by the steepening around t R /τ acc ≈ 1 and subsequent flattening around 2t R /τ acc . Those features should be related to the spectral evolution of CREs. As seen in Nishiwaki et al. (2021), the CRE spectrum has a bump like-shape caused by the Coulomb cooling and the radiative cooling. The reacceleration shifts that bump towards higher energies.

peak luminosity for various masses
We calculate the peak luminosity of RHs with various masses, extending the model in Section 3, which was adopted for the Coma RH (Section 2.5). For this purpose, we introduce a scaling between the reacceleration parameter and the mass. As shown in Appendix B, under the simple binary merger approximation with zero impact parameter and the IK scaling for the compressible turbulence, one can find τ acc ∝ M −1/3 500 . Thus, the reacceleration could be more efficient for larger masses. On the other hand, we assume that t peak R is independent of the mass for simplicity. This may be justified if the duration is regulated by the turbulent cascade timescale (see Appendix B). The values of τ acc and t peak R will be normalized at the Coma mass, M 500 ≈ 8.5 × 10 14 M .
We neglect the ξ dependence of the acceleration efficiency and the duration, although the acceleration efficiency is more sensitive to ξ than M 500 (Figure 14 in Appendix B). In other words, all RHs are assumed to be generated through the mergers with the threshold value ξ ξ RH , regardless of the mass. As noted in Section 5.2, the steep ξ dependence of the accretion rate ensures that events with ξ ξ RH are negligible. More precisely, the ξ dependence would cause a scatter in the peak luminosity-mass relation. The parameters we adopt here can be interpreted as the merger-rate weighted means for ξ RH < ξ < 1.
In addition to the above assumptions, we introduce following assumptions to make our model applicable to RHs with various masses: • We adopt the beta-model of the ICM profile, where the core radius is scaled with the virial radius, r c ≈ 0.2R 500 , and β = 0.75 is taken to be constant. Under the assumption of a constant baryon fraction for all masses (see, e.g., Allen et al. 2008, for the observational constraints), this leads to a constant central density of n 0 ≈ 3 × 10 −3 cm −3 .
• We adopt the same Eq. (13), B 0 , and η B for the profile of the magnetic field, without any dependence on mass or redshift.
• We adopt Eq. (6) with β = 0.75 and r c ≈ 0.2R 500 , and α inj = 2.45 for the primary CR injection. The volumetric CR injection rate at the center (r = 0) is scaled with the thermal energy density norimalized by the model for the Coma cluster.
• We adopt the same index for the turbulent profile, α turb , as Coma.
• All clusters are observed at redshift z = 0.2, which roughly corresponds to the mean redshift of the sample of Cuciti et al. (2021a).
• The duration of the injection phase is taken to be the time gap between z = 0.2 and the epoch when the cluster mass was a half of that at the observation calculated with Eq. (16).
The observations indicate that ∼ µG magnetic field is ubiquitous in nearby merging clusters, while some coolcore clusters could have a central field of a few ∼ 10µG (e.g., van Weeren et al. 2019, for review). Recent lowfrequency RH observation suggests that some distant clusters have similar magnetic field to those in nearby clusters (Di Gennaro et al. 2021).
With those assumptions, we calculate the peak luminosity of RHs with M 500 = 2, 3, 4, 5, 6.5, 8.5, 10.0 × 10 14 M at z = 0.2. The acceleration timescale averaged within r < 0.5R 500 , τ acc , is normalized at the Coma mass (M 500 = 8.5 × 10 14 M ) as τ acc = 253 Myr and 231 Myr for the secondary and the primary scenarios, respectively. With the scaling of τ acc ∝ M −1/3 500 , τ acc spans 237 Myr < τ acc < 405 Myr for the above mass range in the case of the secondary scenario. We use the value of peak luminosity assumed in Section 5.4 to derive the duration required to achieve the peak luminosity, t peak R . Adopting (α M , A RH ) = (3.5, 0.6), the peak luminosity becomes P 1.4 ≈ 1 × 10 24 W/Hz at the Coma mass. We find that t peak R = 690 Myr and t peak R = 700 Myr can reproduce that luminosity at the Coma mass for the secondary and the primary scenarios, respectively. As noted before, t peak R is assumed to be constant with mass. With those assumptions, the peak luminosity increases with mass.
In the secondary scenario, the typical value of the average CR energy density in the RH volume (r ≤ 0.5R 500 ), ε CR , becomes ≈ 5% of the thermal energy density at the peak state. As discussed in Appendix B, the acceleration timescale could be reproduced with η t ≈ 0.15, ξ ≈ 0.2 and M 500 ≈ 10 15 M . With those values, the energy density of the compressible turbulence induced by a merger becomes ≈ 40% of the thermal one. Thus, η CR ≈ 12.5% of the turbulent energy should be consumed for the CR acceleration.
In the primary scenario (f ep = 0.01), we find ε CR ≈ 0.003ε th at the peak time. The energy density ε CR is still dominated by CRPs even after the reacceleration. With the same values of η t , ξ, and M 500 as those in the secondary scenario above, only η CR ≈ 0.75% of the turbulent energy should be converted into the CR energy.
As seen in Section 5, however, ξ RH ≈ 0.01 rather than 0.2 is required to explain the observed fraction of RHs in the primary scenario. When ξ is taken to be 0.01 as the typical value, the turbulent energy fraction is estimated as ≈ 10% of the thermal energy, then η CR ≈ 3% is required. On the other hand, a higher efficiency of the turbulent excitation (η t ≈ 0.4) is required for this scenario (Appendix B).
In the above discussion, the energy injection in the form of primary CR injection is not taken into account, since that is negligible compared to the energy injection by the reacceleration (Nishiwaki et al. 2021).
In the left panel of Figure 11, we show the brightness profiles at the peak time. Since a larger t peak R /τ acc is assumed for higher masses, the overall brightness increases with mass. Note that all of those RHs are assumed to be at the same redshift, z = 0.2. The typical size, or the cut-off radius of the profile, also increases with mass. We have assumed that the core radius of the ICM (r c ) scales with R 500 . The profile of the magnetic field, the CR injection, the turbulence for the reacceleration, and thus the synchrotron brightness also scale with R 500 . The profiles in the primary scenario (dashed lines) are slightly flatter than the secondary scenario (solid lines), because of the flatter turbulent profile assumed in the primary scenario. In addition, the brightness in the primary scenario more steeply depends on the mass than the secondary scenario.
In the right panel of Figure 11, the red and blue points show the peak luminosity at 1.4GHz for the secondary and the primary scenarios, respectively. Those points are fitted with a power-law function (4.00, 0.54) (primary). We can reproduce the steep relation with α M ≈ 3 − 4 for both the scenarios. The slope of the relation reflects the steepness of the luminosity evolution around t R /τ acc ≈ 2, shown in Figure 10. Since the dependence on t R /τ acc is larger in the primary scenario, the fit result in Figure 11 is also steeper than that in the secondary scenario.

GAMMA-RAY LIMITS IN THE SECONDARY SCENARIO
As long as the Coma RH is assumed to be in the midst of the reacceleration phase or in the very early stage of the cooling phase, our model has not strong tension with the gamma-ray limit under the magnetic field constrained from RM (Section 2.5). Recently, several papers on the analysis of Fermi data reported the detection of diffuse gamma-ray emission in the direction of Coma (Xi et al. 2018;Abdollahi et al. 2020;Adam et al. 2021). As noted by Adam et al. (2021), the flux is comparable to the expectation in the secondary reacceleration models (e.g., Brunetti & Lazarian 2011).
Although larger gamma-ray fluxes could be expected from brighter RHs, the current Fermi limits on the Coma cluster gives the most stringent constraint on the secondary model. For example, the RH in Abell 2744 is ∼ 23 times more luminous than the Coma RH, but the expected gamma-ray flux becomes ∼100 times smaller due to the large distance to the source.

NUMBER COUNT IN THE ASKAP ERA
We have focused on the RH population detectable with NVSS sensitivity. The RHLF in Section 5.5 indicates a large number of RHs with low luminosities, which would be detectable with future survey with high sensitivity radio telescopes, such as Square Kilometre Array and its pathfinders.
The Australian SKA Pathfinder (ASKAP) is a next generation radio telescope array being built at Murchison Radio-astronomy Observatory in Western Australia (Johnston et al. 2008). It consists of 36 dish antennas, each in 12-m diameter, distributed over with baselines up to 6 km. Its array configuration balances the need for high sensitivity to extended structures with the need for high resolution for continuum projects such as EMU, the "Evolutionary Map of the Universe (Norris et al. 2011). The short-spacing uv-coverage of the ASKAP array has higher sensitivity to extended structures such as cluster RHs. The EMU survey is expected to achieve the sky coverage (75%) similar to the NVSS survey (Condon et al. 1998) but with 45 times better sensitivity. We use the model explained in Section 5 to calculate the RH population observable with the ASKAP sensitivity. Compared to Section 5, the only modification is the observational limit; we here use the flux-based criterion (Eq. (28)) with θ b = 25 arcsec, F rms = 10 µJy and ζ 2 = 10, following the previous study (Cassano et al. 2012). For the criteria for the RH onset, we here only discuss the mass-ratio condition with the parameters listed in Table 2. The luminosity evolution and its mass dependence are unchanged.
In the left panel of Figure 12, we show the mass trend of f RH within 0.05 < z < 0.5. The fraction f RH at lower mass bins, where the radio power is relatively small, is larger than the values in Figure 7. The mass trend seems similar between the secondary (red) and the primary(blue) scenarios. In the right panel of Figure 12, we show the cumulative number count of RHs for models (i) and (ii). Most of RHs are expected to be found in z < 0.5. The cluster mass corresponding to the minimum luminosity observable with the ASKAP sensitivity is M 500 ≈ 2 × 10 14 M . For z > 0.5, the cut-off mass of the HMF becomes smaller than the above threshold mass. Thus, the RH detection at z > 0.5 becomes inefficient. We can expect that the ASKAP survey will detect 10 3 RHs, while only ≈ 30 RHs are available with NVSS. Since the RHLF (Figure 8) is a steep function of the luminosity (Figure 5), the number count is dominated by the low-power systems (P 1.4 3 × 10 23 h −2 70 W/Hz). Although the onset conditions are considerably different between models (i) and (ii), f RH and the number count show similar trends. The results look very similar as in Figure 8. This indicates that the survey at a single frequency is not enough to determine the CRE origin in RHs. A multi-band analysis ranging from ≈ 100 MHz to ≈ 10 GHz is important for further discussions. Especially at higher frequencies, where the CRE cooling is more efficient and the RH lifetime is expected even shorter in the primary scenario, the difference between those scenarios could be more apparent. In addition to the radio properties, the statistics of the merger mass ratio, which can be constrained from the X-ray and near-IR band observations, must be important, since it differs much between those scenarios (Section 5).

CONCLUSIONS
An increasing number of RHs have been found in recent radio observations of galaxy clusters. The correlation between RHs and dynamically disturbed clusters supports the scenario where the RH emission is powered by the reacceleration of CRs by the merger-induced turbulence, but the origin of the seed CREs for the reacceleration is still poorly constrained. In this study, we compare two possibilities for the origin; the secondary scenario, where the seed population is provided as secondary particles injected via the pp collision, and the primary scenario, where most of the seed CREs are originated from the background thermal electrons and the secondary injection from the pp collision is negligible.
We have solved the FP equations to follow the evolution of the CR distribution in RHs. Our calculation method is basically the same as the one in Nishiwaki et al. (2021), but we have newly included the radial dependence of the reacceleration efficiency in this paper. The radio and gamma-ray observations of the Coma  Figure 12. Left: RH fraction with the ASKAP sensitivity (θ b = 25 arcsec and Frms = 10 µJy) within 0.05 < z < 0.5. We adopt ζ2 = 10 and the parameters listed in Table 2 for models (i) and (ii). Right: The all-sky number count of RHs observable with ASKAP (solid lines) and NVSS (dashed lines), accumulated with respect to redshift. In both the panels, the red and blue lines show the results for the secondary and the primary scenarios, respectively.
cluster is used to constrain the model parameters. Assuming that the primary CRs are injected with the same radial profile as the thermal gas density, we have found that the radially increasing efficiency of the reacceleration reproduces both the radial profile and the spectrum of the Coma RH under the gamma-ray limit given by Fermi-LAT. Our model calibrated by the Coma RH is then used to study the lifetime of RHs in Section 3. We have followed the evolution in the cooling phase, where the reacceleration ceases. The main results in this part are summarized as follows: • There are two factors that powers the radio emission: the reacceleration of the seed CREs and the enhanced secondary injection from reaccelerated CRPs.
• In the secondary scenario, a substantial part of the emission is produced by the injection from reaccelerated CRPs. This emission lasts much longer than the cooling timescale of CREs.
• In the primary scenario, the RH emission decays within the cooling timescale (t cool ≈ 300 Myr).
• The reacceleration boosts the radio power almost two order of magnitude larger. Thus, we can clearly distinguish radio-loud and radio-quiet clusters.
Based on the merger tree provided in Section 4, we have studied the occurrence conditions of RHs to satisfy the observed RH statistics (Section 5). The most important update from previous studies is the inclusion of the emission lifetime studied in Section 3. We introduce the two conditions for the onset of RHs: mass-ratio condition and break-frequency condition. The former condition is expressed with the threshold mass ratio, ξ RH , while the latter has two parameters, i.e., the break frequency, ν b , and the fraction of the turbulent energy to the merger kinetic energy, η t .
Our results in Sections 5-9 can be summarized as follows: • The observed fraction and number counts of RHs could be explained in both the two scenarios, but the required threshold value for the merger mass ratio ξ RH differs very much: ξ RH = 0.12 and ξ RH = 0.01 are required for the secondary and the primary scenarios, respectively.
• The present Coma RH should be in the middle of the luminosity growth due to the reacceleration. The Coma-like RH would achieve its peak luminosity of P 1.4 ≈ 10 25 W/Hz at an elapsed time ≈ 700 Myr after the onset of the reacceleration in both the secondary and the primary scenarios.
• A correlation between the halo mass and the peak radio luminosity is required. This correlation can be reproduced by the mass dependence of the acceleration timescale (τ acc ∝ M −1/3 ) and a constant duration timescale of the reacceleration. Those parameter behaviors can be justified by a simple model as discussed in Appendix B.
• In order to reproduce the tight correlation between the RHs and the dynamical disturbance seen in the X-ray morphology, the threshold mass ratio to induce the merger signature in X-ray observation should be close to ξ RH . For the primary scenario, this condition requires that even minor mergers with a mass ratio ξ ≈ 0.01 should produce observable global disturbances.
• The apparent difficulty in the primary scenario can be relaxed if the emission is sustained for a few Gyrs by a gentle reacceleration.
• The RH fraction steeply decreases with decreasing halo mass. However, our model predicts a large number of RHs below the present observable flux limit. The future surveys such as ASKAP are expected to detect ∼ 1000 RHs below redshift 0.5 for both the secondary and the primary models.
In summary, we have found that both the secondary and primary models for the origins of the seed CREs are allowed from the current RH statistics. Both the models have both merits and demerits. The secondary scenario requires only major mergers for the RH onset, which is consistent with the fact that RH clusters tend to show observable morphological disturbance. However, for the Coma RH, the secondary model requires a steeper injection spectral index compared to the primary scenario (see Appendix A). That requirement is even severer for USSRHs, and the CRP energy density possibly exceeds that of the thermal ICM for such RHs in the absence of the reacceleration (e.g., Brunetti et al. 2008). On the other hand, the primary scenario provides a better fit to the Coma spectrum with a flatter injection index and potentially explains USSRHs without requiring an extremely high CR energy density. However, the short lifetime in this scenario requires frequent onsets with ξ ≈ 0.01 mergers, which is one order of magnitude smaller than the typical value estimated in RH clusters (e.g., Cassano et al. 2016). The required efficiency for the turbulence excitation is relatively high in the primary model to explain the acceleration time of ≈ 300 Myr within the formulation of Appendix B. Either an extended study dedicated for the RH population in lower (≈ 100MHz) and higher frequencies (≈ 10GHz), or a statistical analysis of the merger mass ratio of RHs would further constrain the origin of relativistic electrons in the ICM.

ACKNOWLEDGMENTS
We thank the anonymous referee for the useful comments that greatly improved the presentation of the paper. K.N. acknowledges the support by the Forefront Physics and Mathematics Program to Drive Transformation (FoPM). This work is supported by the joint research program of the Institute for Cosmic Ray Research (ICRR), the University of Tokyo, and KAKENHI No. 22K03684 (KA).

A. SECONDARY SCENARIO WITH A STEEPER INDEX
As discussed in Section 2.5, our secondary model shows a slight tension with the observed radio flux of Coma above 1 GHz. In this section, we discuss the possibility that this tension could be relaxed when we adopt a steeper injection index α inj .
We have tested the case of α inj = 2.45, 2.6, 2.8 and 3.0. Since the spectral shape of the synchrotron emission is mostly determined by t R /t acc (Nishiwaki et al. 2021), the parameter t acc (r c ) was fixed to 310 Myr. From the gamma-ray upper limit, the lower limit of t R is constrained as t R > 0.8t acc (r c ). For the turbulent profile, we firstly test the case of α turb = 0.28, i.e. the same value as that in Section 2.5. However, we find that this profile overproduces the 350 MHz intensity at r > 500 kpc for t R > 0.8t acc (r c ). A model with a flatter profile of α turb = 0.35 is compatible with both the radio profile and the gamma-ray limit. Finally, from the radio spectrum, we constrain t R = 310 Myr, .i.e., t R /t acc (r c ) = 1.0. Figure 13 shows the synchrotron spec-trum of the best fit model with α inj = 3.0. The observed spectrum is well fitted with the injection luminosity of L inj p = 1.9 × 10 44 (solid line). The CR energy density in r < 0.5R 500 becomes ε CR /ε th ≈ 0.013, where ε th is the thermal energy density averaged within r < 0.5R 500 . The fraction increases to ε CR /ε th ≈ 0.05 at the peak state (t R ≈ 500 Myr).
Throughout this paper, we have assumed r ap = 500 kpc for the aperture radius to calculate the radio spectrum. With this assumption, the overall normalization of the 350 MHz profile becomes consistent with the flux around 350 MHz. In general, the fluxes at other frequencies are taken from different observations and thus originally measured with different aperture radii. We have not taken into account the systematic uncertainty in the re-scaling of those fluxes to r ap = 500 kpc.
By assuming different r ap , we test the case when the normalization is not anchored to the flux at 350 MHz. Assuming r ap = 400 kpc, the calculated flux slightly shifts down and the fit to the data points at high frequency is slightly improved (dashed line). r ap = 500kpc r ap = 400kpc Figure 13. Synchrotron spectra of the Coma RH for the secondary scenario with αinj = 3.0 (solid: rap = 500 kpc, dashed: rap = 400 kpc).

B. REACCELERATION EFFICIENCY IN THE BINARY MERGER APPROXIMATION
In this section, we discuss the efficiency of the turbulent reacceleration triggered by a cluster merger. As explained in Section 5, we adopt the binary merger approximation. The mass of the descendant halo and the merger mass ratio are denoted as M and ξ = M 2 /M 1 , respectively. The impact parameter is assumed to be zero for simplicity. We especially focus on the CR acceleration by TTD with compressible turbulence. As assumed in Section 2.3, the interaction between turbulence and particles is assumed to be fully collisionless (Brunetti & Lazarian 2011). The discussion below is based on the one-zone approximation, unlike in Section 2.3, where we have considered the radial dependence.
To quantify the turbulent energy induced by mergers, we introduce a parameter η t , which denotes the fraction of the turbulence energy dissipated from the merger kinetic energy. With the volume of the turbulent region V t , the turbulent energy density is written as ε t = η t GM 1 M 2 /R mer /V t , where R mer is the distance between the centers at the impact. Following Cassano & Brunetti (2005), we choose R mer = R 1 , i.e., the physical radius of the major progenitor. Eq. (10) provides the radius of the cluster. Cassano & Brunetti (2005) evaluates V t from the comparison between typical radius of the RHs and the stripping radius, where the ram pressure experienced by the minor progenitor becomes comparable to its static ICM pressure. We adopt a more simplified approach, V t ≈ 4π 3 (R 1 R 2 ) 3/2 , assuming that V t increases with both M 1 and M 2 . The ξ dependence of ε t is affected by the choice of R mer and V t , while M dependence is not apparently affected, as long as both R mer and V t are expressed as functions of R 1 and R 2 (or M 1 and M 2 ). We can evaluate the turbulent velocity at the scale of the injection from ρV 2 L ≈ ε t as, V L ≈ 1.25 × 10 3 km/s ξ 1 4 (1 + ξ) where ρ and n are the mass and number densities of the ICM, respectively. The subscript L denotes the injection scale of the turbulence. After a merger with ξ = 0.2, the turbulent energy density in the cluster with M 500 = 1.0 × 10 15 M becomes ≈ 40% of the thermal energy density. This fraction is almost independent of the mass, since the mass dependence of the turbulent energy is similar to that of the temperature.
We adopt the IK scaling, and approximate the outer scale of the turbulence as the size of the minor progenitor, i.e. L ≈ R 2 . Since the ICM is a high-beta plasma, the sound speed c s and k cut characterize the timescale of TTD. We simply adopt the observed M − T relation (Eq. (29)) to calculate c s for different masses. Because the observed relation is similar to the expectation from the virial theorem, T ∝ M 2/3 , the turbulent Mach number is nearly independent of the mass.
We assume that the wave damping is dominated by the TTD interaction with non-relativistic thermal electrons in a high-beta plasma. In this case, the decay timescale of turbulence at scale k is ∼ m p /(m e kv e th ), where v e th is the thermal velocity of the electrons. The cut-off scale k cut appears where the cascade timescale becomes comparable to the decay timescale. For the IK turbulence, the cascade timescale becomes t kk ≈ 2 9 cs V 2 L k −1/2 L k −1/2 , and k cut is approximated as k cut ≈ 1.08 × 10 4 M 4 s k L , where M s = V L /c s is the Mach number for the turbulent velocity at the injection scale (e.g., Brunetti & Lazarian 2007). Since k cut is much larger than k L , D pp (Eq. (7)) can be written as In the ICM, x = c s /c 1 so I θ (x) = x 4 4 + x 2 − (1 + 2x 2 ) ln x− 5 4 has only a log-dependence on x. Neglecting this dependence, D pp scales as D pp ∝ ξ 2/3 (1+ξ) −1 M 1/3 . The acceleration timescale, t acc = p 2 /4D pp , is plotted in Figure 14 (left panels). The mass dependence is slightly weaker than M 1/3 , since the integral I θ (c s /c) (Eq. (7)) has a weak dependence on c s . The typical acceleration timescale assumed in Section 7, t acc ≈ 250 Myr, could be reproduced with η t = 0.15 and ξ = 0.2 (solid line in the upper left panel). To achieve a similar acceleration efficiency with ξ = 0.01 mergers, the efficiency of turbulent excitation should be as large as η t ≈ 0.4 (green dashed line). In Section 7, we have shown that the steep mass dependence in the observed P 1.4 − M 500 relation could be explained by the reacceleration with a constant duration. One possible explanation for this model is that the duration is regulated by the timescale of the turbulent cascade, t kk , rather than that of the dynamical motion that injects turbulence. The cascade timescale at the injection scale can be expressed as (Brunetti & Lazarian 2007), where D kk is the wave-wave diffusion coefficient (e.g., Brunetti & Lazarian 2007). Adopting the same models for L and V L as that adopted for t acc , one can find the scaling of t kk ∝ ξ −1/6 (1 + ξ) −1 M 0 . The duration adopted in Section 7, t R ≈ 600 Myr, corresponds to1.5×t kk for ξ = 0.1.