Radial Transport in the Earth’s Radiation Belts: Linear, Quasi-linear, and Higher-order Processes

Observational studies of the Earth’s radiation belts indicate that Alfvénic fluctuations in the frequency range of 2–25 mHz accelerate electrons to relativistic energies. For decades, statistical models of radiation belts have quantified the impact of Alfvénic waves in terms of quasi-linear diffusion. However, quasi-linear models are inadequate to quantify Alfvénic radial transport occurring on timescales comparable to the azimuthal drift period of 0.1–10 MeV electrons. With recent advances in observational methodologies offering coverage of the Earth’s radiation belts on fast timescales, a theoretical framework that distinguishes between fast and diffusive radial transport can be tested for the first time in situ. In this report, we present a drift-kinetic description of radial transport for planetary radiation belts. We characterize fast linear processes and determine the conditions under which higher-order effects become dynamically significant. In the linear regime, wave–particle interactions are categorized in terms of resonant and nonresonant responses. We demonstrate that the phenomenon of zebra stripes is nonresonant and can originate from injection events in the inner radiation belts. We derive a radial diffusion coefficient for a field model that satisfies Faraday’s law and that contains two terms: one scaling as L 10 independent of the azimuthal number m, and a second scaling as m 2 L 6. In the higher-order regime, azimuthally symmetric waves with properties consistent with in situ measurements can energize 10–100 keV electrons in less than a drift period. This process provides new evidence that acceleration by Alfvénic waves in radiation belts cannot be fully contained within diffusive models.


Motivation and Background
Radiation belts are torus-shaped plasma environments confined by planetary magnetic fields.Due to porous boundaries and energy-momentum deposition from the solar wind, the Earth's radiation belts are continuously driven away from a state of local thermodynamical equilibrium (LTE).With very low particle densities7 and mean free times between collisions of the order of several months to a few years, the Earth's radiation belts are weakly collisional but respond rapidly to departure from LTE by sustaining a wide range of plasma instabilities that mimic collisions and thermalize the plasma.The plasma instabilities result in a broad spectrum of fluctuations that accelerate particles to relativistic energies on timescales of a few hours to a few days.With electron's energies spanning almost seven orders of magnitude, and reaching as high as several MeV, the Earth's radiation belts are the closest natural laboratory, in which charged particles are accelerated close to the speed of light (Roederer & Zhang 2014).
From a fundamental physics perspective, it is an observational fact that planetary radiation belts and a plethora of astrophysical plasma environments are efficient particle accelerators.The Earth's radiation belts constitute the most accessible environment to perform detailed in situ studies relevant to a wide range of fundamental physics problems, such as cosmic rays' acceleration (Cronin 1999), upper and middle atmosphere climatology (Turunen et al. 2009), and even the microphysics of accretion disks (Quataert & Gruzinov 1999;Sironi & Narayan 2015).With electron to magnetic pressure ratio (β e = 2μ 0 n e k B T e /B 2 ; 0.1-0.01)and relativistic electron energies (γm e c 2 ; 10 MeV) in accretion disks comparable to the Earth's radiation belts (β e ; 10 −3 -10 −1 and γm e c 2 ; 1-10 MeV), kinetic plasma physics near black holes (but far from the event horizon), lies at our doorstep!From an applied physics perspective, and due to their high energies and confinement location around geostationary orbits, radiation belts' particles constitute a threat to satellites orbiting Earth, and are therefore a research focus for communication and military industries.Driven by fundamental scientific questions and risk mitigation to communication infrastructures, radiation belt research aims to quantify the acceleration and loss confinement processes of energetic electrons (Cannon 2013;Hands et al. 2018;Horne et al. 2018).
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
More than 60 yr of research following the discovery of the Earth's radiation belts (Van Allen et al. 1958), plasma physicists have identified two dominant mechanisms responsible for the transport and acceleration of charged particles: (1) spatially localized wave-particle interactions driven by smallscale kinetic fluctuations (Thorne 2010), and (2) large-scale electromagnetic fluctuations induced by global magnetospheric currents and encompassed under the formalism of radial diffusion (Lejosne & Kollmann 2020).Both mechanisms can be understood in terms of the theory of adiabatic invariants in nearly periodic Hamiltonian systems (Cary & Brizard 2009).In the absence of collisions, the motion of magnetically trapped electrons can be decomposed fully in terms of three separate motions with very distinct timescales: 1. Larmor motion around the local magnetic field (Ω ; 1-10 kHz), 2. The bounce motion between magnetic mirror points (ω b ; 0.1-1 Hz), 3. The azimuthal drift around the Earth's midplane (Ω d ; 0.1-1 mHz).
In order to break one of the three periodic motions, a wave with a frequency higher or comparable to one of the periodic motions has to interact with the particles.Since the Earth's radiation belts sustain broadband fluctuations with frequencies ranging between 10 −4 and 10 4 Hz (Murphy et al. 2020), all three invariants can repeatedly be violated.Small-scale kinetic fluctuations accelerate electrons if one of the first two adiabatic invariants μ = E k⊥ /B and p ds . With the first and second adiabatic invariant conserved, particles carried closer to Earth gain energy through a betatron process (Kulsrud 2005) as they sample a larger magnetic field, whereas particles diffusing to higher radial distances sample a weaker magnetic field, lose energy, and experience a greater likelihood of losses at the outer magnetopause boundary (Turner et al. 2012;Hudson et al. 2014;George et al. 2022).
Similarly, violation of the first and second adiabatic invariants for a collection of particles is also modeled in terms of Fokker-Planck equations (Lichtenberg & Lieberman 1983).Contrary to radial diffusion, scattering associated with the first two invariants results in a localized enhancement along the radial distance.
From an observational perspective it has therefore been possible to infer which acceleration mechanism is dominant by computing from satellite data the distribution function in terms of the three adiabatic invariants, i.e., * f L , , ( )  m (Green & Kivelson 2004).As shown in Figure 1, if radial diffusion dominates, the distribution function results in a flattening along the radial Figure 1.Illustration of the conceptual frameworks for the acceleration of charged particles in planetary radiation belts.Following the injection of particles at t = 0 (darker-shaded region) and the generation of plasma instabilities, the phase-space density will be deformed.In the left panel, Alfvénic fluctuations drive radial diffusion and a flattening of the phase-space density along the equatorial radial distance (Lejosne & Kollmann 2020).Particles scattered to lower radial distance sample a larger magnetic field, and gain energy through a betatron process.In comparison, the signature of small-scale fluctuations consists of a localized enhancement along the radial distance (Green & Kivelson 2004), as shown in the right panel.The radial shift of the peak in the right panel illustrates that violation of the first and/or second adiabatic invariant results in a change in the third adiabatic invariant as well (Öztürk & Wolf 2007;O'Brien 2014;Desai et al. 2021).Both frameworks are expressed in terms of Fokker-Planck equations.Transport by Pc4 and Pc5 Alfvénic waves is encoded in a radial diffusion coefficient D LL .Transport by small-scale interactions is encoded in an energy diffusion coefficient D EE (Summers 2005;Shprits et al. 2006). 8It should be kept in mind that when the background dipole magnetic field is deformed on long timescales compared to the drift period, the third adiabatic invariant Φ does not map into the normalized radial distance L since . The background magnetic field model used in this communication is dipolar and the third adiabatic invariant can be interpreted as an effective radial distance.
distance, but if small-scale waves are the primary drivers, localized enhancements along the radial distance should be observed.Contemporary observational and modeling studies of the radiation belts rely on this conceptual framework to determine which of the two mechanisms dominate on timescales of hours to several days (Chen et al. 2007;Reeves et al. 2013;Jaynes et al. 2018).

Benefits of Quasi-linear Models of the Earth's Radiation Belts
The theoretical framework to quantify and interpret the dynamical evolution of radiation belts on timescales of a few hours to several days relies exclusively on quasi-linear theories (Fälthammar 1965;Kennel & Engelmann 1966;Diamond et al. 2010;Brizard & Chan 2022).The overwhelming reliance on quasi-linear models in radiation belt research is not fortuitous as it offers two benefits alternative computational and theoretical approaches lack: 1. Computationally inexpensive reduced models.The full particle motion requires a seven-dimensional description (three adiabatic invariants with three associated phases plus time).Since energetic electrons span four orders of magnitude in energy, and more than six orders of magnitude in time and space, reduced statistical models are necessary to account for geomagnetic storms occurring on timescales of at least a few hours.Quasilinear models for small-scale wave-particle interactions (Summers 2005;Shprits et al. 2006) and radial diffusion (Lejosne & Kollmann 2020) take the form of Fokker-Planck equations that are computationally inexpensive and can be easily implemented in global magnetospheric models.2. Generalizability.With sparse measurements of electric and magnetic fields responsible for the violation of the three adiabatic invariants, quasi-linear models encode wave-particle interactions in diffusion coefficients that have simple algebraic forms.For instance, radial diffusion coefficients are amenable to parameterization in terms of ground magnetometers' measurements (Brautigam & Albert 2000;Sarma et al. 2020) that are correlated with fluctuations that drive dynamic radiation belts.Current quasi-linear models can therefore be generalized to periods of unavailable in situ measurements.
Quasi-linear model comparisons with data yields, in several events, accurate estimates of electron fluxes (Reeves et al. 2013;Thorne et al. 2013;Jaynes et al. 2015).However, the dominance of quasi-linear models also stems from the fact that building statistical models that are a departure from quasi-linear assumptions is an outstanding theoretical challenge, since it falls into the class of multiscale nonlinear problems (Dupree 1966;Orszag & Kraichnan 1967;Dupree 1972;Schekochihin et al. 2008;Diamond et al. 2010;Davidson 2012).9Moreover, a multipoint satellite methodology that can quantify the evolution of energetic particle fluxes on a timescale comparable to a drift period has only recently been developed with the availability of 140 keV-4 MeV electron GPS fluxes calibrated with the Van Allen Probes (Morley et al. 2016;Kalliokoski et al. 2023).GPS instruments combined with the Van Allen Probes offer, for the first time, an unprecedentedly large number of measurement points, thus providing a broader spatial coverage of the radiation belts and a better temporal resolution in terms of drift shells.Energetic electron fluxes inferred from GPS electron counts and calibrated against MagEIS and REPT instruments on board the Van Allen Probes (Morley et al. 2017) can be used to quantify processes that are too fast to be quantified by radial diffusion.Probing radiation belt processes on timescales of the drift period is now observationally possible, and statistical models that quantify the impact of Pc4 and Pc5 waves on fast timescales comparable to the drift period are missing.
New tools for the radiation belts that can complement and supersede quasi-linear models would have to provide the benefits listed above in order to be incorporated into global models.In this paper, we provide the theoretical framework to address the limitation of radial diffusion models and extend radial transport beyond a quasi-linear description.But before doing so, we describe the limits of quasi-linear theory and how it constrains the interpretation of radiation belt observational studies.

On the Need for a New Theoretical Framework of Radial Transport
Quasi-linear models in the radiation belts are mean-field theories that assume that the average interaction of electrons with small-amplitude waves will describe accurately the long timescale evolution of the particles and that nonlinearities arising due to mode-mode coupling or particle orbits can be neglected.Quasi-linear models in the radiation belts therefore contain the following inherent constraints:10 1. Scale separation between fast and diffusive timescales.In quasi-linear models, the cumulative effect of many waves on the distribution functions is slow and diffusive (Vanden Eijnden 1997).This slow timescale for diffusion is contrasted with the fast timescales associated with a single encounter/transit time of a wave with the particles.When the timescales for diffusion becomes comparable to the transit time for the wave-particle interactions, the quasilinear hierarchy breaks down (Kennel & Engelmann 1966).2. Absence of nonlinear processes.The fast response of the distribution function is assumed to be unperturbed and nonlinear processes such as particle trapping (Bernstein et al. 1957;Artemyev et al. 2012;Osmane et al. 2016) or mode-mode coupling (Schekochihin et al. 2016;Adkins & Schekochihin 2018) are ignored.
On the basis of the first constraint, the slow diffusion expressed in terms of a Fokker-Planck equation cannot be used to describe the acceleration of particles on fast timescales comparable to a single interaction or transit time.Nonetheless, current diffusion coefficients used for radial transport become sufficiently large during high geomagnetic activity (Brautigam & Albert 2000;Ozeke et al. 2014;Sandhu et al. 2021) to result in violation of the scale separation constraint of quasi-linear theory.For instance, Figure 4 of Ozeke et al. (2014) shows that the diffusion coefficient D LL can be of the order of 10 2 -10 3 days −1 for Kp > 5. Consequently, the diffusion time for a particle to be carried across one drift shell ΔL * scales between τ D ; 15 minutes and a few minutes.Similarly, the impact of radial transport on losses cannot be quantified in terms of quasilinear models if particles are depleted on timescales comparable to or less than an azimuthal drift period.Olifer et al. (2018) show through observations that fast losses on timescales as short as half an hour can take place during intense magnetic storms.Such transport timescales are inconsistent with a quasi-linear theory relying on a scale separation between fast and slow timescales, with the fast timescales comparable to azimuthal drift orbits of the order of tens of minutes to a few hours.
The second constraint can be justified on the basis that largeamplitude fluctuations are statistically rare occurrences: an electron will be scattered hundreds of times by small-amplitude fluctuations before encountering a large-amplitude wave.However, from a theoretical perspective, wave amplitudes do not need to be very large for higher-order effects to become comparable to linear terms and for a quasi-linear theory to break down.This property of nonlinear systems is well known among astrophysical and fluid turbulence research and underlies the assumption of critical balance, in which the transit time becomes comparable to the nonlinear interaction time (Goldreich & Sridhar 1995). 11 Observational evidence and theoretical studies of fast and nonlinear processes at the heart of the Earth's radiation belts have become substantial in the last 15 yr but are typically associated with electron-scale whistlers and chorus (Bortnik et al. 2008;Cattell et al. 2008;Cully et al. 2008;Albert et al. 2012;Artemyev et al. 2012Artemyev et al. , 2015;;Mozer et al. 2013;Malaspina et al. 2014;Santolík et al. 2014;Agapitov et al. 2015;Osmane et al. 2016Osmane et al. , 2017;;Tao et al. 2020;Omura 2021) and ion-scale EMIC waves (Hendry et al. 2019;Bortnik et al. 2022;Grach et al. 2022).With the exceptions of the numerical studies of Li et al. (1993Li et al. ( , 2018)), Degeling et al. (2008), Hudson et al. (2017), and extreme driving events such as the one reported by Kanekal et al. (2016), fast and higher-order radial transport are rarely considered and have yet to be accounted for in global models.However, observational studies demonstrate the existence of large-amplitude fluctuations that can sustain radial transport.For instance, Hartinger et al. (2013) demonstrated that transient foreshock perturbations during moderate geomagnetic periods lead to the generation of ultralow frequency (ULF) electric and magnetic fields as high as 10 mV m −1 and 10 nT, respectively.A statistical study by Simms et al. (2018) and an information-theoretic analysis by Osmane et al. (2022) characterized the statistical dependence of energetic electron fluxes in the Earth's radiation belts on ULF wave power measured on the ground and at geostationary orbit.Both studies demonstrated that ULF wave power is nonlinearly coupled to energetic electron fluxes. 12And as higher-order effects become significant, the scale separation constraint of quasi-linear models also breaks down.In this paper, we present a theoretical framework to distinguish quasi-linear diffusion from fast linear and higher-order processes.

Next Generation of Radial Transport Models for Radiation Belts
The physics of the Earth's radiation belts is nonlinear, high dimensional, and multiscale and it is not computationally possible to resolve energetic particle motion ranging from milliseconds to hours during geomagnetic storms that can last from several hours to a few days.Consequently, reduced statistical models relying on quasi-linear theories have been developed to predict the dynamical evolution of energetic electrons in terms of physical drivers (i.e., in the solar wind and the magnetosphere).With growing satellite measurements and coverage, we now know that large-amplitude Alfvénic fluctuations and fast processes occurring on timescales beyond the reach of quasi-linear radial diffusion are commonly observed in the radiation belts (Li et al. 1993;Hudson et al. 1995;Turner et al. 2012;Hartinger et al. 2013;Kanekal et al. 2016;Olifer et al. 2018).The current modeling tools are therefore unable to quantify the impact of fast and/or higherorder radial transport on the energetic electrons, and thus unable to distinguish it from small-scale wave-particle interactions (Lejosne et al. 2022).Figure 2 illustrates the spatial and temporal scales covered by radial diffusion in comparison to characteristic waves and particle motions.In order to characterize processes occurring on fast timescales, we need to use a reduced statistical framework that accounts for variations during the drift motion.Drift kinetic models have been developed for decades, mostly for laboratory fusion plasma (Goldston & Rutherford 1995;Parra & Catto 2008), but is an ideal starting point to quantify the impact of Pc4 and Pc5 ULF waves on energetic electrons that belong to the long wavelengths (kρ e = 1) and short frequency limit (ω/Ω e = 1).

Summary of the Main Results
1.The choice of the magnetic field model to quantify radial transport is essential for radial transport models and needs to respect Maxwell's equations.If Faraday's equation is violated, we show that Liouville's theorem is also not respected, and thus phase-space density is not conserved.This result also has implications for test-particle experiments in global magnetospheric simulations (Tu et al. 2012).If Faraday's equation is not respected in the simulation box, the construction of the distribution function from the particle trajectories can violate Liouville's theorem.2. The linear wave-particle response of the distribution function to a single Alfvénic ULF mode consists of three separate terms, two nonresonant processes and one resonant one: (1) a nonresonant modulation of the distribution function in terms of the ULF wave frequency ω, (2) a nonresonant modulation of the distribution function in terms of particle's drift frequency Ω d , known as drift echoes, and (3) a drift resonant response in the instance where the frequency of the ULF wave corresponds the drift frequency of the particle, i.e., ω ; Ω d .All three responses are a function of the radial gradient in the background distribution function, and the 11 In critical balance, the linear transit timescale (the time it takes for an Alfvén wave packet to transit across another Alfvén wave packet) becomes comparable to the nonlinear interaction time.In the radial transport problem, the transit timescale (the time it takes for a magnetically trapped particle to transit/sample an Alfvén wave) becomes comparable to the time it takes for higher-order effects to be felt.This is quantified in Section 3.4.
12 Counterintuitively, energetic electrons with 100 keV were shown to possess the largest statistical dependency with ULF waves that should only resonate with relativistic electrons >1 MeV.In Section 3.4.2,we provide a nonresonant mechanism, unaccounted by quasi-linear radial diffusion, that can explain the results of Simms et al. (2018) and Osmane et al. (2022) as a result of ULFdriven impulsive acceleration of 100-400 keV.
modulation in terms of the ULF wave, sometimes interpreted as evidence of drift resonance (Claudepierre et al. 2013), can also be the product of a nonresonant interaction.3. The formation of Zebra stripes does not require drift resonant interactions, and can be the signature of injected particles in the inner belts in the absence of ULF waves and radial gradients of the distribution function.We argue that the injection events reported by Zhao & Li (2013) provide all the necessary ingredients for the formation of zebra stripes.4. We derive from the drift kinetic equation a quasi-linear radial diffusion coefficient that consists of two terms.The first term is independent of the wave azimuthal number m and scales as L 10 , and the second term is a function of the azimuthal wavenumber and scales as L 6 .The diffusion coefficients account for electric and magnetic field fluctuations that respect Faraday's equations, and thus, the separation of the diffusion coefficient in terms of an electric and magnetic D LL , as commonly used in the literature (Fei et al. 2006;Ozeke et al. 2014;Sandhu et al. 2021), becomes redundant.Our derived diffusion coefficient can be computed on the basis of the magnetic field wave power alone.5. We provide criteria to determine the limit where higherorder radial transport processes become significant on timescales comparable to the drift period.We demonstrate that when higher-order effects are accounted for, symmetric and compressive ULF waves can accelerate electrons with energies of the order of 10 to a few hundred keV by convecting them inward.This process is a higher-order generalization of the mechanism presented by Parker (1960) and does not require drift resonance.

Drift Kinetic
In a strongly magnetized plasma, charged particle motion can be split into a fast gyration around the local magnetic field and the motion of its guiding center.The Larmor motion is analytically solvable when the electric and magnetic fields, E and B, respectively, are assumed constant in time and uniform in space.However, this solution can also be extended to more general electromagnetic fields that are approximately constant on timescales comparable to the Larmor period m q B s s s 1 W = and spatial scales of the order of the Larmor radius ρ = v/Ω s , where v is the characteristic speed of particles sampling the field, q s is the charge, and m s is the rest mass of a particle species (s = p for protons and e for electrons).
We consider a system with characteristic scale size l and frequency ω ∼ v/l.The time and spatial scales of the system are  (2000) for L = 8 and Kp = 6, which corresponds to strong geomagnetic conditions.A D LL at L = 8 and Kp = 6 indicates radial transport over one L-shell on a timescale of 30 minutes.For a >4 MeV electron, a drift period is of the order of 3 minutes, and radial diffusion over one drift shell after 10 azimuthal drift periods is very fast, but perhaps possible through quasi-linear diffusion.For lower-energy electrons, e.g., 400 keV, a complete azimuthal drift is of the order of 20 minutes, and diffusion over one drift shell in less than two azimuthal drift is inconsistent with the quasi-linear assumption of small changes over fast timescales.It should therefore be kept in mind that the range of validity of quasi-linear radial diffusion becomes smaller for less energetic particles.estimated from derivatives of the electromagnetic fields a sufficiently strong background magnetic field, the small parameter ε can be defined as this limit the particle does not sense significant variations in the electromagnetic field during characteristic Larmor time and spatial scales.By choosing appropriate coordinates, the fast gyration around the guiding center can be ignored and a kinetic theory for a collection of particles in a magnetized plasma can be constructed. 13Put differently, starting from the Lorentz equation or Hamilton's equations to compute the particle motion for slowly varying electromagnetic fields, one can build a statistical description of particles confined by large-scale inhomogeneous magnetic fields (Goldston & Rutherford 1995;Parra & Catto 2008;Cary & Brizard 2009;Hazeltine & Meiss 2013).In the Earth's radiation belts, such a description is therefore appropriate for energetic electrons with Larmor periods 0.1 1 e 1 -Wms, and interacting with electromagnetic fluctuations in the Pc4 (ω ∼ 8-25 mHz) and Pc5 (ω ∼ 2-7 mHz) ULF range. 14 In this study, we use a kinetic theory of guiding centers known as drift kinetics to quantify the radial transport of energetic particles interacting with ULF fluctuations.Our starting point is the conservative drift kinetic equation derived recursively by Hazeltine (1973 in terms of the gyro-averaged distribution function 〈f〉 defined as ò p m q q á ñ = p the guiding center position vector r, parallel velocity v ∥ , and gyrophase θ g , first adiabatic invariant μ, to account for the relativistic correction that appears for particles with kinetic energies of E c = m e c 2 (γ − 1) comparable to the electron rest mass m e c 2 = 511 keV. 16 The appearance of the magnetic field amplitude B in Equation (3) originates from the Jacobian when one transforms variables from (r, v) to (r, μ, v ∥ , θ g ).In the absence of collisions, conservation of phase-space density for a collection of guiding center particles requires that the following equation be respected: Equation (6) is a statement of Liouville's theorem, and is a function of the electromagnetic field model and of the guiding center's particle trajectory.In open systems the impact of electromagnetic fluctuations will naturally lead to transport to the boundaries, and thus to irreversible losses.Terrestrial and planetary radiation belts are not closed systems and the inner and outer boundaries allow for particles' injection and losses (Millan & Thorne 2007;Aryan et al. 2020;Walton et al. 2022).However, the wave-particle interactions with ULF waves, in the absence of boundary effects, have to conserve phase-space density.Equation (6) is, therefore, a different statement, independent of the presence of porous boundaries, and determines whether phase-space density, and thus the number of particles, is conserved in a closed phase-space volume.The choice of a fields' model that violates phase-space density is unphysical and necessarily results in erroneous quasi-linear diffusion coefficients.For instance, if a field model that does not conserve phase-space density is chosen, and boundary effects are added, the resulting losses would either be amplified or underestimated.Liouville's theorem can therefore be used as a constraint for the electromagnetic fields, as shown in Section 2.2.The particle guiding center description in the (r, v ∥ , μ) phase space, for a given problem, is a function of the strength of the electric field when compared with the magnetic force.If the characteristic speed of the particle is comparable to the E × B drift, additional sources for perpendicular drifts can be ignored.For instance, in the collisionless MHD approximation, the perpendicular velocity of ion and electron fluids are to first order comparable to the E × B drift and MHD fluid equations can be derived from the kinetic equation with the perpendicular velocity approximated by the E × B drift (Hazeltine 2018).However if additional drifts are comparable in size to the E × B drift, or if the characteristic speed of a particle population is much greater than the E × B drift, perpendicular velocities of ions and electrons are going to decouple, and additional drifts have to be taken into account.Hazeltine (1973) suggests two regimes to account for the ordering of the E × B in a given problem: the high flow regime, with strong perpendicular electric fields |E ⊥ | ; vB, and the low flow regime, with small electric fields, making the E × B drift small compared to the characteristic speed of the particle.Thus, in the high flow regime, the perpendicular electric field can be 13 http://www-thphys.physics.ox.ac.uk/people/FelixParra/CollisionlessPlasmaPhysics/CollisionlessPlasmaPhysics.html 14 Terrestrial and planetary radiation belts also sustain high-frequency electromagnetic fluctuations with characteristic frequencies ω comparable to the Larmor frequency Ω s , e.g., the whistler-mode wave branch at Earth (ELF/ VLF) (see Ukhorskiy & Sitnov 2012 for more details).The drift kinetic description relying on the small parameter ordering, Equation (2), can therefore not be generalized to wave-particle interactions with such modes and one needs to resort to a full Maxwell-Vlasov system (Kulsrud 2005). 15A pedagogical step-by-step derivation of Hazeltine's (1973) results can be found in the lecture notes at http://www-thphys.physics.ox.ac.uk/people/FelixParra/CollisionlessPlasmaPhysics/CollisionlessPlasmaPhysics.html. 16In the Earth's radiation belts, particles are injected at energies of the order of 1-100 keV, but are accelerated to energies comparable to the rest mass and as high as a few MeV (Turner et al. 2017).It is therefore crucial to keep track of the relativistic effects.In our particular problem limited to equatorially trapped particles, the relativistic effects appear in the first adiabatic invariant but an extension to non-equatorially trapped particles will require a relativistic representation of the drift kinetic equation in terms of the parallel momentum p ∥ = m e γv ∥ .
comparable to the magnetic force, and the E × B drift is the dominant drift.In the low flow ordering, the perpendicular electric field cannot balance the magnetic force, and since the E × B drift is not dominant, additional magnetic drifts, such as the curvature drift and the magnetic gradient drift −μ∇B have to be included.
For an application to energetic electrons in the Earth's radiation belts possessing kinetic energy ranging between hundreds of keV and a few MeV, and interacting with ULF waves, the low flow regime is the correct limit since it accounts for the dominance of the magnetic gradient drift over the E × B drift.Dominated by the magnetic gradient drift, energetic electrons in the Earth's radiation belts perform one complete azimuthal loop on timescales ranging from a few minutes, for MeV electrons, to a few hours for 50 to a few hundred keV electrons.In comparison, the additional drifts present in Equation (7) are weaker on such timescales.However, we keep track of additional drifts since they are cumulatively responsible for the irreversible transport of particles across drift shells on long timescales of several hours to a few days (Lejosne & Kollmann 2020).
In the low flow regime, the position is to first order in the small parameter ε evolving according to 17 in terms of the local magnetic field direction b = B/B.The five terms are, respectively, the velocity parallel to the magnetic field, the Baños parallel drift, the E × B drift, the curvature drift, and the magnetic gradient drift.Coupled with particle's position, the evolution of the parallel velocity is given by . 9 Combining Equations (3), ( 7), (8), and (10) with a model of electromagnetic fields consistent with Liouville's theorem (Equation ( 6)), one can quantify the evolution of the distribution function for a collection of energetic particles in planetary magnetosphere on timescales much shorter than quasi-linear times and therefore comparable to the azimuthal drift periods of magnetically confined particles.The drift kinetic approach therefore provides the foundation for a variety of models (linear, quasi-linear, nonlinear, with or without porous boundaries) to account for ULF radial transport of particles.
A priori the set of drift kinetic equations is nonlinear when coupled with Maxwell's equation and therefore not easily tractable analytically.However, the equations can be simplified when energetic particles confined to the equator of the Earth's magnetosphere are studied.Equatorially trapped particles have pitch angles -^and thus v ∥ = 0.Moreover, the absence of a ULF parallel electric field results in 0  m = , v 0   = , and the evolution of the conservative kinetic equation for the distribution function f (r, v ∥ = 0, μ = μ c ), for a fixed magnetic moment μ c , takes the simple form of In the remaining part of this communication, we will use a kinetic equation, Equation (11), to describe equatorially trapped particles and leave the generalization to non-equatorial particles (α ≠ π/2) for future work. 18But before solving the kinetic equation we need to complement it with a a model of electromagnetic fields.

Review of Electromagnetic Fields Used for Radial Diffusion Models
In this section, we review the electromagnetic fields that have been chosen to model ULF radial transport.We focus solely on electromagnetic models that can be written analytically and that have been used to model coefficients for Fokker-Planck equations.Our aim in this section is also to demonstrate that an arbitrary choice of electromagnetic fields can violate the conservation of phase-space density given by Equation (6).

Mead Field
The Mead field (Mead 1964) consists of the superposition of two perturbations: an azimuthally symmetric fluctuation with amplitude S(t) and an azimuthally asymmetric fluctuation A t r cos ( ) ( ) j superposed to a background magnetic dipole field of amplitude B R r E E 3 3 .The Mead model has the benefit of being mathematically simple yet containing all the necessary ingredients, through the presence of an asymmetric perturbation, for the violation of the third adiabatic invariant experienced by a collection of magnetically trapped particles.The Mead field was therefore a natural choice for early models of radial diffusion (Fälthammar 1965;Schulz & Eviatar 1969;Schulz & Lanzerotti 1974) and has been used as the field model for empirical (Brautigam & Albert 2000;Cunningham 2016) and theoretical studies (Lejosne 2019; Osmane & Lejosne 2021) of quasi-linear radial diffusion in recent years.
In our analysis, we will argue that the choice of the Mead field is preferable for analytical studies. 19As stated in 17 Terms of order ε 2 ; (ρ/l) 2 are neglected. 18Since ULF waves propagate off the equatorial plane (Sarris et al. 2022), additional drifts have to be accounted for non-equatorial particles. 19One particular exception where a Mead field is clearly inappropriate consists of events such as the one on 1991 March 24 (Li et al. 1993;Hudson et al. 1995) in which an electromagnetic pulse is highly localized in magnetic local time and radial distance.For weak geomagnetic activity (Kp < 4), the Mead field with a single m = 1 mode is a good representation (Lejosne et al. 2013).For strong geomagnetic activity and ULF modes that are not spatially localized, the Mead field containing multiple Fourier modes should provide a reasonable fit of the wave field.
Section 2.1, we will focus here exclusively on equatorially trapped particles, but note that a generalization to non-equatorial particles can also be done.We also generalize the Mead field to antisymmetric perturbations with azimuthal wavenumbers m ≠ 1 This generalization of the Mead field will have little incidence for the linear and quasi-linear radial transport equation since the perturbed distribution function due to various m modes are independent from one another another.In the higher-order regime of radial transport in turn, as shown in Section 3.4, mode coupling of various m modes can interact with one another.
Thus, the magnetic field for equatorial particles can be written in cylindrical coordinates (r, j, z), with r the radial distance, j the azimuthal angle, and z the cylindrical axis direction: The above Mead field results in two drifts, the E × B drift, and the magnetic gradient drift20 written for the electron charge written in terms of the background magnetic dipole magnitude B 0 = B E R 3 /r 3 and the magnitude B B S t Conservation of phase-space density for a collection of particles trapped in a magnetic dipolar field and interacting with ULF fluctuations can be written as in which the first term on the right-hand side of Equation ( 16) is the projection of Faraday's law along the background magnetic field direction b.Since we are focusing solely on equatorially trapped particles for the Mead field we can switch to cylindrical coordinates (r, j, θ = z).Thus, phase-space density is always conserved for particles confined in a magnetic dipole if Faraday's law projected onto the mean field is respected.A corollary is that the choice of time-varying electric fields that do not satisfy Faraday's law does not satisfy Maxwell's equation and also has the additional undesirable consequence that it does not conserve phase-space density.Since the electric field in the Mead model satisfies Faraday's equation, the Mead field conserves phase-space density.The choice of the Mead field is therefore appropriate for developing a kinetic theory of radial diffusion.Elkington et al. (2003) argued that enhanced radial diffusion could take place by accounting for an asymmetric background magnetic field attributed to periods of high solar wind pressure and solar wind speeds.In their model, Elkington et al. (2003) chose a background dipole magnetic field with a superposed perturbation ΔB:

Asymmetric Background Field
Here the azimuthal angle is chosen to be zero at noon and we denote the model as B EK to distinguish it from the Mead field.
In addition to the background field, ULF wave perturbations in the electric and magnetic field are chosen to be the sum of azimuthal Fourier components: The above perturbations have no particular polarization, with unspecified toroidal (δE r,m ) and poloidal (δE j,m ) electric fields components, and the relation between the magnetic and electric components are ignored.In order for these fields to conserve phase-space density, two constraints have to independently hold: the first one applies to the stationary background magnetic field given by Equation (17), and is respected for a perturbation ΔB(r) with an existing first derivative along the radial direction.The second one is Faraday's law for the time-varying electric and magnetic perturbations, Equations ( 18) and ( 19), which results in the following three constraints for the electric and magnetic field amplitudes: For the sake of simplicity, we assume that the magnetic field perturbations have no poloidal (B j = 0) or toroidal (B r = 0) component, and thus only require the constraint (Equation ( 23)) to be enforced.In terms of a Fourier decomposition in time (δB m,z ∼ e − i ω t ), Equation (23) can thus be written as This last equation constrains the choice of poloidal or toroidal electric fields.For a purely toroidal electric field (δE m,r ≠ 0, δE m,j = 0), the complex coefficients have the following constraint: δE m,r = − ωδB m,z /m.For a purely poloidal electric field (δE m,j ≠ 0, δE m,r = 0) that has no radial dependence the following equality must be held: δE m,j /r = ωδB m,z /m.We, therefore, conclude that the asymmetric model used to compute the radial diffusion coefficients in Fei et al. (2006) does not conserve phase-space density and that the diffusion coefficients derived on the basis of this field model yield unphysical results.The violation of Faraday's law in the model Fei et al. (2006) has already been noted by Lejosne (2019) and shown to enhance the diffusion coefficient by a factor of 2. By treating this problem kinetically, we have also shown that it violates Liouville's theorem.Equation ( 24) also provides a constraint on the electrostatic model (∇ × δE = 0) of Fälthammar (1965).For the case of a purely poloidal component, Faraday's equation requires , and thus, δE m,j ∼ 1/r.The assumption of a poloidal field independent of the radial distance used in Fälthammar (1965) therefore also violates Liouville's theorem and yields an unphysical radial transport coefficient.
We note that both the Fei et al. (2006) electromagnetic model and Fälthammar (1965) electrostatic models can nonetheless be corrected by accounting for Faraday's law.This correction can be done by enforcing Equation (24) when computing the diffusion coefficient with or without the asymmetry introduced by Elkington et al. (2003).On the basis of this section and the previous one, we choose to use the Mead model since it conserves phase-space density for equatorially trapped particles and already contains all the key ingredients to model radial transport in the Earth's radiation belts. 213. Linear, Quasi-linear, and Higher-order Limits of Radial Transport

Multiscale Dynamics and Separation between Slow and Fast Variables
In this section, we develop a mean-field theory from the drift kinetic equation, Equation (3), for charges confined in a magnetic dipole and interacting with ULF fluctuations given by the Mead field (Section 2.2.1).We will solely focus on particles confined in the equatorial plane (α = π/2) and leave the more involved case of particles bouncing off at mirror points at higher and lower latitudes to future studies.In order to build a mean-field theory we separate slow changes in the third adiabatic invariant L * and background quantities and fast changes in the associated invariant phase and fluctuation timescales parts of the distribution function 22 in which r is the radial distance at the equator, j is the azimuthal angle j ä [0, 2π], and the small parameter ε characterizes the scale separation between large-scale and small-scale parts of the distribution.We note that it is possible to build a background distribution function with azimuthal dependence.For instance, in the presence of an azimuthaldependent source or loss term that evolves slowly in time compared to the azimuthal drift period of the particles.Such an azimuthal dependence can then be accounted for in terms of ε a j, for a > 0, and resulting in ∂f 0 /∂j = ε a f 0 .But for simplicity, and comparison with the previous radial transport model, we will assume that the background distribution function has no dependence on the azimuthal angle, i.e., Formally, this equilibrium distribution can be defined as the average of the exact distribution function over the range of azimuthal angle and over timescales that are intermediate between the fast and the slow ones: +D 21 A reader might then wonder why not simply use the field in Fei et al. (2006) after enforcing the constraint given by Equation (24).The short answer is that the main benefit of using the asymmetric field results in a modification of the diffusion coefficient of the order of ΔB 2 /B 2 = 1.This modification is therefore negligible. 22A scale separation between fast and slow motion is the basis of quasi-linear theories in astrophysical plasmas (Kulsrud 2005;Diamond et al. 2010;Schekochihin 2017).This approach is identical to the one performed in Kennel & Engelmann (1966) for a quasi-linear theory of magnetized charged particles interacting with plasma waves of frequencies comparable to the Larmor frequency.The resulting diffusion models written in the form of Fokker-Planck equations would not be possible without such a scale separation and constrains the timescales upon which the quasi-linear theory can be used.
for ω −1 = Δt = t eq , where ~ ¶ ¶ , the timescale for an equilibrium in the distribution to form.This definition of f 0 constrains the time and spatial scales upon which the background distribution function can be computed.It is shown in Section 3.2 that particles with azimuthal drift frequencies Ω d , as defined by Equation (29), comparable to ULF wave frequency with azimuthal mode number m experience resonance.Thus, since resonance requires ω ; mΩ d , Equation (27) also constrains the evolution of the background distribution function f 0 on timescales much larger than 1/mΩ d .For the mode m = 1, the implication of the quasi-linear theory is that the diffusion cannot take place on timescales comparable to the azimuthal drift periods.
For equatorial particles with a conserved first adiabatic invariant μ interacting with a Mead field, the kinetic Equation (11) takes the form of We now define the drift frequency for equatorially trapped particles as q r 3 2 9 The right-hand side of Equation (31) describes the slow evolution of the background distribution due to the effect of fluctuations.As is often the case in space and astrophysical plasmas we need a closed equation for the evolution of the background.The correlation ñ 24 can be computed if we can write an equation for the perturbation δf m , replace it in Equation (31), and take the average defined by Equation (27).The details of this calculation can be found in Appendix B, and results in the following higher-order equation for the perturbation: .
The three terms that control the evolution of the perturbed distribution in Equation (32) represent free ballistic motion, or streaming, linear wave-particle interaction, and higher-order wave-particle interaction.The term , given by Equation (B2), , otherwise it has to be accounted for and will result in mode-mode coupling even if the ULF wave amplitudes are considered small, i.e., δB ; rA m = B 0 and S(t) = B 0 .In the next sections, we solve these equations in the linear and quasi-linear regimes and describe the conditions in which higher-order processes become significant.

Linear Theory and Radial Transport on Fast Timescales
In linear theory, we consider small perturbations of the equilibrium that evolve on fast timescales comparable to the drift period.All higher-order terms can then be ignored and the background distribution is assumed as constant in time, i.e., f 0 (t) = const.The linear equation is therefore given by The first term on the right-hand side of Equation (34) is a ballistic mode that we will see is responsible for the formation of transient structures in the phase space (r, j).The second term on the righthand side is the linear wave-particle response of the distribution function to the ULF wave.This problem is almost identical to the self-consistent electrostatic problem solved by Landau (1946), in which perturbations of the background distribution result in growing or decaying fluctuations.However, the radial transport problem in the linear regime, contained in Equation (34), is simpler than the one solved by Landau (1946), since the resonant energetic electrons with densities of the order of 0.1% or less are passive tracers and self-consistent effects can be to a very good degree of accuracy ignored. 25One therefore has the freedom to model the ULF fluctuations in a manner consistent with in situ observations, 23 The generalization of the Mead field in Section 2.2.1 has already been expressed in terms of Fourier modes for the antisymmetric perturbations. 24Since the magnetic field amplitude is real, we can write the Fourier coefficient as long as Faraday's law is respected.For a ULF fluctuation given as a single Fourier mode (A m (t) ∼ e − i ω t ) or some stochastic noise, one can solve the linear system analytically.To the best of our knowledge, the analytical solution of Equation (33), i.e., Equation (34), has not appeared in peer-reviewed studies of terrestrial radial transport before so we proceed hereafter with a detailed analysis.

Ballistic Solution and the Formation of Zebra Stripes
Inserting the ballistic solution in the perturbed distribution, i.e., the term f r e , 0 -W in Equation (34), in Equation (30), the total distribution in the linear limit is given by , 0 .35 We can consider the ballistic solution separately of the linear wave-particle response since the former is independent on the radial gradient of f 0 and the latter is not.The ballistic response is therefore the only possible observable response when radial gradients in the distribution function are very small.
If we consider a single Fourier mode m = 1, we note that an initial perturbation δf (r, t = 0, j) will develop fine structures in the (r, j) space as t → ∞.The formation of fine structures in space occurs because an initial perturbation δf m (r, 0) will experience a differential shearing along the radial position r.We can use the solution, Equation (35), to quantify the parametric dependence of the structures arising from ballistic motion in a magnetic dipolar field B = B E /L 3 .For an initial phase j 0 , the perturbed distribution, δf is constant along the curve in which we replaced the Lorentz factor γ by the kinetic energy E c = m e c 2 (γ − 1).For a fixed time t ≠ 0, Equation (36) indicates that energetic particles will have phase-space structures with a kinetic energy that is inversely proportional to the radial distance, i.e., E c ∼ 1/L.
The time evolution of the perturbed distribution function is shown in Figure 3 for time snapshots of 10 minutes, and 1, 2, and 8 hr.Energetic particles ranging between 50 and 400 keV experience a full azimuthal drift on the order of a few hours.Figure 3 shows that phase-space structures can form on timescales of the order of a single drift period, which is on timescales that are far too rapid to be accounted for by radial diffusive effects.After several drift periods, phase-space structures in (E c , L) become thinner even though their numbers grow.This behavior of the ballistic solution is consistent with the phenomenon of zebra stripes commonly observed in the inner part of the Earth's radiation belts (Imhof & Smith 1965;Datlowe et al. 1985) and in the magnetospheres of Saturn and Jupiter (Hao et al. 2020;Sun et al. 2021Sun et al. , 2022)).Zebra stripes are transient structured peaks and valleys observed on spectrograms of inner radiation belts' electrons with energies ranging between tens to hundreds of keV.The zebra stripes that are measured in situ are also characterized by energy peaks and dips that vary as the inverse of the radial distance, i.e., E c ∼ 1/L.They are also associated with substorms onsets and correlated with various geomagnetic indices, such as Kp and Dst, but are also observed during quiet geomagnetic conditions (Sauvaud et al. 2013;Lejosne & Roederer 2016;Lejosne & Mozer 2020a, 2020b).Mechanisms explaining the formation of zebra stripes must therefore reproduce the E c ∼ 1/L dependence and explain the processes responsible for their transient nature and appearance under a wide range of geomagnetic conditions.
Mechanisms suggested for the formation of zebra stripes can be categorized into two types.In the first type, particles sample an electric field that varies on timescales consistent with their drift motion (see, e.g., Ukhorskiy et al. 2014, and references therein for the most recent advances on the subject).Consequently, a collection of trapped particles can experience drift resonance with the field, and result in zebra stripe structures as resonant particles that are scattered to different drift shells.In the second type, illustrated by the studies of Liu et al. (2016) and Lejosne et al. (2022), zebra stripes also sample an electric field but are nonresonant.The formation of zebra stripes for this mechanism is akin to a phase-mixing process.Magnetically trapped particle's drifts are faster for more energetic particles.When fluxes are projected in energy and radial distance, the shearing of the distribution leads to an E c ∼ 1/L dependence.
However, our analysis of the ballistic motion also demonstrates that phase-space structures consistent with in situ observations of zebra stripes can form in the absence of both drift resonance and electric field perturbations.The formation occurs on timescales comparable to the drift period of energetic particles and is equivalent to the phase-mixing scenario presented by Ukhorskiy et al. (2014) in that it does not require drift resonance.However, the ballistic solution we derived assumes a perturbation of the distribution function δf m (t = 0, r) at some arbitrary time.This perturbation of the distribution function can either be due to particles being lost δf m (t = 0, r) < 0, e.g., to the boundaries, or particles being injected δf m (t = 0, r) > 0. While more quiescent than the outer belts, the inner belts experience injection events of energetic electrons even during moderate geomagnetic storms (Zhao & Li 2013). 26n order to inject electrons in the inner belts a radial transport mechanism, such as a convective electric field, is required.But once injected in the inner belts, the ballistic term shows that zebra stripes can form in the absence of any ULF perturbations.In Figure 4, we show the formation of the zebra stripes following localized loss of energetic electrons centered at L = 2 and spread with a standard deviation along radial distances of ΔL = 0.75.Localized injection and losses also result in stripes on timescales comparable to the drift period but shearing of the distribution function results in structures spreading across radial distances beyond the injection or loss location.
The transient nature of zebra stripes can also be evidenced when projecting the ballistic solution in the equatorial plane.Figure 5 shows the temporal evolution of 100 keV electrons' injection (at j ä [0, π]) and losses (at j ä [π, 2π]).The drift period of 100 keV electrons between L = 1 and L= 3 ranges between 2.6 and 8 hr.After a single drift period the distribution function preserves their initial shape and has yet to phase mix.In comparison, Figure 6 shows the temporal evolution of 400 keV electrons' injection (at j ä [0, π]) and losses (at j ä [π, 2π]).The drift period of 400 keV electrons between L = 1 and L = 3 ranges between 45 minutes and 2.3 hr.For more energetic particles, since the drift period is shorter, shearing of the initial distribution phase mixes the distribution on faster timescales.After 4 hr, the zebra stripes of 400 keV have very fine-scale structures in the equatorial plane.
Injection or losses of particles can therefore result in the formation of zebra stripes without the need for drift resonance or the presence of an electric field.The injection and losses are encoded in the ballistic solution but since shearing of the distribution function occurs on timescales of a few drift periods, the most energetic electrons develop quickly fine-scale structures in the distribution function that might not be resolved by spacecraft instruments.Nonetheless, the ballistic solution does not preclude the possibility of the formation of zebra stripes as a response to a ULF electric field for resonant or nonresonant particles.In the next section, we compute the linear solution to include the impact of ULF waves on the distribution function and differentiate between resonant and nonresonant responses.

Solution to the Linear Wave-Particle Interaction
In the previous section, we described the time evolution of the ballistic term in the distribution function and argued that it should dominate the particle's response when radial gradients in the distribution function are small.However, in the absence of phase-space injection and/or loss terms, and thus in instances where the ballistic term is zero, i.e., δf (t = 0, r) = 0, and ∂f 0 /∂r ≠ 0, the linear wave-particle response should dominate.
In this section, we describe the linear wave-particle solution found in Equation (34).For the sake of simplicity, we assume a single Fourier mode for the ULF wave: with initial amplitude a m , frequency ω, and growth/damping rate γ m .We can generalize this solution to a spectrum of Fourier modes, but since the solution is linear, each are independent of one another.The linear solution is valid in the limit where the growth rate is sufficiently small, for the fluctuations to remain sufficiently small in amplitude and higher-order effects negligible (Davidson 2012). 27We insert Equation (37) into Equation (34) 27 See Section 3.4 in which we quantify the conditions for the linear regime to break down.It will not come as a surprise to readers' familiar with solar wind turbulent problems that nonlinear effects can become dynamically significant for small-amplitude electromagnetic fluctuations.In the magnetohydrodynamic limit, this condition is associated with a state of critical balance (Goldreich & Sridhar 1995) at fluid scales, but has also been generalized to kinetic problems in space plasmas (Schekochihin et al. 2016;Meyrand et al. 2019).For the problem of radial transport the nonlinear regime is reached even in the limit where ULF wave amplitudes and the perturbed distribution function are small, i.e., δB/B 0 = 1, and δf/f 0 = 1, respectively.
to find the following linear wave-particle response f r t , m L ( ) d : Equation (38) contains a resonant part indicating that particles with a drift frequency Ω d can be scattered across drift shell efficiently with ULF waves of frequencies ω m .We can decompose Equation (38) in terms of a linear wave-particle resonant part that can grow in time, and two oscillating parts as follows: The first term on the right-hand side of Equation ( 39) is a nonresonant term oscillating mode that can grow or damp with the ULF wave at a rate γ m and modulates the distribution function at a frequency ω m .The second term on the right-hand side of Equation (39) is a nonresonant ballistic term that indicates that a ULF fluctuation of arbitrary frequency ω m can sustain fluctuation in the distribution function at frequencies Ω d without drift resonances involved.In the past 60 yr, time series of particle fluxes observed to have temporal frequencies comparable to the drift period have been termed drift echoes (Lanzerotti et al. 1967).For instance, Figure 6 of Kokubun et al. (1977) shows the simultaneous association of the transverse ULF wave mode with oscillations in energetic ion fluxes and energetic electron fluxes of 79, 158, and 266 keV.The low energy fluxes are modulated by the ULF wave, and the phases of modulations are energy dependent.The oscillations reported by many authors (Kokubun et al. 1977;Zong et al. 2007Zong et al. , 2009) ) occur on timescales comparable to the drift periods of energetic populations and are therefore produced too quickly to be sustained by a quasi-linear radial diffusive process.This second term, responsible for drift echoes, is another source responsible for the formation of zebra stripes (Section 3.2.1)for nonresonant particles and corresponds to the mechanism explained in Lejosne et al. (2022).The difference between this second term and the zebra stripe source derived in Section 3.2.1 is that the latter requires the phase-space loss (δf (t = 0) < 0) or injection of particles (δf (t = 0) > 0) and no electric fields, 28 whereas the former requires the perturbation of the distribution function from a ULF fluctuation with amplitude a m and a gradient in the distribution function, i.e., ∂f 0 /∂r ≠ 0. The third term on the right-hand side of Equation (39) represents the linear wave-particle resonance between ULF fluctuations of frequencies ω m and particles with drift frequency Ω d .It can be shown that this last term can grow in time for the limit mΩ d ∼ ω m and for what we here call intermediate times: The intermediate time range defined by Equation (40) means that the ULF wave has had time to oscillate, but the perturbation has not yet been damped away significantly, or grown appreciably, for linear effects to break down (Schekochihin 2017).In this limit, the resonant term in Equation (39) dominates over the other two, and the perturbation in the linear response of the distribution function takes the following form: Equation ( 41) is valid at intermediate times given by Equation (40) and assumes that mΩ d −ω m ?γ m .In the limit where mΩ d −ω m = 1/t, the exponential term in Equation (41) can be expanded as a Taylor series, and the dominant term for the perturbed distribution function gives and thus demonstrates that fluctuations grow linearly in time due to resonant interactions.Equation (41) is an instance of a Case-van Kampen mode, initially derived for a Vlasov-Poisson plasma (Van Kampen 1955;Case 1959), but rederived here in the context of radial transport.In the limit where t → ∞ but γ m t = 1 is necessary to respect Equation (40), the right-hand side of Equation (41) tends to a delta function. 29 28 It should be noted that in the terrestrial radiation belts injection and losses are separated between adiabatic and nonadiabatic ones.Reversible losses are associated with adiabatic perturbations, whereas irreversible losses are associated with nonadiabatic effects, for instance, the scattering of particles inside the atmosphere (Millan & Thorne 2007).The ballistic amplitude δf (t = 0) can account for both reversible and irreversible losses. 29These non-eigenmodes are not only of theoretical interest.Non-eigenmodes have to be tracked in order to quantify entropy production in kinetic systems.See, for instance, Section 5.6 of Schekochihin (2017) for an introduction in terms of a Vlasov-Poisson system and Zhdankin (2022) for a framework that can be applied to kinetic problems in planetary radiation belts.
The resonant linear response presented in this section occurs on timescales comparable to or larger than the drift period but smaller than 1/γ m , while phase mixing and zebra stripes are taking place on timescales comparable to drift periods.For finite damping ULF rate γ m < 0, the resonant part decays e 0 t g  on timescales of |γ m |t ? 1, and the ballistic response proportional to e i t d -W in Equation (39) dominates.This criterion can be used to distinguish nonresonant to resonant drift particle interactions from spacecraft data since both require a radial gradient in the MLT averaged distribution function f 0 .The requirement for a nonzero radial gradient in f 0 of a given energetic population is an experimental constraint on the observation of phase-space structures, as reported by Hartinger et al. (2020) and Sarris et al. (2021), and is discussed further in Section 4.2.
An additional criterion to distinguish resonant from nonresonant particle's response can also be achieved observationally for instruments recording energy-dependent fluxes.Drift resonance is energy dependent, and the signature of resonance for resonant energies should different than for nonresonant particles, even though Equation (41) shows that both experience oscillations with frequencies comparable to the ULF wave frequency ω m .Figure 7 shows the perturbed distribution function of 1.1 meV electrons at L = 8 in comparison to the particle's response for energies at 700 and 900 keV.Thus, a shift in energy can take particles out  of resonances and result in perturbed distribution functions that are more than 5 times smaller in amplitude.
Drift resonance is therefore an efficient mechanism for ULF waves to exchange energy with energetic electrons.In Figure 8, we plotted the drift period as a function of kinetic energy and parametrized it in terms of the radial distance L. The top panel is made for 45°pitch angles and the bottom panel for 90°pitch angles.The shaded and dashed rectangles bound the resonant frequency ω m /m for Pc5 ULFs with azimuthal mode numbers m = 1, 2, and 3. From Figure 8, we note that energetic electrons with kinetic energies larger than 200 keV and up to a few MeV have access to drift orbit resonance across broad drift shells.Figure 9 is the same as Figure 9 but the bounded rectangles are drawn for Pc4 waves with azimuthal wavenumbers m = 4, 7, and 10.In the case of Pc4 waves, they can sustain drift resonance for energetic electrons with kinetic energy less than 400 keV, but require larger azimuthal wavenumbers (Barani et al. 2019).Even though drift resonance is strongly energy dependent, Figures 8  and 9 show that they can be accessible to a broad range of energy and pitch angles across the radiation belts.
We therefore conclude this section by pointing out that the linear perturbation of the distribution function due to ULF electromagnetic fluctuations, particle injections (δf (r, t = 0) > 0) or losses (δf (r, t = 0) < 0), all result in phase-space drift structures on non-diffusive timescales comparable to the drift periods.Some of the phase-space structures for the lower energetic electrons (E c < m e c 2 ), assuming particle injection or gradient in the background distribution, can appear as zebra stripes in the inner radiation belts.Even though Equation (39) shows that the resonant part of δf also experiences phase mixing, drift echoes and zebra stripes nonetheless form for nonresonant drift frequencies at mΩ d ≠ ω m , and thus, stringent resonant conditions of mΩ d ; ω m do not constitute sine qua non constraints for the formation of drift echoes and zebra stripes.

Quasi-linear Theory of Radial Diffusion
In the previous section, we described the fast linear response of the perturbed distribution function to an electromagnetic ULF wave.We assumed that the background distribution f 0 was time independent, which is equivalent to saying that it did not experience significant variations on fast timescales.In this section, we compute the evolution of the background distribution function according to quasi-linear assumptions (Kennel & Engelmann 1966;Diamond et al. 2010;Schekochihin 2017;Allanson et al. 2022).In quasi-linear theories, one assumes that perturbations start modifying the equilibrium before they reach large amplitudes.In other words, the higher-order term  in Equation (32) can be ignored when the characteristic time for higher-order effects is longer than the time for the equilibrium to be reached.We also neglect the linear term in the quasi-linear limit, as shown in Appendix C. Thus, for our purpose, we assume that the evolution of the perturbation is determined by Equation (33).Similarly to the previous section, this linear equation can be solved by Duhamel's principle, for the initial condition δf m (r, t = 0) = 0: The linear solution given by Equation (43) can then be combined with the following quasi-linear equation to described the time evolution of f 0 : We note that the first two terms on the right-hand side of Equation (44) will result in a diffusion term, and the last two expressions in advection terms.Replacing the linear solution of δf m into Equation (44) to compute the correlation terms , results in the following two integrals: To compute the autocorrelations analytically we need to make some assumptions about the nature of the ULF amplitude A m (t).To account for finite and zero correlation times we choose to model the fluctuations as different realizations of an Ornstein-Uhlenbeck process (Papoulis 1991) given by the following time evolution equation: 30 Using Equation (48), we can compute the following quantities for a finite correlation time τ c ≠ 0: The above correlators are only a function of the time difference t t -¢, and not the particular times t and t¢, indicating that the Ornstein-Uhlenbeck process is stationary, or time homogeneous.
Returning to the integrals, Equations ( 45) and ( 46), it should be stressed that the gradient in the background distribution functions in the integrals is a function of time, i.e., f 0 = f 0 (r, t).The last step before solving the integral is to assume that a short decorrelation time τ c exists, such that the correlators with f (r, t) on the basis that C t t i ( ) t = -¢ changes appreciably before any significant variation in the background distribution (Vanden Eijnden 1997).This quasi-linear assumption indicates that the ULF wave amplitude cannot alter the background distribution function on timescales comparable to the ULF wave and drift period.The diffusion coefficient that follows in the next lines can therefore not lead to changes in timescales comparable to the azimuthal drift period and justifies the ensemble average defined by Equation (27).
For the sake of simplicity, we now assume zero correlation time, 31 which means e t t t t c c ( ) Using the above expressions, we compute the following correlators: 30 Energetic electrons in the Earth's radiation belts are passive tracers and the self-consistent response onto the field can therefore be ignored.This freedom allows one to model the ULF wave amplitudes in a manner empirically consistent with in situ measurements.
31 By keeping τ c finite but small (Ω d τ c = 1), the diffusion coefficient in the quasi-linear limit is rescaled by a factor , thereby introducing an energy dependence to the radial transport, as shown in Osmane & Lejosne (2021). .Moreover, since this diffusion coefficient has been derived for an electromagnetic field model that respects Faraday's law it can be expressed in terms of the wave power in the magnetic field alone and does not require the separation in terms of an electric D LL E and magnetic D LL B diffusion coefficients commonly used in radial transport studies (Ozeke et al. 2014;Sandhu et al. 2021).
The diffusion coefficient is dependent on the first adiabatic invariant μ contained in the azimuthal drift frequency Ω d .We note that for a large m ? 1 azimuthal wavenumber, the diffusion coefficient is energy dependent and has a radial distance dependence that goes as L 6 , even though the shortcorrelation time assumption would constrain Ω d τ c = 1.For m ; 1 and Ω d τ c = 1, the diffusion coefficient is independent of energy and has an L 10 scaling of  Barani et al. 2019).In such an instance, the model predicts an energy-dependent D LL that scales as L 6 .On the other hand, a narrow ULF wave spectrum along the azimuthal wavenumber m = 1 should result in a diffusion that is independent of energy and with a radial scaling dependence more sensitive to the radial distance.In other words, the parametric dependence of D LL is a function of how broad the ULF wave spectrum is in m.If the magnetospheric plasma is dominated by an m = 1 mode, with several orders of magnitude less power in m > 1 modes, a quasilinear modeling of the diffusion coefficient with an L 10 dependence should be chosen.If the choice of a quasi-linear model with an L 6 dependence and an energy dependence in D LL provides better accuracy, it would nonetheless be inconsistent with the above radial diffusion coefficients derived for a Mead field.

Beyond a Quasi-linear Theory of Radial Transport: Higher-order Regime
In the preceding sections, we described the linear response of the perturbed distribution function δf and wrote a Fokker-Planck equation for the quasi-linear evolution of the ensembleaveraged distribution function f 0 (L, t).Even in the quasi-linear limit, the perturbed distribution function is assumed to be linear, while the evolution of the background distribution is nonlinear in the sense that it depends on the correlator 〈δB m δf m 〉.However, the perturbed response given by Equation (32) contains a nonlinear term and this section aims to determine when linear assumptions of radial transport break down and higher-order processes become dynamically important.
We distinguish two types of higher-order regimes.In the first type, nonlinear phase-space structures associated with ULF waves are produced but isolated in the sense that they cannot interact with one another.Such structures have been covered in the case of ULF radial transport by Li et al. (2018) and Wang et al. (2018), and their observational signatures consist of the appearance of fluxes trapped in the potential well of electric fields.This regime of isolated trapped structures is equivalent to the formation of Bernstein-Green-Kruskal (BGK) mode for a Vlasov-Poisson system (Bernstein et al. 1957) and requires a sufficiently large-amplitude fluctuation to confine particles in their respective phase space.
In the second type, higher-order processes arise because multiple ULF modes are present and resulting fluctuations in the distribution function interact with one another.This second type of higher-order effect, unlike the first one, can be facilitated by the presence of large-amplitude fluctuations but does not require them.This regime is equivalent to the one presented by Dupree (1972) for a Vlasov-Poisson system and is associated with the formation of phase-space granulations.These phase-space granulations can consist of linear fluctuations arising due to ballistic trajectories, such as drift echoes, or trapped fluctuations equivalent to BGK modes.Theoretical and observational studies have indicated that such a regime of non-isolated structures might be common in weakly collisional plasmas (Schekochihin et al. 2008(Schekochihin et al. , 2016;;Servidio et al. 2017;Kunz et al. 2018;Meyrand et al. 2019), prevent Landau damping from dissipating fluctuations (Wu et al. 2019), and can result in a phase-space turbulent cascade akin to what is observed in fluid and MHD turbulent systems (Goldreich & Sridhar 1995).
While we acknowledge that ULF wave amplitude in the Earth's radiation belts can be sufficiently large to sustain trapped structures derived by Li et al. (2018) and Wang et al. (2018), the trapping along magnetic local time does not result in irreversible energy gain by the trapped populations.We focus hereafter on the second higher-order regime, which relies on the presence of more than one ULF Pc4 and Pc5 mode.We show hereafter that the second higher-order regime can result in the transport of particles along magnetic drift shells, and thus irreversible energizing of populations that would otherwise be unable to experience drift resonance.We also demonstrate that the inclusion of higher-order effects associated with the symmetric ULF fluctuation, which in the linear and quasi-linear regime had no impact, can suddenly become drivers of acceleration and losses.

Criteria to Determine when Higher-order Radial Transport
Becomes Significant The higher-order terms contained in  is given by Equation (B2) and can be understood as coupling terms, in which a mode with azimuthal wavenumber p m m = -¢ couples with a mode q m = ¢ to pump or sink energy from a mode number m.For instance, a collection of particles interacting with azimuthal wavenumbers m = 3 and encoded in δf m=3 and azimuthal wavenumber m 1 ¢ = encoded in f m 1 d ¢= can couple to another through a ULF mode with p = 2 with A p=2 .This higher-order wave-particle coupling can lead to the acceleration of nonresonant energetic particles with slow azimuthal drift periods compared to Pc4 and Pc5 ULF frequencies, i.e., mΩ d = ω.
However, satisfying the condition p + q = m is not enough to make higher-order effects relevant dynamically for radial transport.The higher-order coupling terms become significant when they become comparable to the linear transit term of a particle experiencing an azimuthal drift, which is given by the second expression on the left-hand side of Equation (39).For instance, if we account for the higher-order term associated with the symmetric ULF amplitude, S(t), with mode number p = 0, with the particle response to a mode q = m, , we find the following two criteria: in which the frequency ω m is associated with time variations of the perturbed distribution function δf m and the frequency ω with the symmetric ULF wave amplitude S(t).We note that the linear ballistic response of the perturbed distribution function given by Equation (39) resulted in time variations with frequencies of ω m = mΩ d ; thus, higher-order effects can be felt whenever the symmetric ULF amplitude becomes comparable to the local magnetic field.However, criteria I 1 can also be satisfied in the limit where the symmetric ULF fluctuations are small in amplitude, i.e., S(t)/B 0 = 1, if ω m ?mΩ d .For criteria I 2 , higher-order effects become significant for large gradients in the perturbed distribution ) even in the limit of S = B 0 .
We can account for higher-order processes associated with the antisymmetric perturbation by defining two additional criteria I 3 and I 4 in terms of the higherorder terms We note once more that since the linear response of the perturbed part f m d ¢ can result in time variations with frequencies m d  w ¢W , higher-order effects can be sensed whenever can nonetheless result in dynamically relevant higher-order effects.

I
L m S B L f symmetric higher order term 2 linear transit time 2 log 1, 59 3.4.2.Impact of Symmetric Perturbations on Fast Timescales In the previous section, we defined four criteria to argue that higher-order effects can become significant even for smallamplitude ULF fluctuations.In this section, we focus on processes arising from the symmetric ULF perturbation S(t).The higher-order equation, Equation (32), for the perturbed distribution function δf m can be solved analytically on fast timescales comparable to the drift period of particles.The linear solution to Equation (32) is independent of the symmetric perturbation S(t).However the higher-order term  contains a coupling term between the symmetric perturbation and δf m .This higher-order response of the particles with a mode m is due to the coupling between the m = 0 ULF mode contained in the symmetric perturbation and itself.If we assume that the higher-order coupling due to S(t) is greater than the one due to the antisymmetric ULF waves S ?rA m , Equation (32) becomes In order to isolate the impact of the symmetric perturbation arising due to higher-order coupling we split the perturbed distribution in terms of a linear part f m L d given by Equation (39) and a higher-order part f m NL d that can be extracted from the following equation: In Equation (63), we assume that the higher-order perturbation remains smaller than the linear response, , and thus we can solve the higher-order equation perturbatively to drop the coupling terms proportional to and can now be solved if we prescribe a solution for the linear response f m L d .For the sake of simplicity, and in order to highlight that ULF radial transport can have an impact on nonresonant particles on fast timescales comparable to or less than the drift period, we assume that the linear perturbation f m L d is given by an injection or a loss of 100 keV electrons consistent with the linear solution, and set A m = 0. 32 Particles with 100 keV confined in the equatorial plane at normalized radial distances L 8 have azimuthal drift periods of the order of 90-120 minutes.Thus, frequencies of the order of ω ; 1 mHz would require azimuthal wavenumbers of m 10 ( Barani et al. 2019).
We assume an injection of 50 keV given by a Gaussian centered at a radial distance L c and with a radial spread ΔL The symmetric perturbation is modeled as a compression of the magnetic field with a decay time c S t , S t b e .6 5 The perturbed solution for the distribution function following the Gaussian shaped injection and decaying symmetric ULF mode is given by The higher-order response given by Equation (66) for the m = 1 mode is shown in Figure 10 for a symmetric ULF wave amplitude of δb = 0.12B 0 .The top-left panel corresponds to the linear response.After the injection of the particles at L c = 5, the distribution function oscillates in time and gets sheared along L. However, when we introduce a symmetric perturbation with a decay time that is smaller than the drift period (with W has no impact on the distribution function, as shown in the bottom-right panel of Figure 10. The physical process responsible for this mechanism is illustrated in Figure 11.A symmetric ULF compression with amplitude S(t) results in an E × B differential gradient that is larger in amplitude at higher than lower drift shells.Drift shells with negative (positive) gradients result in particles being driven inward (outward).If the ULF compression is an adiabatic particle phase mix along L, but the compression is nonadiabatic and the E × B drift decays or grows too fast (compared to the azimuthal drift period) for phase-mixing to occur, the net radial drift is inward.This net motion of particles inward is shown in Figures 12, 13, and 14 for a ULF symmetric amplitude corresponding to 25% and 62% of the background field at L = 5.The inward-moving particles increase in energy in order to conserve the first adiabatic invariant whereas the outward-moving particles lose energy.This process can result in the fast and irreversible acceleration of particles as well as losses associated with shadowing even though there is no drift resonance with the ULF modes.These results demonstrate that the inclusion of higher-order effects can lead to non-diffusive The linear response is taken as the ballistic one f f t r e 0, The inclusion of the linear wave-particle response for A m ≠ 0 leads to the same physical process and is left for future more detailed studies of higher-order radial transport.and irreversible radial transport on fast timescales.Such a process cannot be modeled with quasi-linear radial diffusion.

When Can We Use Quasi-linear Radial Diffusion?
A drift kinetic description of ULF wave interaction with energetic particles is a convenient methodology to define the regime of validity of quasi-linear radial diffusion problems.In comparison, the derivation in terms of the particle's trajectories (Fälthammar 1965;Elkington et al. 1999;Lejosne 2019) is mathematically more transparent than the one provided in Section 3.3 but since it does not require computation of the perturbed orbits, it does not distinguish explicitly between the fast perturbed part and the slow background part of the distribution function.
The procedure to derive the radial diffusion coefficient is identical to the one pursued for other quasi-linear theories in laboratory and astrophysical plasmas (Kennel & Engelmann 1966;Diamond et al. 2010;Schekochihin 2017).Quasi-linear theories require temporal and spatial scale separation of the distribution function in terms of a slow ensemble-averaged background component and a fast perturbed component.The fast component can evolve on timescales comparable to the periods of electromagnetic fluctuations responsible for the wave-particle interactions.For instance, for seed electrons of 10-100 keV interacting with high-frequency whistler waves, the quasi-linear theory of Kennel & Engelmann (1966) is explicitly clear that the perturbed component evolves on timescales of the order of the whistler period, and thus the Larmor period as well, since ω ; Ω s .The diffusive evolution of the distribution function requires a large number of interactions with whistler waves and is therefore computed on timescales that have been averaged over a large number of whistler wave periods.The perturbed part is computed linearly, and thus quasi-linear theory assumes that higher-order effects such as trapping and mode-mode coupling associated with large amplitudes can be neglected.
For a quasi-linear theory of radial transport to be consistent, one needs to preserve the scale separation defined by Equations ( 25) and (27).The background distribution function is not only independent of magnetic local time, and thus j, it cannot change significantly on timescales comparable to the drift period Ω d .A radial diffusion coefficient that becomes comparable to the drift period (D LL ; Ω d ) indicates that a collection of particles can be carried across one drift shell ) during a single drift period.This argument stems from the fact that dimensionally the radial diffusion coefficient scales as D LL ; 〈ΔL 2 〉/t, and that the inverse of the diffusion coefficients gives a characteristic time for transport across one drift shell.Taking into account that the derivation of the quasi-linear diffusion coefficients requires a short decorrelation time of the ULF wave amplitude, and the observational fact that ULF waves are long lived and coherent (Hartinger    et al. 2013), it is inconceivable that a diffusive scattering along drift shells can occur over a single drift period. 33he determination of accurate radial diffusion coefficients is not merely of academic interest and has important consequences on space weather models and on studies of radiation belts focused on distinguishing between local and global acceleration processes (Green & Kivelson 2004).Current global magnetospheric models accounting for radial diffusion rely on D LL coefficients that become comparable, and for large geomagnetic activity larger than the drift periods of energetic electrons trapped in the Earth's radiation belts (Brautigam & Albert 2000;Ozeke et al. 2014).For instance, the radial diffusion coefficient of Ozeke et al. (2014) can become as large as 10 2 in units of day −1 for Kp > 5, which corresponds to the drift period of electrons with energies of 1 MeV.Additionally, derived radial diffusion coefficients assume that the ULF wave correlation 〈δB(t)δB(t + τ)〉 is time and space homogeneous along the particle's orbits.However, ULF waves are sustained by a wide range of processes that are not colocated, ranging from Kelvin-Helmholtz instabilities (Mills & Wright 1999), pressure pulses in the solar wind (Takahashi & Ukhorskiy 2007), foreshock transients (Hartinger et al. 2013), magnetospheric substorms (Volwerk 2016), and unstable plasma distributions (Southwood et al. 1969).As a consequence, ULF waves are not homogeneously distributed in the magnetosphere (Murphy et al. 2020), and unless the ULF waves decay very fast compared to the drift period, quasi-linear radial diffusion coefficients accounting for nonhomogeneous statistics have to be derived. 34he abovementioned limitations of quasi-linear radial diffusion do not imply that ULF waves cannot sustain transport on timescales comparable to the drift period.Rather, what is argued is that current quasi-linear radial diffusion models have clear limitations, and should not be used beyond their range of validity.Radial transport coefficients encoding the impact of ULF waves on fast timescales require models that are not quasi-linear.A drift kinetic approach to radial transport is also not confined to theoretical or modeling studies.With GPS flux measurements calibrated by Van Allen Probes' instruments, it is now possible to quantify observationally radial transport on timescales of the order of a single drift period for electrons with energies less than 1 MeV (Morley et al. 2016(Morley et al. , 2017;;Kalliokoski et al. 2023) and quantify the wide range of radiation belts responses to solar wind drivers (Mann et al. 2002(Mann et al. , 2004;;Rae et al. 2005;Hudson et al. 2008;Bentley et al. 2018;Simms et al. 2018;Osmane et al. 2022).

Distinguishing between Drift Resonant and Nonresonant Interactions
The scale separation described in Section 3.1 forms the basis to derive a quasi-linear theory of radial transport, but is also appropriate to quantify the linear and higher-order response of the distribution function that occurs on fast timescales comparable to a few drift periods.Section 3.2 described three different types of linear responses associated with a ULF wave of frequency ω m , growth or damping rate γ m , and azimuthal wavenumber m.Three of these responses are nonresonant and one corresponds to drift resonance of particles drifting the Earth's magnetic field with frequency Ω d ; ω m /m.The first type of nonresonant response is a modulation of the distribution function with the frequency of the ULF wave ω m , and the second type of nonresonant response is an oscillation of the distribution function at the drift frequency Ω d .While both resonant and nonresonant responses to a ULF wave are a function of the local gradient in the background distribution function, the resonant particles are energy dependent and the perturbed distribution is amplified by up to one order of magnitude and is therefore distinguishable from nonresonant responses.
Models of ULF drift resonance predict that satellites should observe the largest modulations in particle flux at energies corresponding to the resonant energy, with smaller modulation at lower/higher energy (Southwood & Kivelson 1981).Equation (41) confirms this signature for resonance but also demonstrates that nonresonant, as well as resonant particles, can oscillate at the ULF wave frequency.In situ observation of distribution functions or fluxes oscillating at a ULF ω m should therefore not be assumed as a signature of drift resonance unless the response is localized in energy spectrograms.For drift resonance, the timescales associated with the resonant interaction and the width are a function of the growth rate, and we here stress that seeing comparable modulation across multiple energy levels for a monochromatic ULF wave spectrum is an indication that the interaction is nonresonant.
In the study of Claudepierre et al. (2013), fluxes of energetic electrons between 20 and 500 keV are identified as unambiguous signatures of localized drift resonant interaction with a ULF wave.However, no analysis is provided to quantify the radial gradient of the distribution function for each respective energy flux.As shown in this paper, the modulation of particle fluxes in terms of the ULF wave does not require drift resonance and can be observed for nonresonant particles as well.The difference in amplitude between fluxes can be explained in terms of radial gradient differences between energetic fluxes.The localized modulation in time can be explained by a ULF wave that is being damped at a rate γ m , and the spatially localized modulation seen on one Van Allen Probe but missed by the second probe can be an indication that the radial gradient of the distribution function is highly spatially localized.Large and localized radial gradients of the distribution function have been reported for case studies.For instance, Hartinger et al. (2020) point out that at L = 4.5 and L = 6.6, the reported radial PSD gradients are 30-300 times larger at values corresponding to energies of 200 keV compared to 1 MeV.Consequently, residual flux oscillations in this particular case would be 30-300 larger for electrons with energies of 200 keV rather than 1 MeV.Thus, characterizing flux oscillations without accounting for radial gradients, known to vary by several orders of magnitude, can lead to erroneous interpretation of wave-particle processes.
As shown in Figures 8 and 9, ULF waves in the Pc4 and Pc5 range can be resonant with electrons of energies ranging between 100 keV and a few MeV, yet signatures of drift resonances for the most energetic MeV populations are rare.Hartinger et al. (2020) address this inconsistency between observations and theoretical assumptions.On the basis of the theoretical study of Southwood & Kivelson (1981), in order for drift resonances to be observed, one requires finite radial gradients in the background distribution function.Drift resonant interactions could still occur but would be masked by small radial gradients in the background distribution function.While we are in agreement with the conclusions of Hartinger et al. (2020), that drift resonance requires observable gradients to be detected, our analysis of the resonant response provides one additional constraint.Drift resonant signatures result in an amplification of the particle's response, as shown in Equation (41), which is localized in time and can be comparable to a single drift period.Moreover, if the ULF wave damps quickly, that is, on timescales comparable to a few drift periods, the resonant exchange could be too weak to be observed or distinguishable from the nonresonant one.Keeping in mind the conclusions of Hartinger et al. (2020) regarding the importance of radial gradients, our analysis provides an additional explanation as to why observations of drift resonant signatures have been rare when detected by a few spacecraft.Drift resonance is a transient process and detection by one spacecraft can be entirely missed by another spacecraft sampling the same orbit but on timescales larger than a few drift periods.

Mechanisms of the Formation of Zebra Stripes
Even though phase-space structures in the radiation belts are not necessarily indicative of violation of the third adiabatic invariant, and thus acceleration, their observed signatures can be used to test the validity of radial transport models or be used as diagnostic for electric fields or particle injections.In Section 3.2, we showed that phase mixing of trapped electrons can result in the formation of structures known as zebra stripes.Zebra stripes are transient-structured peaks and valleys observed on spectrograms of inner radiation belts' electrons with energies ranging between tens to hundreds of keV.The zebra stripes that are measured in situ are also characterized by energy peaks and dips that vary as the inverse of the radial distance, i.e., E c ∼ 1/L (Sauvaud et al. 2013;Lejosne & Roederer 2016;Lejosne & Mozer 2020a, 2020b).Since the zebra stripes can be produced on timescales of the order of a few drift periods, a radial diffusion mechanism should be immediately rejected.Our analysis also shows that zebra stripes can form without drift resonance with ULF waves, and as a result of a phase-mixing process described for nonresonant particles.The phase-mixing process described in this paper is triggered by particle injection or losses from the radiation belts, and the requirement for an electric field that sustains drift resonance, as shown in Ukhorskiy et al. (2014), is therefore unnecessary.The requirement for drift resonant interactions to produce zebra stripes is also more constraining than nonresonant phase-mixing mechanisms, since resonance requires ULF fluctuations with a narrow set of parameters and finite radial gradient in f 0 . 35  How can we distinguish between the mechanisms of the formation of zebra stripes?We note that the first phasemixing mechanisms, described in Section 3.2.1,require injection or losses of particles but no electric field.The second type, appearing as the ballistic term in Equation (39), requires ULF fluctuations and a finite radial gradient in the distribution function.The third type, described by Ukhorskiy et al. (2014), but appearing as the drift resonant term in Equation (39), requires ULF fluctuations that can resonate within a wide range of energies, and also a finite radial gradient in the distribution function.For all three types, the formation and shearing occur on the same timescales.In order to distinguish both phase-mixing mechanisms, one needs to measure radial gradients in the phase-space density and determine if the amplitude of the ULF fluctuations can provide the amplitude of the stripes' structures observed.If such a test proves inconclusive, the phase-mixing process requiring injection of particles, such as the one observed in the measurements of Zhao & Li (2013), might be favored.If future observational studies demonstrate that injection or loss of particles in the inner belts correlates with phase-mixed structures, one could use zebra stripes as proxies for injection and losses.Similarly, if the phase-mixing process is primarily driven by ULF fluctuations, the appearance of zebra stripes could be used as proxies to extract properties of electric fields in the inner belts.

Higher-order Parker Mechanism
The first radial transport model resulting in irreversible acceleration of particles was presented by Parker (1960) and did not require drift resonant interactions.In the Parker (1960) scenario, magnetically confined particles experience nonadiabatic transport as a result of asymmetric magnetic field perturbations.Since particles at different MLT sectors of the same drift shells sensed a different perturbation, they collectively experienced a net radial transport.The mechanism presented in Section 3.4.2 is a higher-order generalization of the Parker (1960) mechanism in that it does not require drift resonance with ULF waves.This higher-order mechanism is also the product of nonadiabatic perturbations but does not require asymmetric magnetic fluctuations.Rather the only two ingredients required for this higher-order process to result in irreversible radial transport is 1.Large amplitude symmetric perturbation δB/B 0 ; 10% decaying or growing nonadiabatically, and 2. Opposite radial gradients in the distribution function, or put differently, a localized minimum or maximum of the distribution function along the radial distance.
While particles on the same drift shells sense the same electromagnetic field and radial drift speeds, particles on different drift shells drift at different speeds, and the combined inward and outward transport in the presence of opposite gradients results in irreversible acceleration as more particles are pushed inward than outward.If the waves decorrelate very slowly (adiabatically) compared to the drift period, particles will phase mix radially and instead of a net injection inward, a plateau along the radial distance will form.Are symmetric ULF fluctuations observed in the Earth's radiation belts?In a recent observational study, Takahashi et al. (2022) provide the first description of symmetric compressional 35 On the basis of Occam's razor argument (Popper 2005) we would favor a phase-mixing mechanism of nonresonant particles to explain the formation of zebra stripes.For instance, the observation of zebra stripes in both electron and proton fluxes in Saturn's magnetosphere (Sun et al. 2022) is more easily explained through nonresonant processes.A drift resonant explanation for the stripes across ions and electrons of varying drift frequencies necessitates the additional presence of ULF waves broad enough in frequencies to resonate with each population.The nonresonant explanation, on the other hand, simply requires a single ULF mode that can perturb both ions and electrons adiabatically.
ULF fluctuations with magnetic field amplitudes comparable to the background magnetic field.The symmetric ULF waves are excited outside of the plasma sphere, and localized in MLT and radial distance.The large amplitude (δB/B 0 0.1) and compressional nature of the fluctuation described by Takahashi et al. (2022) are consistent with the one used for the acceleration process presented in Section 3.4.Moreover, the waves are observed in association with the injection of particles, and thus symmetric fluctuations are associated with local radial enhancements of particles.Even though it is too speculative at this point to determine whether this mechanism is commonly occurring in the radiation belts, we want to stress that the two required ingredients for the occurrence of this higher-order mechanism have been observed in the radiation belts.Unlike radial diffusion, which operates on long timescales and requires a large number of drift resonant interactions, fast and higher-order acceleration mechanisms can be both seldom and more efficient.

Conclusion
In this paper, we have presented a drift kinetic description of ULF radial transport for the Earth's radiation belts.The use of a drift kinetic formalism is particularly convenient to distinguish quasi-linear diffusion occurring on slow timescales, with fast wave-particle interactions associated with linear or higher-order processes.Current global models of the Earth's magnetosphere account for ULF radial transport solely in terms of quasi-linear diffusion models.Our analysis demonstrates that linear and higher-order processes occurring on timescales of the order of the drift period and with a spatial dependence on magnetic local time cannot be modeled in terms of quasi-linear diffusion.Observationally, fast and localized radial transport have been known for decades, but have been limited to extreme driving events or serendipitous satellite measurements (Li et al. 1993;Hudson et al. 1995Hudson et al. , 2017;;Kanekal et al. 2016).In recent years, calibration of GPS electron flux measurements with Van Allen Probes' instruments has offered for the first time unprecedented spatial and temporal coverage of the Earth's radiation belts on timescales comparable to the azimuthal drift period (Morley et al. 2016(Morley et al. , 2017;;Kalliokoski et al. 2023).Thus, a modeling framework that distinguishes between fast and slow radial transport is not only of theoretical interest, but can also be tested for the first time with in situ measurements for a wide range of geomagnetic driving conditions. 36 In recent years, dominant acceleration processes in the Earth's radiation belts have been categorized as belonging to local wave-particle interactions or global ULF radial diffusion.The observational signature of local wave-particle processes in the phase-space density consists of localized enhancements, whereas ULF radial diffusion results in the flattening of the phase-space density along the radial distance (Green & Kivelson 2004;Reeves et al. 2013).When including higherorder terms in the radial transport equation, we found that seed electrons with 50-100 keV injected in the outer belts can experience additional betatron acceleration in the presence of symmetric ULF wave amplitudes with amplitudes comparable to the one reported by Takahashi et al. (2022).This impulsive and convective higher-order process requires no drift resonance yet results in a localized enhancement of the phase-space density on timescales that are much shorter than the drift period.This theoretical result is therefore of particular interest to observational studies of radiation belts since ULF waves are also able to produce localized signatures attributed to smallscale wave-particle interactions.With growing satellite coverage and the capacity to measure electron fluxes on timescales comparable to the drift period the binary quasi-linear framework developed in recent years needs to be revisited.
The main focus of this paper has been on the radial transport of energetic electrons in the Earth's radiation belts.However, a drift kinetic description based on the work of Hazeltine (1973) can also be used to describe energetic ring current protons (>100 keV) with Larmor frequencies of Ω p ∼ 1-10 Hz responding to ULF fluctuations of ω ∼ 1 mHz (Murphy et al. 2014) and energetic electrons in a wide range of planetary environments, such as those of Jupiter or Saturn (Lejosne & Kollmann 2020). 37Another limitation of our paper is that it focused solely on equatorially trapped particles, and it ignored boundary effects that are known as a sink for energetic electron fluxes (Millan & Thorne 2007) or ring current ions (Li et al. 1993).A growing number of in situ experiments are showing that energetic electrons can be depleted on timescales comparable to a few drift periods (Turner et al. 2012;Jaynes et al. 2018;Olifer et al. 2018).While such sudden losses can, in theory, be explained by local wave-particle interactions (Zhang et al. 2022), in some events the small-scale waves appear insufficient to account for the losses (Albert 2014).Since ULF waves can effectively transport energetic electrons on fast timescales, it is worth investigating the net impact that they have on irreversible particle losses.The higher-order Parker scenario described in Section 3.4.2,and in association with a sudden and symmetric reduction of the magnetic field, will result in inward convective acceleration and outward convective losses to the outer magnetopause boundary.Finally, another notable limitation lies in the choice of a Mead electromagnetic field model.For extreme events with very large amplitude and localized electromagnetic pulses, such as that of the 1991 March 24 event (Li et al. 1993;Hudson et al. 1995), the radial transport equation needs to be revisited as a Fourier decomposition is not the optimal representation of the ULF field.In future studies, we will quantify radial transport for event-specific studies during both moderate and extreme geomagnetic driving conditions and we will quantify the role of ULF waves acceleration and losses occurring on timescales comparable to the drift period and therefore too fast to be 36 After submission, we were informed of the publications by Lejosne & Albert (2023) and Lejosne et al. (2023) of a new formalism that aims to distinguish between fast and slow radial transport processes.While the aim of their work overlaps with ours, we use different methodologies.Our framework relies on drift kinetics and we separate the fast and slow evolution of the distribution function in terms of coupled but separate equations.In our work, the perturbed part of the distribution function containing the fast motion is spectrally decomposed in terms of Fourier modes, and requires no assumptions of stochasticity for the ULF waves.The appearance of stochastic trajectories and diffusion in phase space can originate from deterministic chaos (Diamond et al. 2010) and radial diffusion should be recovered even in the limit where the ULF wave amplitudes are deterministic. 37Radial transport modeling of ring current is more complicated for two reasons: (1) ions of energy less than 10 keV can sustain fluctuations that violate the drift kinetic scale separation with k ⊥ ρ ; 1 (Crabtree & Chen 2004), and (2) the energy density of the ions is comparable to the energy density of ULF waves.The first problem can be solved by adding collision terms that mimic the effect of pitch angle and energy diffusion, and the second problem requires coupling with Maxwell's equations.Thus, for ring current ions, the higherorder terms described in this work are effectively nonlinear.
explained by radial diffusion.With a radial transport framework that incorporates linear, quasi-linear, and higher-order transport, we can now for the first time work to supplement global radiation belt models with transport coefficients that account for the full range of ULF wave-particle interactions.The first term in the first brackets in Equation (A1) contains the linear term, and the following two brackets with the double sums contain the higher-order terms.We solve this equation with the aid of the Fourier convolution theorem: The linear wave-particle interaction only depends on the antisymmetric magnetic field fluctuation A m .However, the higher-order perturbations are also a function of the symmetric magnetic field fluctuations, i.e., S(t).The traditional quasi-linear assumption consists of ignoring the higher-order terms in the field amplitudes by setting 0  = , and thus computes the fast linear response due to antisymmetric ULF waves.It is however possible, as shown in Section 3.4.2, to derive the fast higher-order response on timescales less than a drift period, and a higher-order quasi-linear theory for long timescales, by accounting for the terms associated with the symmetric ULF perturbations.
Appendix C Justification for Neglecting the Temporal Variation of the Background Distribution in the Linear Response (Equation ( 32)) We note that the linear equation in Equation (B1) contains a term proportional to A r B f t m 0 0 ¶ ¶ .In the quasi-linear limit of short autocorrelation this term will introduce an additional term in the diffusion equation that results in the following correction: .We note that even if the diffusion coefficient is artificially increased to values comparable to the drift period, thereby implying that transport across one drift shell is possible for one single drift period, the short autocorrelation time limit Ω d τ c < 1 would nonetheless hold the above dimensional analysis and justify the neglect of the partial time variation of f 0 in the linearized equation, Equation (B1).

Figure 2 .
Figure 2. Spatial and temporal scales of electromagnetic fields and particle motion in the Earth's radiation belts, and their relation to theoretical limits.The Larmor motion, bounce mirroring motion, and azimuthal drift motions are represented as turquoise ellipses.ULF waves ranging from 2-100 mHz are shown in shaded rectangles.The regime of validity of quasi-linear radial diffusion is shown in yellow and the regime covered by drift kinetic, which encompasses quasi-linear radial diffusion is in gray.The left boundary of the quasi-linear regime is computed from the inverse of the radial diffusion coefficient obtained from Brautigam & Albert(2000)  for L = 8 and Kp = 6, which corresponds to strong geomagnetic conditions.A D LL at L = 8 and Kp = 6 indicates radial transport over one L-shell on a timescale of 30 minutes.For a >4 MeV electron, a drift period is of the order of 3 minutes, and radial diffusion over one drift shell after 10 azimuthal drift periods is very fast, but perhaps possible through quasi-linear diffusion.For lower-energy electrons, e.g., 400 keV, a complete azimuthal drift is of the order of 20 minutes, and diffusion over one drift shell in less than two azimuthal drift is inconsistent with the quasi-linear assumption of small changes over fast timescales.It should therefore be kept in mind that the range of validity of quasi-linear radial diffusion becomes smaller for less energetic particles.

j
the first adiabatic invariant μ, and decompose the perturbed fluctuations along the azimuthal angle in Fourier space 23 Replacing the decomposition, Equation (30), in Equation (28) for m = 0 (the azimuthal average), and averaging over time according to Equation (27) results, as shown in Appendix A, in the quasi-linear equation * * *

Figure 3 .
Figure 3. Ballistic motion for particles trapped in a dipolar field results in the formation of zebra stripes.The top-left, top-right, bottom-left, and bottom-right panels show solutions at j 0 = 0 for t = 30 minutes, and 1, 2, and 8 hr, respectively.The initial distribution is uniform in L and kinetic energy E c .

Figure 4 .
Figure 4.The formation of zebra stripes after the injection of particles centered at L = 2 with a spread in radial distance of ΔL = 0.75.

Figure 5 .
Figure 5.The formation of zebra stripes for 100 keV equatorially trapped electrons in terms of (L−j).The initial distribution corresponds to a Gaussian-distributed beam centered at L = 2 for j ä [0, π] and a Gaussian-distributed drop centered at L = 2 for j ä [π, 2π].After 4 hr, the zebra stripes remain visible.

Figure 6 .
Figure 6.The formation of zebra stripes for 400 keV equatorially trapped electrons in terms of (L−j).The initial distribution corresponds to a Gaussian-distributed beam centered at L = 2 for j ä [0, π] and a Gaussian-distributed drop centered at L = 2 for j ä [π, 2π].After 4 hr, the zebra stripes have phase mixed.

Figure 7 .
Figure 7. Example of resonant and nonresonant response in the electron distribution function.The ULF wave has a frequency of ω = 7 mHz and a mode number of m = 1.The particles are located at L = 8 with pitch angle α = 45°.Particles with kinetic energies of the order of ;1.2 meV(2π/Ω d ; 15 minutes) are resonant, but particles with energies less than 1 meV (2π/Ω d 17 minutes) are not.The resonant particles experience fluctuations almost one order of magnitude greater than nonresonant particles with comparable kinetic energy.

Figure 8 .
Figure 8. Azimuthal drift period (2π/Ω d ) dependence in terms of the kinetic energy E c = [50-2000] keV and normalized radial distance L = r/R E = [4, 6, 8].The top panel is for α = 45 and the bottom one is for α = 90.The gray-shaded area is when the drift frequency matches the Pc5 ULF fluctuations with ω = [2, 7] mHz and resonant interactions are possible.The areas bounded in dashed and dotted lines show the resonant boundary for m = 2 and 3 modes, respectively.

Figure 9 .
Figure 9. Azimuthal drift period (2π/Ω d ) dependence in terms of the kinetic energy E c = [50-400] keV and normalized radial distance L = r/R E = [4, 6, 8].The top panel is for α = 45 and the bottom one is for α = 90.The gray-shaded area indicates when the drift frequency matches the m = 4 Pc4 ULF fluctuations with ω = [7, 25] mHz and resonant interactions are possible.The areas bounded in dashed and dotted lines show the resonant boundary for the m = 7 and m = 10 Pc4 modes, respectively.
on the righthand side of Equation (32) since it provides a correction of order B The quasi-linear diffusion equation therefore takes the general form of We normalize time and the radial distance in the quasi-linear equation, Equation (54), as τ = t/τ c and L = r/R E , and write|δB m | 2 = r 2 |A m | 2 to findin which the diffusion coefficient D LL normalized by τ c is given by conserves particles confined within a bounded volume since the total rate of change of particles is given by ), the distribution function splits at the injection point.This nonadiabatic behavior is shown in the top-right and bottom-left panels of Figure 10.In comparison, an adiabatic decay of the ULF mode with c S d  t

Figure 10 .
Figure 10.Impact on the perturbed distribution function of 100 keV injected electrons at L = 5 by symmetric perturbation on timescales less than one drift period for symmetric perturbation of amplitude δb = 0.12B 0 at L = 5.The color scale denotes the perturbed distribution amplitude.

Figure 11 .
Figure11.Explanation of the higher-order mechanism presented in Section 3.4.2.A symmetric ULF compression with amplitude S(t) results in an E × B differential gradient that is larger in amplitude at higher than that of lower drift shells.Drift shells with negative (positive) gradients result in particles being driven inward (outward).If the ULF compression is an adiabatic particle phase mix along L, but if the compression is nonadiabatic and the E × B drift decays too quickly for phase mixing to occur, the net drift is inward.

Figure 14 .
Figure14.Cut of the linear and nonlinear perturbed distribution function at j = 0 in Figure13.The nonadiabatic symmetric perturbation splits the distribution functions by pushing particles inward and outward.

.
After decomposing the perturbed fluctuations along the azimuthal angle in Fourier space f In order to obtain the quasi-linear equation, we first set p = 0, which corresponds to the spatial average of the kinetic equation, the time average defined in Equation (27) to find Equation (31),The right-hand side of Equation (A6) describes the slow evolution of the background distribution due to the effect of fluctuations.Appendix B Derivation of the Higher-Order Perturbed Equation (Equation (32))In order to obtain an equation for the perturbed part of the distribution function for Fourier modes m ≠ 0, we subtract Equation (A5) from Equation (A4), which results inEquation (32) in the limit of small ULF wave amplitude given by the Mead field B reduces the diffusion by a factor much less than 1.One can also give a dimensional argument to neglect the first term on the right-hand side of Equation (B1) to compute the linear response under quasi-linear assumptions.The diffusion coefficient D LL has units of 1 over time, and is bounded by the drift period Ω d of a particle.With D LL = Ω d , and thus, D LL = 1, the diffusion equation requires that D .If the time and spatial variations of the background distribution are slow and determined by nondimensional small parameters ε t = 1 and ε L = 1, respectively, then f 0 = f 0 (ε t t, ε L L) inserted into the diffusion equation gives the following scaling:D t L L L 2  e e .Therefore, the time variation of the background is smaller than the radial gradient of the background distribution in the linear response by a factor of D 1 magnetic field dipole magnitude B E Earth's magnetic field dipole moment B EK Asymmetric magnetic field model of Elkington et al. criteria associated with the antisymmetric ULF perturbations l Characteristic scale size of electromagnetic fluctuations in the ULF range L Normalized radial distance from the Earth's midplane L * Magnetic drift shell and third adiabatic invariant m s Rest mass of particle species "s" m Azimuthal wavenumber  Higher-order term in the kinetic equation p ∥ Relativistic momentum along the local magnetic Label for particle species s = i for ions and s = e for electrons t Adnane Osmane https:/ /orcid.org/0000-0003-2555-5953Emilia Kilpua https:/ /orcid.org/0000-0002-4489-8073Harriet George https:/ /orcid.org/0000-0002-3715-4623Oliver Allanson https:/ /orcid.org/0000-0003-2353-8586Milla Kalliokoski https:/ /orcid.org/0000-0002-6445-5595