Isodrastic magnetic fields for suppressing transitions in guiding-centre motion

In a magnetic field, transitions between classes of guiding-centre motion can lead to cross-field diffusion and escape. We say a magnetic field is isodrastic if guiding centres make no transitions between classes of motion. This is an important ideal for enhancing confinement. First, we present a weak formulation, based on the longitudinal adiabatic invariant, generalising omnigenity. To demonstrate that isodrasticity is strictly more general than omnigenity, we construct weakly isodrastic mirror fields that are not omnigenous. Then we present a strong formulation that is exact for guiding-centre motion. We develop a first-order treatment of the strong version via a Melnikov function and show that it recovers the weak version. The theory provides quantification of deviations from isodrasticity that can be used as objective functions in optimal design. The theory is illustrated with some simple examples.

On a short timescale, charged particles (mass m, charge e) in a strong magnetic field B perform helices around magnetic field lines with gyrofrequency e m |B| and gyroradius (1) ρ = mv ⊥ e|B| , v ⊥ being the magnitude of the component of the velocity perpendicular to B. We consider fields for which |B| = 0 in the region of interest, indeed large enough to make the gyroradius smaller than typical length-scales for variation of B.
On longer time-scales, the centre-line, radius and pitch angle of the helices drift, but there is an adiabatic invariant, the magnetic moment, whose asymptotic expansion starts (2) µ = mv 2 ⊥ 2|B| , and thereby makes ρ ∝ |B| −1/2 along trajectories. The relevant small parameter ε is the relative change in B (in magnitude and direction) seen by the particle during one gyro-period. The adiabatic invariant allows one to reduce the dynamics to rapid gyro-oscillation about a "guiding centre" whose motion is governed by a relatively slow Hamiltonian system of two degrees of freedom (DoF).
To zeroth order in ε the motion of the guiding centre is along magnetic field lines, governed by the canonical Hamiltonian dynamics of (3) H(s, p ) = 1 2m p 2 + µ|B(s)| for arc-length s and momentum p = mv along B (the effect of an electrostatic field can be included but we leave it out for now; so can gravitational fields and relativistic effects, see Appendix A). We will write b = B/|B|, and for derivative with respect to arc length along a fieldline, so e.g.
in differential forms and vector-calculus notation, respectively. We shall often express relations by differential forms, but we provide vector-calculus translations where feasible. For an example in the other direction, the condition divB = 0 for a magnetic field can be written as dβ = 0, where β is the magnetic flux 2-form i B Ω with Ω being the volumeform in physical space; this says that β is closed. For a tutorial on differential forms for plasma physics, see [M20].
For time-independent fields, the zeroth-order guiding-centre motion (ZGCM), equation (3), conserves the energy H = E, so the trajectories on a given fieldline can be classified into (see Figure 1 • passing: |B| < E/µ along the whole fieldline so the guiding centre moves unidirectionally along it (this includes the case µ = 0, E = 0); • one-sided bouncing: |B| < E/µ along the fieldline in one direction (say s < 0) but exceeds E/µ somewhere in the other direction, with |B| > 0 at the first point s 0 where |B| = E/µ, so a guiding centre moving from s < 0 with v > 0 reverses direction at s 0 ; • two-sided bouncing: |B| < E/µ in an interval (s 1 , s 2 ) along the fieldline with |B| = E/µ and |B| = 0 at both ends, so the guiding centre bounces periodically between s 1 and s 2 (the standard terminology for "bouncing" is "trapped", but it is only at zeroth order that they are trapped; the confinement problem is that at first order they might not be trapped; another term used in some contexts is "blocked", e.g. [F+]); • marginal: if |B| = 0 at a point where |B| = E/µ then the guiding centre takes infinite time to reach it.
Definition 1. We call the portion of fieldline between a pair of turning points for bouncing motion a segment. It is sometimes useful to identify a segment with its corresponding interval [s 1 , s 2 ] of arclength values.
To first order in ε, however, the guiding centre drifts across the field. The description of first-order guiding-centre motion (FGCM) that we prefer (a reformulation of [Li]) is that the motion is Hamiltonian on the space of guiding-centre position X in 3D and parallel velocity v , with the Hamiltonian (4) H = 1 2 mv 2 + µ|B(X)| and closed 2-form where β is the magnetic flux 2-form and b is the 1-form b·dX. This description generates the dynamics (Ẋ,v ) = V by solving i V ω = −dH for V (except whereB = 0, defined below, at which ω is degenerate; see [BE] for a modification that has no singularity). The solution can be written aṡ where (8) B = B + m e v c, c = curl b and B = B · b. The above use of differential forms is convenient because it produces a Hamiltonian form for guiding-centre motion despite there being no natural canonical coordinates for it; the phase space has three position-dimensions and one velocity-dimension.
Alternative expressions for FGCM appear in the literature (e.g. [LC, He]). One alternative that exhibits the standard curvature and grad-B drifts iṡ Note that c ⊥ can be written as b×κ, where the curvature vector κ = b·∇b. Equation (9) can be seen as the first-order expansion of (6,7,8) in ε (think of ε as 1/e). An equivalent way to write it is to define v = ± 2 m (E − µ|B|) as a (multivalued) function of position X, and thenẊ = v b + mv e|B| (curl(v b)) ⊥ , where E is the value of H treated as constant in computing the curl. Although (9) conserves H, we doubt that it has a Hamiltonian formulation in general (but see [Bo84] for some discussion), thus we prefer (6,7,8) on the grounds of preserving as much structure as possible from the original model. Whichever of the above equations are used, the motion still conserves H but produces drift across the field. By assigning to each point in guiding-center phase-space its corresponding ZGCM orbit, this cross-field drift may be visualized as motion through the space of ZGCM trajectories. Since the profile of |B| with respect to arc-length along nearby fieldlines is in general different, a drifting solution's instantaneous ZGCM trajectory may transition between classes. Repeated transitions between classes produces an effective diffusion (e.g. [GT, Men]) and hence poor confinement. In some classes (e.g. ripple-trapped) the drifts produce large, even unbounded, excursions, so transitions into such classes also produce poor confinement.
For passing motion, one may be able to design the field so that many of the fieldlines remain in some desired region R, e.g. a solid torus, and furthermore, so that the guiding centres for many initial conditions remain on fieldlines in R , the subset of R whose gyroorbits are contained in R. For the guiding centres, it suffices to have invariant 2-tori with spatial projections in R for all values of energy and magnetic moment used, because such tori confine all initial conditions inside them. Near-integrability (e.g. from an approximate flux function for B), C 3 -smoothness and generic conditions on the field guarantee existence of such tori, by KAM theory (there are various references, e.g. [Mo, SZ] and the semi-popular [Du]). Thus, ignoring the effects of interaction between particles, particles on passing trajectories inside such a torus remain within R.
One-sided bouncing can be treated together with passing. For example, for a mirror machine, guiding centres that enter one end, make one bounce inside and then leave by the same end, have much the same effect as passing ones that go out of the other end. For fields with an invariant set of finite volume, one-sided bouncing trajectories are rare, because Poincaré recurrence implies that almost every fieldline comes back arbitrarily close to any value of |B| it takes. So henceforth, "bouncing" will refer to two-sided bouncing.
For (two-sided) bouncing motion there is a second adiabatic invariant L, called "longitudinal", whose asymptotic expansion starts with (10) L = s 2 s 1 mv ds, as long as the fieldline seen by the guiding centre changes relatively little during one bounce period T = 2 s 2 s 1 v −1 ds, e.g. [He]. Then the motion in a bouncing class can be reduced to one DoF for the intersection of the fieldline segment with a transverse section Σ; the Hamiltonian is still H but evaluated at the value of v for which the bouncing segment has the given value of L; the symplectic form is eβ (by conservation of magnetic flux along fieldlines, this gives the same dynamics regardless of which transverse section is used). Closed level-sets of H on Σ (for given L) give invariant 2-tori for the guidingcentre motion. So the corresponding particles are confined to within one gyro-radius of the projections of such tori to physical space (ignoring the effects of interaction between particles). If the particles are on tori that fit in a desired region R then they stay in that region. Examples are the "banana" trajectories in a tokamak. They bounce above and below the outer midplane (where |B| is minimum along fieldlines), moving alternately inwards and outwards between bounces relative to a flux surface (thus tracing out a banana-shape in projection to a poloidal section), but not in general closing because of a slight rotation around the central axis so that the banana drifts around the central axis, tracing out a torus of banana cross-section.
If the drift motion leads towards a marginal case, however, the guiding centre may make transitions between the above classes or between different bouncing classes. Such transitions can lead to large changes in the region explored by a particle [Ne]. Transitions between bouncing classes may lead to larger tori that no longer fit in the desired region, or even unbounded motion. Transitions from bouncing to passing may lead to motion that is not confined by the tori for passing trajectories. Repeated transitions can lead to large accumulated changes. Transitions where one bouncing class gets divided into two lead to pseudo-random choice of new class. These are all particular problems for high-energy particles, e.g. [B+, F+] and [P+].
Thus, transitions between classes of guiding-centre trajectory are generally bad for confinement.
One way to solve this problem was proposed in 1975 by [HM], called "omnigenity". For a review, see [He]. It assumes the field B has a flux function ψ, i.e. such that i B dψ = 0 (B · ∇ψ = 0) and dψ = 0 (∇ψ = 0) almost everywhere. For a review of magnetic fields with a flux function, including some less known properties, see Appendix B.
The field is called omnigenous if the time-average of i V dψ (V · ∇ψ) along each zerothorder trajectory is zero, using the non-Hamiltonian V of (9). This is automatic for passing trajectories on irrational flux surfaces; see [He] or Appendix C. A necessary and sufficient condition for omnigenity is that L be constant for all bouncing trajectories with given energy and class on a flux surface. Necessity was proved in [LC] (extended to full generality in [PCHL]), and sufficiency (which seems not to have been proved before) is proved in Appendix C.
Axisymmetric fields are omnigenous (indeed, so are all quasi-symmetric fields). Constructions of non-axisymmetric omnigenous fields appear in [CS, LC, PCHL] but depend on existence of Boozer coordinates, which are derived assuming the field is nondegenerate MHS (e.g. [He]) and we are not aware that any non-axisymmetric nondegenerate MHS fields are known. In particular, we are not aware that |B| as a given function of Boozer coordinates can be realised by a magnetic field B in 3D.
For C 3 omnigenous fields, some generic conditions imply existence of many invariant tori for the passing trajectories, close in projection to the flux surfaces, by KAM theory, at least for low energies, as already mentioned. It furthermore implies that the invariant tori for bouncing motion are close in projection to parts of flux surfaces. Lastly it implies that transitions between classes are a second-order effect [CS].
Thus, omnigenity (assuming it is realisable) sounds a good solution for confinement. Nonetheless, not every field has a flux function, e.g. most vacuum fields do not, nor do most magnetohydrodynamic equilibria with anisotropic pressure or mean flow. Secondly, reducing transitions to second order is perhaps not enough for good confinement. Thirdly, even for fields with a flux function, the requirement that v d · ∇ψ = 0 for all GC trajectories is stronger than necessary for confinement; prevention of transitions and existence of KAM tori in each class would suffice. Fourthly, analytic exactly omnigenous fields have to be quasi-symmetric [CS] and it is suspected that non-axisymmetric quasi-symmetric fields do not exist [GB]; this last one is a minor objection, however, because analyticity is not necessary for real applications, where the current distribution need not be analytic (except in the vacuum case where analyticity is automatic because locally the field is the gradient of a function satisfying Laplace's equation). A weaker condition than omnigenity, called pseudo-symmetry, was introduced by Mikhailov [M+] but does not prevent transitions; the relation to our work is discussed in Appendix D.
In this paper we first extend [CS] to present a set of conditions for absence, to first order, of transition between classes of guiding-centre motion. The set does not require a flux function, and so has wider applicability than omnigenity. Even if there is a flux function, our conditions are weaker than omnigenity, so they stand more of a chance of being realisable without axisymmetry. Indeed, we construct non-axisymmetric mirror fields with no transitions. Then we present a "strong" form that prevents all transitions exactly. We call it "isodrasticity". We are unaware of any previous work that formulates such a criterion for non-perturbative suppression of transitions.
Isodrasticity has the further advantage that it can be adapted to high-energy particles, such as the α-particles produced by D − T fusion, for which a higher-order guidingcentre approximation may be required (see [BSQ] for explicit computation of higherorder guiding-centre approximations).
Our theory provides clear objective functions to contribute to the optimisation of magnetic fields. It provides enhanced understanding of the effects of imperfections in tokamaks, and more generally in quasi-symmetric fields. It suggests the prospect of controlling transitions between classes by weak breaking of isodrasticity via trim coils.
We emphasise that isodrasticity prevents transitions between classes but that one would still need to confine trajectories that stay within each class, as mentioned above. Failure to achieve that was the main problem with early stellarators. The basic way we propose to achieve it for bouncing classes is by designing a suitable subset of level curves of L to be closed and fit in the machine and finding a corresponding band of KAM tori for the guiding-centre motion at each value of energy and magnetic moment, which confine all trajectories inside, though this might be challenging for ripple-trapped classes. For circulating classes, we propose to achieve confinement by KAM tori derived from an approximate flux function. This important aspect of confinement is not addressed further here.

Reduced guiding-centre motion and weak isodrasticity
To explain our concept of weak isodrasticity, we first give a more detailed description of the reduction of first-order guiding-centre motion by the longitudinal invariant and of the set of critical points of field strength along the fieldlines. We end the section by computing the flux of reduced trajectories that make a transition if the field is not weak isodrastic.
2.1. Reduced guiding-centre motion and critical points of |B| along the field. We suppose that in the domain of interest, B is nowhere zero and is C r with r ≥ 2 (and for some purposes more).
Using energy conservation (4), for µ > 0 the second adiabatic invariant (10) can be written as L = √ mµ j with where h = E/µ and the bounce points are at arclengths s 1 , s 2 . We note in passing that this is the Abel transform 2) of the length (v) of the subsegments with |B| < v, which might provide a useful alternative way to compute it and is used in Section 4 and Appendix C.
Bouncing segments along a given fieldline come in one-parameter families, parametrised by the value h of the scaled field strength at its ends. The action j is differentiable with respect to h and a standard calculation shows that the derivative is This can be recognised as the time taken from s 1 to s 2 by the dynamics ds dt = ± 2(h − |B|) of the scaled Hamiltonian H = 1 2 u 2 + |B(s)|. Thus the period of bouncing in the original time is T = 2 m µ dj dh . As a bouncing segment approaches marginality, the period goes to infinity. For example, approaching a non-degenerate local maximum at one end with |B| = h 0 and |B| = −a, the period diverges asymptotically like T ∼ m µa log 2a |h−h 0 | . A key role is played in the reduced dynamics by the set Σ of critical points of |B| along fieldlines, i.e.
where, as before, denotes derivative with respect to arclength along a fieldline. Subdivide Σ into the disjoint union according as |B| > 0, = 0 or < 0, respectively (nondegenerate local minima, degenerate critical points, nondegenerate local maxima). By the implicit function theorem, Σ ± are C r−1 surfaces (with possibly several components). For r ≥ 3, Σ 0 is generically a C r−2 curve (with possibly several components) and generically forms the common boundary of Σ ± (see Appendix E). Some examples are shown in Figure 2. Figure 2. The set Σ of critical points of |B| along fieldlines for (a) a dipole, (b) a mirror machine, and (c) a tokamak.
The analysis presented in this Article demonstrates that the identification of Σ in a magnetic field, including its decomposition into Σ ± and Σ 0 , is a key step to understanding the confinement properties of the field.
There are various ways to label a segment of fieldline. All include specifying the fieldline γ and the (common) value h of |B| at its endpoints. To complete the label, we have to say which wells of |B| it visits along the fieldline. One way to do this is to specify the set of all local minima it visits, or indeed the set of all its intersections with Σ. Given γ and h this is more than is required to specify the segment, because the segment is connected, so one could instead select one of the local minima. The only problem with any of these specifications is that as the segment moves to nearby fieldlines the set of local minima may change size, because one might be annihilated by a local maximum or a new one created from a horizontal inflection; a jump will be required if the selected local minimum is annihilated. The set also changes size when there is a real change in the class, i.e. when one end becomes a local maximum so the segment can lengthen over some new wells, or when an interior local maximum rises to height h, which splits the segment into two, but we consider that natural. An alternative is to specify the first intersection with Σ − outside the segment when going in one direction along the fieldline (say the positive one). This choice does not suffer from the previous issue, but the local maximum might be annihilated with a local minimum a bit further away and then the first local maximum would jump to one further away, or a horizontal inflection might be born between the local maximum and the end of the segment, leading to a jump in the other direction. Thus there is no good solution.
The best way is to use equivalence classes of such labels. We denote such an equivalence class by M . As mentioned above, M = (M, h) comprises M , a field line together with visited wells along that line, and h, the common value of |B| at the segment's endpoints. Note that the set of possible M is in one-to-one correspondence with (images of) ZGCM bouncing orbits.
There is likewise not a good concept of class of segment. One might be tempted to say two segments are in the same class if one can be obtained from the other by continuous change of segment, but the example of Figure 3 shows that this can lead to two different segments of the same fieldline with the same h being in the same class, which is not what we want (it can even be done conserving j). On the other hand, discontinuous change in a segment is well defined, being a jump in segment at given h as the fieldline varies locally continuously, and we refer to this somewhat loosely as a transition between bouncing classes. Given a fieldline and the set M of intersections of a segment γ with Σ, there is an interval [h min , h max ] of compatible values of h. h min is the maximum of |B| over M and h max is the minimum of |B| over the first local maxima outside M in the two directions. The action j of (11) is a continuous increasing function of h for this interval of segments (and differentiable in the interior). It goes from some j min to a j max . If M is a single local minimum then j min = 0.
For given j in this interval (j is now a scalar rather than a function), we obtain the reduced and scaled Hamiltonian H j (M ) defined to be the unique value such that where γ(M, h) is the segment of the fieldline with label M and endpoints with |B| = h. H j (M ) is the energy divided by µ, equivalently, |B| at the bounce points, expressed as a function of M for given j. See Figure 4.
The phase space F j for the reduced system is It is easiest to think of the case of a single well, then F j is just a subset of Σ + , but in general it can be mapped to any transverse section of the field. The symplectic form on F j is eβ, which is invariant along the fieldlines between any choices of transverse section. To address the reduced dynamics on F j we first remark that onF j = {M : j min (M ) < j < j max (M )} (the subset of F j for which the segments are non-marginal), H j is differentiable. Indeed, Proposition 1. Let X be any vector field on guiding-centre phase-space whose flow commutes with that of ZGC dynamics and that satisfies dj(X) = 0. Then is related to the true bounce period T by T s = µ/m T /2.
Proof. Let Y be any vector field on guiding center phase space whose flow commutes with that of ZGC dynamics (we do not yet require that Y preserve level sets of j). The derivative of j along Y is given by The second term of the integrand integrates to 0 because the square root is zero at the ends of γ. The first term expands to Now db = i c Ω and b is tangent to γ, so we end up with Next we use (12) which gives the change in j for a change in h without change of fieldline (one way to derive that equation is a similar but simpler use of Lie derivative).
As discussed in the text, it implies dj/dh = (T /2) µ/m. On taking Y to be a field X that in addition satisfies dj(X) = 0, this leads to (15).
We refer interested readers to Appendix F for an alternative proof of this result that casts it in a more general light.
Applied to γ, b / 2(h − |B|) is the time interval dt for ZGCM. So, modulo the correction term involving i c Ω, the formula (15) says that dH j is the time-average of d|B| along the segment of fieldline. An important consequence for our analysis is that taking the limit as a bouncing segment goes to a homoclinic one, we see that i X dH j goes to the value of i X ⊥ d|B| at the critical point, because the bouncing segment spends all but a vanishing fraction of its time near there. At a critical point, |B| = 0 so i X ⊥ d|B| = i X d|B| there. Thus, using Σ − as transverse section and defining h = |B| on Σ − (being the value of h for homoclinics to Σ − ), as any homoclinic case is approached. The reduced dynamics on F j in a scaled time τ = µ e t is dM/dτ = u defined by (17) i T = e µ ∂Φ ∂h with j fixed. This comes from the standard formula that the period T of a periodic orbit γ of an autonomous Hamiltonian system is ∂S ∂E where S is its action ( γ α for a primitive α of the symplectic form ω) and E its energy (using that periodic orbits come in 1-parameter families).
The case of short bounces can be treated explicitly (we mean short in length, not in time; these are usually called "deeply trapped" but again they might not be trapped). The limiting case of zero length is j = 0. For these, M is a singleton. The motion is on Σ + with Hamiltonian H 0 = |B|. So the trajectories follow level curves of |B| on Σ + at rate µ e u in real time, with i u β = −d|B|. To keep these guiding centres in a region R one must put them on level sets of |B| on Σ + lying within R. For fields with flux surfaces, an ideal is that for each flux surface the minima of |B| along fieldlines on it have the same value of |B|, because then the short bouncers remain on that flux surface (pointed out already by [MCB]). In contrast, it goes wrong for ripple-trapped particles in tokamaks. They are a class of bouncing trajectories in a poloidally confined region (not around the equatorial plane) for which the |B| contours in Σ + are approximately vertical, so for one sign of e/m the guiding centres leave the desired solid torus. Any particles that make the transition into this class are lost. See Fig.5 of [GT] for an example, and [P+] for more.
One can also treat the linear approximation to short bounces with j > 0. It gives j = π(h − h min )/ |B| , where |B| is evaluated on Σ + (and is the bounce frequency in a scaled time). So to this order in j, Although our focus in this paper is on transitions between classes, the above behaviour of short bouncers is of independent interest. Yet, it will be seen in Section 3.2 to be of relevance to transitions for perturbations of a tokamak field.
2.2. Weak isodrasticity. The adiabatic invariant j is well conserved if the segment changes relatively little during one bounce period. The condition can be written as 2 2 |B| ∂j ∂h 1/ρ, where is a lengthscale for variation of B and ρ is the gyroradius (1). This fails when the period becomes large, in particular as a segment approaches a marginal case. Figure 5 shows the three principal ways marginal cases can be approached. In the weak version of isodrasticity, to determine whether marginal cases can be approached we make the approximation that j continues to be conserved up to and including when a marginal case is reached and the dynamics is given by the reduced dynamics (17) on F j . Definition 2. A magnetic field B is weakly isodrastic if the marginal cases are never reached from non-marginal ones by the first-order reduced dynamics.
We will develop some necessary and some sufficient conditions for weak isodrasticity. On Σ −0 = Σ − ∪ Σ 0 , define h = |B| (this agrees with E/µ for guiding centres with v = 0 on Σ −0 ), and for direction σ ∈ {±} along the field define  σ to be the value of j for a segment γ σ starting at the given point of Σ −0 and going in direction σ (assuming a turning point is reached; if not,  σ is undefined). When it is clear which direction is under consideration, we drop σ. So  = γ 2(h − |B|) ds.
Equivalently, given x ∈ Σ −0 the value of (x) is the area of the zeroth-order guidingcentre separatrix lobe attached to (X, v ) = (x, 0) in the appropriate direction along the field. Note the relation Loosely speaking, our results are that a magnetic field is weakly isodrastic iff the contours of h and  coincide on Σ −0 for both directions σ. See Figure 6 for an illustration of failure of weak isodrasticity. This explains the etymology of our definition. In Greek, Figure 6. Illustration of failure of weak isodrasticity; the level curves of h and  on Σ − do not coincide.
"iso" means equal and "drasis" means action. We're asking for marginally bouncing trajectories of given energy (which up to scaling by µ is h for particles on segments with an endpoint on Σ − ) to have the same action . Use of the term "isodrastic" goes back to [We], but in a different context.
We formulate precise statements of two necessary and one sufficient condition for weak isodrasticity in (Theorem 1). First, the function h is smooth on Σ − because B is assumed to be smooth and Σ − was proved to be smooth. For given direction along the field, the separatrix area  is also a smooth function on Σ − except at points with a heteroclinic connection, where the derivative generically becomes infinite and  jumps or ceases to be defined (if there is no longer any bounce point in that direction). We define Σ − to be this subset of Σ − . As already remarked, H j is differentiable for non-marginal segments and dH j → dh as a homoclinic case is approached (16).
Theorem 1. (a) If magnetic field B is weakly isodrastic then for both directions along the field, dh and d are linearly dependent at every point of Σ − ; (b) If B is weakly isodrastic and Σ 0 is a smooth curve without heteroclinic cases, then h and  are constant on connected components of Σ 0 ; (c) If for both directions,  is constant on components of level sets of h then B is weakly isodrastic.
Proof. (a) If for some direction along the field, d and dh are linearly independent at some x 0 ∈ Σ − , let j 0 = (x 0 ). In particular, d = 0 there so the level set  −1 (j 0 ) is locally a smooth curve. See Figure 7. Locally, it is the boundary of F j 0 and the motion is periodic in the interior of F j 0 . By independence of dh and d at x 0 , dh = 0 there and the tangent to the boundary is not in ker dh. dH j 0 → dh as x 0 is approached from the interior. So ker dH j 0 is at a non-zero angle to the boundary near x 0 . The reduced dynamics is the Hamiltonian dynamics of H j 0 with respect to the flux form β.  (b) If Σ 0 is a smooth curve and h is not constant along a component of Σ 0 then there exists a point x 0 ∈ Σ 0 where the derivative of h along Σ 0 is non-zero. Thus the tangent to Σ 0 there is not in ker dh. Let j 0 = (x 0 ). By assumption the segment from x 0 is not heteroclinic. Then dH j 0 → dh as x 0 is approached from Σ − . Thus ker dH j 0 is at a non-zero angle to Σ 0 near x 0 . So trajectories of the reduced dynamics for this value j 0 reach Σ 0 in finite time for one sign of scaled time τ , i.e. in finite positive time for one sign of charge. Thus B is not weakly isodrastic. We deduce that weak isodrasticity implies h is constant along smooth components of Σ 0 . Since weak isodrasticity also implies dh ∧ d = 0 up to the boundary in reduced space by (a), h = const. along the boundary implies also that  = const. along the boundary.
(c) Assume  is constant on level sets of h. Let γ(t) denote a trajectory for first-order reduced dynamics such that γ(0) is not marginal. If γ were to become marginal in (say) forward time there would be a t 0 > 0 such that γ(t 0 ) is marginal and γ(t) is not marginal for 0 ≤ t < t 0 . But this is impossible because we claim: γ(t 0 ) marginal implies γ(t) marginal for t in an open neighborhood of t 0 . We prove as follows. Suppose x 0 = γ(t 0 ) is marginal. The initial value problem for x 0 enjoys uniqueness because dynamics in the reduced space arise as a quotient of dynamics in the full guiding center phase space, in which uniqueness holds. Therefore if x 0 is a fixed point then we conclude . So assume x 0 is not a fixed point. Hamilton's equations imply that dH j (x 0 ) = 0. Since dh agrees with dH j at marginal points it follows that dh(x 0 ) = 0. By the implicit function theorem, the level set of h containing x 0 is a 1-manifold Γ near x 0 . And since  is constant along Γ the reduced Hamiltonian H j is constant along Γ. The pullback of Hamilton's equations to Γ now implies Γ is locally invariant, which establishes the claim. Thus the field is weakly isodrastic. The same argument applies for the case of splitting of a segment by an interior local maximum. Simply,  needs replacing by the sum  + +  − . If both  ± are constant on level sets of h then so is this sum.
One might ask why in (c) we did not need a hypothesis like h constant on components of Σ 0 . We think this follows from  constant on h-levels, at least under some generic assumptions. For example, in Appendix E we show that h constant on generic Σ 0 follows from linear dependence of dh and d for the short bouncing class on the neighbouring part of Σ − .
2.3. Quantification of failure of weak isodrasticity and transition flux. Theorem 1 leads to a quantification of failure to be isodrastic. The failure of contours of h and  to coincide on Σ − can be measured by the 2-form dh ∧ d. This is most simply described by comparing it to the magnetic flux-form β, which is a nondegenerate top-form on Σ − . Thus there is a function M on Σ − such that dh ∧ d = Mβ.
In Section 8, M will be identified as a "Melnikov function" for the FGCM dynamics. But for now, to compute M, if Σ − is given locally as the graph z = Z(x, y) of a function in Cartesian coordinates then where subscripts after a comma indicate partial derivatives.
The function M has units of square root of field strength divided by length, but it is natural to multiply M by the factor √ mµ to turn  into L. The quantity √ mµ M is an inverse time, so represents the rate of transition. Indeed, the flux of reduced orbits between classes is given precisely by the following Theorem 2. We need first to introduce the Liouville volume-form Λ on guiding-centre phase-space. It is defined by Λ = 1 2 ω ∧ ω, where ω is in (5). This can be computed to be (20) Λ = emB Ω ∧ dv , using the relations β ∧ b = |B| Ω and b ∧ db = −b · c Ω. In the following, we use the symbol ρ for a density in GC phase space (as opposed to the gyroradius).
Theorem 2. Let Λ denote the Liouville volume form in guiding center phase space. For a distribution of guiding centers with density ρ with respect to Λ in (X, v ), the flux of reduced orbits between classes is given by the 2-form on where ρ is the ZGC bounce-average of ρ.
Proof. In the 3-dimensional space of ZGCM bouncing orbits, particle flux is quantified by a 2-form Γ. The flux of reduced orbits between classes is given by restricting Γ to the 2-manifold of marginal bouncers. So we will find a formula for Γ and then analyze its restriction to the marginal orbits.
To determine an expression for Γ we first argue generally in the context of symplectic Hamiltonian systems with U (1) symmetry. This is relevant to bouncing particles because Kruskal's theory [Kr] of nearly periodic systems implies guiding-centre dynamics formally comprises such a system when restricted to bouncing orbits. Let V denote a Hamiltonian vector field with Hamiltonian H on the symplectic 2n-manifold (M, ω). Assume there is a symplectic U (1)-action Φ θ : M → M with momentum map P : M → R and that H is U (1)-invariant. We will also suppose that the quotient map π : M → M/U (1) sending phase points to their U (1) orbits is a smooth mapping between smooth manifolds.
Let Λ = 1 n! ω ∧n denote the Liouville volume on M . Given a particle density = ρ Λ, regarded as a top-form on M , we will derive a formula for the flux of particles in the orbit space M/U (1). At the level of measures, the density of particles in M/U (1) is simply the measure-theoretic pushforward along π of the phase space measure defined by . At the level of differential forms, this means the density on M/U (1) is given by the fibre integral π * -a volume form on M/U (1). If v denotes the vector field on M/U (1) induced by the U (1)-invariant vector field V , the particle flux form on M/U (1) is therefore Γ = ι v π * . We can simplify this abstract formula as follows. Let P, H denote the unique functions on M/U (1) such that π * P = P and π * H = H. A direct calculation shows that the fibre integral is given by where ω is any 2-form on M/U (1) that restricts to the Marsden-Weinstein reduced symplectic form on level sets of P , and ρ is the U (1)-average of ρ. It follows that the flux form Γ is given by Specializing now to the guiding-centre case (n = 2, P = 2L), we deduce that the flux form in the space of ZGCM orbits is given by Γ = 4π ρ dH ∧ dL, where H is the true (i.e. unscaled) Hamiltonian. To restrict this 2-form to the class boundary, we parameterize the set of marginal bouncers using the map i that sends points in Σ − to the corresponding ZGCM separatrix orbit attached to Σ − in the appropriate direction along the field. Since we have the pullback identities the particle flux through the class boundary is given by as desired.
The flux-form of Theorem 2 represents the transition fluxes out of and into a class as positive and negative contributions with respect to an orientation on Σ − . To obtain the flux in one direction, one has to integrate it over the subset with the appropriate sign.
The size of the function M can be quantified in various ways, for example, one could take the maximum of |M| over a relevant piece of Σ − (say between two contours of h), or the integral of |M| with respect to the flux form β over a relevant piece of Σ − . Note that the size of M can be used as an objective function for a magnetic field optimizer that encourages the optimized field to be weakly isodrastic.
The function M can be computed from (19) by numerical differentiation of h,  and Z, as will be illustrated in the next section, but a more direct method will be given in Section 8.
As a special case, a segment can approach marginality simultaneously at two (or more) points. It leads to the reduced phase space F j having corners as well as edges. This is discussed in Appendix G. In particular, Σ − can contain curves along which there is heteroclinic connection: the condition is just that there is a segment with both ends on local maxima, so basically a Maxwell equal-height condition.

Examples of deviations from weak isodrasticity
To help understand the constructions of the surfaces Σ in a magnetic field and the reduced Hamiltonian H j for bouncing trajectories with given (scaled) value j of longitudinal invariant, we treat examples of the three types of field presented in Figure 2. For each we start from an axisymmetric field and then consider the effects of breaking axisymmetry. The case of a dipole field has only Σ + , so is trivially isodrastic; it is treated in Appendix H. In the cases with Σ −0 , one in general loses weak isodrasticity. The big question is whether there are special perturbations that keep weak isodrasticity. For mirror fields, that will be answered positively in Section 4. See Appendix I for some practical tips for computing expressions involving b and |B| and for computing .
3.1. Mirror machine. Here we consider a mirror machine. As a simple axisymmetric and vacuum version, we take the field of two circular coils centred on a common axis that we orient vertically, with magnetic dipole moments in the same direction along the axis and with separation significantly larger than their radii so that the field is significantly weaker between the coils than at their centres (in contrast to the Helmholtz case). We focus attention on the region within less than the coil radii of the axis (outside the coils there are additional parts of Σ + and further away there can be nulls at which Σ + branches; the full picture will be presented in a future publication). Then Σ − consists of two surfaces, one spanning each coil, and Σ + is an intermediate surface cutting the axis (recall Figure 2 Segments can bounce between the stronger fields near the coils, or if they have enough energy compared to magnetic moment they can escape through one or other coil. The bouncing segments can be labelled by the radius r of their intersection with Σ + . For a segment γ labelled by r and with |B| = h at the bounce points, The function J increases with h and it is plausible that it increases with r too. Given j ≥ 0, the Hamiltonian H j is defined on the part of Σ + for which there is a bouncing segment with that j, by J(r, H j (r)) = j. The domain on Σ + is a disc for j ≤ j * and an annulus for j > j * , where j * is the value for the segment bouncing along the axis between the saddle points of |B| at the centres of the coils. Under the assumption that J increases with r, the derivative dH j /dr = − ∂J/∂r ∂J/∂h < 0, except 0 for r = 0. Its level sets are axisymmetric. So the bouncing segments precess round the axis.
If axisymmetry is broken by a smooth perturbation but not too strongly then the surfaces Σ ± deform smoothly and the function H j on Σ + deforms smoothly. It remains a Morse function (i.e. all its critical points are non-degenerate), so (under the assumption that ∂J ∂r > 0) the low level sets remain closed curves around a central point and the bouncing segments precess around them, but level sets for higher values of H j may reach the boundary of definition.
To consider transitions, it is simplest to break up-down symmetry by making, say, the lower coil produce a stronger field than the upper coil. Then there is a range of segments from the upper part of Σ − that bounce before the lower part of Σ − . Let  be the function on the upper part of Σ − giving j for the segment that starts at the given point on Σ − and goes into the mirror machine. Let j 1 be the minimum of  on the upper part of Σ − . In the axisymmetric case this is j * . For j < j 1 , the segments precess forever. But for j > j 1 , motion along a level set of H j could take the segment to the boundary of its domain of definition, where it becomes marginal. Then one has to examine h (= |B|) and  on the upper piece of Σ − . In general, their level sets do not coincide, which corresponds to motion of segments leading to marginal cases.
The result of breaking axisymmetry for a mirror machine is illustrated in Figure 8(a,b). As described in the previous section, one can quantify the failure of contours of h and  Figure 8. The effects of breaking axisymmetry in a mirror machine. The field is oriented vertically, with the stronger (red) coil at the bottom, having twice the current. Axisymmetry is broken by rotating the stronger coil by π/60 about the vector (− sin π/60, cos π/60, 0) at the center of the coil and then centering it at (1, 1, −7) while the weaker (blue) coil is at (0, 0, 7). (a) the parts of Σ above the square |x|, |y| ≤ 2; the pieces surrounded by the two coils are in Σ − , the piece in the middle is in Σ + ; (b) level sets of h and  on the upper part of Σ − (solid black for h, dashed red for ); (c) the function M on the upper part of Σ − .  to coincide by computing dh ∧ d. It is a multiple M of the flux 2-form β, so the convenient way to describe it is the function M. Figure 8(c) shows M for the example, computed by numerical differentiation.
Analysis of the motion would be completed by computing H j on Σ + , but for isodrasticity, it is enough to look at h and  on Σ − , as we have explained.
A more complicated case is perturbations of an up-down symmetric mirror machine. In the unperturbed case all the marginal cases are doubly marginal, because both ends of the segment are zeroes of |B| . On breaking axisymmetry (and up-down symmetry if desired) a subset of initial conditions on each piece of Σ − bounce before reaching the other piece of Σ − but another subset cross the latter at a lower value of |B| and thus escape. For these points,  is undefined. Actually, some of the escaping fieldlines might encircle a coil and come back for another approach to Σ − and bounce, thereby giving a defined but larger value of , so  would have jump discontinuity lines at heteroclinic cases. We do not discuss this further here, but the same issue is unavoidable for tokamaks, to be addressed in the next subsection.
It would be interesting to compute Σ, h and  on Σ − for some other mirror machines, for example, a baseball coil [Pos]. Indeed, we treat a different mirror field in Section 6.1.

Tokamak.
A simple example of an axisymmetric tokamak field, in physical components for cylindrical polar coordinates (R, φ, z), is (not minor radius) and C > 1 (this is a simple case of Balescu's class of standard axisymmetric magnetic fields [Ba]). It has a closed fieldline z = 0, R = 1, called the "magnetic axis", whose radius has been scaled to 1. The field is divergence-free but we made no effort to make it magnetohydrostatic; perhaps it would be better to use a Solov'ev equilibrium [So, CF], but we chose this one because we could do more calculations explicitly. The parameter C is chosen to exceed 1 for stability to kinks, as will be discussed shortly, though not being an equilibrium, this stability condition is not really applicable. The fieldlines preserve ψ = 1 2 (r 2 + z 2 ). With poloidal angle θ around the magnetic axis, one obtains This can be integrated to show that the change in φ for one revolution in θ is 2πC/ √ 1 − 2ψ, hence the rotational transform ι = √ 1 − 2ψ/C. In particular, if C > 1 then the "safety factor" q = 1/ι exceeds 1 on all flux surfaces, satisfying the Kruskal-Shafranov condition for stability to kinks, as claimed.
The field strength is So Σ 0 is the magnetic axis r = z = 0 and separates Σ into Σ + for r > 0 and Σ − for r < 0 (see Figure 2(c)). There are passing trajectories that circulate in the same direction forever. There are bouncing segments that cross Σ + repeatedly, bouncing at the stronger field where R is smaller; they give the bananas. As for the axisymmetric mirror machine, j = J(h, r) for some function of h = |B| at the bounce points and r the value at which the segment crosses Σ + . It is defined for the limits corresponding to the field strengths on Σ + and Σ − respectively. The function J increases with h, to a maximum of j * (r) corresponding to the upper limit of h. It is plausible also that it increases with r. The Hamiltonian H j on Σ + for motion with given j is again defined by J(H j , r) = j, on the subset for which bouncing motion with the given j is possible. This is the set with r ≥ j * −1 (j). Hence dH j /dr = − ∂J/∂r ∂J/∂h is negative (except on the magnetic axis, but that is relevant for only j = 0). Its level sets are axisymmetric, and the bananas precess at constant rate.
If axisymmetry is broken then Σ deforms to a nearby surface, because ∂ z |B| = R −2 = 0, so the implicit function theorem applies. By the same argument, Σ 0 deforms to a nearby curve (on the surface) because ∂ R |B| = 1 C = 0. Furthermore, the magnetic axis deforms to a nearby closed curve if its rotational transform ι 0 / ∈ Z, by persistence of nondegenerate fixed points of the return map of fieldline flow to a transverse section. Recall that for this example, ι 0 = 1/C ∈ (0, 1). If ι 0 / ∈ Z/2 then the perturbed magnetic axis remains elliptic. But typically the new magnetic axis is not contained in Σ. All fieldlines intersect Σ + and Σ − alternately, except for intersections with Σ 0 . The intersections with Σ 0 are tangential to Σ so produce no crossing except at special points where the contact is of odd order.
The level sets of H j deform smoothly except near Σ 0 and near the cases of heteroclinic orbits from Σ − to Σ − . Thus a lot of the motion remains similar to the axisymmetric case, with precessing bananas, but the motion is qualitatively different near Σ 0 and near the former heteroclinic cases; the result in both cases is transitions between bouncing and passing. Figure 9 shows an example of Σ for a perturbed tokamak, including the effect on Σ 0 and level curves of |B| on Σ. For this, we consider the physical components of a vector potential A ε = (0, εzR cos φ, 0) in cylindrical polar coordinates and perturb the axisymmetric field by adding B ε = ∇ × A ε . We give expressions for |B| and |B| in Appendix J for sake of completeness.
Note from the righthand panel that some of the level curves of |B| on Σ cross from Σ + to Σ − . From this we deduce (following the discussion in the previous section) that some short bouncers (deeply trapped) drift into Σ − where they become unstable.
Computation of  on Σ − is complicated by the fact that the unperturbed ZGC trajectories from Σ − are mostly heteroclinic (reaching critical points at both ends) rather than homoclinic (returning to the same critical point). Under small perturbation, the values of |B| at the two crossings of a fieldline with Σ − in general become different. This implies that under small perturbation they may bounce just before reaching Σ − again or they may cross Σ − and make another poloidal revolution before approaching Σ − again. They may bounce there or cross again, etc. There may even be trajectories that never bounce. Thus Σ − is divided into many components labelled by the number N ∈ {0, 1, 2 . . .} of crossings with Σ − before bouncing. They are separated by the curves on which the fieldline from the given point of Σ − crosses Σ − some number M of times at lower values of |B| than it started and then reaches Σ − at a point with exactly the same value of |B| as it started. On moving the starting point on Σ − across the curve labelled by M , the number N changes from something less than M to something at least M , but they are not necessarily neighbours. The function  has jump discontinuities at these curves. Interpreting this picture becomes challenging, though the simplest option is just to keep the component 0 and consider the rest of Σ − to go to the "circulating" class.
In addition, one should remember that different functions  ± are defined for each direction along the field, so one should plot two pictures of level curves of , and that the circulating classes for the two are in opposite directions.
Near Σ 0 , the picture is particularly intricate. Analysis of what happens near C 4generic Σ 0 is carried out in Appendix E. In particular, near a generic point of Σ 0 there are short bouncing segments in the well of a cubic (see Figure 24) and they make transitions to longer ones and vice versa. Necessary and sufficient conditions for the short bouncing class to make no transitions are derived in Appendix E. Also treated there are the longer bouncing classes that come close to Σ 0 . Breaking of axisymmetry in a tokamak induces many transitions between classes and hence some form of mixing in the core. Mixing in the core is not necessarily a bad thing, however; [Bo] makes the case for "annular confinement".
The picture will become clearer when we go to the exact version of the theory in the second half of this paper. The effect of the drifts in FGCM is in general to replace the equilibria of ZGCM by periodic orbits. We will analyse the effects of this after introducing strong isodrasticity, but for now, we conclude that for typical perturbation of the tokamak example, at any value of ψ there are in general repeated transitions between different types of bouncing trajectory and passing trajectories. They can produce relatively rapid diffusion in ψ, which is bad for confinement. Hence the desire to make the field isodrastic.
The same issue about critical bouncers being critical at both ends rather than just one end holds for all quasi-symmetric fields. This is because for a quasisymmetric field |B| is constant along the lines of the symmetry field.

Realisation of weak isodrasticity
Can isodrastic fields be realised, outside omnigenity? Here we show how one can construct many weakly isodrastic mirror fields that are not omnigenous.
Firstly, we construct such examples in the form of field strength as a function of fieldline coordinates. Then we prove under some additional conditions that such a function can be realised by a divergence-free field in Euclidean R 3 . 4.1. Construction in fieldline coordinates. By fieldline coordinates for a magnetic field, we mean a pair of fieldline labels u, v with independent derivatives, and (signed) arclength s along the fieldline from a transverse reference surface.
Choose a positive C 2 function h on a disk with coordinates (u, v), with a nondegenerate minimum at (0, 0) and no other critical points, so its level sets are nested closed curves around the origin. It will represent |B| on Σ − . Extend h to a C 2 function B of (s, u, v) for an interval of s with B(0, u, v) = h(u, v), such that along each line of constant (u, v), B has a non-degenerate local maximum at s = 0, a minimum at some s m (u, v) > 0 and a first point s b (u, v) > s m (u, v) at which B = h with positive s-derivative. B will represent |B| along the fieldlines. The functions s m and s b are to be chosen C 2 . Most importantly, we require also that There is a lot of freedom in these choices.
Any magnetic field realising such a function B is weakly isodrastic, but in general it is not omnigenous. Fields realising B automatically have a flux function, namely the value of h as a function of fieldline labels (u, v). By the isodrastic condition, any other flux-function has to have the same flux surfaces. Thus it is omnigenous iff in u, v) has to have the same value along each line with the same value of h. It is easy to make counterexamples.
Concretely, let us take B to be a cubic function of arclength along each fieldline: along the fieldline starting from (r, θ) in polar coordinates on Σ − . One can take a and b in the form of polynomials a(r, θ) = N n=0 a n r n e inθ , for example (N = 1 suffices). Then h = c + r 2 has level curves r = constant, and Thus we have an isodrastic field iff b 2 ∝ a 5/2 on r = constant. But the well depth (difference in |B| between the local maximum and minimum) is 4a 3 27b 2 , which can be varied independently of keeping  constant on r = constant.
Let f (r 2 ) = 1 + r 2 1+r 2 and where α ≥ 0 is a real parameter (a and b are expressed here in Cartesian rather than polar coordinates, to facilitate seeing that the result is smooth). For the field B = 1 u,v) , consistent with the above discussion. The separatrix action (u, v) is given by This field is therefore weakly isodrastic for each value of α, as illustrated in Fig. 10. On the other hand, the field strength at the local minimum s m is given by which is not constant along the circles h = const. when α = 0, as shown in Fig. 11. It follows that the field is not omnigenous for nonzero α. In fact the field is omnigenous Figure 10. Contours of h = |B| and  on Σ − for the isodrastic mirror example with α = 0.3. Figure 11. B-contours on the flux surface √ u 2 + v 2 = 1.2 when α = 0.3.
iff α = 0. Also, Figure 11 shows that this field is not pseudosymmetric; see Appendix D. Finally, note that a field that is not omnigenous cannot admit a rotation symmetry. Therefore our construction produces a truly 3-dimensional weakly-isodrastic field.

4.2.
Realisability by divergence-free field in Euclidean space. In this subsection we address the general question whether a given function B of fieldline coordinates can be realised as the field strength along the fieldlines of a divergence-free field in R 3 . We obtain a positive answer locally if the function is analytic. We suspect that analyticity is not really necessary, but it suffices for realising examples, such as the cubic examples of the previous subsection. We believe also that although the result is local, if c is large enough then it applies at least as far as s = a/b, thus containing all the bouncing trajectories of those examples.
Our construction makes a closed two-form β representing the magnetic flux-form for a magnetic field B. Given a volume-form Ω, it is immediate from this to construct B as the unique vector field such that i B Ω = β, and β being closed is equivalent to B being divergence-free.
To make sure that β is closed we will find a diffeomorphism φ from a neighbourhood V of (0, 0, 0) in (s, u, v) space to a neighbourhood U of a point p in physical space such that the pullback φ * β = du ∧ dv. This makes u and v locally into Clebsch coordinates for the field. Then β is closed because du ∧ dv is closed. φ * β = du ∧ dv looks like a large constraint. It could be relaxed to φ * β = f (u, v)du ∧ dv for any positive function f , but we will see that there is still an immense amount of freedom for the construction.
The two remaining constraints are that ∂ s φ should be a unit vector (s is supposed to represent arclength) and that |B| • φ = B (which can be written as φ * |B| = B).
Although the case of interest is R 3 with the standard Euclidean metric, our construction can be done in an arbitrary 3D Riemannian manifold, so we start with the more general formulation.
Definition 3. Let β be a nowhere-vanishing closed 2-form on a Riemannian 3-manifold (M, g). Use (s, u, v) to denote the standard Euclidean coordinate system on R 3 . A system of Clebsch coordinates for β near p ∈ M comprises a pair of open sets, U ⊂ M containing p, and V ⊂ R 3 containing 0, together with a diffeomorphism ϕ : V → U such that Definition 4. Let B : V ⊂ R 3 → R be a positive smooth function. We say B is realizable on a Riemannian 3-manifold (M, g) if there exists a nowhere-vanishing closed 2-form β on M and a system of Clebsch coordinates (p, U, V ) for β such that ϕ * |β| = B.
Here |β| denotes the pointwise norm of β defined by the inner product of 2-forms induced by g.
Theorem 3. Let g = dx 2 + dy 2 + dz 2 denote the standard flat Riemannian metric on Proof. First suppose that B is realizable. Then we have a nowhere-vanishing closed 2-form β on M and a diffeomorphism ϕ : V → U : (s, u, v) → (X, Y, Z) such that ϕ * β = du ∧ dv, g(∂ s ϕ, ∂ s ϕ) = 1, and ϕ * |β| = B. Let Ω = dx ∧ dy ∧ dz denote the Euclidean volume form on U and let e s denote the standard basis vector along the s-axis in V . There is a unique nowhere-vanishing vector field B on U such that ι B Ω = β. We also know that the vector field ∂ s = ϕ * e s on U satisfies ι ∂s β = 0, since Since the null space for β is one-dimensional by hypothesis there must therefore be a smooth function α with ∂ s = α B. But since ∂ s must be a unit vector. Therefore the function α is given by α = 1/|B| = 1/|β| (using the standard result that |β| = |B|). We arrive then at the useful identity Upon introducing the Jacobian determinant we may express the pullback of the previous identity along ϕ as which implies J B = 1. This formula, together with the condition recovers the desired system of PDEs (21) and (22).
Conversely, suppose that ϕ = (X, Y, Z) satisfies (21) and (22). Since B is nowherevanishing, ϕ is a local diffeomorphism. By restricting to an appropriate open set we may therefore assume it is a diffeomorphism onto its image. The diffeomorphism ϕ defines a system of Clebsch coordinates for the 2-form β = ϕ * (du ∧ dv) by (22). But since J B = 1 by (21), we may also write β as Using |ϕ * e s | = 1, we therefore have which says that B is realizable.
containing the origin and real analytic functions Proof. The proof is an application of the Cauchy-Kowalevski theorem [RR]. There is still a lot of freedom, so we will establish existence of a solution with Z = v. For such solutions the PDE system reduces to We will therefore also restrict our search to solutions near (X 0 , Y 0 ) (this means B is principally along the X-direction).
Assuming (X, Y ) is close to (X 0 , Y 0 ), we may reformulate the reduced system by solving for ∂ s X and ∂ s Y according to Since the right-hand-sides of these formulae are real analytic near  the Cauchy-Kowalevski theorem implies that the initial value problem 0 u, has a unique analytic solution in some open neighborhood of (s, u, v) = 0 in R 3 . This X and Y , together with Z = v, comprise the desired solution of the original PDE system (23)-(24).
Estimating a neighbourhood in which the Cauchy-Kowalevski theorem applies requires some work. An example where this has been done is [GRR], in which a given analytic magnetic field on a given 2D region with analytic boundary is proved to have a vacuum field extension to a neighbourhood.
Perhaps there are alternative proofs not requiring analyticity, but the results are likely to still be local in character.
The real challenge is to make a non-trivial isodrastic stellarator field. That will have to wait for a future publication. The cases of heteroclinic and homoclinic connections have to be addressed.
We close this section by commenting that, despite claims in the literature, it is not clear whether omnigenity can be realised outside axisymmetry. References like [CS, PCHL] construct |B| as a function of Boozer coordinates, but it is not evident that one can realise an arbitrary such function as the strength of a divergence-free field in R 3 .

Strong Isodrasticity
The treatment of weak isodrasticity rests on assuming conservation of the adiabatic invariant j, but that assumption fails near the transitions.
So now we develop a version that does not assume conservation of j. We derive an exact condition for absence of transitions, which we call "strong isodrasticity". We illustrate it in Section 6 and elaborate on the theory in Section 7. In Section 8 we recover the results for weak isodrasticity as a first-order approximation.
The key idea is that Σ − ×{v = 0} is an approximate normally hyperbolic submanifold for guiding-centre motion. A normally hyperbolic submanifold (NHS) for a dynamical system is an invariant submanifold such that any tangential contraction or expansion is weaker than normal contraction or expansion, respectively (precise specification of this property is technical, see [Fe, HPS], or [Ku] for a tutorial). An approximate NHS is a submanifold that is close to being tangent to the vector field and similar tangential versus normal contraction and expansion comparisons hold.
It follows from the theory of NHS that there is a locally unique true NHS N − near Σ − × {0} for the guiding-centre dynamics. In general, computing the true NHS near an approximate one is hard, but in this context the approximate NHS consists of equilibria so is a "slow manifold" and there is an algorithm to compute higher-order slow manifolds to arbitrary order (see [M04] for an indication of how to get started, though higher than first order is less straightforward than that reference would lead one to believe, and [BH] for more). Furthermore, in our context, for µ > 0 the system is Hamiltonian and the initial slow manifold is symplectic (meaning that the symplectic form is non-degenerate on it) and there is a streamlined procedure to compute arbitrarily high-order symplectic slow manifolds ( [M04] with the same caveat). Even more, in our context, the resulting NHS has only 1DoF so consists principally of periodic orbits plus some equilibrium points and homoclinic or heteroclinic orbits between them.
For this discussion, assuming µ = 0, it is simplest to treat FGCM in a scaled time τ = µ m t, scaling velocity to u = m µ v and magnetic moment toμ = m e 2 µ. Then FGCM and is the Hamiltonian dynamics of the scaled Hamiltonian and symplectic form This scaling reduces the set of parameters to just √μ . For the excluded limiting casẽ µ = 0, the inverse square root inω looks singular, but recall that it is the inverse of the symplectic form (the Poisson bracket) that gives the dynamics; the Poisson bracket is degenerate atμ = 0 leading to the conservation of fieldline (this is a case of Casimirs for degenerate Poisson brackets, e.g. [MR]). Thus the dynamic forμ = 0 is a well defined case, namely the motion of a unit mass in potential |B| along each fieldline. This scaling also allows one to extend to higher-order guiding-centre approximations (with corrections toH andω) that are relevant for high energy, in particular for the α-particles produced by D − T fusion. One could also non-dimensionalise arclength s by a typical lengthscale for variation of B, B by a typical field-strength B 0 , andμ by 1/B 0 , but little is gained by this.
Applying the symplectic slow manifold method of [M04] to leading order in √μ produces N − as a graph over Σ − (see Appendix K). There is a displacement tangent to Σ, which plays negligible role, and a scaled parallel velocity Approximations to N − can alternatively be computed by expanding and solving the PDE expressing invariance of a graph to desired order (this may appear in a separate paper). The dynamics on N − is given by the restrictions of the guiding-centre Hamiltonian and symplectic form to it (the restriction of the symplectic form is non-degenerate). Being 2D, the bounded trajectories are mostly periodic, the exceptions being equilibria and trajectories connecting them.
NHS have forward and backward contracting submanifolds W ± (usually called stable and unstable manifolds respectively, but that terminology is inconsistent with the concepts of stable and unstable sets), consisting of the set of points whose trajectory in the stated direction of time converges to the NHS. They are made up of sub-submanifolds W ± (x) for each point x of the NHS (Arnol'd's ingoing and outgoing "whiskers" [Ar]), consisting of the set of points whose trajectory in the stated direction of time converges together with the trajectory of x.
To prevent transitions, the main part of our strong isodrastic condition is that the relevant branches of W ± coincide, forming "separatrices": invariant submanifolds that separate motions of different types. A familiar example is the separatrices 1 2 p 2 = 1−cos θ for the pendulum, which separate librating motion from rotating motion. There the NHS is just a saddle point in 2D, but the same idea extends to higher dimensions (2D NHS in 4D in our case).
Perfect separatrices are achieved by integrable systems. In 2DoF, integrability corresponds to a continuous symmetry. In the GCM context, integrability is implied by quasisymmetry [BKM], but perfect quasisymmetry is perhaps not achievable outside of axisymmetry. Even if one allows a velocity-dependent symmetry (as in weak quasisymmetry) [BKM2], we are not aware of any exact examples.
Integrable systems are not the only way to obtain perfect separatrices in Hamiltonian systems, however; there are constructions with perfect separatrices that are not integrable (see Appendix L). Thus there is hope that one might be able to achieve this for GCM.
A little care is required in the above construction of N − , however, because the theory of NHS requires |B| ≤ c for some constant c < 0 (depending on the perturbation size), so it fails near the boundary Σ 0 of Σ − (if it has boundary). For cases with no Σ 0 , like the mirror machine, nothing needs doing, but for cases like the tokamak, one has potentially to exclude a neighbourhood of Σ 0 in the construction of N − . Indeed, as will be described in the next section, when µ is turned on, N for this example truly develops a gap around Σ 0 . Nonetheless, we will see that a good understanding of N − can be obtained.
To complete the strong isodrastic condition, we have to deal with the issue that if N − has a poorly defined edge then GCM trajectories on it might fall off its edge. So we require that the above mentioned potential failure of continuation of Σ − to N − near Σ 0 does not occur. Specifically, we ask for Σ −0 × {0} to continue to an invariant submanifold N −0 with boundary consisting of a NHS N − and its boundary N 0 . For N − to be invariant, N 0 has to also be invariant. Being 1D , the invariance condition for N 0 is just thatH = |B| + 1 2 u 2 is constant on it. We suspect that the above problem does not occur if (i) Σ 0 is generic, as per Appendix E, and (ii) |B| is constant on it, as for weak isodrasticity, but have not established this (see Appendix M for some discussion). So for present purposes we make the following definition.
Definition 5. A magnetic field is strongly isodrastic if Σ −0 ×{0} continues to a maximal invariant submanifold N −0 with boundary for guiding-centre motion for a range of √μ > 0, which can be decomposed into normally hyperbolic N − and its boundary N 0 , the relevant branches of the contracting submanifolds of N − coincide, andH is constant along N 0 .
In the case that there is no Σ 0 and hence no N 0 , the continuation is guaranteed, so strong isodrasticity is just the coincidence of W ± . This applies to many mirror fields. But in tokamak and quasisymmetric stellarators, Σ 0 is an essential feature and thus its continuation to an invariant N 0 and the continuation of N − right up to N 0 is an additional consideration for isodrasticity.
In the next section, we will illustrate how strong isodrasticity is in general lost for perturbations of axisymmetric mirror and tokamak fields. This will lead to a quantification of failure of isodrasticity that could be useful for reducing it.
The definition allows also for use of higher-order guiding-centre approximations, relevant to the alpha-particles generated by fusion of D and T for example.

Illustrations of the exact picture
We illustrate the ideas of the previous section (construction of normally hyperbolic submanifolds for FGCM and their contracting manifolds) by a mirror machine and tokamak again.
6.1. Mirror machine. For a mirror machine of the type described in Section 3.1 (not restricted to axisymmetry), there is a non-degenerate saddle point of |B| near the centre of each coil. For the gradient field ∇|B|, each of them has one-dimensional downhill subspace and two-dimensional uphill subspace. They give unstable equilibrium points of guiding-centre dynamics with v = 0. They are each surrounded by a family of periodic orbits of guiding-centre motion, called Lyapunov orbits, which form the 2D centre manifold of the equilibrium point. The periodic orbits are hyperbolic and the centre manifold is normally hyperbolic. This is a case of a general phenomenon for Hamiltonian systems with an index-one saddle, understood by Conley in the context of celestial mechanics [Co]. The forward contracting submanifold of the periodic orbit at given energy separates trajectories that bounce from those that pass over the saddle. The flux of energy-surface volume passing over the saddle at given energy is the action of the corresponding periodic orbit [M90]. This "flux over a saddle" picture is the basis for the current subsection. [RBNV] validated the flux formula of [M90] on a 2DoF four-well potential energy surface, using a numerical method for computing hyperbolic periodic orbits and their forward and backward contracting submanifolds that we shall use again here.
The field for the two-coil example used earlier involves elliptic integrals, which turned out to be tedious to deal with for the exact approach. So we switched to a mirror field based on [G+]. After scaling the field strength to 1 at r = 0, z = π 2k , theirs is an axisymmetric vacuum field B z = 1 − a cos kz I 0 (kr), B r = −a sin kz I 1 (kr), B φ = 0, with I j being modified Bessel functions. A corresponding vector potential, written as a 1-form, is A φ dφ, with A φ = 1 2 r 2 − r a k −1 cos(k z) I 1 (k z). We take a ∈ (0, 1) to avoid introducing nulls on the axis. The field is periodic in z but they consider one period to be a model for a mirror field. |B| has saddles at z = ±π/k.
For realism one should concentrate on just a core r < R(z) for some relatively small function R. In particular, the field has a ring of nulls on z = 0 at the radius r 1 such that I 0 (kr 1 ) = 1/a and R(0) should be taken less than r 1 (the same issue of nulls arises for the two-coil example).
To simplify treatment of transitions, we add a similar vacuum field of twice the period to break the reflection symmetry about z = 0, so B z = 1 − a cos kz I 0 (kr) − aη sin kz 2 I 0 ( kr 2 ), B r = −a sin kz I 1 (kr) + aη cos kz 2 I 1 ( kr 2 ), with η ∈ (0, 4), thereby making the upper saddle weaker than the lower one so that we can study transitions involving passing through the top alone, as for the two-coil example. We restrict to a < (1 + η 2 /8) −1 so as not to introduce zeroes on the axis. A potential for this field is given by A φ dφ, with A φ = 1 2 r 2 − r a k −1 cos(k z) I 1 (k z) − 2 k −1 r aη sin(k z/2) I 1 (k r/2). The saddles remain at z = ±π/k; indeed there is still reflection symmetry about z = ±π/k. In particular, Σ − is the two planes z = ±π/k. The parameter k can be scaled to any desired value; we will take k = 2.0 in Figures 12-14.
The saddles have |B| = B ± = 1 + a(1 ∓ η). The saddle at z = +π/k is the weaker one and it is the one on which we will focus attention. It is surrounded by a family of periodic orbits of guiding-centre motion. Indeed the plane z = π/k, v = 0, is invariant and the dynamics is a drift around the axis. Thus at energy E > µB + there is a periodic orbit at z = π/k with radius such that |B| = E/µ. If also E < µB − then the region accessible to the guiding centre has the form of a bottle (Figure 12(a)). This is somewhat irrelevant though, because the bottle contains the above-mentioned ring of nulls, whereas a realistic mirror machine would look like only a smaller core r ≤ R(z) of the field. So we should retain only the features that the level set of |B| has an annular neck A and a roughly horizontal disk D at the bottom (Figure 12(b)). Around the neck of the bottle is a periodic orbit of FGCM, given by the intersection of the bottle with Σ − . It is hyperbolic. We plot it in Figure 13(a) together with its forwards and backwards contracting submanifolds W ± in projection to physical space, up to the first bounce. W − is plotted by releasing initial conditions with the same energy slightly below the periodic orbit and integrating forwards in time. The system has time-reversal symmetry under simultaneous change of sign of t, v and φ. Thus, W + is just the time-reverse of W − . Consequently, in this projection, W ± coincide. In phase space they are distinct, having opposite signs of v , but at the first bounce they merge and thus form a perfect separatrix. It separates trajectories that bounce periodically in the mirror machine from those that enter downwards via the neck, make one bounce and then leave via the neck. Then we break axisymmetry by adding (in contravariant components) This can be generated from 1-form A r dr with A r = εr 2 cos φ cos kz, by The hyperbolic periodic orbit has a locally unique continuation, which no longer has v exactly zero, but as the average velocity in the field direction is zero, it crosses v = 0 at least twice in a period. Thus it forms a closed loop slightly inside the neck but touching it at at least two points. We compute it by finding a fixed point of a return map. Then we compute its contracting manifolds, as for the axisymmetric case. This time they do not coincide. In particular, their curves of first bounce do not coincide, as indicated in Figure 14. Transitions between bouncing and escape are now possible. In a range of energies a small amount above the saddle compared to the breaking of axi-symmetry, it is possible for W ± to completely miss each other at the first bounce, as illustrated in Figure 15.
To understand Figures 13-15 better in phase space, it is essential to resolve the twoto-one nature of the projection of an energy level to physical space. Let P be the region |B| ≤ E/µ of physical space, restricted to a core of the form r ≤ R(z) and to z between the disk D and a little above the neck (recall Figure 12(b)). To each point of P except on D and A there are two velocities ±v with the given energy. On D and A, these merge into a single value v = 0. In a neighbourhood of D × {0} the energy surface is diffeomorphic to D × I, with I being an interval of velocities containing 0. This is because ∇|B| = 0 on D so we can label nearby points of the energy surface by pairs (x, v) ∈ D × I, by flowing along −∇|B| from x to the locally unique point where |B| = h − mv 2 2µ . Similarly, in a neighbourhood of A × {0} (again cf. Fig. 12), the energy surface is diffeomorphic  Figure 15. (a) Projection to physical space of the hyperbolic periodic orbit for a non-axisymmetric case with a = 0.5, η = 1.5, ε = 0.1, E = |B| + + 2 × 10 −5 , (|B| + = 1.11887),μ = 10 −2 , and some trajectories on its backwards (magenta) and forwards (green) contracting manifolds up to the first bounce; (b) the traces of the first bounces for the contracting manifolds.
to A × I. Since ∂ r |B| = 0 on A, one way to explicitly realize such a chart is simply by projecting into (φ, z, v ) coordinates; the coordinates (φ, z) parameterize A and v parameterizes I. To visualize the local stable and unstable manifolds with this chart we introduce the variables Z = z, X = (2 + v ) cos φ, Y = (2 + v ) sin φ for convenience and display the results in Figure 16. Away from A∪D, we take two copies of physical space, one for each sign of v . The whole energy surface (restricted to r ≤ R(z)) is obtained by overlapping these charts. In principle, one can choose a global coordinate system for the energy surface (at least for the part projecting to P ). For example, choose coordinates (x, y) on D ∪C ∪A, where C is the cylinder r = R(z) connecting D to A (one can make them smooth by suitable choice of the function R); choose a vector field U transverse to D ∪ C ∪ A inwards and nowhere zero, so that every point of P is reached uniquely by flowing along U for some time t > 0 from a point (x, y); then given v at this point, let s = sign(v ) √ t. Then (x, y, s) are global coordinates for this part of the energy level, but there is a lot of choice and no obvious one to settle on.
One way to do something like this is to write (x, y) = t(z)(X, Y ), v = s(z)W , for some positive scaling functions s and t to be chosen as functions of z. The origin of Figure 16. Samples of stable (blue) and unstable (red) manifolds attached to the hyperbolic periodic orbit at energy level E = 1.061155 in (X, Y, Z) coordinates described in the text. Other parameters include e −1 = 0.1, = 0.01, m = 1, µ = 1.1, a = 0.5, η = 1.1, k = 2. Samples of stable manifold were generated using a rejection-sampling technique. The energy level is sampled randomly from a distribution whose spatial projection is uniform. Each sample is evolved forward in time using a well-resolved Runge-Kutta integration. If a sample stays in the window π/k − 0.2 < z < π/k + 0.2 for a fairly large time (∆t = 6 in this case) it is kept. Samples on the unstable manifold are generated in the same manner, but with backward-in-time integration.
(x, y) should be chosen to be the lowest point on the surface |B| = E/µ (or the scaling extended to include a shift). Then for suitable s and t, the energy surface is a graph z = Z(X, Y, W ). To see how and when this can be done, consider the equation for the energy surface in the scaled variables: 1 2 s(z) 2 W 2 + |B|(t(z)X, t(z)Y, z) = E/µ. The z− derivative of this expression is s sW 2 + t (X|B| ,x + Y |B| ,y ) + |B| ,z = s s v 2 + t t (x|B| ,x + y|B| ,y ) + |B| ,z . At the bottom of the accessible region, |B| ,z < 0 and the other two terms are zero. We can make it negative on the whole of the boundary of the accessible region by choosing t t sufficiently negative in the parts where |B| ,z ≥ 0; this assumes x|B| ,x + y|B| ,y > 0 away from the bottom, which is true for mild perturbations from axisymmetry. Lastly, we can make it negative for v = 0 by choosing s s sufficiently negative in places where what was constructed before is not already negative. Then by the implicit function theorem, the energy surface is a graph z = Z(X, Y, W ). The function Z has a minimum at (0, 0, 0) and its other level sets are topologically two-spheres.
We did not yet implement such a choice of global coordinates, but following this idea, it is convenient in sketches to represent v = 0 in the energy surface as a horizontal plane, with v > 0 above the plane, v < 0 below it. We put D × {0} at the centre of this plane and A × {0} as a concentric annulus. Between them is a torus T representing points on r = R(z) with z between the two components of |B| = h, and outside A is a cylinder C representing points on r = R(z) above the torus component. This is illustrated in Figure 17. Also, denoting by π the map from H −1 (E) to physical space, is a double cover of the disk Σ − ∩ P glued along its boundary Σ − ∩ A; so it is a sphere. A neighbourhood of S can be obtained in the form S × I for an interval I representing height z relative to Σ − ; this is known as the Conley-McGehee representation (see for example the sketch in [M90], and [KW]).
As a first pass, the region of the energy surface between S and T might be considered to be the states that are inside the machine. The upper hemisphere has v > 0 so one might say it consists of states that are just exiting the machine; but this is not quite right because guiding-centre drifts could compensate for small v, so we will give a dynamical construction shortly. Similarly, the lower hemisphere has v < 0 so consists of states that are just entering the machine, modulo the drift corrections. On T , states may enter or exit, but this is an aspect that we do not address here, as our focus is on transitions between different classes of guiding-centre motion. Now we come to the dynamics in the energy surface. FGCM has a periodic orbit γ which is the continuation of the circle π −1 (Σ − ∩ A) of equilibria for ZGCM. It is the Lyapunov orbit with energy E for the upper saddle. In general it does not have v identically zero, but v oscillates about 0 on it. It is hyperbolic, so it has forward and backward contracting manifolds W ± . It is possible to span γ by a surface diffeomorphic to a sphere that is transverse to the dynamical vector field except on γ. It is a perturbation of the sphere S, so we denote it by the same symbol. It is non-unique but an essential feature is that it lies in the sectors between W ± indicated in Figure 18. It is called a Figure 18. Dynamics in the energy surface H −1 (E), showing the periodic orbit γ, the contracting manifolds W ± and the transverse hemispheres S ± , in the case when the first intersections of W ± with v = 0 miss each other. Inside the sphere S, we show the trajectories below the horizontal as dotted and above as dashed while escaping out of the sphere is shown as continuous.
dividing surface, because the upper hemisphere S + has unidirectional flux from inside to outside, representing guiding centres leaving the machine, and the lower hemisphere S − has unidirectional flux from outside to inside, representing guiding centres entering the machine. The manifolds W ± form tubes that in one direction go inside the sphere. They necessarily cross v = 0 transversely, representing bounce on the disk D. Their first intersections with v = 0 are diffeomorphic to two circles. In the axisymmetric case, they coincide, but if axisymmetry is broken then they need not coincide. Generically for h only slightly above h 0 they miss each other entirely, as shown in Figure 18. This will be justified at the end of the subsection. In this case, we see that all the flux entering the sphere transitions to bouncing trajectories (in fact, making at least three bounces) and all the flux leaving the sphere came from bouncing trajectories (making at least three bounces). We call them "trapping" and "detrapping" fluxes, respectively, to align with standard terminology. The fluxes of energy-surface volume are equal and can be expressed as the action integral of γ: S = γ α, α = eA + mvb , or S = γ (eA + mvb) · dx, where A is any vector potential for B.
in, single bounce and out trapping detrapping Figure 19. Case when the first intersections of W ± (γ) with the bounce surface {v = 0} intersect. W ± not drawn, except for their first intersections with D, to give priority to the lobes on the dividing surface. Same convention for the trajectories as in Fig. 18.
For larger h, the two circles of first intersection of W ± with the bounce surface {v = 0} might intersect, as in Figure 19. We have drawn the simplest case of two intersections, but of course there could be more. The trajectories of the intersections are homoclinic to γ. They wrap onto γ in both directions in time. Now there are three types of flux. Firstly there is the trapping flux across a lobe on the upper hemisphere that crosses v = 0 in the indicated lobe and turns into a bouncing trajectory (at least three bounces). Secondly, there is the detrapping flux that comes from bouncing trajectories that cross v = 0 in the other indicated lobe and exit the lower hemisphere via the indicated lobe. Thirdly, there is the remaining flux across the upper hemisphere, that passes through the intersection of the disks bounded by the circles on v = 0 (where they perform one bounce) and then exit via the rest of the lower hemisphere. These are single-bounce trajectories.
The fluxes of energy-surface volume are related to the actions of the homoclinic orbits and of γ. Namely, the trapping flux is the difference in action between the homoclinic orbits at the ends of the corresponding lobe. The action of a homoclinic orbit does not converge, but the difference in actions of two homoclinic orbits to the same periodic one does, as long as one takes the end points to converge together, and that is what is assumed in this statement. The detrapping flux is equal to the trapping flux because they are both given by the difference in action of the same pair of homoclinic orbits. The single-bounce flux is the difference between the action of γ and the trapping flux.
To complete this discussion, we explain why generically the circles miss each other for h only slightly above h 0 . In the limiting case h = h 0 , the periodic orbit γ shrinks to an equilibrium point, namely the saddle of |B| near the centre of the top coil. The accessible region can be considered as an inside and an outside that are pinched together in a conical point at the saddle. The part of the energy surface corresponding to the inner part is a double cover of a sphere with a conical singularity at the saddle, hence it is a 3-sphere with a conical singularity. The "shrinking" of γ to this conical point corresponds in our representation to γ growing to a circle at infinity: think of how a small circle around the north pole maps under stereographic projection to a large circle in the plane tangent to the south pole. The saddle has 1D contracting manifolds. In one direction they pierce v = 0. Generically they pierce it in distinct points. Now deform the picture to h > h 0 : the points expand to small circles, but still miss each other for small h − h 0 . 6.2. Tokamak. For an axisymmetric tokamak, Σ × {v = 0} also persists to an invariant submanifold for µ > 0, but only after taking its union with another invariant submanifold that crosses it transversely, corresponding to closed guiding-centre trajectories, and allowing the crossing to break generically. The result was already illustrated in [M94]. It is illuminating to analyse this example in some detail. This analysis will demonstrate how and why strong isodrasticity generically breaks in magnetic configurations close to tokamaks.
We use the magnetic field of Section 3.2. For the guiding-centre motion we choose to scale e, m to 1 and take H = 1 2 v 2 + µ|B|, ω = β + d(vb ). This is so that the effect of turning on small µ is on the Hamiltonian instead of the symplectic form, which makes it easier to understand. We can take an axisymmetric vector potential A for B, leading to β = dA . Its (physical) component A φ = ψ/R = 1 2 (r 2 + z 2 )/R, where r = R − 1 (and in case it is useful, one can take A z = −C log R, A R = 0). By axisymmetry, H p (r, z) = 1 2 A typical contour plot of H p for p > 0 and µ small is shown in Figure 20, which shows some banana trajectories and some circulating and counter-circulating trajectories, and makes clear it has three critical points. We look for critical points of H p because they generate axisymmetric periodic orbits for H (or exceptionally, a circle of equilibria). The ones of interest for isodrasticity are the hyperbolic ones, but for the moment we consider them all. Using ψ = 1 2 (r 2 + z 2 ) and R = 1 + r, we see the z-derivative is zero iff z = 0. The r-derivative on z = 0 is So critical points of H p are the solutions r of (33) (C 2 − r)p 2 + Rr(C 2 + r 2 )p + µ RC 2 (C 2 − r) √ C 2 + r 2 = 0, wherep = p − r 2 /2. We can consider this instead as an equation forp given r. It has real roots iff Rr 2 (C 2 + r 2 ) 5/2 ≥ 4µC 2 (C 2 − r) 2 , that is for r outside an interval approximately (−2 √ µC, 2 √ µC). The value ofp at the fold points isp = − Rr(C 2 +r 2 ) 2(C 2 −r)) ≈ −r/2. For r 1 but large compared with √ µC, the dominant balances are (i) the second and third terms, giving a solution approximatelỹ p = − µC r , and (ii) the first and second terms, giving a solution approximatelyp = −r. These can be converted fromp to v by (31), which gives v ≈p and so the same approximate formulae.
Thus we obtain an invariant manifold N consisting of circular orbits in z = 0, as shown in Figure 21. The parts with v ≈ − µC r are a small perturbation of Σ × {v = 0} away from the gap. The parts with v ≈ −r correspond to perturbation of another invariant submanifold for µ = 0, consisting of the circular periodic orbits in z = 0 for which the vertical components of the curvature drift and parallel velocity balance (this can be computed exactly if desired, but has the leading form v = −r).
To identify which parts are normally hyperbolic or elliptic, we have to do further work. The boundary between them is determined by the turning points of p φ along this curve of critical points. There is just one such turning point and it is near r = −(µC) 1/3 . To see this, δp = 0 iff δp = −r δr, divide (33) by C 2 − r for convenience and differentiate the result to obtain the condition for δp = 0, δr = 0: where the derivatives are with respect to r. For µ and r small this isp − r 2 + µC ≈ 0. This occurs on the lower left branch, wherep ≈ −µC/r, r < 0, as a balance principally between the first and second terms. Hence the result. This gives N 0 . Study of the type of the critical points of H p gives the decomposition of the complement into N ± as shown in the figure. One can compute N 0 as a curve in (µ, E, p) parametrised by r. Note that at N 0 , r ≈ −(µC) 1/3 ,p ≈ (µC) 2/3 , so p ≈ 3 2 (µC) 2/3 . Thus, the regime for which H p has three critical points is approximately p > 3 2 (µC) 2/3 . This invariant manifold N persists under breaking axisymmetry, because the degenerate case N 0 is an elementary saddle-centre periodic orbit. This is because the zero of equation (34) is transverse.
Having established the existence of the invariant submanifold N − , we now wish to understand its contracting submanifolds. In the axisymmetric case, p φ is conserved, so they form pairs of homoclinic connections, as can be divined from Figure 20. To consider the effect of breaking axisymmetry, we have to abandon conservation of p φ , but energy is still conserved. So the useful viewpoint is to consider the energy surfaces H −1 (E) and the level sets of p φ in them for the axisymmetric case and then consider the effects of losing conservation of p φ . This is a view already promoted in [M94] (though some aspects of the figures there are inaccurate).
Firstly, as in [M94], one can represent points of H −1 (E) in the axisymmetric case by (v, z) because given (v, z, E) there is at most one compatible r in r 2 + z 2 ≤ r 2 0 . To see this, R 2 C 2 + (R − 1) 2 + z 2 is positive for R ∈ (0, C 2 + 1 + z 2 ), which contains r 2 + z 2 ≤ r 2 0 < 1. Thus there can not be two values of R with the same value of (v, z, E).
We can obtain R explicitly in terms of (v, z, E) because the equation for R is a quadratic: To save studying more than one case for the sign of the coefficient of R 2 , let us suppose that C > √ 3. Then |B| > 1 for C 2 + (R − 1) 2 + z 2 > R 2 , i.e. for R < 1 2 (C 2 + 1 + z 2 ), which is true for all r 2 + z 2 ≤ r 2 0 . So E − 1 2 v 2 > µ. It follows that the quadratic has one positive real root R. Writing R = 1 + r it has Then p φ can be expressed in terms of E, µ, v and z, by rewriting Eqn. (30) using |B| = E µ − v 2 2µ = h − u 2 /2, and substituting the above expression for r into A contour plot of p φ in (u, z) for some µ, C, E = hµ is given in Figure 22. Note that the previous considerations restrict us to C > √ 3, h > 1 and |u| < 2(h − 1). The region in (µ, E) for which there are three critical points of p φ can be obtained from the previous analysis. The energies for particles near the magnetic axis and with small v, as on the branches v ≈ − µC r , are near µC, so it is natural to consider a further scaled version E = h C − 1. The scaled energy at the degenerate critical point N 0 is E ≈ 3 2 (µC) 1/3 . Degenerate critical points of p φ for given E correspond to degenerate critical points of H for given p, so we obtain that the region of energy for which p φ has three critical points is approximately E > 3 2 (µC) 1/3 . In the regime of E for three critical points of p φ , one is a hyperbolic point (corresponding to that for H p ). It has a positive value of v (it is born near v = (µC) 2/3 at E ≈ 3 2 (µC) 1/3 and for larger E has approximately v = µC/E). The rate of change of φ is given byB which at the hyperbolic point has each term positive (remember v > 0 and r < 0), soφ is positive there. Thus in the full phase space including φ, it corresponds to a hyperbolic periodic-orbit of guiding-centre motion. The hyperbolic point has two homoclinic orbits, corresponding to the joint level set of H and p φ containing the hyperbolic point. The mapping (31), considered as giving v from (r, z), shows that the joint level set forms a figure of eight in (v, z), arranged as in Figure 22. The closed levels of p φ for given E inside the right and left lobes of the eight correspond to positively and negatively circulating trajectories, respectively (actually, some of the negatively circulating trajectories bounce, as can be seen in the Figure from the sign change of u, but it is convenient to consider them as purely circulating in this exact treatment). Those round the outside of the eight correspond to bouncing trajectories. In the full phase space the figure-eight gives two homoclinic submanifolds to the hyperbolic periodic orbit. For generic perturbation they can be expected to break, producing homoclinic oscillations and a zone of chaos around them. See Figure 23. Note thatφ is not of fixed sign (its sign is essentially that of v), so one can not draw a first return map to φ = 0 modulo 2π. But one should still get a similar homoclinic tangle (which would be worth illustrating). This corresponds to repeated transitions between bouncing trajectories and trajectories circulating in either of the two directions. Note that trajectories on the counter-circulating side and the bouncing trajectories on the outside of the figure-eight still have a net drift in the positive direction if close to the separatrix. But further away, φ may change to negative. It looks likely that for each of the counter-circulating and the bouncing classes there is a level set of p φ for given E for the unperturbed problem for which φ is zero; generically they break into island chains in the φ direction, creating what are called "super-bananas", together with their own broken separatrices. The scenario merits more in-depth analysis from this point of view (and discussion of ripple-trapping). v φ z W + W − γ Figure 23. Expected picture for the contracting manifolds W ± of the hyperbolic periodic orbit γ after a perturbation breaking axisymmetry.
The conclusion for isodrasticity is that to be strong isodrastic, one would need to restrict perturbations from axisymmetry to preserve the pairs of homoclinic orbits to the hyperbolic periodic orbits. This requires two functions f ± : R × R × S 1 → R to be zero, f ± (E, µ, φ) representing the splitting of the separatrix on the right or left, respectively, for energy E and magnetic moment µ at phase φ. It is an open question how one might achieve this in general. Perhaps the helical fields of [KIVM] would give some insights.
We note that quasi-symmetry implies the same situation of double homoclinics to periodic orbits. Indeed, the analysis proceeds just by replacing conservation of p φ by conservation of K = −eψ + mv u · b, where u is the quasi-symmetry vector field.

Exact treatment of the splitting of separatrices
To decide whether W ± for N − coincide there is a standard technique, usually called Melnikov's method, though its origins go back to Poincaré [Poi]. It is usually presented in the context of time-periodic perturbation of an autonomous 1 DoF system, whereas we want it for an autonomous 2DoF system (nevertheless, see [Ro,HM82a,HM82b], for example). Furthermore, it is usually presented in the case that the unperturbed system has a manifold of hyperbolic periodic orbits with homoclinic connections, whereas here the unperturbed system has a manifold of equilibria with homoclinic orbits and the periodic orbits appear only as a result of the perturbation. It is often presented as a first-order calculation in some perturbation parameter, but there is an exact version, to be presented here. Furthermore, the integral of the Melnikov function between zeroes represents a flux of energy-surface volume making a given transition and the flux can be written as the difference in action between homoclinic orbits [MM].
The NHS N − is a graph of arclength and parallel velocity over Σ − . Σ − is transverse to the magnetic field, so we can label its points (and those of N − ) by labels for the fieldline through the point. We write such labels as (x, y) ∈ R 2 , often shortened to ξ ∈ R 2 . It is possible to choose them so that the magnetic flux 2-form β = dx ∧ dy, in which case (x, y) are called Clebsch coordinates, but that is not really necessary.
For FGCM with givenμ, W ± (ξ) for ξ ∈ N − follow close to the fieldline through ξ and with parallel velocity u = ± 2(H(ξ) − |B|). W ± (ξ) reach bounce points where u = 0, equivalently |B| =H(ξ), but in general at different points because they are on different fieldlines by that time. Define Ξ ± (ξ) to be the fieldlines on which W ± (ξ) find themselves at their first bounces. The splitting of the separatrices of ZGCM under the perturbation parameterμ is to do with the displacement from Ξ + (ξ) to Ξ − (ξ) in the surface of section {u = 0}. More precisely, given ξ ∈ N − let G(ξ) be its trajectory on N − ; then the minimum over ξ ∈ G(ξ) of the displacement from Ξ + (ξ) to Ξ − (ξ ) is an appropriate quantifier of the splitting at ξ, because if it is zero that would still make a homoclinic orbit, just with a phase change. As a global quantifier of the splitting, one can take the maximum over ξ. Note that, in general there is not a natural notion of subtraction in the space of fieldline labels, but one can make one locally. Furthermore, the limit for small separations makes sense as a tangent vector in the space of fieldlines, which we will use for first-order treatment.
For the moment, we stay with the exact treatment, and show how to compute Ξ ± (ξ), at least conceptually. The idea is to integrate the rate of change in fieldline along W ± (ξ). Note that W ± (ξ) are not trajectories in general. They are the sets of points whose forward/backward trajectory converges together with that of ξ ∈ N − . But because there is only one unstable/stable normal direction to N − , they are one-dimensional. Furthermore, W ± (ξ) are in the same energy level as ξ.
Because the dynamics on N − is two-dimensional and Hamiltonian, it consists of periodic orbits or exceptionally, equilibria or connecting orbits between equilibria. The tangent to W + (ξ) at an equilibrium ξ ∈ N − is the contracting eigenvector of the derivative of the vector field. An equilibrium in N − is actually in Σ − so its contracting eigenvector is relatively easy to find. The tangent to W + (ξ) at a point ξ ∈ N − of a periodic orbit on N − is the contracting eigenvector of Dφ T (ξ) where φ is the flow of FGCM and T is the period of the orbit. This is not so straightforward to compute but is feasible. For ξ on a connecting orbit on N − from one equilibrium ξ 1 to another ξ 2 , the tangent to W + (ξ) is given by flowing the contracting eigenvector at ξ 2 backwards along the connecting orbit to ξ.
The analogous constructions for the opposite direction of time give the tangent to W − (ξ) for all points ξ ∈ N − . For ξ ∈ N − an equilibrium, W ± (ξ) are trajectories, and they start along the fieldline with u = ± −|B| s to first order in s. The velocity dX/dτ can be pulled back to Σ − by the derivative of b (plus a correction to bring it tangent to Σ − ) and thus the point Ξ + (ξ) can be obtained by integrating this pulled-back vector field on Σ − . Similarly for Ξ − (ξ).
For ξ ∈ N − on a periodic orbit, if we already found a small piece of W + (ξ) near ξ then we can construct a longer piece by applying φ −T to it, where T is the period. So we can take a small piece given approximately by the straight line in the tangent direction at ξ and iterate it backwards by φ T . The forwards contracting manifold of a periodic orbit γ is the union of W + (ξ) over ξ ∈ γ.
Note that the choice of surface of section to be the bounce surface is not necessary, but is somewhat natural in this problem.
Note also that this exact picture gives an exact formula for the flux of energy-surface volume making a transition. The transitions are represented by lobes formed by the backwards and forwards contracting manifolds between successive intersections. The intersections correspond to homoclinic orbits. The flux of energy-surface volume across such a lobe is precisely the difference in action between the two homoclinic orbits. Here, the action of a curve is eA + mv b and to compare the actions of two homoclinic orbits one has to take the limit of long segments whose backwards endpoints converge together (and to the periodic orbit) and forwards endpoints converge together (and to the periodic orbit). This follows from [MM].

Melnikov functions
Here we will show that to first order the conditions for strong isodrasticity are precisely the weak isodrasticity conditions, and give a computational test for weak isodrasticity in terms of "Melnikov functions" that is simpler than that of Section 2. Finally, we interpret the Melnikov functions in terms of flux of phase-space volume making transitions. 8.1. Strong isodrasticity to first order. To compute Ξ ± to first order inμ, we can integrate the rate of change of fieldline label along the unperturbed trajectory of ZGCM, compensated by the rate of change of the fieldline label at the base point.
Recall that the equations to first order can be written in scaled time τ as and c = curl b (as already mentioned, c ⊥ can be written as b × κ with κ = b · ∇b). Let ξ be a coordinate on Σ − (e.g. x or y from the previous section) and extend it along the fieldlines to a fieldline label. Its derivative a = dξ (a covector) can be computed by setting a(0) = dξ on tangents to Σ − and a(0)b = 0, and integrating the adjoint equation along the fieldlines: where s is arclength. The covector a quantifies the linearised change in the fieldline label. For trajectories starting on N − , take h in (37) to be |B| on Σ − to first order. Then the first-order change ∆ξ in fieldline label ξ along W − is given by integrating the rate of change of ξ by the perturbed vector field along the unperturbed trajectory, while simultaneously subtracting off its rate of change at the base point. Thus at the initial point on Σ − . The subtraction of C compensates for the denominator u at the start of the integration, where u ∼ −|B| s for small s. The denominator does not give a problem at a generic bounce point because it behaves like √ s 0 − s there, which gives an integrable singularity; a tidy way to compute the integral accurately is to switch variable of integration from s to u near the bounce. The first-order change along W + is the negative of ∆ξ. So the displacement in the chosen fieldline label ξ from W + to W − is to first order It is a function of the initial point on Σ − and we call it the Melnikov function for the fieldline label ξ, by analogy with formulae for splitting of separatrices in other contexts. If one takes two independent fieldline labels x, y then the Melnikov function has two components M x , M y . This is particularly important when one takes initial points on or near a critical point of |B| on Σ − . Compare the discussion of the mirror machine in Section 6.1.
For a periodic orbit on N − , however, we are more interested in the displacement in fieldline label h = |B| where it crosses Σ − than the possible phase shift along the periodic orbit. Thus we specialise to take h as coordinate on Σ − and compute the resulting firstorder change in h. In this case several things simplify. Firstly, we can take a(0) = d|B| because h = |B| on Σ − and i b d|B| = 0 on Σ − . Secondly, C = 0. Then letting k = curl(ub) we obtain M h = ak ⊥ |B| ds and ab = 0 so we can drop the ⊥. Thus we obtain the simple formula Of course this hides the computation of k and also the fact that k is singular at s = 0 (but the singularity is annihilated by a).
Note that in practice it may be better to integrate with respect to time T for fieldline flow dx/dT = B(x) than arclength s. This would replace ds/|B| by dT , and also simplify the computation of (38) as indicated in Appendix I. 8.2. Relation to weak isodrasticity. Next, we relate the Melnikov function M h to the ratio of dh ∧ d to β on Σ − .
Thus the main part of the weak isodrastic condition, that d and dh be linearly independent on Σ − , is the first-order condition for strong isodrasticity, and M h is the function M of Section 2.
Proof. Recall that  = u ds = ub along the segment of fieldline from Σ − to the first bounce point. Hence for a tangent vector v to Σ − , d(v) = iṽd(ub ), whereṽ is obtained from v by flowing with the derivative ofẋ = b(x) (v i =ṽ j ∂ j b i ). But d(ub ) = i k Ω and displacement along the segment is given by b ds, so Near a point of Σ − with dh = 0 we can choose h as one fieldline label and can take another one g such that β = dh ∧ dg (by Darboux's theorem; this is a construction of Clebsch coordinates). Extend h and g to fieldline labels along the fieldlines. Let us take v to have dh v = 0. Then as iṽdg is constant along the fieldline. Then evaluate dh ∧ d(w, v) on Σ − for an arbitrary it suffices to evaluate 2-forms on any pair of independent vectors to determine them.
At a point of Σ − where dh = 0 then both M h and dh ∧ d are zero, so the same result holds.
Computation of the Melnikov function via (39) (rather than by numerical differentiation of h and ) for some examples will be reported in a future paper.
To complete the discussion of strong isodrasticity at first order, we look at Σ 0 . The strong isodrasticity condition that H = 1 2 u 2 + |B| is constant along components of N 0 , evaluated to first order is equivalent to h being constant on components of Σ 0 . Linear dependence of dh and d on N − extends to the boundary, so from h constant on boundary components we also deduce that  is constant on them.
Hence strong isodrasticity at first order is equivalent to weak isodrasticity, as claimed.
8.3. Interpretation as flux. To conclude this section, we interpret the Melnikov function in terms of the first-order flux of phase-space volume making the given transition. This is analogous to the interpretation of dh ∧ d as transition flux for reduced dynamics in Section 2. For a 2 DoF Hamiltonian system with Hamiltonian H, symplectic form ω and vector field V defined by i V ω = dH, the phase-space volume form is 1 2 ω ∧ ω and the energysurface volume form ε on H −1 (E) is defined so that 1 2 ω ∧ ω = dH ∧ ε. A standard calculation (e.g. [MM,M90]) shows that the energy-surface volume flux form φ = i V ε is just ω.
Specialising to the mirror machine example for illustration, it follows that the flux of energy-surface volume making the transition from free to bouncing is the integral of ω over the lobe between W ± on the bounce surface B = {v = 0} in Figure 19. In the scaling we are using, ω = β/ √μ + d(ub ), so is just β/ √μ on {v = 0}. Choose a local coordinate g on Σ − such that β = dh ∧ dg (possible where dh = 0 by Darboux's theorem) and flow the functions h and g along the field. Then the flux is dh ∧ dg/ √μ over the lobe. But the change in h from W + to W − is to first order 2 √μ M h so the flux is 2M h dg along the arc of the level set of h between zeroes of M h . We can treat a range of energies simultaneously. The flux-form for phase space volume is i V ( 1 2 ω ∧ ω) = dH ∧ ω. So, given an area A on Σ − corresponding to a transition, the flux of phase-space volume making the transition is the integral of dH ∧ ω across the corresponding region on {v = 0}. To first order, dH = dh, because we are treating trajectories that graze the maximum in |B|.

So the flux is
The factor 2 comes from the convention that L is computed from one zero of v to the other, whereas for a full period it would be twice as much.
So far, we have been treating the problem in scaled variables. To turn this from scaled variables to the original variables, we use that the real symplectic form is √ mµ times the scaled one (29), so real phase-space volume is mµ times the scaled one, and real time is m/µ times scaled time. So real phase-space flux across A is: A point with given h on Σ − corresponds to real energy µh.
To convert these results to the flux of particles, we need to introduce a distribution function ρ, giving their density with respect to phase-space volume (this use of ρ is distinct from that for the gyroradius). It is best to include the dependence of the density on µ, rather than treating a fixed µ. The phase-space volume form dq 1 ∧ dq 2 ∧ dq 3 ∧ dp 1 ∧ dp 2 ∧ dp 3 for the 3DoF problem converts in gyro-coordinates to m 2B Ω ∧ dv ∧ dµ ∧ dφ, with Ω being ordinary volume for guiding-centre position and φ being gyrophase. Integrating over gyrophase we obtain the gyro-averaged volume-form 2πm 2B Ω ∧ dv ∧ dµ. This is preserved by the Hamiltonian guiding-centre flow (though not in general by (9), even ifB is replaced by |B|, another reason to prefer the Hamiltonian version). So the number of particles of given type in a volume W of gyro-averaged phase-space can be written as W ρ(X, v , µ) 2πm 2B (X) Ω ∧ dv ∧ dµ, where ρ denotes the usual scalar guiding-center distribution function.
Note that the gyro-averaged volume-form is the wedge of 2π m e dµ with the guidingcentre volume-form. The prefactor is because the standard convention for action variables was broken by many plasma physicists (though tends to be respected by A.J. Brizard and high-energy particle physicists). To see this factorisation of the gyro-averaged volume-form, reall from (20) that the guiding-centre volume-form Λ = emB Ω ∧ dv .
Thus to obtain the flux of particles from the flux of guiding-centre volume, one has to multiply by 2π m e ρ dµ and integrate over µ. To leading order in µ we can replaceB by |B|. So the flux of particles corresponding to an area A on Σ − , to leading order in µ, is A point to note is that the picture to first order depends on E and µ through only the combination E/µ. The rate of transition has a prefactor √ µ (or other powers depending on scaling), but is otherwise independent of µ given E/µ. As pointed out by Roscoe White (personal communication), this gives hope that fusion α-particles might remain in their initial class despite slowing down, because it seems for fusion α-particles E and µ decrease in such a way that E/µ remains roughly constant.

Discussion
We have generalised the concept of omnigenous magnetic fields and their analysis by [CS], to remove the requirement of a flux function (and even with a flux function, our condition is much weaker), obtaining the concept of weakly isodrastic field. They are magnetic fields for which there are no transitions between classes of guiding-centre motion, under the assumption that the longitudinal invariant is conserved. Furthermore, we have provided a quantification of deviation from the ideal case, namely Melnikov functions, thereby making available objective functions for optimisation of design.
We have extended our theory, by removing the assumption of conservation of the longitudinal invariant, to a notion of strong isodrasticity, which provides an exact prevention of transitions between classes. We have proved that to first-order in the magnetic moment, strong isodrasticity is weak isodrasticity, thus justifying weak isodrasticity as a first-order approximation. We have illustrated how isodrasticity is lost for general perturbations from axisymmetric fields for toy mirror machines and a toy tokamak. The exact theory has the advantage that it can in principle be applied to higher order guiding-centre approximations, of potential relevance to the fusion alpha-particles.
We have shown how to construct many weakly isodrastic mirror fields, in particular that are not omnigenous. The main question that remains is whether isodrasticity (weak or strong) is possible outside axisymmetry for a stellarator. Quasisymmetry implies isodrasticity. Although perfect quasisymmetry perhaps does not exist outside axisymmetry, close to quasisymmetric fields can be made (for recent examples, see [LP]). Close to quasisymmetric fields should be close to isodrastic, but perhaps there is a larger class of isodrastic fields than quasisymmetric. Omnigenous fields are weak isodrastic. Perhaps truly omnigenous fields have to be quasisymmetric, but close to omnigenous fields can be made, such as in Wendelstein 7-X. As weak isodrasticity is weaker than omnigenity, there is hope that truly isodrastic fields can be made. One option is to make all the marginal cases heteroclinic, but we have shown (not included in this paper) that this reduces to omnigenity. So we wish to make examples where most of the marginal cases are homoclinic, as discussed for typical perturbations of a tokamak in Section 3.2, but there will be codimension-one cases of heteroclinic, leading to the double transition scenario of Appendix G. We are looking for examples. For general Hamiltonian systems it is certainly true that there is a larger class of systems with some perfect separatrices than the integrable class (see Appendix L). Also, the concept of isodrasticity motivates some clear objective functions that could be fed to an optimisation to automate a search. For example, one could use the maximum of the Melnikov function or the integral of its positive part. Even if one does not make exact isodrasticity, these objective functions could be weighed against others in the design of stellarators.
The theory also suggests that one might be able to make controlled transitions between classes by slightly breaking isodrasticity via trim coils. The Melnikov function for a normally hyperbolic submanifold specifies the flux of energy-surface volume making a given transition, so if one could learn the effects of trim coils on the Melnikov function then one could control the flux.
Finally, the theory sheds light on the effects of imperfections on tokamaks and quasisymmetric stellarators. The marginal bouncing trajectories of the ideal case give rise to double homoclinic manifolds to hyperbolic periodic orbits for guiding-centre motion. Breaking the symmetry in general breaks both these separatrices, leading to a stochastic layer in which trajectories transition between bouncing and co-and counter-circulating. Furthermore, the guiding-centre dynamics near the magnetic axis is in general significantly disturbed from the axisymmetric case, leading to mixing in the core (though this is not necessarily a bad thing, as discussed by [Bo]). H = 1 2 mv 2 + µ|B| + eΦ + mV, where Φ and V are electrostatic and gravitational potentials, respectively. The symplectic form is unchanged. The resulting equations of motion arė ZGCM is just 1 DoF motion of unit mass in the potential µ m |B| + e m Φ + V . The longitudinal invariant is modified to The surface Σ is modified to be the set of points at which the first derivative µ|B| + eΦ +mV = 0, where again denotes derivative along the magnetic field, and decomposes into Σ ± and Σ 0 according to the sign of the second derivative. But Σ now depends on the ratio µ : e : m. Weak isodrasticity becomes that dE ∧ dL = 0 on Σ − , where L is for segments starting on Σ − . The exact FGCM dynamics is still 2DoF and has NHS with contracting submanifolds, whose intersections can be analysed the same way.

The second adiabatic invariant becomes
at H = E. If Φ and V are constant we again have Σ the set of points where |B| = 0 and the same analysis of weak isodrasticity. With Φ or V , however, the shape of the argument of the square root now depends on E so Σ becomes E-dependent and the nice picture breaks down. Nonetheless the exact FGCM still has a NHS continuing the non-relativistic Σ to the relativistic case, at least for µ not too large. The same picture of its contracting manifolds holds. For magnetically confined fusion devices, probably adiabatic invariance of µ fails before relativistic effects come in (the alpha particles produced by DT fusion have speed 4.3% of the speed of light and in a 1T field have gyroradius 0.266 metres times the sine of their pitch angle). But perhaps in some astrophysical contexts, relativistic guiding-centre motion is relevant.

Appendix B. Fields with a flux function
We take the opportunity to review the theory of magnetic fields possessing a flux function, including some results that we have not found in the literature.
Existence of a flux function is automatic for non-degenerate magnetohydrostatic (MHS) fields, i.e. those satisfying i J i B Ω = −dp (J × B = ∇p) for some function p with dp = 0 (∇p = 0) almost everywhere, where i J Ω = dB (J = curl B): one can take ψ = p (though it is generally preferred to take the toroidal flux enclosed by a level set of p). It is also automatic for most axisymmetric fields: let u be the vector field ∂ φ in cylindrical coordinates, then di u i B Ω = 0 (curl(B × u) = 0) so i u i B Ω = dψ (B × u = ∇ψ) for some function ψ locally, and generally there is no cohomological obstruction, so then ψ is a global function, i B dψ = 0 and dψ = 0 except where B is parallel to u.
The components of level sets of ψ are called flux surfaces. They are called regular if dψ = 0 (∇ψ = 0) on them. The bounded boundaryless regular flux surfaces are cooriented by ∇ψ and hence oriented. Because they carry a nowhere-zero vector field B their Euler characteristic is zero, so they are tori. Furthermore, they carry an area-form A satisfying A ∧ dψ = Ω, e.g. A = i n Ω (A(ξ, η) = n · (ξ × η)) with n = ∇ψ/|∇ψ| 2 . The fieldline flow preserves A on the flux surfaces, because applying di B to A ∧ dψ = Ω produces di B A ∧ dψ = 0 and so di B A = 0 on tangents to the flux surface. So there is no asymptotic convergence of fieldlines in either direction of time. This forces the flow to have a cross-section (a closed curve transverse to the vector field such that the trajectory of every point crosses it in forward and backward time) because the only other option for a nowhere-zero vector field on a 2-torus has a "Reeb component" (an annulus bounded by periodic orbits in opposite directions) and area can not be preserved. See sec.4 of [BGKM] for a summary of the theory of vector fields on a 2-torus.
Furthermore, integrating i B A from a reference point produces a local coordinate on the flux surface that is preserved by the flow. It follows that the return map to a crosssection is a rigid translation (in this coordinate). Thus, the fieldline flow on each torus is equivalent to that of a (non-zero) constant vector field on a standard torus R 2 /Z 2 up to a possible time-change (and examples can be made for which that is necessary) (equivalently, conjugate to a vector field with a constant direction). In particular, the fieldlines wind around the torus with a well-defined winding ratio (limit of ratio of number of turns in two angles), called its rotational transform ι, and in the rational case all the fieldlines are closed.
In the case of Diophantine winding ratio, i.e. |kι − m| ≤ Ck −σ for some C > 0, σ ≥ 1 for all integers k, m with k > 0, if the flow is smooth enough then it is conjugate to a constant vector field without any need for a time-change factor, but the conjugacy is in general less smooth than the flow (by about σ derivatives, depending whether one works in C r or Sobolev spaces). The C ∞ case is treated by [KH] in Prop 2.9.5, where C ∞ conjugacy results, but the method of proof there gives results for less smoothness.

Appendix C. Omnigenity for passing trajectories
In [LC], omnigenity is shown to imply that L is constant on flux surfaces for bouncing trajectories of given energy and class (actually, the extension to more than one class comes in [PCHL]). As mentioned at the end of [BKM], however, omnigenity also implies that L is constant for circulating trajectories of given energy on each rational flux surface, and to deduce omnigenity both conditions must be satisfied. In the meantime, we realised that constancy of L for circulating trajectories on a rational surface is implied by constancy for bouncing trajectories.
In this appendix we first prove that omnigenity requires constancy of L for all circulating trajectories on a rational flux surface. Then we prove that this is implied by the same statement for bouncing trajectories. After this, we give an equivalent proof to [He] that the time-average of V · ∇ψ is zero for circulating trajectories on irrational tori.
It follows that constancy of L for bouncing trajectories of given energy and class on each flux surface implies omnigenity, a result that we feel had not been established correctly before. The continuity argument of [He] from irrational to rational surfaces shows only that the flux-surface average of i V dψ for circulating particles is zero on each rational surface. But the time-averages along individual fieldlines on a rational surface need not agree with the flux-surface average. C.1. Omnigenity for circulating trajectories on rational flux surfaces. The longitudinal adiabatic invariant for general periodic orbits of zeroth-order guiding-centre motion (ZGCM) (not just bouncing ones) is where γ is the segment of fieldline covered. The term in A gives zero when integrated along a bouncing trajectory because it backtracks exactly, which is why it is usually left out. Also it is a constant for given flux surface when γ is a closed loop restricted to the flux surface, so plays no role in the present discussion. Thus we take L = γ mv b for both.
For circulating periodic orbits γ of ZGCM for a field with a flux function ψ, the first-order drift in ψ averaged over one period T is 1 2 mv 2 = E − µ|B|. The definition of omnigenity uses the non-Hamiltonian velocity (9) rather than the Hamiltonian one of (6-8). We drop thev component because we need only V · ∇ψ. Then using B · ∇ψ = 0, we obtain where ξ = (b × ∇ψ)/|B|. To test whether L is constant on a flux surface (for given E), it is enough to test whether dL ξ = 0, because tangents to a flux surface are linear combinations of the vector fields B and ξ, and the change of L along B is zero. Using the Lie derivative L ξ = i ξ d+di ξ (distinguish the function L and the Lie derivative L ξ along ξ) on differential forms, the total derivative term integrating to zero because γ is closed. Now dA = i B Ω so along fieldline γ it gives zero. The second term gives Differentiating (40) we obtain mv dv = −µd|B|; and using db = i c Ω, we obtain The middle term is zero because dψ applied to a tangent to γ is zero. We are left with This proves that for passing particles on a rational flux surface, ψ = 0 iff L is constant on it for given energy.
C.2. Omnigenity for bouncing trajectories implies omnigenity for circulating ones on rational surfaces. We scale out the magnetic moment µ by writing E = hµ and L = √ mµ j with for segment γ of fieldline. As remarked in section 2, the function j is the Abel transform of the length function for a fieldline. (v) is the length of the subsegments of γ for which |B| < v. Its Abel transform (with a factor of √ 2) is where h min is the minimum of |B| along it (one can replace h min by −∞ because d (v) = 0 for v < h min ). This applies equally well to a closed fieldline with h > h max , the maximum of |B| on it.
The condition that j be the same for all bouncing trajectories of given class on a flux surface with given h, for all values of h for this class, implies that the length function is the same for each segment of fieldline for the class on the flux surface, by the Abel inversion formula, e.g. [Ke]: This is a reformulation of a result of [SS] (see also [CS]). One could write in more detail about the treatment of multiple classes of bouncing trajectory [PCHL], but hopefully the above is clear enough. Note that it includes that h min and h max are the same for all fieldlines on the same flux surface. Our point is that it follows from the length function being the same for each bouncing class and the Abel representation (41) that j(h) is the same for all circulating trajectories on the rational surface with given h > h max . C.3. Omnigenity for circulating trajectories on irrational surfaces. ZGCM at given energy for circulating particles on an irrational flux surface in given direction is uniquely ergodic (i.e. it has a unique ergodic probability measure). Up to normalisation, the unique ergodic probability measure is given by the area-form |B| v A (where A was defined in Appendix B). This is because A is preserved by B but the time spent anywhere by ZGCM is a factor |B|/v longer than for fieldline flow. Thus, the time-average of any continuous function along ZGCM at energy E is equal to its surface-average with respect to |B| v A. On a flux surface, V · ∇ψ A = i V Ω because Ω = A ∧ dψ and dψ on tangents to a flux surface is zero. Now using the non-Hamiltonian form for V , (40) and other relations as above, we obtain |B| v i V Ω = |B| B i B Ω + m e d(v b ). The first term is zero on tangents to the flux surface and the integral of the second term over the flux surface is zero. One might worry that the time required for convergence to the time-average might be long for some irrationals, but actually under the omnigenity condition we believe a uniform estimate is possible.
It is curious that the results of this subsection and subsection C.1 are exact for the non-Hamiltonian form for V , but would be true only to first order if we used the Hamiltonian form (6).
Appendix D. Relation to pseudo-symmetry A weaker notion than omnigenity was introduced by Mikhailov, e.g. [M+], called pseudo-symmetry. There are various formulations, e.g. [Sk], but the way we choose is that a magnetic field is said to be pseudo-symmetric if it has a flux function and on each flux surface the contours of |B| are nowhere tangent to the magnetic field.
A pseudo-symmetric field is not necessarily isodrastic. Although the above formulation constrains the local maxima of |B| for a pseudo-symmetric field to form (noncontractible) closed curves on each flux surface, it does not force the separatrix area j to be the same for each homoclinic orbit coming from a local maximum with the given value of |B|.
Conversely, an isodrastic field is not necessarily pseudo-symmetric. Firstly, an isodrastic field need not have a flux function, though if the marginal trajectories are all heteroclinic then it does (h = |B| on Σ − extended along the fieldlines is a flux function). Secondly, even if an isodrastic field has a flux function, the contours of |B| on it could have an island chain instead of a curve of minima, as long as the fieldlines see only local minima on crossing the island chain (else new Σ − is created).
Appendix E. C 4 -generic Σ 0 A property is generic in a topological space if it happens on a countable intersection of open dense subsets.
Theorem 6. Restricting attention to bounded subsets, C 4 -generically Σ is a C 3 surface and Σ 0 is a C 2 -curve on Σ, separating it into Σ ± .
Proof. First recall that Σ is defined by |B| = 0 so if B is C r with r ≥ 2 then by the submersion theorem [La] the subset where d|B| = 0 is a C r−1 surface. The joint condition that |B| = 0 and d|B| = 0 is four conditions (div B = 0 does not make any restriction) on three variables, so generically does not happen. Thus C r -generically, Σ is a C r−1 -surface. Note that the condition d|B| = 0 is met in particular on Σ ± , which are the subsets where |B| > 0, < 0 respectively. Furthermore, it follows that Σ ± are transverse to B.
Secondly, the set Σ 0 is defined by |B| = 0, |B| = 0, so by the submersion theorem the subset where the derivative D(|B| , |B| ) has rank 2 is a C r−2 curve. The joint condition that |B| = 0, |B| = 0 and D(|B| , |B| ) does not have rank 2 is 4 conditions on 3 variables, so generically does not happen. Thus C r -generically, Σ 0 is a C r−2 curve. Σ 0 lies on Σ because |B| = 0 on it. It separates Σ into Σ ± because it is the subset with |B| = 0.
Next we study the generic behaviour of |B| near Σ 0 . Firstly, near a generic point of Σ 0 , there is a choice of fieldline labels x, y and a coordinate t along fieldlines, near to arc length s, such that for some function f and constant k = 0. Note that there is no remainder in this expression; this can be achieved by normal form principles of singularity theory [Mi]. The restriction div B = 0 plays no role in this, only in the mapping between (x, y, t) and physical space (compare the proof of realisability in Section 4.2). Then |B| = (y + 2xt + 3kt 2 )t , where t denotes ∂t ∂s holding x, y constant, so Σ is locally the surface (43) y = −2xt − 3kt 2 .
Thus we can use (x, t) as coordinates on Σ. Furthermore |B| = (2x + 6kt)t 2 + (y + 2xt + 3kt 2 )t = (2x + 6kt)t 2 + |B| t /t . Now |B| = 0 on Σ, and t is near 1, so Σ 0 is locally the curve Its projection to fieldline labels (x, y) is the curve 3ky = x 2 . It is a standard fold in Σ. Σ ± are the parts of Σ with x + 3kt > 0, < 0 respectively. The result is shown in Figure 24. We see that there are short bouncing segments on some fieldlines, namely those for which the cubic has a well, i.e. for which y < −2xt − 3kt 2 . Figure 24. Generic neighbourhood of a point (0, 0, 0) on Σ 0 . x, y are fieldline labels and s is arclength along the field. It shows that Σ 0 forms a smooth (blue) curve on Σ, separating Σ into Σ ± on the right and left, respectively. Three fieldlines are indicated, with x = 0, y = −1, 0, +1 and the shapes of |B| along them are sketched.
We can combine the treatment of generic points of Σ 0 and those having k = 0 by including both the kt 3 and at 4 terms in the expression for |B|, with k, a not simultaneously zero. Then on Σ, In particular, along Σ 0 , |B| = f (−3kt − 6at 2 , 3kt 2 + 8at 3 ) + kt 3 + 3at 4 , so using t now as a coordinate along Σ 0 , d|B| dt = −3kf ,x at t = 0, where f ,x denotes the partial derivative of f with respect to x at (0, 0). At points of Σ 0 where kf ,x = 0 the field is not weakly isodrastic, because |B| is not constant along Σ 0 (a little more analysis shows that the level sets of |B| cross Σ 0 with cubic tangency). A closed component of Σ 0 has at least one minimum and one maximum of |B|, so there are points on it where f ,x = 0 or k = 0. A question is whether both types of point have to occur, but generically they do not happen simultaneously. We analyse the local picture for the |B| levels on Σ near such points.
At a point of Σ 0 where f ,x = 0 one has generically a non-degenerate local maximum or minimum of |B| along Σ 0 . The level sets of |B| on Σ look locally like Figure 25. So the level curves of |B| cross Σ 0 except in the hyperbolic sector of the first case.
For points of Σ 0 where k = 0, note that on Σ 0 , k = 0 iff B is tangent to Σ 0 . So one has generically one of the cases of Figure 26. In the case of |B| coming to a quartic minimum, all level curves of |B| collide with Σ 0 . The case of |B| coming to a quartic maximum produces a curve of switch of lowest hill. This "Maxwell curve" is y = 0, x = 2t 2 . All |B| levels on Σ + cross Σ 0 .
If we want weak isodrasticity and there is a curve of Σ 0 then we must have kf ,x = 0 everywhere on it. Figure 26. The disposition of Σ and Σ 0 relative to the magnetic field lines near a point x = y = 0 with k = 0; (a) quartic minimum, (b) quartic maximum. In each case, three fieldlines are shown (denoted by red, green, and blue lines) and a bouncing segment on each of them is denoted by bar arrows. The |B| profiles along the fieldlines are shown in the bottom panel, together with the chosen level of E/µ for the bouncing segment.
To analyse the dynamics more thoroughly near Σ 0 one needs to compute j. In particular, to test isodrasticity near but not on Σ 0 one needs  on Σ − . We can compute this to a good approximation for the short homoclinics that occur near the generic points of Σ 0 in the well of the cubic. From a point of Σ − labelled by (x, t), we use t (s) ≈ 1 to get the approximation where C is the value at the local maximum s 0 of ys + xs 2 + ks 3 and s 1 is the other point at the same height. Write s = s 0 + v/k and α = −|B| /2 at the point of Σ − and recall that |B| = 2x + 6ks 0 . Then Recalling the expression for h = |B| on Σ, using y = −2xt − 3kt 2 (43). It is zero iff y = 2xf ,y + 3kf ,x . This PDE can be solved for f , resulting in h being an arbitrary function of x 2 − 3ky (the deviation from Σ 0 ), but the main conclusion can be read off immediately by putting (x, y) = (0, 0): kf ,x = 0 at (0, 0). This implies for k = 0 that the derivative of h along Σ 0 is zero, confirming one of our necessary conditions for weak isodrasticity.
It is useful to compare dh∧d to β. On Σ, locally β = c(x, y) dx∧dy for some non-zero smooth function c, because x and y are fieldline labels. Using y = −2xt − 3kt 2 and the expression for |B| , we obtain dx ∧ dy = −c|B| dx ∧ dt. So the Melnikov function M = (−|B| ) 1/2 3ck 2 (y − 2xf ,y − 3kf ,x ). Finally, for isodrasticity one also needs the corresponding condition for segments leaving Σ − in the opposite direction, which is not obtainable by local analysis. A typical picture for the phase space F j around a fieldline with a cubic critical point is sketched in Figure 27, using the normal form (42) and a generic assumption that the separatrixarea of the cubic critical point varies at non-zero rate along the fold curve Σ 0 (but note that this assumption is incompatible with weak isodrasticity, because we proved that  is constant along Σ 0 for weak isodrastic fields.) The cubic critical point unfolds in the downward vertical direction, to a local maximum and a local minimum. Moving horizontally to the right, the separatrix-area for the cubic critical point decreases, thus to maintain constant j, the energy-level changes as indicated. The boundary of F j is a cusped curve with width ∆x asymptotically proportional to (−y) 5/4 . This follows from the above computation of  for the little well in the cubic. The lefthand curve is determined by making the separatrix area for the main well be j; on the righthand curve the sum of the separatrix areas for the main well and the little well is j.
In conclusion, generic Σ 0 is incompatible with isodrasticity except with special design. The typical ways isodrasticity fails near Σ 0 reveals a sensitivity of axisymmetric and quasisymmetric designs to imperfections.

Appendix F. Derivative of reduced Hamiltonian
Here we give an alternative proof of Proposition 1 that illustrates its general connection to the theory of nearly-periodic systems. Figure 27. Sketch of the phase space F j near a fieldline with a cubic critical point with generic unfolding. Σ 0 is the curve with cubic critical points: 3ky = x 2 in fieldline labels (x, y). The cusped wedge is excluded.
Proof. In the bouncing region of the guiding center phase space the guiding center vector field V defines a Hamiltonian nearly-periodic system with limiting roto-rate R 0 = T 2π V 0 and exact -dependent symplectic form ω . Here Hamiltonian means ι V ω = dH , for some Hamiltonian H , and T denotes the true period for bounce motion. As for all Hamiltonian nearly-periodic systems [BS], there exists an all-orders roto-rate R such that ι R ω = dJ , where J denotes the all-orders bounce adiabatic invariant. By [BS] the formula for the first-order term in the roto-rate is where I 0 denotes the inverse of L V 0 restricted to the subspace of vector fields with vanishing U (1)-average, and V 1 = V 1 − V 1 is the first-order guiding center vector field V 1 less its U (1)-average. It is easy to show that H 0 = J 0 = 0, and that H 1 , J 1 correspond to the usual leading-order expressions for the guiding center energy and bounce invariant. Since both R and V are Hamiltonian vector fields we have In particular dH 1 = ι V 1 ω 0 + 2π T (dJ 1 − ι R 1 ω 0 ). This formula is useful because it simplifies computing the derivative of H 1 along U (1)-invariant vector fields X = u · ∂ X + a ∂ v that preserve J 1 , i.e. dJ 1 (X ) = 0. In particular, This establishes the Proposition after using dt = b /v and energy conservation.

Appendix G. Double transitions
As a special event, a fieldline segment can approach double transition. Some ways this can occur are indicated in Figure 28. They involve formation of a heteroclinic cycle. There are others involving formation of a degenerate critical point too, but they are more special. In the 2D space of fieldlines, a heteroclinic cycle requires only one condition, Figure 28. Some ways to double transition. namely that |B| have two local maxima at the same height, so it happens generically along curves in the space of fieldlines.
It leads to corners in the reduced spaces F j , where boundaries corresponding to two different transitions meet. See Figure 29.
One could make explicit examples, for example with |B| a quartic in arclength whose coefficients depend on two fieldline labels, as for the formulae leading to Figure 26(a). Although evaluating j requires elliptic integrals, the boundary cases can be computed explicitly (generalising the cubic case of Section 4 and Appendix E).
Near points of double transition, transitions between several classes can occur. Isodrasticity requires that ker dH j contain the tangent to the boundary at single transition points. Taking the limit to a transverse corner, this implies that dh = 0 there. The consequences of this are left to a future publication. Figure 29. An example of a corner in the reduced space F j .
Note that the times spent by a periodic orbit near the saddles of a heteroclinic cycle of a Hamiltonian system are asymptotically proportional to the Lyapunov times of the saddles (i.e. the inverses of their positive Lyapunov exponents). So using (15), along the curve for equal height, dH j is asymptotic to the convex combination of dh at the two ends, weighted by their Lyapunov times.
Note also that d goes to infinity at generic corners because the time spent near the second saddle grows logarithmically as it is approached along a level curve of , and the derivative of  is related to this time.

Appendix H. Dipole field
As a simple illustration of Σ and the reduced Hamiltonian H j , we consider GCM in a dipole field B = 3 cos θr −ẑ r 3 in spherical polar coordinates (radius r, colatitude θ, longitude φ), illustrated in Figure 30. The dipole strength has been scaled to 4π. The fieldlines are r = r e sin 2 θ, φ = constant, where r e is the radius at which they cross the equatorial plane. Along the fieldlines, |B| = √ 3 cos 2 θ + 1 r 3 e sin 6 θ .
The guiding centres bounce across the equatorial plane between the regions of stronger field near the poles. Then Σ is the equatorial plane and it consists of only Σ + (see Figure 2(a)). For a bouncing segment, j can be written as j = r −1/2 e F (hr 3 e ), where h is |B| at the bounce points and F (k) = θ 1 θ 0 2 k − √ 3 cos 2 θ + 1 sin 6 θ 2 cos 2 θ + 1 sin θ dθ, Figure 30. A meridional section through a dipole field, showing some fieldlines and the field strength in colour (blue to red indicates weak to strong and white indicates |B| exceeds a threshold). Σ + is the equatorial plane.
with θ i (k) being the zeroes of the first square root (symmetric about π/2). F is defined for argument greater than or equal to 1, has F (1) = 0, positive derivative (including at k = 1 where F (1) = π 3 ), and goes to infinity as k → ∞. FGCM is described in the adiabatic approximation with given value of j (and scaled time τ ) by the Hamiltonian H j on Σ + , defined by H j = r −3 F −1 ( √ rj) (we write r e = r on Σ + ), and the symplectic form that is just the magnetic flux-form r −2 dr ∧ dφ on Σ + . Now dH j dr = − 3 r 4 F −1 ( √ rj) + j 2r 7/2 F (k) = r −4 −3k + F 2F , where k = H j r 3 . We didn't check, but presumably F/F < 6k (this is certainly true for k = 1 where F/F is zero, and at k = ∞ where F/F ∼ 2k) so H j is a strictly decreasing function of r for given j. In any case, the level sets of H j are concentric circles, and the bouncing segments precess around the dipole axis at constant rate dφ dτ = r 2 dH j dr . If axisymmetry is broken but not too much in C 1 , as perhaps for the earth's magnetosphere, and we ignore changes far away (such as due to the solar wind), then Σ + deforms into a nearby surface and H j into a nearby function. Its level sets deform to closed curves near the original circles. The segments continue to precess, but in general no longer at constant rate. The precession period can be calculated from (18).
As there is no Σ −0 , there are no transitions and the field is automatically isodrastic. The only thing to check for confinement is which set of precessing segments to populate. For example, in the context of the earth's magnetosphere, the desired region is the outside of the earth. Then in the axisymmetric case, scaling the earth's radius to 1 and the field strength to 1 at the earth's equator to fit with the above, a calculation shows that the segments that do not hit the earth are those for which h < 4 − 3/r e , where h is the ratio of the energy to µ.
The earth's magnetosphere is perturbed not only by deviations from a dipole of magnetic generation in the earth but also by interaction with the magnetic field of the solar system and solar wind, which change the arrangement of the fieldlines, notably introducing magnetic nulls. We do not pursue those effects here.
Appendix I. Computational practicalities Firstly, we show how square roots can be avoided in computation of many of the quantities required to find Σ and the Melnikov function (though not all). Then we give suggestions for the computation of  on Σ − . I.1. Eliminating square roots. We have written expressions like |B| = b · ∇|B| but if B is given in components, |B| involves taking a square root, so b = B/|B| involves dividing by a square root. So does differentiating |B|. It can be better to write such quantities in terms of |B| 2 . For example, |B| = 1 2 B · ∇|B| 2 /|B| 2 . Here, we collect various such formulae.
The grad-B drift involves b × ∇|B|, which can be written as 1 2 B × ∇|B| 2 /|B| 2 . The curvature drift involves (curl b) ⊥ /|B|. Now (curl b) ⊥ /|B| = J ⊥ /|B| 2 + 1 2 B × ∇|B| 2 /|B| 4 . We can also remove square roots from the computation of the covector a in (38) by switching to fieldline-flow time T . We have da i /dT = −|B|a j ∂ i b j and ∂ I.2. Integrating along a fieldline. Next,  is defined at a point X 0 ∈ Σ − by taking h = |B(X 0 )| and then letting  = 2(h − |B|) ds with respect to arclength s along the fieldline through X 0 to the first bounce, i.e. where |B| = h again.
Suppose we are integrating in the positive direction along B (the obvious changes apply for the other direction). It is slightly more convenient to integrate with respect to fieldline flow time T than arclength s, i.e. start at T = 0 with j = 0 and X = X 0 and integrate dX dT = B(X), dj dt = v|B|, with v = 2(h − |B|), until the first bounce, thereby banishing square roots to only the second of the two equations.
There's also a term proportional to γ from the second derivative of |B| but the w term dominates it by the factor 1/μ. So we end up with γ = −μ b × ∇|B| · ∇|B| |B| 2 |B| 2 .
In particular, using (44), Appendix L. Systems with perfect separatrices There exist Hamiltonian systems that have one or more perfect separatrices but are not integrable.
Firstly, one can make area-preserving twist maps with this property. The idea is to use the construction by de la Llave in the appendix to [Mat]. Given a (lift of a) degree-one homeomorphism g of the circle R/Z, define h(x) = g(x) + g −1 (x) − 2x, and define map T for (x, y) on the cylinder R/Z × R by y = y + h(x), x = x + y It is an area-preserving twist map and the circles y = x − g(x) and y = x − g −1 (x) are invariant, with dynamics x = g −1 (x) and x = g(x) respectively. So we choose g to be a circle homeomorphism with two fixed points, e.g. g(x) = x + k sin 2πx with 0 < k < 1 2π .
Then we obtain a map T with two period-one islands with perfect separatrices. But in general it is non-integrable, as illustrated in Figure 31 (we presume that one could prove this if desired). One could modify the choice of g to make an example with a single Figure 31. Some orbits of the map T for k = 0.1 with the y = x−g −1 (x) invariant circle shown as the red and the y = x − g(x) as the blue curve.
period-one island but it would require careful matching of derivatives of g to the left and right of the fixed point if one wants T to be smooth. Similarly, one can make continuous-time Hamiltonian systems with a perfect separatrix without imposing integrability. For example, given a function S : R 2 → R with constants A, B such that S(x + m, y + n) = S(x, y) + mA + nB, and the vector fielḋ q = ∇S(q) induced on R 2 /Z 2 , then H = 1 2 |p| 2 − 1 2 |∇S(q)| 2 has invariant graph p = ∇S(q) on whichq = ∇S(q). So choose S with a periodic orbit repelling on one side, attracting on the other. What is not clear to us is how to make examples with perfect separatrices for all energies simultaneously.
Appendix M. Persistence of Σ 0 and Σ + In this appendix, we address the question of whether also Σ 0 and Σ + have continuations to invariant submanifolds for small µ.
We begin with Σ 0 . If a 2DoF Hamiltonian system has a 2D manifold N 0 consisting of elementary saddle-centre periodic orbits ("elementary" means that some generic conditions are satisfied) then firstly it is part of a 3D manifold N consisting of periodic orbits, which decomposes into N − , N 0 and N + , with the orbits on N − being hyperbolic and those on N + being elliptic. Secondly, all nearby Hamiltonian systems have a nearby such N locally. These results follow from [Mey], which is formulated in the context of area-preserving maps. This is a useful result once we have µ > 0, but unfortunately it does not apply to µ = 0 because firstly the points of Σ 0 × {0} are all equilibria, not periodic orbits. Secondly, the Poisson bracket is degenerate at µ = 0 so we are not starting from a genuine Hamiltonian system.
We suspect that under the conditions that Σ 0 be a generic curve and |B| be constant along its components then at least Σ −0 × {0} persists to an invariant N −0 . The |B| constant condition is necessary for the reduced dynamics for µ > 0 to have Σ 0 invariant.
One might ask why we do not conjecture that the whole of Σ × {0} persists to an invariant N . The answer is that generically we expect resonances to break Σ + × {0}. We give some explanation. Σ + × {0} is a slow manifold but normally elliptic instead of hyperbolic. In general, the best one can deduce for a normally elliptic slow manifold is that the true system has a sequence of submanifolds N + m that are invariant to n th order, but in general the sequence does not converge [M04]. In this low-dimensional case, however, the reduced dynamics on Σ + consists of periodic orbits that are non-degenerate except at integer resonances (in this context, integer resonances means the linearised bounce frequency is an integer multiple of the precession frequency). If we cut out a neighbourhood of these integer resonances, there is a true invariant submanifold N + nearby consisting of periodic orbits of the guiding-centre dynamics, which are elliptic except near halfinteger resonances (where they generically turn inversion hyperbolic). This procedure to construct N + fails at integer resonances (including at Σ 0 where the elliptic frequency goes to zero).
Integer resonances all correspond to saddle-centre periodic orbits but they are unlikely to be elementary: that would require in particular that the field strength happen to have a turning point at them. Thus we expect Σ + to break at integer resonances, if there is something for them to resonate with. For axisymmetric fields the resonances are not excited by turning on µ > 0, but for general non-axisymmetric ones they are excited. The (angular) bounce frequency is ω b = µ m |B| and the (angular) precession frequency is ω p = 2π µ e ∂h ∂Φ , so the condition for integer resonance n is ∂Φ ∂h |B| = 2πn μ, whereμ = m e 2 µ. Unfortunately, for typical fields with the derivative of the lefthand side non-zero this means that there is a large set of resonances forμ small. Nonetheless, if the lefthand side is non-zero they have large n and for smooth enough fields the resonances can be expected to be very weak. See [M04] for an example where they are exponentially weak for large n. Also, adiabatic invariance of L typically makes the motion near Σ + bounded.
We close this appendix by illustrating the problem of integer resonances for Σ + by the tokamak example of 3.2. Then Φ = 2πψ plus a constant, so 1 2π ∂Φ ∂h |B| = − r 3/2 R(C 2 + r 2 ) 1/4 C 2 − r on Σ + . So we get resonance −n at r ≈ Cμ 1/3 n 2/3 . As µ → 0 each −n resonance tends along Σ + to the magnetic axis, so Figure 21 would generically show a sequence of breaks on breaking axisymmetry.