Genuine Nonlinearity and its Connection to the Modified Korteweg - de Vries Equation in Phase Dynamics

The study of hyperbolic waves involves various notions which help characterise how these structures evolve. One important facet is the notion of \emph{genuine nonlinearity}, namely the ability for shocks and rarefactions to form instead of contact discontinuities. In the context of the Whitham Modulation equations, this paper demonstrate that a loss of genuine nonlinearity leads to the appearance of a dispersive set of dynamics in the form of the modified Korteweg de-Vries equation governing the evolution of the waves instead. Its form is universal in the sense that its coefficients can be written entirely using linear properties of the underlying waves such as the conservation laws and linear dispersion relation. This insight is applied to two systems of physical interest, one an optical model and the other a stratified hydrodynamics experiment, to demonstrate how it can be used to provide insight into how waves in these systems evolve when genuine nonlinearity is lost.


Introduction
The study of hydrodynamic systems remains at the heart of the study of nonlinear waves in modern physics. Ranging from the studies of fluids, optics, quantum mechanics and beyond [53], they continue to prove their ability to be an accurate descriptor of observed phenomenon within such systems. Central to the study of this class of systems is are quantities known as characteristic speeds (or simply characteristics) that reveal several properties about the nature of the system. Primarily, they describe how information is transmitted in the problem but are also used to diagnose whether the underlying equations are hyperbolic or elliptic (real or complex characteristics respectively) which have implications for the stability of the states corresponding to such classifications. Another less frequent use of characteristics, which will be the key concept of this paper, is to diagnose genuine nonlinearity [35]. This notion distinguishes whether nonlinear structures such as shocks and rarefactions can form. In cases where genuine nonlinearity is not operational, the system is said to be linearly degenerate and instead admits contact discontinuities. An important hyperbolic system within the field of nonlinear waves, and one which will be the focus of the discussion of this paper, are the Whitham modulation equations (WMEs). These govern the slow evolution of the wavenumbers and frequency of a given wave and thus determine its long-time evolution, which can be successfully applied to problems on wave stability [7,11], the formation of dispersive shocks ( [25,26] and references within) and localised structures [48]. However, the WMEs lack a regularisation mechanism such as dissipation or dispersion, but this can be remedied via a phase dynamical analysis to introduce dispersion into the system (see for example, refs. [8,9,46] and references therein). The form of the resulting dispersive equation has been shown to depend largely on properties of the characteristics admitted by the WMEs, with recent works highlighting significant changes in evolution in the neighbourhood of the elliptic-hyperbolic transition. The aim of this paper is to explore the result of the phase dynamics in light of a loss of genuine nonlinearity at a given state point (as opposed to a total linear degeneracy) to determine how this alters the evolution of the wave. The result of this reveals that the dispersive equation operational in such scenarios is the modified KdV (mKdV) equation: where the function u(x, t) is related to a perturbation of the wavenumber. The mKdV is a well-known nonlinear equation that arises across many fields, such as internal waves [15,20,24,31], plasma physics [29,32,27,41] and optics [2,36,50], and a key outcome of this paper is to unify the emergence of the mKdV in such environments by providing a universal derivation and form of the mKdV from a Lagrangian formalism, whose coefficients depend solely on properties of the wave from which they are derived.
The most straightforward derivation of the WMEs for a given single-phase wavetrain U (kx + ωt; k, ω) ≡ U (θ; k, ω) involves an averaged Lagrangian approach [53], but these may also be obtained via formal asymptotics or by averaging the relevant conservation laws. In any of these cases, one arrives at the first order hydrodynamic system k T − ω X = 0 A(k, ω) T + B(k, ω) X = 0 , for local wavenumber k(X, T ) and frequency ω(X, T ), slow variables X = εx, T = εt with ε 1 and A, B are the wave action and wave action flux. The most well-understood version of these equations applies to single-phased wavetrains, as described above, but there have been generalisations to accommodate for both the cases of relative equilibrium, mean-flow effects and for arbitrarily many phases. For these cases, the WMEs generalise naturally to the vector-valued system k T − ω X = 0 A(k, ω) T + B(k, ω) X = 0 , k, ω, A, B ∈ R N . (1.1) Here, k and ω are the vectors containing each slow wavenumber and frequency respectively and A and B are now the vector-valued wave action and wave action flux associated with each phase of the solution. Emerging from this system are up to 2N characteristics c satisfying the zero determinant condition of a quadratic matrix pencil [10]: The presence of a larger set of characteristics heralds an increasingly nontrivial set of ways in which these can interact and change the resulting dynamics for the system. One shortcoming of the WMEs is that they lack a regularisation mechanism, such as dissipation or dispersion, which prevents gradient singularities and multivalued solutions from occurring. There has however been a recent series of approaches to remedy this issue via the use of phase dynamics. Inspired by the early works by the likes of Pomeau and Manneville [43], Kuramoto [33] and Doelman et al. [21], Bridges et. al. adopted a similar modulational ansatz for use in Lagrangian systems to introduce dispersion into the modulation equations. Particularly, one very recent advancement utilises the characteristics to generate such dispersion in a general way. To do so, one constructs a guess at a new solution of the form U = U (θ + εφ(X, T ); k + ε 2 φ X , ω − ε 2 cφ X + ε 4 φ T ) + W (θ, X, T ) , X = ε(x − ct) , T = ε 3 t , ε 1 . for phase θ = kx + ωt, phase perturbation φ and c a real characteristic of the WMEs. Substitution of the above into the Euler-Lagrange equations and a subsequent the asymptotic analysis leads to a dispersive set of dynamics emerging instead in the form of the famous Korteweg -de Vries (KdV) equation, relying only on the fact that c is real and thus the WMEs being hyperbolic [46]. A remarkable feature of this analysis is that it demonstrates that the coefficients of this KdV are universal, in the sense that they rely only on information regarding the conservation laws for the system rather than the particular form of the governing equations. Much like how the WMEs generalises to multiple phases, this approach and insight too extends naturally to waves with arbitrarily many phases However, it can be shown that the form of the dispersive dynamics may alter dependent on the nature of the characteristics. For example, when characteristics coalesce at the elliptic-hyperbolic transition point it is instead the dynamics of the two-way Boussinesq which become operational [9,10]. Such a dynamical change heralds both quantitative and qualitative differences in how the system evolves -new solutions may arise, how they bifurcate may be altered and stability properties of solution families can change. This highlights that the properties of the characteristic can be used to diagnose which dispersive equation should be used to model the original wave's evolution as well as lending insight into how one expects such evolution to proceed.
It is in this spirit that the paper will proceed, with the main focus being on the connection between genuine nonlinearity and the resulting phase dynamics. The earliest work into this was undertaken by El et al. [26], who were able to demonstrate that a loss of genuine nonlinearity in a hydrodynamical system suggested the modified KdV (mKdV) equation should emerge with highly nontrivial consequences on the resulting dispersive shocks. The aim of this paper is to prove this connection generally for the WMEs, so that a loss of genuine nonlinearity for a given underlying wave signifies that an mKdV equation governs the dispersive dynamics of the wave quantities. We phrase this precisely as the following theorem: Theorem 1.1. Suppose a given Lagrangian admits an N -phased wavetrain solution. Then if the Lagrangian system is dispersive, the resulting Whitham modulation equations are hyperbolic, a chosen characteristic c is simple and its field locally linearly degenerate for the wavetrain considered, the modified KdV equation, is an asymptotically valid reduction of its Euler-Lagrange equations in a frame moving with this characteristic speed.
The criterion outlined within the above statement are important in making sure that the coefficients of this reduction are nonzero. Hyperbolicity guarantees c is real and the need for the simplicity of the characteristics ensures the coefficient of the time term, α, doesn't vanish. The requirement for the system to be dispersive gives that γ is nonzero except in special cases where the dispersion is weak. The loss of genuine nonlinearity is crucial to the presence of the cubic nonlinearity instead of the quadratic one the KdV possesses. The notion of local linear degeneracy, a relaxed form the classical linear degeneracy requirement, essentially states genuine nonlinearity is lost only at points and will be defined within the paper. This generically allows β = 0 and thus for nonlinearity to be retained within the phase dynamics in the form of the cubic term.
Once again the universality of the dispersive dynamics is apparent via the presence of the conservation laws (through E and its derivatives), however there is an additional universal feature to the above equation arising from the linear dispersion relation σ. The connection between the dispersive term in KdV-like models and the linear dispersion relation for the original system has long been known heuristically (for example, see [3,22,23]) but as of yet has not been rigorously proven for generic dispersive systems, and this paper provides such a proof. To do so, a Fourier-Bloch analysis is presented inspired by preceding work [21], and thus completely casts the coefficients of the resulting equation in terms of quantities obtainable from straightforward linear analyses. This lends a further strength to the analysis here -the nonlinear PDEs sought can be readily constructed from expressions one likely already possessed or those that are easily obtained, easing the access to information pertaining to the nonlinear evolution of the wave.
The essence of the proof of theorem 1.1 is to adopt a rescaled and slightly modified version of the previously used ansatz (1.3) so that all the terms in the mKdV asymptotically balance. The assumption of hyperbolicity, the use of a moving frame and the assumption of a loss of genuine nonlinearity then allow the analysis to proceed to the required order which the mKdV equation emerges at.
The appearance of the mKdV itself already sheds light on how a loss of genuine nonlinearity will affect the evolution of the underlying wave. Mainly it is the fact the mKdV admits a much larger solution set than the KdV equation, since the Muira transform connecting the two equations is not bijective [39], suggesting a more complex and richer evolution of the system. Further to the cnoidal and sech-based solitary wave solutions present in the KdV equation, dnoidal waves and front solutions connecting conjugate steady states also emerge, which grow as the square root of their speed rather than linearly with it as is the case for the KdV (see, for example, Grimshaw et al. [24]). There are also breather and rational solutions which arise as solutions to the mKdV [4,55]. Stability properties of the two equations differs as well, for example periodic wave solutions of (1.4) can be modulationally unstable depending on the sign of the cubic nonlinearity, in stark contrast to the KdV where all such solutions are stable [11]. With all of these factors considered it is clear that the transition in dynamics from the KdV equation to the mKdV equation via a loss of genuine nonlinearity presents a nontrivial set of changes to the overall evolution of the nonlinear wave.
Genuine nonlinearity plays an important role within the study of nonlinear waves across physics, albeit it is not always explicitly identified. Similarly, its loss and connection to the appearance of the mKdV within such systems remains widely unacknowledged. As part of the novelty of this work, it will be demonstrated how a loss of genuine nonlinearity can be identified in systems of physical interest, how the paper's theory can be used to construct the relevant mKdV in such scenarios and what the consequences of this might be for the original wave. In particular we focus on a higher order Nonlinear Schrödinger model utilised in the study of optical systems, which turns out to also provide information regarding the evolution of Stokes waves, in addition to a stratified shallow water system representing active experiments into internal solitary wave. This provides a template for how the theory of this paper may be utilised to understand the dynamics of nonlinear waves in situations where genuine nonlinearity is lost.
The paper proceeds in the following way. In §2, the necessary abstract theory to undertake the phase dynamical approach is outlined and discussed. This includes a discussion of the wavetrain, the linearisation about it and the notion of genuine nonlinearity in the context of the WMEs. This is utilised in §3 to prove theorem 1.1 by constructing the relevant ansatz and undertaking a phase dynamical analysis. With the mKdV derived, we apply the theory to two examples in §4. The first is for a single phase wavetrain arising with an optical wave system, and the second appeals to an experimental set-up used to study internal waves in stratified fluids. Concluding remarks appear in §5.

Abstract Set-up and Linearisation Properties
The starting point for the abstract theory discussed in this section is the multisymplectic form of the Lagrangian; for skew symmetric matrices M, J and Hamiltonian function S. The procedure to transform a given Lagrangian into its respective multisymplectic form is a standard sequence of Legendre transforms [8]. The motivation for using this form is to provide a clear connection between the modulation analysis and the conservation laws, which enters through the matrices M and J. The associated Euler-Lagrange equations in the multisymplectic formalism are then the variations of this Lagrangian: (2.1) Throughout, the notation D will refer to the directional derivative and the subscript, where present, will signify the argument being differentiated. The theory of this paper proceeds under the assumption that the Euler-Lagrange equations (2.1) has an N -phase wavetrain solution, where N is a natural number, so explicitly we write this as This wavetrain solution to (2.1) satisfies the PDE N j=1 Moreover, one may evaluate the Lagrangian along this wavetrain, Differentiating this with respect to the wavenumbers and frequencies leads to the wave action and wave action flux evaluated on the solution Z: It is useful for the later analysis to take note of their derivatives with respect to wavenumber and frequency: as these matrices will arise within the definition of the characteristics, as well as a central feature of the phase dynamical analysis. It is from these definitions for the conservation law components that we are able to discuss characteristics, which are the fundamental construct of the majority of this paper. The WMEs for N -phased wavetrain can be written as for local vector-valued wavenumber K(X, T ) and local frequency Ω(X, T ). The characteristics for this system about a fixed wavenumber and frequency k, ω can be found using the normal mode approach (K, Ω) = (k, ω) + δ(K,Ω) exp(i(X − cT )), which to order δ gives the quadratic matrix pencil The roots of its determinant, define the characteristics for the system. For these choices of c, we can define the eigenvector ζ satisfying E(c)ζ = 0 (2.7) Throughout the paper, we will be assuming that the characteristic chosen is simple so that the kernel of E(c) is one dimensional and no other eigenvectors need be considered. As a consequence, it means that where the prime denotes differentiation with respect to c, tr denotes the trace of the matrix, T the matrix transpose and adj denotes the matrix adjugate. Interestingly, it is the final expression above that emerges as the coefficient of the time term in the mKdV, emphasising why the assumption on the characteristic's simplicity is necessary for its derivation.

Linearisation Properties and Fourier-Bloch Analysis
For the analysis leading to the modified KdV we must consider the linearisation of the system (2.2) and its properties. Therefore, define the linear operator We can show that by taking θ j derivatives of (2.2) that so that it is clear that Z θ j lies in the kernel of L. We make the assumption that the kernel of L is no larger than the span of these elements, so that ker(L) = span Z θ j : j ∈ {1, . . . , N } . As such, the condition that a given expression lies in the range of L can then be formulated as where •, • is a suitable inner product for the problem. For multiply 2π-periodic waves, the natural choice is the averaging inner product over each phase: It is also necessarily for the analysis leading to the modified KdV equation to consider derivatives of (2.2) with respect to the wavenumber and frequency. Doing so gives which may be combined into the single expression where c is a constant to be determined shortly. This suggests a Jordan chain structure is present, as is discussed in Bridges and Ratliff [9]. The details are briefly recounted here, but for further details the reader is referred to this work instead. Two chains emerge, one of length four and one of length two, and it is the former we are concerned with. It takes the form with the first two elements, for constants ζ j , following from (2.8) and (2.10) respectively. The third element, defined by may be found providing the right hand side is in the range of L. Assessing this using (2.9) leads to the vector system This is satisfied providing that c is a characteristic of the Whitham modulation equations associated with the wavetrain Z and the vector ζ is the eigenvector associated with the zero eigenvalue of E defined in (2.7). The zero eigenvalue of L is even, and so the existence of v 3 automatically guarantees the existence of v 4 with The length of the chain is precisely four when the right hand side of the expression for the next element of the chain, does not lie in the range of L. Within the analysis contained within this paper, we make this assumption as in practice it is the most generic case, only failing in special cases where dispersion is sufficiently weak. By (2.9), this is precisely when Surprisingly, the termination of this Jordan chain may be related to the linear dispersion relation obtained about the solution Z. The details of how this connection are a novel aspect of this paper and will be discussed below, and the result is simply that where σ(ν) is the linear dispersion relation about the solution Z. This will go on to form the coefficient of the dispersive term in the mKdV derived in this paper. In this light the connection between this coefficient and the linear dispersion relation is natural, as the linear dispersion relation for the resulting mKdV equation must match the long wave expansion of the linear dispersion relation of the system for which it is derived. The starting point to establish this connection is to introduce the Bloch ansatz where σ(ν) denotes a continuous set of eigenvalue curves which may be indexed [21], however we will restrict ourselves to but a single one of these as will be made clear shortly. This Bloch form suggests the definition of the Bloch operator which we assume has Z as a kernel element, so that The adjoint operator of this under the averaging inner product is simply it's complex conjugate. When ν is taken to be zero, we have the eigenvalue problem Thus, σ(0) needs to be an eigenvalue of the original linear operator. We choose this, quite naturally, to be the zero eigenvalue so that the discussion corresponds to the linear theory about Z. In doing so, it becomes clear that σ(ν) is the linear dispersion relation about Z and we can write Z(θ, 0) as a linear combination of the kernel of L, This can be ensured by assuming that L only has a simple zero eigenvalue. Much of the discussion revolves around taking ν derivatives of the Bloch linearisation and evaluating these at ν = 0. This is to essentially consider a long wave expansion of σ, and we will show that the Jordan chain structure discussed above arises naturally from doing this. Thus, differentiate (2.13) with respect to ν four times and set ν = 0: The first equation, once (2.14) is used, resembles the twisted Jordan chain result (2.10).It follows that for real constants α j . Using this in (2.15b) and assessing solvability gives that and thus we must have σ = c and α ∝ ζ. This is not unexpected as the linear dispersion relation must admit the linear long-wave speed as ν → 0, which are equivalent to the characteristics of the WMEs. Without loss of generality make the above proportionality an equivalence for simplicity and thus from (2.11), Using the results obtained thus far in (2.15c) and appealing to solvability, it can be seen that the first term vanishes due to the even multiplicity of the zero eigenvalue of L and the final term also vanishes. This leaves only The inner product in fact leads to lead to the system As the characteristic c is assumed simple, this is only true if σ (0) = 0 ,. This is expected as the WMEs for the wave Z are hyperbolic, and as such it is (modulationally) stable and the dispersion relation σ should therefore be real. Overall, we therefore have Now, in order for the right hand side of (2.15d) to lie in the range of L we require from all previous results that which is equivalent to the vector system A projection through a left multiplication by −ζ gives the scalar quantity appearing as the coefficient of the dispersion term in theorem 1.1: This is to say that the dispersion relation of the long wave model is consistent with the long wave expansion of the original system's dispersion relation (recalling that the ζ T E (c)ζ is the coefficient of the time derivative term in mKdV equation outlined in Theorem 1.1), as one expects.

Genuine Nonlinearity and its Role in Phase Modulation
One of the key results of this paper is to connect the notion of linear degeneracy in the Whitham modulation equations associated with a wavetrain and the emergence of the mKdV equation. In order to do so, we recount the notion of genuine nonlinearity in hyperbolic wave equations. Given such an equation of the form where u represents a state vector, then we may identify its characteristics c(u) by the standard relation for the set of right eigenvectors R c (u). From these notions, we may state the definition of genuine nonlinearity as follows: Definition 2.1. We then say the evolution associated with the characteristic speed c genuinely nonlinear, as defined by Lax [35], if for this speed we have that where · denotes the standard inner product on vectors. If a characteristic fails this criterion and instead then it is said to be linearly degenerate.
In the linearly degenerate regime, neither rarefactions or shocks form and instead contact discontinuities are operational. However it will be necessary to subsequent discussion for a local definition of these properties, as the phase dynamics considers expansions of the systems about a given basic state. To this end, we define the notion of local genuine nonlinearity as follows: Definition 2.2. Suppose one considers the evolution associated with the characteristic speed c close to some state point u 0 , then we say that the system is locally genuinely nonlinear whenever Whenever the converse is true, we say that the evolution is locally linearly degenerate.
With the above notions, we will show that the modified KdV equation is a result of the phase dynamics having a local linear degeneracy along one of its characteristic fields.
In the context of the WMEs (2.4), we have that the characteristics satisfy (2.6) and the system gives the set of right eigenvectors In order to construct the expression to assess genuine nonlinearity, we use the Jacobi formula for the differentiation of a determinant to find the k i derivative of c from (2.6): Similarly, we also have that Then, the relevant condition to determine genuine nonlinearity It should be emphasised that in the above, the derivatives D k , D ω on the right hand side do not operate on the c terms in E. To simplify this, we note that as E is singular with simple zero eigenvalue (as c is assumed to be simple) and symmetric we may write its adjugate as where µ(E(c)) is the product of the remaining nonzero eigenvalues of E(c). From this, we may then show that The term in the square bracket is exactly the coefficient of the quadratic nonlinearity obtained via phase modulation in both the KdV [46] and the Two-Way Boussinesq [9]) in the multiphase modulation theory (noting the different sign conventions of the speed c), since we can note that and so the multiphase Whitham modulation equations lose genuine nonlinearity precisely when Thus, a loss of genuine nonlinearity via a local linear degeneracy implies the loss of the quadratic term in the KdV equation and would suggest the emergence of the mKdV, since this contains the cubic nonlinearity one expects to emerge. All that remains is to demonstrate that the phase modulation will lead to the mKdV in the linearly degenerate case, and this will be undertaken in the following section.

Phase Dynamical Reduction to the Modified KdV Equation
With the abstract framework necessary to undertake the modulation approach outlined, all that remains is to undertake the calculation to demonstrate its emergence. This is to say that we are to prove theorem 1.1 and derive the weakly nonlinear dispersive model that arises in this scenario. The methodology to obtain the modified KdV in this way is to utilise the ansatz where X = εx, T = ε 3 t and ε 1. The phase modulation function Φ is defined as the summation of three vector-valued functions Φ = φ(X, T ) + εψ(X, T ) + ε 2 α(X, T ) and the wavenumber and frequency modulation functions q, Ω are defined as Φ X , Φ T respectively. The presence of the q term in the frequency modulation is necessary to ensure the phase consistency condition (θ + Φ(X, T )) xt = (θ + Φ(X, T )) tx in light of the moving frame. For convenience in the analysis, we also expand the remainder/correction term W in a simple asymptotic series, in order to make the role of W in the analysis clearer.
The approach is to substitute the ansatz (3.1) into the multisymplectic Euler Lagrange equations (2.1), Taylor expand Z about ε = 0 and solve the resulting system of equations order by order. By using this ansatz and the multisymplectic formalism, the connection between the solvability conditions which arise from the analysis and the conservation laws for the system become obvious and leads to the universal form of the equation.
This section presents a summary of the order-by-order analysis. There is significant of overlap with previous phase dynamical analyses in this area, so for brevity the details of such overlap will be less detailed and for complete details the reader is referred to this other work. The most relevant of these are some recent work on modulation in the moving frame for multiple phases [9] and a derivation of the modified KdV for two phases in the laboratory frame [45], and it is from these that the subsequent work draws most heavily.
The leading order is satisfied as Z solves (2.1), and the first order in ε satisfied by appealing to (2.10) and the definition of q. The terms at order ε 2 , when simplified, give the system By applying the solvability condition (2.9) to determine whether the right hand side is in the range of L, one generates the vector system E(c)q X = 0 .
Thus, for the problem to be solvable at this order, c must be a characteristic of the Whitham modulation equations and q = ζU (X, T ). This speed is real providing the Whitham modulation equations are hyperbolic, as assumed within the theorem 1.1. The solution for W 0 then reads where v 3 is the third element of the Jordan chain outlined in §2.1.
It is at this point the analysis of this paper diverges from the existing works. The Euler-Lagrange equation at order ε 3 , once simplified, reads Appealing to solvability of this equation and manipulations almost identical to previous work [10] and leads to the vector system This vector system may be solved exactly when the linear degeneracy condition (2.19) holds, as assumed in the premise of the theorem 1.1. In such cases, this imposes that The result of the analysis at this order is that W 1 is given by The final order of the analysis considered is ε 4 , at which the mKdV equation emerges.
Simplifying the Euler-Lagrange equation at this order and using (3.2) results in (3. 3) The tilde above W 2 denotes the fact that all terms at this order which lie in the range of L at this order have been absorbed into it. The exact form of these terms does not matter as the analysis terminates at this order, however any analysis at higher orders of ε would require these. All that remains is to determine the condition for the remaining terms on the right hand side to also lie in the range of L , which results in the mKdV equation sought. Its coefficients are generated by appealing to the solvability of the above, and this is what we now generate. Firstly, applying the solvability condition (2.9) to the term involving U T gives The terms involving α i simply result in −E(c)α, as can be seen by its similarity to the expression arising at second order. Next, we do the same to the term involving the U XXX term, which by the discussion of §2.1 gives where σ is the linear dispersion relation about the solution Z. The final term to be considered is the cubic nonlinearity, and a significant level of manipulation along the lines of previous works [9,45]. Due to its complexity, the details of these are presented in appendix A and we simply state here that the solvability condition applied to these terms generates Combining all of these results, the right hand side of (3.3) lies in the range of L providing that The role of α in the analysis is now clear here -without it, we would have N equations for a single unknown, U , and so would result in the imposition U = 0 and thus no modulation taking place. To remove it from the analysis, and so to find the scalar equation U must satisfy, we multiply the above by ζ on the left to project the system in the direction of the kernel of E. The result of this is the modified KdV equation The above equation is valid so long as none of the coefficients are zero. We ensure α = 0 under the assumption in theorem 1.1 that the characteristic is of multiplicity one and thus simple, as α = 0 is exactly the condition of coalescing characteristics [9]. Thus γ = 0 whenever the long wave expansion of the linear dispersion satisfies σ (0) = 0, which is generic for dispersive waves and assumed under the assumptions within the theorem. Currently, there is no generic theory to discern in what cases the coefficient of the cubic term vanishes, but it is assumed not to within the theorem. Therefore, the modified KdV equation emerges as the asymptotically valid reduction of (2.1) whenever the Whitham modulation equations are hyperbolic and they are locally linearly degenerate for the chosen characteristic c.

Weak Linear Degeneracy and the Gardner Equation
The derivation of the mKdV (3.4) relies on the assumption that there is a local linear degeneracy which for the WMEs is equivalent to (2.19 is instead the modulation equation which arises. The key reason the analysis is able to proceed in this case is because the linear system (3.2) is still solvable in a weak sense, namely that it is solved to leading order with an error of order ε. The projection of this error generates the above quadratic nonlinearity.

Examples of the Theory's Application
With the abstract result confirmed, we now demonstrate how it may be applied to problems of interest. Namely, we show how nonlinear dispersive models can be constructed using the above result and thus subverting the need for any further asymptotic analysis. The first is a singled phased example to concisely pin down the steps one can take to reach the modified KdV and how one identifies the local linear degeneracy condition required for it to be valid. This will be done for a higher order Nonlinear Schrödinger model, which arises in both optical and oceanic settings. This is then used as a basis to show that the insight gleamed from this example is also applicable to the analysis of Stokes waves [53], which are prevalent across many nonlinear wave systems including water waves [38,52] and viscous fluid conduits [37,28]. The second, motivated by recent experiments [6,13,14,19], considers a multilayered shallow water system to demonstrate how a multiple phased relative equilibrium may be treated and how the resulting dispersive reduction can be used to explain the experimental observations.

Application to Higher Order NLS
An illuminating example with a single-phase wavetrain is the higher order Nonlinear Schrödinger (NLS) equation. This is given by The real constants α n are related to the dispersion relation ω 0 (κ) of system from which it is derived, namely α n = 1 (n+1)! d n+1 ω 0 dκ n+1 , and β relates to the nonlinear correction to it. Conforti et al [17,16,18,40] consider this and related equations to describe the evolution of resonances within optical fibres, but the above NLS equation (sometimes with further nonlinearities) also appears within the study of oceanic Stokes waves [1,51].
The simplest illustration of the theory of this paper is to investigate the phase dynamics of the genus 0 solution, which is the plane wave The conservation law components for this system evaluated on this solution are given by which can be used to find the characteristics as The eigenvector ζ associated with each characteristic in this case is simply unity, since there is only a single phase present. Hyperbolicity requires that β(α 1 +3α 2 k) < 0, which is a higher order dispersive correction to the typical Benjamin-Feir-Lighthill condition α 1 β < 0, which is recovered for the Stokes wave case k = 0. This confirms α 2 's secondary effect of stability, however as we will shortly see it has a fundamental role in the characteristics undergoing a local linear degeneracy.
To determine the conditions for the local linear degeneracy, compute the relevant derivatives for the nonlinear term: Thus, the linear degeneracy condition requires that This occurs when α 2 2 |A 0 | 2 + 2β −1 (α 1 + 3α 2 k) 3 = 0 , and requires the speed whose root is the same sign as α 2 β to be chosen, meaning the mKdV equation may only arise for one of the speeds. It is now clear, even in the Stokes wave case of k = 0 that α 2 is required to be nonzero for a local linear degeneracy, highlighting the higher order dispersion's role in this transition. The vector κ does not need to be considered for this single phase example, where it is fact zero. Now that the linear degeneracy condition (4.5) has been identified, all that remains is to compute the coefficients. The necessary derivatives of the conservation law components to compute the cubic nonlinearity are Thus, the coefficient of the cubic nonlinearity is The time derivative coefficient is simply Finally, we compute linear dispersion relation about this wave by either using a Madelung transform (see [12] and references therein) or a Stuart-DiPrima-like analysis [49], giving The long wave expansion of this is readily computed, and one can find that, using the condition (4.5), Therefore, the modified KdV equation one obtains can be simplified to There is much about the dynamics of the original wave that may be inferred by the mKdV (4.6) which one can readily see is the defocussing mKdV. This implies that all of its periodic solutions are stable [11], and these manifest themselves as undulations to the amplitude of the original wave (4.2), suggesting the weak formation of wavepackets. Solitary wave solutions exist for the defocussing mKdV so long as these have a non-zero background, and these correspond to bright and dark solitary waves forming from (4.2). A family of front solutions however does exist in the defocussing mKdV, which for this problem take the form and correspond to smooth shocks in the amplitude of the original wave. A full study of these solution families and their effect on the original wave is outside the remit of this paper, but such inferences already demonstrate the level of insight that the theory of this paper can afford regarding the evolution of waves such as (4.2) in locally linear degenerate circumstances.

Connection to Stokes Wave Analyses
The higher order NLS equation (4.1) provides an informative example under which the computations may be done exactly, however it also provides insight into the phase dynamics of Stokes waves. We will illustrate how this can be done below, demonstrating how the insight of the analysis of the above example mirrors that of weakly nonlinear waves which retain full dispersive information.
Stokes wave solutions are a weakly nonlinear correction to linear waves, leading to corrections to both the wave's amplitude and frequency [53]. In doing so, one induces an effective Lagrangian of the form where a is the wave amplitude, which is assumed to be small, Ω(k, ω) denotes the linear dispersion relation of the governing equations and Γ represents the nonlinear correction to the linear wave's Lagrangian. To the order of the analysis described here, it can be treated as constant, but for more detailed analyses it will vary with the wavenumber k. One should note the effects of mean flow have been neglected in the above Lagrangian, which has been done in order to retain parallels to the previous example, but such effects can be important to the evolution of the Stokes wave. An in-depth discussion of these from the perspective of this paper's approach is reserved for future study.
Variations of the Lagrangian (4.7) with respect to the wave parameters a, k and ω yield expressions for the amplitude as well as the conservation laws we will need to investigate the phase dynamics. Firstly, variations with respect to a lead to the relation which connects the wave parameters to one another: This variation also allows one to connect Γ to the nonlinear frequency correction for the Stokes wave, ω 2 for right moving near-linear waves [53]. To do so Taylor expand the dispersion relation about ω = −ω 0 (k), where ω 0 is the right-moving root of the linear dispersion relation Ω, to show that and thus a comparison to the literature gives ω 2 = ΓΩ(k, −ω 0 ) −1 . Considering the variations of (4.7) with respect to k and ω gives the conservation law components Notice that this recovers exactly (4.3) for the case Ω = ω + α 1 k 2 + α 2 k 3 , which is expected given the relation between NLS models and Stokes waves.
With the conservation laws determined, we now seek to obtain the characteristics for the case of right-moving near-linear waves, meaning that throughout we will be evaluating the conservation law derivatives at ω = −ω 0 . This leads to the following useful expressions for derivatives of Ω: , with the prime denoting derivatives with respect to k. With this in mind, we find that the characteristics satisfy (Ω ωω a 2 + Ω 2 ω Γ −1 ) c 2 + 2(Ω ωk a 2 + Ω ω Ω k Γ −1 ) c + Ω kk a 2 + Ω 2 k Γ −1 = 0 .
In the small amplitude limit, one is able to find the characteristics in the form of c = c 0 + c 2 a + O(a 2 ), and results in the classical nonlinear splitting of the group velocity This is of exactly the same form as (4.4) with ω 0 = α 1 k 2 + α 2 k 3 and β = −ω 2 . For hyperbolicity we require the classical Benjamin-Feir-Lighthill condition of ω 0 ω 2 > 0 We now assess the condition for a local linear degeneracy, which once the necessary derivatives of the conservation laws are computed yields the condition which is identical to (4.5) when the previous mentioned choices for Ω, ω 0 and β are made, up to a scaling factor. As such, rearranging the above gives the identical condition for a local linear degeneracy, Thus, the plane wave analysis of the higher order NLS model mirrors that for Stokes waves, and thus such models serve as a good basis to understand Stokes waves in a formulation where the calculations are exact. It also reinforces that the properties of the linear dispersion relation play a substantial role in the nonlinear phase dynamics of the Stokes waves, as initially identified in the previous example.
With the linear degeneracy condition identified, all that remains is to compute the relevant mKdV equation for the Stokes wave in the case where (4.8) holds. We start by computing the coefficient of the time derivative term, giving For the dispersive term, we require the linear dispersion relation about the Stokes wave solution (as opposed to the linear dispersion relation for the original system, Ω), which we obtain by using the higher order NLS equation (4.1) as in the previous example, as is the typical approach [1,51]. This means the dispersion relation is identical to the previous example and thus 1 6 Finally, one must compute the coefficient of the cubic nonlinearity. Once simplified using the condition (4.8) gives This is similar to the coefficient obtained for the higher order NLS in the previous example, however there is now a higher order derivative of ω 0 present. It is likely that this additional term would be recovered if further spatial derivatives were included in (4.1). Combining these results gives that the mKdV operational for Stokes waves when (4.8) holds is This extension of the previous example shows that NLS-type models can yield much of the relevant information one needs to assess the evolution of Stokes waves, with the key difference resulting from the level of dispersive information the Stokes waves inherently possess. This can be remedied simply by adding further derivative terms to the NLS model used, as mentioned prior. However, unlike the mKdV (4.6), this additional term within the nonlinear coefficient for the Stokes waves case demonstrates that the classification of the mKdV derived can change from focussing to defocussing. This transition depends on the dispersive nature of the system the waves originate from, so a definitive analysis of this mKdV and its effect on the original Stokes is not investigated here.

Application to Stratified Hydrodynamics
Another particularly illuminating application of the theory of this paper is to stratified fluids. The fact that a modulation-based approach would be operational in such a system is surprising, but it arises from the fact that a set of affine symmetries is present and so the uniform flow solution forms a relative equilibrium. This allows the theory to proceed as described within the paper. A benefit of investigating this system is to make the connection between the characteristics discussed in this paper and the linear long wave speeds widely discussed within the area -in fact, they are the same -and so highlight that these can be used as diagnostic tool to determine the relevant dispersive dynamics.
Motivated by the recent experiments on three layered flow [6,13,14,19], we will discuss the shallow water system arising from the Choi-Camassa equations [5] with linearised dispersion: (4.10d) for layer thicknesses h i , quiescent thicknesses H i layer density ρ i and fluid velocity in each layer u i . The layers are labelled from top to bottom, so that the top-most layer is index by 1 and the lowest by 3, meaning that stable stratification requires ρ 1 < ρ 2 < ρ 3 . The pressure term P is chosen based on the configuration, namely whether the upper-most surface is free (P is a constant) or whether there is a rigid lid, which is the case for the experimental set-up we wish to consider. This rigid lid imposes the constraint on the thicknesses One is able to then eliminate the lid pressure by subtracting one momentum equation in (4.10) and one of the thicknesses using (4.11). Doing so leads to the system of equations This configuration is sketched in figure 1. The Lagrangian structure emerges once one imposes that the flow is irrotational and introduces the velocity potential in each fluid φ j such that u j = (φ j ) x . By doing so, this allows one to introduce the Lagrangian which generates the potential version of (4.12) as its Euler-Lagrange equations.
The relative equilibrium we study in this example is precisely the uniform flow solution, given by The flow velocities are given by U i (taking the place of k i in the theory, to better fit with the literature) and ω i represents the Bernoulli head of each flow. Substitution into the Euler-Lagrange equations generated by (4.13) determines the quiescent thicknesses in terms of Figure 1: A sketch of the three-layered fluid system under consideration with rigid lid. .
The vector-valued conservation laws one extracts from this system come from the conservation of mass equations in the shallow water system. Explicitly, we have With the rigid lid constraint (4.11) and consideration of the static state U 1 = U 2 = U 3 = 0, the characteristics which emerge from the determinant condition (2.6) satisfy the biquadratic which always has real roots [5]. This also gives the eigenvector ζ as The quantity γ represents the ratio between the deflections of the two free surfaces of the problem [5, eqtn 2.16], and it is positive for mode-1 waves (associated with the two largest magnitude roots of (4.14)) and negative for mode-2 (the lowest two magnitude solutions). Therefore, the faster speeds generate waves with the same polarity and slower waves admit waves of opposing polarities.
We are now in a position to assess the criterion for a loss of genuine nonlinearity, which gives which is a condition highlighted in Barros et al. [5] for a loss of the quadratic nonlinearity for nondimensional wave amplitude a, wavespeed c and coefficient M determining the slope. For the Gardner equation M = 6 and Carr's data, although there are only two low amplitude data points, suggests M = 6.79 for such waves. Thus it would appear that the Gardner equation gives a good qualitative picture of the dynamics of the three-layered systems that are experimentally considered, and a full quantitative assessment of this descriptive ability is reserved for future study. This is not to say that the derived Gardner equation gives complete picture of the dynamics, despite its successes and widespread use in internal wave modelling. This insight should only be applicable for sufficiently small amplitude structures in (4.10), where (4.18) is applicable, and therefore cannot explain why internal waves destabilise at larger amplitudes, although it will likely apply to the smaller solitary waves resulting from the fission process observed. It also is unable to explain the observed asymmetry which arises from these experiments. It additionally will not be expected to fully characterise conjugate flow states correctly in all cases [34]. For a more comprehensive investigation of these situations, one should instead use strongly nonlinear models to improve accuracy and extend the validity of long wave models.

Concluding Remarks
In this paper, we have connected the notion of genuine nonlinearity to the weakly nonlinear dispersive behaviour of the system, namely that local linear degeneracy signals the emergence of the modified KdV equation. This allows one to use the linear quantity, the characteristic, as a diagnostic for the behaviour of the nonlinear system. Moreover, quantities available from the linear theory form its coefficients and so the mKdV may be constructed simply from a linear analysis of the original wavetrain. The nonlinear equation which results from the analysis can then be used for primarily qualitative insight into the system's evolution at such degeneracy points, however there is some evidence that it may also yield quantitative insight. A consequence of the universality of the resulting phase dynamical equation is that one is able to characterise the dispersive dynamics through conditions imposed on the quantities c and σ which one readily obtains via linear analyses. As a result, starting from the WMEs one can systematically identify the most suitable dispersive long-wave model simply by assessing which of the relevant properties the characteristic c and linear dispersion relation σ satisfy. Combined with connections made in previous works [46,10] this essentially turns the phase modulation analysis into a flow chart-like process. This is visualised in figure 3, demonstrating the connections between each of these well-established equation and the conditions required on the respective linear quantity. Further, it suggests that when such conditions are combined, much more complex phase dynamics should be expected. For example, when dispersion is weak and a given characteristic is locally linearly degenerate, the appropriate phase dynamical equation should resemble an extended version of the KdV equation. Moreover, by combining the linear degeneracy of this paper with a double characteristic signals the emergence of a modified version of the two-way Boussinesq equation [44].
There are several avenues for future work based on these results. One of the most natural directions to take is to use this approach to investigate other nonlinear wave systems to discern the insight the theory offer, such as a more in-depth analysis of the Stokes wave example for particular systems including the water wave problem [52]. Most importantly, the interaction between waves and their mean flow have a highly nonlinear interplay in such systems and so the ideas of this paper and preceding work will likely shed significant insight into this coupling. Another prospective direction concerns itself with the "infinite" phase limit of Whitham modulation theory. The discussion within this work has dealt with finitely many phases, so that the formulation of conservation laws, characteristics and solvability conditions involves only linear algebraic constructions. However, there are examples where the family of relative equilibria depends continuously on a variable, which in essence makes it 'infinite' phased. Such cases arise in the study of continuously stratified water waves [22,23,24] and wave fields involving a whole spectrum of wavenumbers [42,54]. In such cases, we expect there to be a spectrum of characteristics c in play, and it therefore isn't clear how the notion of local linear degeneracy will generalise in these contexts but should still lead to the mKdV emerging as it has been shown to for internal waves.

A Calculation of the Cubic Coefficient
Due to the cumbersome nature of the cubic coefficient's calculation, we undertake it here in an appendix. In short, we wish to try and write the inner product K( Z k j kmkn − c Z k j kmωn + Z k j knωm + Z kmknω j ) + c 2 ( Z k j ωmωn + Z kmω j ωn + Z knω j ωm ) − c 3 Z ω j ωmωn (A.1) in terms of derivatives of the conservation laws. We will ultimately show that this inner product leads to the vector term − 1 2 (D k − cD ω ) 4 E(c)(ζ, ζ, ζ) + 3(D k − cD ω ) 3 E(c)(ζ, κ) .

(A.2)
We do this in stages, as in [44,45], and will require use to use further derivatives of the basic state. For example, for some of the manipulation we will use to simplify the inner product, and this relation can be obtained simply by differentiation of (2.2) with respect to θ i , then either k j , ω j and combining the results. Further relations of this nature can be obtained in a similar fashion but are not documented here. We manipulate, starting with the terms involving Ξ: ζ n K( Z kmkn − c Z ωmkn + Z kmωn + c 2 Z ωmωn ) = N j,m,n=1 ζ j ζ m ζ n Z θ i k j − c Z θ i ω j , K( Z kmkn − c Z ωmkn + Z kmωn + c 2 Z ωmωn ) We colour code the terms which require no further manipulation, with red terms contributing to the first term in (A.2) and blue the second. In subsequent lines we contract these term by writing them as red, blue respectively. We can then combine these terms with those in (A.1) involving v 3 : ζ j ζ m ζ n ( Z θ i k j km − c( Z θ i k j ωm + Z θ i kmω j ) + c 2 Z θ i ω j ωm ), K( Z km − c Z ωm ) The remaining terms in (A.1) contribute to the red terms. We now gather the terms of each color, and the simplest of the two are the blue terms. Collecting the powers of c together, one can show that N j,m=1 ζ j κ m Z θ i k j − c Z θ i ω j , K( Z km − c Z ωm ) This is exactly the index form of the second term in (A.2). Then by gathering these red terms, we can also show that N j,m,n=1 ζ j ζ m ζ n Z θ i k j − c Z θ i ω j , K( Z kmkn − c Z ωmkn + Z kmωn + c 2 Z ωmωn ) + ( Z θ i k j km − c( Z θ i k j ωm + Z θ i kmω j ) + c 2 Z θ i ω j ωm ), K( Z km − c Z ωm ) This is the index form of the first term in (A.2), completing the connection between the inner product of this term and the conservation laws.