Hamiltonian formulation of a class of constrained fourth-order differential equations in the Ostrogradsky framework

We consider a class of Lagrangians that depend not only on some configurational variables and their first time derivatives, but also on second time derivatives, thereby leading to fourth-order evolution equations. The proposed higher-order Lagrangians are obtained by expressing the variables of standard Lagrangians in terms of more basic variables and their time derivatives. The Hamiltonian formulation of the proposed class of models is obtained by means of the Ostrogradsky formalism. The structure of the Hamiltonians for this particular class of models is such that constraints can be introduced in a natural way, thus eliminating expected instabilities of the fourth-order evolution equations. Moreover, canonical quantization of the constrained equations can be achieved by means of Dirac's approach to generalized Hamiltonian dynamics.


I. INTRODUCTION
In a monumental article written in 1848 (in French), Mikhail Vasilyevich Ostrogradsky laid the foundations for the Lagrangian and Hamiltonian formulation of higher-order differential equations and pointed out their disposition to instability [1]. Modern applications of Lagrangians with higher derivatives in particle physics include investigations of possible deviations of electroweak vector-boson self-interactions from the standard model (see [2] and references therein).
The present work on higher-order theories is motivated by attempts to introduce an alternative theory of gravity as a Yang-Mills theory based on the Lorentz group (see, for example the works [3][4][5] spanning more than six decades, and references therein). The vector potentials of the Yang-Mills theory are no longer considered as primary fields, but rather as functions of a decomposition of the metric tensor including time derivatives (the vector potentials appear as a spin connection in the spirit of the Ashtekar variables proposed for a canonical approach to gravity [6,7]). Therefore, it is natural to consider theories given by standard Lagrangians L(q,q) where the variables q(q,q) andq(q,q,q) are given in terms of more fundamental variablesq.
Based on this idea, we introduce a class of higher-order models in the Lagrangian (Section II) and Hamiltonian (Section III) settings, where the reformulation is achieved by means of the Ostrogradsky framework. The expected Ostrogradsky instability is cured by means of constraints which, for the proposed class of models, arise naturally (Section IV). We offer a number of concluding remarks (Section V), in particular, on the role of constraints and the canonical quantization of the proposed higher-order theories. In the appendices, the abstract general ideas are illustrated in the context of simple examples from mechanics (Appendix A) and field theory (Appendix B). * hco@mat.ethz.ch; http://www.polyphys.mat.ethz.ch/

II. LAGRANGIAN FORMULATION
We start our development from the first-order Lagrangian for the discrete set of variables q i , i = 1, . . . I, where q represents the list of all variables, u i (q), V (q) are sufficiently smooth functions, and we make use of Einstein's summation convention (summation over indices occurring twice). This type of Lagrangian is not only very common for mechanical systems, but it covers also the space-discretized version of the Yang-Mills Lagrangian [8].
As a next step, we assume that the variables q can be expressed in terms of the variablesq, which are typically fewer than the variables q. Moreover, the variables q are allowed to depend also on the time derivatives ofq, where α i (q), β ik (q) are sufficiently smooth functions of the variablesq k , k = 1, . . . K ≤ I. For dynamic consistency reasons, we postulatė with the derivatives In the following, we use an analogous notation for the second-order derivatives of α i (q) and β ik (q). According to Eqs. (2) and (3), the Lagrangian (1) can be considered as a function ofq,q andq. Note that the variables q,q represent 2I degrees of freedom, whereas the variablesq,q,q represent 3K degrees of freedom. A particularly interesting situation arises for I = (3/2)K. We can then consider the case where there is a one-to-one correspondence between the variables q,q andq,q,q. In other words, we can assume that the functions (2), (3) can be inverted to obtainq,q,q uniquely from q,q. An example of such an invertible relationship is given in Appendix A (all the steps of the general development are illustrated for that example in the appendix). In general, we make the regularity assumption that the rank of the matrix β ik in Eq. (2) takes its maximum possible value, K.
Stationarity of the time integral of the Lagrangian (1), or action, with respect to variations of q leads to the I evolution equations with If the variations of q are restricted to the variations ofq in Eqs. (2) and (3), we obtain the following smaller set of K evolution equations Note that Eq. (7) contains third-order time derivatives of q, implying a set of fourth-order differential equations for q. This is the class of fourth-order differential equations considered in this paper. As a consequence of the chain rule, they have the factorized structure of Eq. (7) because they result from second-order differential equations by considering the unknowns as functions of potentially fewer, more basic variables and their time derivatives. Our further investigation is motivated by the question whether there is a canonical quantization procedure for this class of fourth-order equations. One would like solutions q(t) of the second-order equations (5) to provide solutionsq(t) of the fourth-order equations (7). Even if we assume that one can uniquely reconstructq,q,q from q,q, this is not straightforward because the resultingq(t),q(t) must be consistent with the time derivatives ofq(t). Consistency is most easily achieved for static solutions. In general, symmetries are required to obtain valid solutionsq(t) (see the example of Appendix A). Note that Eq. (5) is a system of third-order differential equation forq(t) consisting of I equations for K ≤ I functions.

III. HAMILTONIAN FORMULATION
We next consider the Hamiltonian formulation of the fourth-order differential equations (7) forq(t). Such a formulation can be achieved by means of the Ostrogradsky framework. The key idea is to use as configurational variables and to define the corresponding conjugate momenta by and Note that P 2k contains second-order time derivatives of q k , whereas P 1k contains third-order time derivatives of q k . In view of Eqs. (2), (3) and (8), the key step in calculating the Hamiltonian H = P 1kQ1k + P 2kQ2k − L by Legendre transformation is the determination of q k (Q 1 , Q 2 , P 2 ). Equation (10), combined with Eq. (3), gives According to our regularity assumption, the symmetric K × K matrix β ik β il has an inverse B kl . We hence find the desired relation forq k (Q 1 , Q 2 , P 2 ), whereq l = Q 2l , the functions β im , α ′ in , β ′ iln , B mk depend on Q 1 and, finally, u i is a function of q i (Q 1 , Q 2 ). In a similar way, we find the third-order time derivatives At this point, with Eqs. (8), (12) and (13), we have fully established the one-to-one relation between the variables q,q,q and ··· q and the canonical variables Q 1 , Q 2 , P 1 and P 2 . In particular, we can also find q,q andq in terms of the canonical variables.
The Hamiltonian now takes the following form, In order to find the derivatives of H with respect to the canonical variables we need to use the substitution rules for q andq. The following auxiliary results are helpful: and In deriving these auxiliary equations, only Eq. (11) and not Eq. (12) has been used, so that the following results for the derivatives of H hold even without regularity assumption, Some of the terms in Eqs. (19) and (20) can be written in a simpler form, directly in terms of canonical variables. For example, if we introduceV (q,q) = V (q(q,q)), then we can write and The same type of simplifications can also be achieved for ∂u j /∂q i . On the other hand,q i is a fairly complicated function of the canonical variables, which involves the lengthy expression (12) forq k , to be inserted into Eq. (3). In view of this complexity, the auxiliary result (15) is remarkably simple. The canonical equationsQ 1 = ∂H/∂P 1 andQ 2 = ∂H/∂P 2 are consistent with the definitions (8) of Q 1 and Q 2 . AsṖ 2 = −∂H/∂Q 2 reproduces the representation (9) of P 1 , the evolution equation must be contained iṅ P 1 = −∂H/∂Q 1 (note thatṖ 1 contains third-order time derivatives of q and hence fourth-order derivatives ofq). Indeed, by using Eqs. (19) and (9), we recover the fourthorder evolution equation (7).

IV. OSTROGRADSKY INSTABILITY
The occurrence of the term P 1k Q 2k in the Hamiltonian (14) implies that the energy can be lowered without any bound by increasing the momentum P 1k to large positive or negative values. Therefore, this term suggests an instability that is known as Ostrogradsky instability. Such an instability is generally considered as a strong argument against higher-order equations.
The unbounded term in the Hamiltonian can be suppressed by imposing the K primary constraints The potentially nice features of these constraints can be recognized by a closer look at Eq. (9). This expression for P 1k implies that the primary constraints (23) are fulfilled by the solutions of the basic equation (5), which we want to keep, but not necessarily by the solutions of the higher-order equation (7), from which we want to eliminate instabilities. On a more formal level, if the primary constraints (23) are imposed, the constrained Hamiltonian (14) is obtained by the same substitution idea as the Lagrangian: the variables q,q are given in terms of q,q,q in Eqs. (2), (3) and, in turn,q,q,q are expressed in terms of the canonical variables Q 1 , Q 2 , P 2 according to Eqs. (8) and (12). The constrained Hamiltonian does not depend on P 1 but, according to Eq. (20), the derivatives of the Hamiltonian on the constrained manifold do. The primary constraints are not consistent with the dynamics, so that we introduce the secondary constraints These secondary constraints share the potentially nice features of the primary constraints, keeping physical solutions but not all terms of the fourth-order equations.
The explicit example of Appendix A shows that it may be necessary to continue the iterative procedure and to consider also the tertiary constraints In general, the iterative procedure needs to be continued until full dynamic consistency is reached on the constrained manifold.

V. SUMMARY AND CONCLUSIONS
We have introduced a class of Lagrangians L(q,q,q) associated with fourth-order evolution equations by substituting a transformation q(q,q), as well as the consistent transformationq(q,q,q), into a standard Lagrangian L(q,q). The corresponding canonical Hamiltonian formulation with Hamiltonian H(Q 1 , Q 2 , P 1 , P 2 ) is obtained by means of the Ostrogradsky formalism. Natural constraints arise from the idea that, on the constrained manifold, the Hamiltonian of the higher-order problem should be obtained from a substitution procedure, just like the Lagrangian.
These natural constraints play a crucial role in the proposed class of models. They are needed to eliminate the Ostrogradsky instability that one expects because the Hamiltonian contains the momentum P 1 only in a linear term and hence is unbounded. Ideally, the constraints restrict the solutions of the higher-order problem to a subset of solutions of the original standard problem. This requirement is not fulfilled automatically and imposes restrictions on the original Lagrangian and its interplay with the transformation; the transformation should be consistent with the symmetries of the Lagrangian. If the higher-order formulation only picks out solutions from the standard formulation, the most interesting features may arise only upon coupling to other systems.
In general, the constraints have nonvanishing Poisson brackets so that they may be classified as secondclass constraints. For second-class constraints, the Poisson bracket can be modified into a Dirac bracket that leads to a canonical quantization procedure [9][10][11]. Compared to an alternative quantization scheme based on reducing general higher-order Lagrangians to first-order Lagrangians proposed in [12], we here exploit the special features of our particular class of higher-order Lagrangians.
There may be further constraints already present in the original theory represented by L(q,q), for example, for gauge theories. The gauge transformation of the q variables should then be inherited from the gauge transformation behavior of theq variables. In the quantization procedure, the constraints associated with gauge transformations can be handled by the powerful BRST procedure [13][14][15], where the acronym BRST refers to Becchi, Rouet, Stora [16] and Tyutin [17].
There is an interesting epistemological aspect of the present work. The class of higher-order models proposed in this paper is particularly promising in situations where a successful theory should be modified because the fundamental objects ("particles") of the theory are discovered to be composed of even more fundamental objects. For example, the idea to decompose the space-time metric g µν of general relativity as g µν = η κλ b κ µ b λ ν in terms of the Minkowski metric η κλ and the more fundamental variables b κ µ has profound implications for the theory of gravity [6,7]. In particular, the variables A aν characterizing the connection of the variables b κ µ ("spinconnection," similar to the Levi-Civita connection for g µν ) can be considered as a most relevant example of a mapping q(q,q) with q = A aν andq = b κ µ , where the subscript a labels the generators of a Lie group (a simplified version of this field theoretic example is presented in Appendix B). If the Lie group is given by the Lorentz group [5], the index a takes six values, compared to the four values of the space-time index κ; we thus find the ratio 3 : 2 for the natural Lorentz-group/space-time pairing, which has been revealed to be of special relevance in one-to-one reformulations of standard Lagrangian theories. In terms of the variables b κ µ , one can formulate a dissipative quantum field theory based on a dynamic "diffusive smearing" mechanism on the Planck scale [18].

Appendix A: Example from mechanics
For the linear transformation one can easily verify that it is invertible for λ = 0, where λ has the dimensions of a time constant. Moreover, the consistency condition (3) is satisfied. For the threedimensional harmonic oscillator with mass m and spring constants h i with which is of the general form (1), the higher-order Lagrangian obtained by insertion becomes Instead of the usual three second-order equations (no summation over i), we find the two fourth-order equations forq 1 andq 2 , which can be rewritten in the form (7), Solution of the usual equations (A4) for three harmonic oscillators requires six initial conditions. The solutions are of the form where ω i = h i /m for i = 1, 2, 3. Using to the inverse of the transformation (A1), these solutions suggest as candidates for solving the higher-order equations. In general, however, these functionsq 1 ,q 2 do not provide a solution to Eq. (A5). Only for equal h i (h i = h, ω = h/m), the functions given in Eq. (A8) provide a four-parameter solution to Eq. (A5). Note that the transformation (A1) breaks the symmetry between q 1 , q 2 and q 3 in an unnatural way so that, in the anisotropic case, the higher-order problem becomes completely different from the lower-order problem. From now on, we therefore restrict ourselves to the isotropic case. On the one hand, for ω 1 = ω 2 , the solution (A8) for q 1 ,q 2 represents only two independent harmonic oscillators with four parameters, whereas the underlying q 1 , q 2 , q 3 represent three independent harmonic oscillators with six parameters. On the other hand, we would expect eight parameters in the solution of the two fourth-order equations (A5), which we indeed recognize in the general solution of the system in Eq. (A5) for the isotropic case, The existence of exponentially increasing and decreasing contributions with rates and frequencies independent of m and h is based on the two identities The exponentially growing solution illustrates the Ostrogradsky instability. As the evolution equations are reversible, the exponentially growing solution is accompanied by an exponentially decaying solution. The growth and decay rates are determined by the time scale λ, which is not in the original problem and enters the picture only through the transformation (A1). One could avoid the instability by taking the limit λ → ∞, in whichq 1 ,q 2 are only shifted by constants. As these constants are unphysical, it seems preferable to avoid exponentially growing solutions by imposing suitable constraints, as proposed in the general development.
For the Hamiltonian formulation of the problem, we rely on the correspondences implied by Eqs. (8), (12) and (13), Q 12 =q 2 , Q 22 =q 2 , P 22 = λm(q 1 + λq 2 ), (A14) The Hamiltonian is found to be of the quadratic form where the last term (proportional to the parameter h) is the potentialV . Most of the evolution equations resulting from this Hamiltonian in a canonical way can be interpreted as assignments of variables, andṖ 21 = P 22 λ − P 11 − λ 2 hQ 21 , The final two evolution equationṡ coincide with the two fourth-order differential equations of the Lagrangian approach given in Eq. (A5). If we impose the four constraints ϕ 1 = λP 11 − P 22 = 0, ϕ 2 = λφ 1 = λ(P 12 − mQ 22 ) = 0, ϕ 3 = λφ 2 = −P 22 + λmQ 21 − λ 2 hQ 12 = 0, ϕ 4 = λφ 3 = ϕ 2 + P 21 + λ 2 hQ 11 = 0, the identity implies that the Poisson bracket of the Hamiltonian with each of these constraints vanishes on the constrained manifold. The first two constraints in Eq. (A21) actually coincide with the primary constraints (23) of the general development (except for a factor of λ). The second constraint arises also among the secondary constraints (25), so that this classification scheme clearly comes with ambiguities. In the general development, the fourth constraint in Eq. (A21) arises already as a tertiary constraint, after which the iterative procedure comes to an end. By means of the first two constraints, which are the primary constraints of the general development, the Hamiltonian (A16) on the constrained manifold can be written as which (for m, h > 0) is clearly bounded from below by zero. Indeed, the four constraints eliminate all the exponentially growing or decaying terms from the solutions (A9) and (A10) (c 5 =c 6 =c 7 =c 8 = 0), so that only the two harmonic oscillators (A8) resulting from the original system of three harmonic oscillators survive in the higher-order theory. The reduction from three to two harmonic oscillators corresponds to the reduction from K to I basic variables in the higher-order theory, where the reduction from three to two plays a special role because it allows for a one-to-one correspondence of the full sets of variables in the Lagrangian, as pointed out in the paragraph after Eq. (4). The matrix of Poisson brackets among the constraints, where Ω = λ 2 ω 2 , implies that we deal with second class constraints, for which a Dirac bracket can be introduced and a canonical quantization procedure is available [9][10][11]. As expected from general grounds, the matrix in Eq. (A24) is invertible for all Ω. Only two 2 × 2 submatrices need to be inverted, so hat the Dirac bracket can be written down in closed form.

Appendix B: Example from field theory
We here look at the weak-field approximation for a Yang-Mills theory based on the Lorentz group, which has been proposed as an alternative theory for gravity [5]. For this field theoretic example, the transformation of the type (2) is given by where the Greek letters denote space-time indices (with the standard convention x 0 = ct, where c is the speed of light). The space and time dependent field h λν can be interpreted as the deviation of the metric from the Minkowski metric [with signature (−, +, +, +), which we use for lowering or raising indices]. Note that A κλν is antisymmetric in κ and λ so that only six index combinations matter. These six index pairs correspond to the generators of the Lorentz group, where the pairs (0, 1), (0, 2), (0, 3) correspond to boosts in the three coordinate directions and (2, 3), (3,1), (1,2) correspond to rotations around the coordinate axes, indicated by the corresponding perpendicular planes. Therefore, in the definition of A κλν , the pairs (κ, λ) label the generators, and ν is interpreted as the space-time index of the vector potential from which a Yang-Mills field tensor can be defined.
To keep the example simple, we here assume that only the spatial components,q are needed to parametrize the vector potential, that is, we choose h 00 = h 0k = h k0 = 0 in Eq. (B1). According to Section 10.2 of [19], this setting is sufficient for discussing gravitational waves. All fields A κλν then vanish for ν = 0. The vector fields associated with rotations are linear in h kl [including spatial derivatives, which belong to the α part of the transformation (2)], whereas the vector fields associated with boosts are linear in the time derivative of h kl [and therefore belong to the β part of the transformation (2)]. For symmetric h kl , there are six independent basic variables. In the weak-field approximation, the Yang-Mills Lagrangian associated with the vector potentials (B1) is given by the space integral of the following quadratic Lagrangian density, One half of the expression in parentheses is equal to the Yang-Mills field tensor. The Lagrangian density (B3) leads to the fourth-order field equation This field equation can be rewritten in the elegant alternative form where R µν and R (1) are the linearized versions of the Ricci tensor associated with h µν and the corresponding curvature scalar.
For the Hamiltonian formulation, we find the canonical variables and which, through the time derivative of h kl , includes the third-order time derivative of h kl . The Hamiltonian density is found to be H = 2c 4 P 2kl P 2kl + P 1kl Q 2kl − 1 8 Three of the canonical Hamiltonian equations reproduce the time derivatives of the quantities in Eqs. (B6) and (B7). The Hamiltonian evolution equation for P 1kl reproduces the field equation (B4). In order to find the primary constraints for eliminating the Ostrogradsky instability we need to go back to the general development and the corresponding structure of Eq. (23). For our field theoretic example, these primary constraints become or, with the help of Eqs. (B6) and (B8), The secondary constraints obtained as the time derivative of Eq. (B11) can be used to eliminate the fourthorder time derivatives from the field equation (B4) to find a second-order differential equation in time.
The possibility of identifying the primary constraints is the key advantage of the class of higher-order models introduced in this paper. We illustrate their importance for the special case of gravitational waves. If we assume transverse waves, the field equation (B4) is reduced to and the primary constraints (B11) become ∂ ∂t h kl = 0. (B13) Both equations are satisfied by the plane wave solutions characterized by h kl = 0, but additional instable solutions need to be excluded from the higher-order equations. The combination of Eqs. (B12) and (B13) gives ∆ h kl = 0, where ∆ is the Laplacian. With suitable spatial boundary conditions one could arrive at the desired equation h kl = 0.
Instead of expressing the vector potentials in terms of the spatial components h kl it would be preferable to use the full deviatoric metric four-tensor h µν to keep the Lorentz covariance in the description. However, such an approach would be significantly more challenging because an explicit consideration of gauge invariance and the associated constraints would be required. These extra efforts should certainly be made if one wishes to go beyond the weak-field approximation used here for illustrative purposes.