Junction conditions in a general field theory

It is well-known in the modified gravity scene that the calculation of junction conditions in certain complicated theories leads to ambiguities and conflicts between the various formulations. This paper introduces a general framework to compute junction conditions in any reasonable classical field theory and analyzes their properties. We prove that in any variational field theory, it is possible to define unambiguous and mathematically well-defined junction conditions either by interpreting the Euler–Lagrange differential equation as a distribution or as the extremals of a variational functional and these two coincide. We provide an example calculation which highlights why ambiguities in the existing formalisms have arisen, essentially due to incorrect usage of distributions. Relations between junction conditions, the boundary value problem of variational principles and Gibbons–Hawking–York-like surface terms are examined. The methods presented herein relies on the use of coordinates adapted to represent the junction surface as a leaf in a foliation and a technique for reducing the order of Lagrangians to the lowest possible in the foliation parameter. We expect that the reduction theorem can generate independent interest from the rest of the topics considered in the paper.


Introduction
An important problem in classical field theory is the determination of junction conditions, which are relations that specify when two solutions of the field equations can be glued together along a common boundary surface.In electromagnetism, these involve the usual jump conditions on the electric and magnetic fields at an interface and is textbook material [1].For general relativity (GR) there is a rich literature for the junction conditions.The definitive formulation has been given by Israel [2] although there are preceding works on the matter by eg.Lanczos [3], Darmois [4], Synge and O'Brien [5] and Lichnerowicz [6].Israel's formulation is sensitive to the metric signature of the surface and is valid only for the timelike and spacelike case.His work has been generalized to null signature by Barrabes and Israel [7] and to completely arbitrary (including non-constant) signature by Mars and Senovilla [8] among many other papers on the subject (see also [9,10,11,12]).
A rich family of increasingly complicated classical field theories is provided by the modified gravity scene.Junction conditions of various generalities have been calculated for a number of modified theories such as Brans-Dicke theory [13], Gauss--Bonnet gravity [14], higher derivative theories [15], and Horndeski theory [16].These are only a selection of examples with no claim of completeness.There is a considerable interest in the computation of junction conditions for a large number of macroscopic phenomena can be described this way such as gravitational collapse [17], stellar boundaries [18], cosmological phase transitions [18], impulsive electromagnetic and gravitational signals [19].
Despite the interest in junction conditions, calculations in modified theories of gravity have been performed in an ad-hoc manner with no general theory behind junction conditions.Accordingly, a number of ambiguities have been observed when junction conditions are computed in different theories and surprisingly the origins and resolutions of these ambiguities have not been investigated so far to any significant detail.While for a given theory there are a number of specific ways to determine the junction conditions, there are two methods that do not rely at least conceptionally on the specific properties of the given field theory.
1.One method is to assume some reasonable regularity conditions on the field variables (such as smooth everywhere except at the junction surface, and C k with bounded left and right k + 1th derivatives at the surface, for some k ∈ N), then interpret any derivative that would not exist classically as being distributional.Then it is attempted to make sense of the field equations as a relation between distributions.For more complicated theories the ambiguities manifest in the appearance of invalid operations such as the product of two singular distributions or the product of a singular distribution with a discontinuous function.The Schwartz impossibility theorem [20] states that within the usual theory of distributions such products cannot be defined in any consistent manner.Products of Dirac deltas have a tendency to cancel1 , but since the products themselves are ill-defined, it is not even clear whether such cancellations lead to valid results even if the end result is free of invalid expressions.
More problematic are products of singular distributions with discontinuous functions since these have a general tendency of appearing and is not immediately clear how to deal with them.A common approach is to interpret the value of a discontinuous function at the junction surface as the mean value of its left and right limits.This ad-hoc interpretation is however enforced by hand and cannot be obtained from valid mathematical manipulations.In Section 3.6 we give an explicit example where this approach leads to wrong results.This difficulity has been recognized in eg.[14] where the author speaks of regularizing Delta functions in appropriate ways to ensure the correct result.In the formalism we develop here, no illegitimate operations between distributions ever appear and there is not need for any regularization of Delta functions.
2. The second method is variational in nature.Once again after some reasonable regularity conditions are imposed on the field variables, the variational principle for the field is interpreted in this regularity class.This usually implies that the class of fields we vary in are less regular at the junction surface than the Euler-Lagrange equations would require, therefore during the variation process there are boundary terms appearing at the interior boundary surface which provide the junction conditions.In the calculus of variations such solutions are often called broken extremals.The problem here is that the junction conditions obtained via this approach seem to depend on the specific properties of the Lagrangian from which the field equations are derived.If the Lagrangian has "too high order" when compared to the field equations (the second order Einstein-Hilbert Lagrangian for GR is an example of that), one obtains the correct junction conditions only if the action functional is extended with some boundary terms at the junction surface (for GR with timelike or spacelike junction surface, the well-known Gibbons-Hawking-York (GHY) term [21,22] provides an appropriate surface term).
The reasons why such surface terms are needed can be understood by analogy to be related to the well-posedness [23] of the variational principle at the outer (rather than inner) boundaries, but there is no general existence theorem for such boundary terms for a generic Lagrangian field theory and the precise relationship between these boundary terms and the junction conditions are poorly understood.Nonetheless, for most modified theories of gravity, appropriate surface terms can be found by direct calculation and the variational approach to the junction conditions does avoid the ambiguous aspects of the distributional formalism.This method has been used by Davis [14] to derive the junction conditions for Gauss-Bonnet gravity and by Padilla and Sivanesan for the Horndeski theory [16].
There are two main motivations behind this paper.First of all, it is intrinsically useful to give a general analysis of junction conditions where the field theory is as unspecified as it is possible to be.This way authors investigating novel classical field theories (primarily, modified theories of gravity) are given an algorithmic prescription for calculating the junction conditions.Secondly, we resolve the ambiguities of the two main methods described above.Our method relies in interpreting the junction surface as a leaf in a foliation of the parameter space of the field theory.This way we obtain a splitting of the variables where one of them, the evolutionary parameter describes the evolution of the fields transversal to the junction surface.If the field equations are variational and involve at most sth derivatives with respect to the evolutionary parameter, the Lagrangian with the lowest theoretically possible order of dependence on the evolutionary derivatives of the field involve ⌈s/2⌉th derivatives.Such Lagrangians are said to have minimal evolutionary order.Although it is known that Lagrangians of minimal total order do not always exist, we prove that every variational field equation has a Lagrangian of minimal evolutionary order.Such Lagrangians will play a central role in our analysis.
The crucial assumptions are that the field theory be variational and that the evolutionary order of the field equations be even.The odd order equations are inherently degenerate.In Section 3.4 we also extend the formalism to odd order equations at least formally, but we can say much less about this case than that of the even order equations.The particular form of the Euler-Lagrange equations when calculated from a minimal evolutionary order Lagrangian can be used to deduce that there are nontrivial2 regularity conditions under which the Euler-Lagrange expressions are well-defined distributions and the singular part of the Euler-Lagrange equations represents the junction conditions at the surface.Analogously, the equation of broken extremals for the variational problem determined by any Lagrangian of minimal evolutionary order leads to the same junction conditions we obtain distributionally.
This solves essentially all ambiguities presented before as we prove the well-posedness of the distributional formalism, that the variational and the distributional methods give the same result and also that appropriate boundary terms for the variational principle always exist at least locally (in a suitable sense of locality) with the junction conditions being independent of the choice of "appropriate" boundary term.The latter follows from the fact that the minimal evolutionary order Lagrangian necessarily differs from a non-minimal Lagrangian by a total derivative term which appears as a surface term when integrated.This surface term is precisely a GHY-like boundary term.The use of reduced order Lagrangians over surface terms increases direct computability of the junction conditions.As we will see, one only has to determine all canonical momenta of the minimal evolutionary order Lagrangian and the junction conditions can be expressed in terms of these momenta.This is generally simpler to do in practice than to calculate the total variation of both the non-minimal order Lagrangian and the surface term and perform the necessary integration by parts at the boundary to reduce the total boundary term to a form from which the junction conditions can be read off.
In order to prove the theorem on the existence of minimal evolutionary order Lagrangians, we need a number of advanced results from the formal calculus of variations.These are not in general widely known and they are usually considered as part of some abstract and highly technical framework such as the variational bicomplex [25,26,27] or the Vinogradov C-spectral sequence [28,29].For this reason a review of these operators and results including "classical" proofs is given in the Appendices at the end of the paper.
2 Junction conditions in a field theory

Setup and overview
We first define what we mean under a classical field theory.It consists of a coordinate manifold M of dimension n + 1 with standard coordinates (x µ ) n+1 µ=1 = (x 1 , . . ., x n+1 ), a dynamical field (q i ) m i=1 = (q 1 , . . ., q m ) of m components, and a (possibly nonlinear) differential operator of order s (here q (k) = (q i ,µ1...µ k ) stands for all kth order partial derivatives of the field).Then the field equations are given by E i [q] = 0. We also allow for the presence of sources in that if the ρ i (x) are m arbitrary functions of the coordinates, the field equations may be modified to For some theories, the sources may couple in a more complicated manner, but we shall make the above assumption on the form of the coupling, which is the most common in physics.It should also be noted that the sources ρ i may arise from the dynamics of some other field, but they can also be specified "externally".The origin of the sources are largely irrelevant assuming that we focus only on the dynamics of the field q.If the field theory is Lagrangian and has gauge symmetries (as in GR or Maxwell/Yang-Mills theory), then the operators E i [q] satisfy certain off-shell differential relations (Noether identities) [30], symbolically written as D[E] = 0, then the sources ρ i are admissible if and only if they also satisfy the relations D[ρ] = 0.If the sources arise from some coupled matter action that shares the gauge symmetries of the field, the resulting sources will automatically satisfy the Noether identities.See also Section 3.5 for consequences for the junction formalism.
A remark on the notation is that we shall use q to denote "generic" dynamical variables.When specific field theories are considered, then the appropriate common notation will be used in place of q (eg.g µν for a metric tensor or φ for a scalar field).
The above definition of a classical field theory is very restrictive.For most field theories in physics, especially when coupled to gravity, M is a more general differentiable manifold and the q fields are sections of some fibered manifold (almost always a locally trivial fibre bundle with structure group) over M .However if fibered coordinates are introduced in the field bundle, then locally (in the domain of those coordinates), the field theory can be cast into the above described coordinate form.The method we employ here for defining and calculating junction conditions in a classical field theory depends intrinsically on the use of adapted coordinates, therefore we dispense with the pretense of using global geometry from the beginning and utilize a coordinate-based formulation.Hence as long as we understand that the formalism presented here is local, we lose no generality by employing the above definition of a classical field theory.
Our goal is to derive junction conditions and thin shell equations for the field theory.We thus proceed by defining what we mean under these two terms.Suppose that the parameter space M is cut into two disjoint pieces by a surface 3 Σ with the two disjoint regions being M + and M − .Suppose that q + is a field defined in M + , q − a field defined in M − with sources ρ + and ρ − respectively and obeying the field equations (2.3) 3 We use the term surface for a codimension 1 submanifold.The term "hypersurface" is usually more common for this.Dropping the prefix "hyper-" is mainly for brevity, but it is also appropriate in the following sense.In three dimensional space, a surface has a local separation property, i.e. for any (interior) point on the suface, the point has a neighborhood (in the embedding space) which is cut into two disjoint pieces by the surface.For any pair of points in the two pieces, it is not possible to connect them with a continuous curve without intersecting the surface or leaving the neighborhood.In a general space of dimension eg.n + 1, the codimension 1 submanifolds have this property.Hence they are the submanifolds analogous to surfaces in 3-space.
Since the E i are local operators, it makes sense to let them act on the q ± which are defined only in M ± respectively.The q ± are assumed smooth4 in M ± respectively.The soldering of the field q is defined by which can be also written as where Θ Σ (x) is the Heaviside step function whose value is 1 in M + , 0 in M − and undefined (though often taken to be 1/2) on Σ.
Junction conditions for the field theory are then relations which have to be imposed on the fields q + and q − at the boundary surface Σ such that the soldered field q remains a solution of the field equations E[q] = 0 or E[q] = ρ in some appropriate weak sense.We also have the possibility of introducing singular sources concentrated on the surface Σ and solve the problem in the presence of these singular sources.The surface Σ is then called a thin shell.
In analytical form, the problem of determining junction conditions thus reduces to the question: what are the least restrictive regularity assumptions we can put on the field q at Σ such that the differential operator E[q] remains well-defined on q?
A differential operator E i [q] is said to be variational if there is a Lagrange function L[q] = L(x, q, . . ., q (r) ) of some order r such that are the Euler-Lagrange expressions of the Lagrangian.The Euler-Lagrange or variational derivative δ/δq i stands for where the d µ = d/dx µ are total derivatives with respect to x µ .It is well-known (and clear from the analytic expression) that if L[q] is an order r differential operator in q, then δL/δq i is at most order 2r.However if the Lagrangian function has some particular degeneracies in its functional structure, then it is possible for the Euler-Lagrange expressions to have order strictly less than 2r.
Note that as the variables q i ,µ1...µ k are symmetric in the lower indices, they are independent only for eg.
It would be inconvenient to order the indices in sums, therefore we observe the convention that we treat the q i ,µ1...µ k as if they were independent for all orders of the indices and define derivatives with respect to the q i ,µ1...µ k symmetrically such that they obey the chain rule as well as the symmetrized independence condition Given a surface Σ in M along which we are to construct junction conditions, let us introduce adapted coordinates as follows.The last (n + 1th) coordinate x n+1 is named z and it is defined such that the equation of Σ is z = 0.This is always possible, for example by constructing a local foliation with Σ as an initial leaf along the flow of a vector field transversal to Σ.The rest of the coordinates (y a ) n a=1 are then parameters on the surfaces z = const, Lie transported to a neighborhood of Σ via the foliation vector field.The coordinate z is then called the evolutionary parameter or variable and the rest of the coordinates y a are the instantaneous variables.Derivatives with respect to z are called evolutionary derivatives and derivatives with respect to y a are instantaneous derivatives.
A useful notational and terminological device is the following.We write for k ≥ 0 for the kth evolutionary derivative and also (for l ≥ 0) (2.10) Then we formally consider q i [0] to be different from q i in the sense that q i is a function of all the variables y 1 , . . ., y n , z on equal footing, whereas q i [0] , while still depends on all the variables, is considered to be a field variable only in the instantaneous variables y 1 , . . ., y n with z being a background parameter.Hence the fields q i [0] , q i [1] , etc. are all independent fields as functions of the y a .A Lagrangian L[q] can then be written in the form where this notation indicates that L can depend on q i [0] as well as a finite number of its instantaneous derivatives q i [0],a1...a l , on q i [1] and on a finite number of its instantaneous derivatives q i [1],a1...a l and so on with the highest evolutionary derivative appearing in L being the rth.However L may still depend on q i [r],a1...a l for some finite l.If L has the above functional form, then it is said to have evolutionary order r.This terminology is extended to all objects of similar type as L, such as the differential operators (2.12) that specify the field equations.

Minimal order Lagrangians
We first make some remarks on the covariant (non-split) case.If L[q] is a Lagrangian of order r, then the Euler-Lagrange expressions E i = δL/δq i are order s ≤ 2r.Therefore, if we have a system E i [q] of differential operators of order s which we know to be variational, the lowest theoretically possible order of any Lagrangian for the E i [q] is r = ⌈s/2⌉, i.e. r = s/2 when s is even and r = (s + 1)/2 when s is odd.Any Lagrangian of order ⌈s/2⌉ whose Euler-Lagrange expressions are the E i is known as a minimal order Lagrangian for the E i .
Do minimal order Lagrangians always exist?Unfortunately no.A positive example is the Einstein field equations for GR, where although the usual Einstein-Hilbert Lagrangian is second order, there are (see [40] for examples) minimal order (first order) Lagrangians for the Einstein field equations as well.The status of the Einstein-Hilbert Lagrangian as the "default" is because it is diffeomorphism-invariant.A first order Lagrangian for a metric tensor is well-known to be never diffeomorphism-invariant 5 .A negative example would be Gauss-Bonnet gravity [14] or the Horndeski theory [16], which do not admit first order Lagrangians.
The (local) existence of first order Lagrangians for second order field theories has been completely characterized by the difficult reduction theorem of Anderson and Duchamp [24] (see also [34] for a computationally simpler, but more technically intensive proof).The Euler-Lagrange equations of a first order Lagrangian L(x, q, q (1) ) always has the functional form E i (x, q, q (1) , q (2) ) = A µν ij (x, q, q (1) )q j ,µν + B i (x, q, q (1) ), i.e. it is an affine function of the second derivatives.It has been proven by Anderson and Duchamp in [24] that (locally, in the sense of having to assume the space of admissible field values has a simple topology) if the E i is variational (i.e.any Lagrangian of any order exists for it), then the above affine form is also a sufficient condition for the existence of a first order Lagrangian.Higher order generalizations of this result have been provided by Anderson in [27] through his scheme of minimal weight forms.Altogether, the conclusion is that in general minimal order Lagrangians do not exist.
On the other hand consider now a splitting of the variables (x µ ) = (y a , z) as described in Section 2.1 and a Lagrangian L[q [0] , . . ., q [r] ] of evolutionary order r.We only care about the evolutionary order, the total order may be higher (for example by also depending on q i [r],a1...a l for some l > 0).The Euler-Lagrange expressions can be written as where as usual δL δq i with p k being a nonnegative integer characterizing the order of dependence of L on the instantaneous derivatives of q i [k] .From (2.14), we see that if L has evolutionary order r then E i has evolutionary order s ≤ 2r.Analogously to the non-split case, if E i has evolutionary order s, then the theoretically possible lowest evolutionary order for any Lagrangian for E i is r = ⌈s/2⌉ and such Lagrangians are called minimal evolutionary order Lagrangians.Clearly, a minimal evolutionary order Lagrangian needs not be a minimal order Lagrangian in all variables.
One may then ask then whether a variational system of differential operators E i of evolutionary order s always has a Lagrangian of minimal evolutionary order.In the following, we prove that this is indeed the case.It should be noted that it is known [27] that variational systems of ordinary differential equations always admit minimal order Lagrangians.The proof for field theory systems and minimal evolutionary order Lagrangians is similar in spirit, but is more complicated, since the ordinary derivatives need to be replaced with various variational derivatives (Appendix C) and the Poincaré lemma needs to be replaced with the homotopy formulae presented in Appendix D. The contents of the appendices should thus be reviewed here as they are necessary to understand the following proof.
Theorem 2.1.Let E i [q] be a variational differential operator of evolutionary order s.Then there is a Lagrangian of minimal evolutonary order ⌈s/2⌉ for E i [q].Proof.As E i [q] is asssumed variational, there is a Lagrangian L of order r ≥ ⌈s/2⌉ such that where . (2.17) If r = ⌈s/2⌉, we are done, therefore let us suppose that r > ⌈s/2⌉, i.e. 2r > s when s is even and 2r − 1 > s when s is odd.We first compute the term with the highest evolutionary order in E i , which is order 2r.Note that if f = f [q [0] , . . ., q [r] ], then where p 0 , p 1 , . . ., p r are integers characterizing the dependence of f on the instantaneous derivatives of q i [0] , q i [1] , . . ., q i [r] .Then the evolutionary order 2r part of E i is ∂q j [r],a1...a l q j [2r],a1...a l + "lower order terms".
(2. 19) As E i is order s < 2r, it does not depend on q i [2r] in any way, and hence the coefficients of the q j [2r],a1...a l must vanish: ∂E (r) i ∂q j [r],a1...a l = 0, l = 0, 1, . . ., p r , i.e. it does not depend functionally on q [r] .We then apply the multi-field version (D.17) of the homotopy formula (D.4) to L with respect to the variables q i [r] : where the form of the current K a could be computed explicitly, but is irrelevant.As E (r) i does not depend on q [r] , this reduces to where L ′ differs from L in terms of a total instantaneous divergence d a K a (hence have the same Euler-Lagrange expressions), A i := E (r) i and B := L| q [r] =0 .We then plug this formula for L ′ back into the Euler-Lagrange expressions and isolate the term whose evolutionary order is 2r − 1: where the undisplayed terms are evolutionary order 2r − 2 and lower.We calculate the variational derivative in the second term: and here we use the higher order product formulae to get where the higher variational derivatives have appeared.From the first term we get hence the highest order part is [2r−1],a1...a k + "lower order terms" where the H a1...a k (r−1),ij (A) are the Helmholtz expressions computed on A i with respect to the variables The assumption r > ⌈s/2⌉ then again implies that E i does not depend on q [2r−1] and hence the coefficients must again vanish and we thus obtain We can thus apply the second homotopy formula (D.5) or (D.16) to A i to write where such an F can be constructed explicitly via the Vainberg-Tonti formula as Hence, the Lagrangian takes the form Total differentiation of the function F produces where we have integrated by parts at the last block of terms and introduced the "cut total derivative" (dF/dz) cut , which is calculated as dF/dz while pretending that F does not depend on the highest evolutionary derivative variable q [r−1] .Consequently, we may define the Lagrangian which differs from L ′ in total derivative terms only, and as it is order r − 1 in the evolutionary variable.If r − 1 = ⌈s/2⌉, we are done.If r − 1 > ⌈s/2⌉, then we may iterate this process unil we obtain a Lagrangian with minimal evolutionary order.
For clarity, we restate the steps needed to reduce the evolutionary order of a Lagrangian L[q [0] , . . ., q [r] ] by one in an algorithmic fashion, assuming that L is not of minimal evolutionary order.

First define
, 0].The expressions A i will necessarily have evolutionary order at most r − 1.

The A i are necessarily variational with respect to the q
by any other means.3. Write L reduced := B − (dF/dz) cut , which is a reduced Lagrangian.Here the cut total derivative of i.e. the total derivative but with the dependence of F on q [r−1] ignored.
Assume now that the field equations ] are of odd order and that L[q [0] , . . ., q [r] ] is a minimal order Lagrangian.Then the first part of the above proof still goes through and we obtain that is an equivalent of L (here A i and B are defined exactly as above).Hence for odd order systems one may always choose a minimal order Lagrangian which is an affine function of the highest derivative variables q [r] and depends on these only algebraically (i.e.without any instantaneous derivatives).

Junction conditions
As before, let E i [q] be a variational differential operator, Σ a surface in the n + 1-dimensional parameter space M partitioning it into two subdomains M ± , let (x µ ) = (y a , z) be adapted coordinates to the surface (such that Σ = {(y, z) ∈ M : z = 0}) and at this point we will also suppose that the evolutionary order of E i (with respect to this splitting) is s = 2r even.
We are now in the position of defining junction conditions for the field theory.Our claim is that if the field variables q i (y, z) are C ∞ (or C 2r ) in M + and M − respectively, and are C r−1 on Σ with bounded left and right rth derivatives, then the Euler-Lagrange expressions E i [q] are well-defined on q i in the distributional sense.We first note that as on M ± separately, the field variables are smooth, only evolutionary derivatives may cause discontinuities.Stated differently, q i [k] for 0 ≤ k ≤ r − 1 are continuous throughout M (including Σ) and smooth in the subdomains, hence q i [k],a1...a l for any l ≥ 0 are also continuous at Σ and smooth in M ± , while the rth evolutionary derivatives q i [r] suffer only a jump-discontinuity at Σ.
] ] of evolutionary order r for this system.In terms of this Lagrangian, the field equations are written as where , and as such each has evolutionary order r at most.Therefore, if the field variables q i belong the above stated regularity class, ] is a well-defined function throughout M \ Σ with at most a jump discontinuity at Σ stemming from q [r] having possibly a jump discontinuity there.The Euler-Lagrange expressions are computed from the E (k) i by letting total derivatives act on these, which are linear operators, hence can be interpreted distributionally.For any m-tuple ϕ i = ϕ i (y, z) of smooth test functions whose supports are compact and do not intersect any possible outer boundary of M , we have where for a singular distribution T , the bracket T, ϕ means the action of T on the test function ϕ, while if T is regular and represented by the locally integrable function f , this action is computed through integration as T, ϕ = f, ϕ = f (x)ϕ(x) dx.In (2.37) the last term involves only regular distributions with the bracket being an integral.We will need where the coefficients π are computed by evaluating the total derivative to get where we have temporarily defined π Comparing the coefficients, we get the recursion formulae which can be solved by backwards induction to obtain for 1 ≤ k ≤ r are then the (higher order, Ostrogradsky-) canonical momenta [35] of the field variables with respect to the given splitting.The bracket at the end of (2.37) may then be computed as (we write Σ(z) for the surfaces z = const with Σ = Σ(0)) where we have defined for the jump-discontinuity of a quantity.Write δ Σ for the Dirac delta associated with the surface Σ, which satisfies for any test function ϕ.The evolutionary derivatives of the Dirac delta are denoted δ Σ Plugging this back into (2.42),we get and if the "soldering" is the Heaviside step function, we get which then shows that the Euler-Lagrange expressions E i [q] evaluated on q are well-defined as a distribution with This relation is then the junction condition for the field theory, since if we are given a source where is the regular part and the expressions σ are (smooth) functions on Σ, the field equations E i [q] = ρ i are equivalent to the pair of bulk equations and the boundary equations which constitute the junction conditions.
We briefly remark on why L had to be a minimal evolutionary order Lagrangian.Let the situation be given as before but with an alternative Lagrangian L ′ [q [0] , . . ., q [r ′ ] ] of evolutionary order r ′ > r which is non-minimal.The Euler-Lagrange expressions are then where Since the E i are the same as before, we expect that the Euler-Lagrange expressions are well-defined distributionally if the q i are C r−1 at Σ with bounded left and right derivatives there and smooth everywhere else.However the E ′(k) i [q [0] , . . ., q [r] , . . ., q [r ′ ] ] are in general non-linear differential operators in q, and therefore the E ′(k) i would need to be evaluated on distributions, which is not in general well-defined.Of course as the existence of minimal evolutionary order Lagrangians show, it is always possible to eg. rearrange the various evolutionary derivatives that appear in the expressions E ′(k) i to obtain a form which can be evaluated on test functions correctly.But this process is impossible to discover if we only have the symbolic expressions E ′(k) i and is even difficult when we have explicit expressions.The passage to a minimal evolutionary order Lagrangian serves precisely this purpose in that it produces a symbolic form for the Euler-Lagrange expressions that can be correctly evaluated on test functions.

On the minimal regularity conditions
What we have shown in Section 2.3 is that if the appropriate conditions are satisfied (the theory is variational and the evolutionary order is even), then the Euler-Lagrange operator is well-defined on fields that are smooth away from Σ, and C r−1 on Σ with bounded left and right rth derivatives there and this is true for any such theory irrespective of the precise analytic form of either E i [q] or the associated Lagrangian L[q] of minimal evolutionary order.
In special cases it is possible that solutions of the field equations exist with even less restrictive regularity conditions at Σ.This cannot however be determined by such a general process, and a case-by-case analysis is necessary.The most obvious examples arise when the differential operator determining the field theory is linear, i.e. of the form where the coefficients D µ1...µ k ij may depend on the coordinates x µ but not on the field variables q i or their derivatives.This case is significant because the linear operator D has a formal adjoint D † (see Appendix B) and therefore it can act on any distribution by duality, i.e. as The right hand side defines the meaning of D i [q] whenever the q i are distributions.In consequence, the field variables q i can be interpreted as essentially any distribution and the field equations The junction conditions derived in Section 2.3 should thus be viewed as defining the lowest possible regularity properties of the field variables applicable to any field theory that obeys the few conditions outlined during the derivation.They are then always valid junction conditions, but in exceptional cases, even less restrictive regularity conditions are allowed.
In permissive cases such as this, physical considerations should be used to determine the appropriate junction conditions.For example it might happen that singular sources of a certain kind (eg.involving many derivatives of the Dirac delta) are deemed to be void of physical meaning and thus one does not consider such sources.Then the source term -or the lack thereof -in the field equations themselves will force certain regularity conditions upon the field that would otherwise not be imposed by merely mathematical reasoning.

Junction conditions are broken extremals
The junction conditions (2.52) are also extremals of a variation problem.The setup and notation is the same as in Section 2, ] is an m-tuple of variational differential operators of evolutionary order 2r, L[q [0] , . . ., q [r] ] is a minimal evolutionary order Lagrangian for it.The coordinate manifold M is of the form M = [z 0 , z 1 ] × Σ (here z 0 < 0 < z 1 ) with action functional We look for extremals of this functional in the class of functions q i (y, z) with q i smooth (or C 2r ) in M ± and C r−1 on Σ = Σ(0) with bounded left and right rth derivatives there.For simplicity we assume that the field variations δq i are compactly supported with the supports not intersecting the outer boundary pieces of M (this is to ensure there are no surface terms there).The extremals of S 0 will satisfy the junction conditions (2.52) only with vanishing sources.For simplicity, we first consider only sources concentrated on the junction surface Σ, therefore we modify the action by adding a contribution where the surface Lagrangian K involves at most r − 1 evolutionary derivatives to ensure that its value is unambiguous.When S 0 is varied, we split the z integral as z1 z0 dz = z1 0 dz + 0 z0 dz and collect the interior boundary terms at z = 0 (corresponding to Σ).The derivation is extremely similar to the one in Section 2.3, hence we omit most of it.In K, the fields q [0] , . . ., q [r−1] are evaluated at z = 0 since the integral is over the zero set of z, and we name the variational derivatives as which gives The extremals condition δS[q] = 0 for any field variation (with the required properties on the supports) implies the field equations in the bulk as well as the boundary equations which coincides with (2.52).It should be remarked that if we had also added bulk sources whose Lagrangians depend on q [k] for k ≥ (r − 1)/2, then the variations of these source Lagrangians would have themselves contributed surface terms at Σ.The total sources σ (k) i are then constructed from the sum of these contributions and those from the surface integral Σ d n y K.
In the calculus of variations with a single independent variable, extremals that are not sufficiently differentiable (for the Euler-Lagrange expressions to exist regularly) at a discrete set of isolated points are called broken extremals [36].Hence it is reasonable to refer to fields that obey the junction conditions (including the bulk field equations) as broken extremals as well.
We also note here that if we had used a Lagrangian with non-minimal evolutionary order, we would have encountered a larger number of boundary terms at Σ whose evolutionary order is higher.Correspondingly, we would had to have put higher regularity conditions on the field to be able to repeat the process done above.By adding total derivatives with arbitrarily high evolutionary orders, we can increase the number of boundary terms and the required differentiability class of the field without bounds.Hence, just as when we had used distributions in Section 2.3, the minimal order Lagrangians select the unique minimal allowed regularity class of the field in which junction conditions can be interpreted regardless of the Lagrangian's functional form (cf. Section 3.1).
Unlike the distributional case, where the junction conditions have been deduced from the field equations directly, albeit with the Lagrangian formulation used as a tool to arrive at a "favorable" symbolic form of the field equations, the uniqueness of the junction conditions (at least for a fixed splitting of the variables into evolutionary and instantaneous ones) is not apparent, as the Lagrangian for the field is not unique.
Once we have selected a minimal order Lagrangian L[q [0] , . . ., q [r] ], we still have the freedom to modify the Lagrangian by adding a total derivative: as long as F is at most evolutionary order r − 1 (adding instantaneous divergences eg.d a K a would not contribute surface terms at Σ). Adding this term causes the change in the canonical momenta.However by assumption, the fields are C r−1 at Σ and as F has evolutionary order r − 1 at most, the jump-discontinuity of this added term vanishes, hence which explicitly shows that the choice of Lagrangian within the class of minimal evolutionary order Lagrangians does not affect the junction conditions.

Variational counterterms
The Einstein-Hilbert action for a metric tensor on an n + 1 dimensional space is which is second order in the metric due to the scalar curvature R. On the other hand, its Euler-Lagrange equations (the Einstein field equations) are second order as well.There are in fact first order actions for the metric, but they suffer from a number of drawbacks such as not being diffeomorphisminvariant.The variational problem associated to this action is not well-posed (see [23] for a detailed discussion on what this means) which essentially stems from having "too many boundary conditions" due to the Lagrangian having too high order compared to the field equations.This also prevents the junction conditions to be directly calculated from this action, which is not surprising, as the Lagrangian does not have minimal order.
It should be remarked that this particular argument on the well-posedness of the variational principle is, at least in the general case, only a useful heuristic that can be summarized as "a system with evolutionary order 2r requires r+r boundary conditions (for each dynamical variable) on the initial and final surfaces of constant evolutionary parameter, mapped to 2r initial value data", which is rigorously justifiable for hyperbolic PDEs with noncharacteristic initial/final surfaces or Cauchy-Kovalevskayatype systems with analytic coefficients, but remains a mere heuristic in other cases.
The typical solution to this problem is to add surface terms to the action.If the boundary ∂M consists of timelike and spacelike pieces only 6 , then to each piece one adds the integral Σi ε κ K √ hd n x, where K is the extrinsic curvature, h is the absolute value of the determinant of the induced metric and ε = ±1 depending on whether the piece Σ i is timelike or spacelike.This is the so-called GHY boundary term [21,22].Generalizations of this term also exist to boundaries of arbitrary signature [37,38], and there are also alternative boundary terms arising from the reduction to the mentioned non-invariant first order Lagrangians [31,32,33,40].It is possible to deduce [39,41,40] the junction conditions in GR from the action extended by the GHY term via a variational procedure analogous to Section 3.2, provided one adds the GHY term to both sides of the interior boundary surface.
It is easy to see that there is a close relationship between minimal evolutionary order Lagrangians and GHY-like terms added to the initial and final surfaces (with respect to a foliation).Suppose that L[q [0] , . . ., q [r] ] is a Lagrangian of evolutionary order r and the field equations E i [q [0] , . . ., q [s] ] ≈ 0 are evolutionary order s with s < 2r and s even.By varying the action functional determined by L we obtain rm boundary conditions on the initial surface Σ(z 0 ) and also rm boundary conditions on the final surface Σ(z 1 ).We however expect that the Cauchy problem for the equation requires sm < 2rm initial conditions (see [23] for the relation between initial and boundary data), hence the variational boundary conditions overdetermine the system.
Writing the variation of the action as we see that if s = 2p, the terms π i δq i [r−1] are in excess and should be absent if the variational problem to be well-posed.A variational counterterm is an integral of the form which has the effect that the combined action S + B behaves at the initial and final slices as if its integrand were a minimal evolutionary Lagrangian (of evolutionary order p).Now if we see that Σ(z) dz F is a variational counterterm, hence every process of reducing the evolutionary order of a Lagrangian to a minimal order one also supplies a variational counterterm.The converse is also clear, i.e. whenever B = d n y B is a counterterm, the Lagrangian L min = L + d dz B is a minimal evolutionary order Lagrangian possibly up to total instantaneous divergences.
Hence via Theorem 2.1, we have proven the general -albeit only local -existence of GHY-like counterterms at least for surfaces transverse to the evolutionary direction.Our formulation only reduces the order of the Lagrangian in the evolutionary direction, and the minimal evolutionary order Lagrangian can have quite pathological behaviour at the instantaneous boundaries (i.e. the boundaries ∂Σ(z)).It is then clear that if on the total manifold I × Σ we can set up "polar coordinates" on each slice Σ(z) hence that ∂Σ(z) corresponds to z = const.and r = const (with r being the "radial coordinate"), then the behaviour of the action can also be regularized at the instantaneous boundary by computing the relevant counterterm with now r being the evolutionary coordinate.This will not affect the Lagrangian itself but might produce corner terms at the boundaries ∂Σ(z 1 ) and ∂Σ(z 0 ).It is beyond the scope of this text whether these corner terms can be eliminated by further counterterms.
The line of thought presented above regarding the effective equivalence of GHY-like variational counterterms and the use of minimal evolutionary order Lagrangians then demonstrates why many authors (such as in [16]) are able to derive junction conditions in GR or more general modified theories of gravity by first adding GHY-like counterterms to the action at the junction surface Σ from both sides then taking the variational process for broken extremals.This is essentially the same as when the process in Section 3.2 is carried out with the minimal evolutionary order Lagrangian.

Systems of odd order
Systems with odd evolutionary order are intrinsically degenerate and can never have well-posed variational principles.Since they never have Lagrangians whose evolutionary order is precisely half of that of the field equations', the determination of the junction conditions for these systems is much less clear.When the evolutionary order of the field equations is s = 2r − 1, we can nonetheless show that the Euler-Lagrange expressions exist distributionally when the field variables have the same regularity at Σ that they would have for an equation of evolutionary order s + 1, i.e. the fields must be C r−1 at Σ with bounded left and right rth derivatives there.
To verify this, we recall Eq. (2.35), namely that systems with odd evolutionary orders always have minimal evolutionary order Lagrangians of the form The Euler-Lagrange expressions can be written in the form where and we notice that where π (r) i are the highest canonical momenta.On the other hand, for (k) i has evolutionary order r but is linear in q i [r] and its instantaneous derivatives, while E (r) i has evolutionary order r − 1. Writing the coefficients A (k)|a1...a l ij may in principle be any complicated nonlinear function of the variables q [r−1] , hence in general such an expression is well-defined as a distribution only if these coefficients are continuous functions, i.e. we once again obtain the regularity condition that the q i must be C r−1 at Σ with bounded rth derivatives, which coincides with the case when the field equations have evolutionary order 2r.But then a process completely analogous to the one presented in Section 2.3 gives us However we may then witness that as π (r) i = A i and this is of evolutionary order r − 1, we necessarily and always have π = 0 as an identity, since the r − 1th evolutionary derivatives are continuous.Hence although the regularity conditions for a field theory with evolutionary order 2r − 1 are the same as for on with evolutionary order 2r, we obtain less junction conditions, namely now and if singular sources are also introduced as and σ (r) i = 0.

Initial value constraints
A Lagrangian L[q [0] , . . ., q [r] ] of evolutionary order r7 is degenerate if the matrix is not invertible (for simplicity, we assume that L does not depend on instantaneous derivatives of q i [r] , which is often the case).As it is well-known [30], this means that the Hamiltonian formulation does not exist in a traditional manner, and there are nontrivial initial conditions the initial data must satisfy.Although the presence of initial value constraints do not necessarily imply the existence of gauge symmetries (see [30] for a detailed analysis), the converse is true, i.e. gauge symmetries always signal degeneracies in the Lagrangian and there are always initial value constraints associated, moreover in physically relevant situations, initial value constraints do tend to appear in tandem with gauge symmetries.Hence in the sequel we will suppose that any initial value constraints present in the system are connected to gauge symmetries.
Initial value constraints appear in the junction formalism as constraints on the singular sources.Recall [2] that in GR along a timelike or spacelike junction surface Σ, the junction conditions are where ε = ±1 depending on whether the surface is timelike or spacelike and S ab is the surface energymomentum tensor.These junction conditions are however supplemented by the relations and where − is the arithmetic mean (eg.F = 1 2 (F + + F − )) of a discontinuous field quantity, D and R are the Levi-Civita connection and scalar curvature respectively, associated to the induced metric h ab on Σ.These relations are consequences of the constraints [18] valid on any (timelike or spacelike) hypersurface and are obtained by taking sums and differences of the constraints on each side of the junction surface Σ.
Such relations are expected crop up when the junction conditions are computed for any field theory that has gauge symmetries.Unfortunately, a general analysis of such relations are very difficult, therefore we provide a complete analysis only when the Lagrangian is completely first order and the gauge symmetries are exact, vertical and have differential index 1.Let the Lagrangian be L(q, q (1) , x) and suppose that we have the infinitesimal gauge transformation where the coefficients Q i A and Q i,µ A may in general be field-dependent and the λ A are the gauge parameters.Verticality means that the there are no corresponding transformation δx µ of the coordinates, exactness means that under the above variations we have δL = 0, and having differential index 1 means that the variation δq i contains at most first derivatives of the gauge parameters.Let be the Lagrangian momenta.Then (essentially applying the second Noether theorem) we have and here it is possible to perform further "integration by parts" on the gauge parameter to get Set Integration of the above relation on the manifold with λ A having compact support gives the off-shell identities N A = 0 and d µ S µ = 0. (3.37) If the field equations are E i = ρ i for some sources ρ i , then the former equation reduces to which constrains the sources.For GR this is the well-known pseudo-conservation law ∇ µ T µ ν = 0. Now take an arbitrary hypersurface Σ, introduce adapted coordinates (x µ ) = (y a , z), and choose the gauge parameters λ A such that they have compact support that intersects Σ such that supp λ ∩ Σ is partition by Σ into two disjoint regions.Integration of the conservation law d µ S µ = 0 over the manifold piece corresponding to z ≤ 0 while the field equations are enforced then produces the relation (recall that π i := π z i is the canonical momentum associated to this splitting of the coordinates) Since on the surface Σ given by z = 0 we can choose λ A and λ A ,z separately, we obtain which are relations valid over any surface Σ.
Suppose now that we have a junction situation along Σ.Then these relations are valid on Σ as considered from either side of the surface.The junction conditions are [π i ] + − = σ i , where σ i are the singular sources.Taking jumps and means of the constraints, we get where we have assumed that the coefficients Q depend on the field such that they are continuous at Σ, which is often the case.If this condition is not met, then evidently the jumps and means must involve them as well.
Similar relations are given when the Lagrangian is higher order and/or the gauge symmetry has higher differential index.However the combinatorial complexity of the resulting formulae and thus also the analysis involved gets rapidly unmanageable.Moreover when the gauge symmetry is non-vertical or it is a quasi-symmetry (δL = d µ K µ for some non-closed current K µ ), the analysis again gets a lot more complicated.
In the very general case, what we can say is that in the presence of gauge symmetries, there exist a number of constraint equations which involve the sources explicitly and depend on the field variables up to some unspecified evolutionary order s.These constraint equations can be obtained directly from the degeneracy structure of the Lagrangian (as detailed in eg.[30]) without having to use the second Noether theorem.Then taking sums and differences at the junction surface Σ, we obtain the sets which provide a number of constraints on both the bulk and singular sources at the junction surface.

A cautionary example
In this section we consider a scalar field theory in flat four dimensional spacetime M which illustrates that a naive usage of distributions (eg.treating singular distributions as if they were functions) to derive the junction conditions can produce erroneous results.This is the main reason behind the perceived ambiguities in the junction formalism that has been alluded to in the Introduction.
The scalar field theory is defined by the Lagrangian where (∂φ Here the dynamical variable is denoted φ, the independent variables are t, x, y, z and indices are raised and lowered by the canonical flat metric η µν .This Lagrangian belongs to the G 3 -part of Horndeski theory [16].
The Euler-Lagrange expression is easily derived to be which is quadratic in the second derivatives of the field, hence there is no Lagrangian for this field that is completely first order in all variables.Suppose now that there is a defect at z = 0 and the latin indices a, b, c . . .will take the values t, x, y.Splitting the Euler-Lagrange expression gives The regularity assumptions on the field φ are that it is continuous at z = 0 with possibly discontinuous but bounded first z-derivatives φ ,z .Hence φ ,a , φ ,ab etc. are also continuous at z = 0 and φ ,z , φ ,za , φ ,zab etc. have well-defined, bounded left and right limits at z = 0. Then distributionally we have where δ Σ is the Dirac delta associated to the surface Σ determined by the equation z = 0.If we now pretend that δ Σ is a function and insert this expression into E, we find where E is the collection of all the regular terms in E. Note that φ ,z , thus also φ 2 ,z does not have a well-defined value at z = 0, but all coefficients of δ Σ are evaluated at z = 0. Define for any function f continuous in M with bounded left and right limits at z = 0 the mean value Expressions such as φ 2 ,z δ Σ are often interpreted to mean φ 2 ,z δ Σ , hence we obtain The Schwarz impossibility theorem [20] guarantees that the products of singular distributions like δ Σ with discontinuous functions cannot make sense if we wish to keep the meaning of the product of regular distributions as the usual pointwise product.There is thus no way to actually derive this result, this interpretation has to be supplied by hand.
The original Lagrangian (3.44) in 3 + 1 split form is We may then reduce this to first order in z by applying Theorem 2.1 to it.In fact the reduction here is easy to perform even without knowing the Theorem.Consider a function F (φ, φ ,a , φ z ) and its total z-derivative and we want to solve the equation which is integrable with eg.
Then the Lagrangian L ′ := L + dF/dz has the same Euler-Lagrange expression as L but contains only first derivatives in the z variable, Thus L ′ is a Lagrangian of minimal evolutionary order.From the general theory we know that where This is calculated to be hence + − be the coefficients of the Delta distribution in E and E naive respectively.In order to reduce notation clutter, let a := φ + ,z and b := φ − ,z .Then The second term here is which is a nonzero polynomial in a, b.Hence we must conclude that E naive is not equivalent to E. However both the rigorous distributional calculation and the variational framework establish that E is the correct singular part for the Euler-Lagrange expression.The junction conditions for the scalar field theory at the surface z = 0 are then where σ is a singular source concentrated on z = 0.
The conclusion for this section is that when one calculates the junction conditions using distribution theory, all distributional relations must be evaluated by smearing the distribution against a test function and performing only "valid" manipulations.Starting with (3.46), it would take a very lucky burst of intuition to be able to isolate total z-derivatives from the coefficients of φ ,zz in a way that the operator can be interpreted properly as a distribution.Our whole formalism is based on the premise that if we find a minimal evolutionary order Lagrangian, then the total derivatives in the Euler-Lagrange expressions appear "just right" so that we are able to do that.

Equations with evolutionary order 2
It can be extremely laborous to derive the junction conditions even for low order field equations, especially for modified theories of gravity.Therefore it would be useful to invent a process which can be applied to the equations of motion directly to obtain the junction conditions, without having to go through the pains of reducing the order of the Lagrangian and the calculating the canonical momenta from the reduced Lagrangian.
The case with the most practical importance is when the field equations have evolutionary order two and thus the minimal Lagrangian has evolutionary order one.For this particular case, such a process exists.First of all, the weak Euler-Lagrange expressions take the form hence there is only a single set of canonical momenta which need to be computed, and if L[q [0] , q [1] ] is a minimal order Lagrangian, we have (for a regular field q) where the terms with second evolutionary order are all concentrated in dπ i /dz.In particular, if we expand the total derivative, we get and hence the field equations take the form where we know that for each 0

.68)
The coefficients W a1...a k ij can be read off the expressions E i and we know that the above equation is integrable for π i .We can then construct , tq [1] ]q j [1],a1...a k , (3.69) which will necessarily satisfy W a1...a k ij = −∂Π i /∂q j [1],a1...a k .Now, we cannot know for sure that Π i is equivalent to π i , we only know that π i − Π i does not depend on q [1] in any way.But the pertinent regularity condition for a system with evolutionary order 2 is that the field q is C 0 at the junction surface with the first evolutionary derivatives q i [1] having a jump discontinuity there.Hence [Π i ] + − = [π i ] + − and thus the junction conditions can be computed from Π i .
Another useful consequence of this approach is that for generally covariant theories, it becomes possible to compute the junction conditions in a more geometric manner than the process outlined in Section 2.3.The second evolutionary derivatives of the dynamical variable take the distributional form which when plugged back into (3.67)gives It should be emphasized that this is a naive calculation (cf.Section 3.6), because the coefficients , q [1] ] are not in general continuous at Σ, hence they cannot multiply the Dirac delta.As explained in Section 3.6, such terms are usually interpreted as average values, but this is a mathematically untenable interpretation that can produce wrong results.The utility of this is that the W a1...a k ij can be read off as the coefficients of the q j [1],a1...a k + − δ Σ when the above is interpreted as a formal expression.One may then wonder why not just simply determine the W a1...a k ij from the field equations directly, as in (3.67)?For modified theories of gravity, reading off the W a1...a k ij from the E i requires splitting the variables, which often leads to very complicated and unwieldy coordinate expressions, but the "formal expression" (3.71) can often be computed in a geometric manner.Thus we are able to compute the junction conditions "naively", which can give wrong results but is otherwise the practically simplest approach, then we are able to infer from the wrong naive result the actually correct junction conditions.Rather than attempting further explanations here, we direct the reader to Section 4 for an example application of this method.

Example application
We revisit the Horndeski-class scalar-tensor theory the author has investigated in [42] whose Lagrangian is in which the dynamical variables are a metric tensor g µν and a scalar field φ with X = − 1 2 g µν ∇ µ φ∇ ν φ and B, ξ, F are arbitrary unspecified smooth functions of the scalar field.Suppose that M is four dimensional (this is not a necessary restriction however) and Σ is a junction surface whose causal type is completely arbitrary, including the case when it is signature-changing.The Lagrangian is second order in all coordinates, but its field equations are also second order.The field equations are quadratic in the scalar fields's second derivative, hence there is no totally first order Lagrangian for the field.To determine the junction conditions, we employ the process outlined in Section 3.7, i.e. we work with the field equations directly, which are with with E µν = i=2,3,4 E µν i and E φ = i=2,3,4 E φ i .Per the discussion in Section 3.7, we work covariantly as much as it is possible.The regularity conditions are that g µν and φ are C 0 at Σ with bounded but possibly discontinuous first derivatives and let l µ be a vector field along Σ transversal to Σ and pointing from M − to M + .We take n µ to be the unique covector field along Σ which is normal to Σ and satisfies n µ l µ = 1.We will also use the notations and conventions of rigged surfaces, which are reviewed in Appendix E for the reader's convenience.If Θ Σ is the Heaviside step function, which is being 1 in M + and 0 in M − , the covector valued Dirac delta distribution is where δ Σ is the scalar Dirac delta distribution (see eg. [8] or [40]).We may then compute that where γ µν is a tensor field along Σ calculated as [g µν,ρ ] + − l ρ .This is because only the transversal derivatives of g µν suffer jumps at Σ. Analogously, for the scalar field we may write where χ = φ ;ρ l ρ .The second covariant derivatives of the scalar field are computed as which also gives For the metric sector, we then have and the curvature tensor is If we introduce coordinates adapted to l µ we get (latin indices take the values 1, 2, 3 and see Appendix E for the notation for various projected quantities) For simplicity, let If we plug in the distributional expressions into the field equations (4.2), we get where φ n = n µ φ ;µ and where the subscripts "n" indicates that these are naive formal expressions.Now we want to factor γ ρσ out of the Q tensors: with and We then write since as explained in Section 3.7, we are able to compute the (scalarized) canonical momenta of the system as where W ... {t} means that W ... is evaluated at tg µν,l and tφ l , with g µν,l := l ρ g µν,ρ and φ l := l ρ φ ;ρ .We can read off the coefficients as First we verify that W µν|ρσ and W µν are tangent to Σ in that the contraction with n ν vanishes.For W µν|ρσ , this is equivalent to verifying that B µν |ρσ n ν = 0, and B µν |ρσ n σ = 0. We have and indeed hold.For W µν the analogous calculation is easily performed to yield Finally, for W |ρσ , we can check A |ρσ n σ = 0 and A µν |ρσ φ ;µ φ ;ν n σ = 0 separately: hence Also we notice that hence these two coefficients coincide.
We now use coordinates adapted to l µ and compute the coefficients W ... in decomposed form.As these coefficients are all tangent, only the W ab|cd and W ab = W |ab parts will be nonzero: and to further evaluate this expression we insert to get We then compute and finally First we note that W ab|cd {t} = W ab|cd as these coefficients do not depend on the first evolutionary derivatives of neither the metric nor the scalar field.We then calculate the (scalarized) metric canonical momenta by where we note that hence The scalar field momentum is then where the primary difficulty lies in computing ∆{t}.Both φ ||ab and K ab depend nontrivially on h ab,l .We have thus and and Since we have defined E µν as the coefficients of δg µν , these expressions are negatives of the ones considered usually, hence the regular field equations are where T µν is the energy-momentum tensor in the usual sense and T φ is a scalar source.The correct distributional form of the Euler-Lagrange expressions are then with [Π µν ] + − being the four dimensional extension of Π ab + − , and the junction conditions are where the S ab and S φ are the surface energy-momentum tensor and surface scalar source respectively.
The junction conditions above have been computed under no assumptions on the causal type of Σ.
Let us now assume that Σ is timelike or spacelike, choose l µ to be the outward pointing unit normal in which case n µ = εl µ , where ε = ±1 with the + sign chosen when Σ is timelike and the − chosen when Σ is spacelike.Then ε(n) = ε, ν a = 0, h ab * = h ab , h ab,l = 2K ab , K ab = εK ab , ω ab = −εh ab and φ ||ab + − = 0. Inserting these relations give and These junction conditions correct and greatly expand the ones that have appeared in [42].

Conclusion
We have devised a method to derive the junction conditions and thin shell equations, applicable to an extremely wide class of field theories, by reducing partially the order of the Lagrangian of the field theory.We have proven that the junction conditions can be both realized as singular extensions of the ordinary Euler-Lagrange equations as well as broken extremals of the action functional and both of these representations coincide.Both the reduction process and the subsequent computation of the junction conditions are constructive.The former requires the calculation of a sequence of quadratures while the latter the computation of a number of variational derivatives.We gave an example illustrating how the distributional formulation can produce erroneous results when applied incorrectly and related the reduction process of the Lagrangian to Gibbons-Hawking-York-like boundary terms, thereby also giving a general -although local -existence proof for such boundary terms.
For the most common case, when the field theory has equations of order 2, we have also derived a formal trick for computing the correct junction conditions from the incorrect "naive" ones by applying a homotopy integral to the "naive" coefficients.The value of this method lies in that it is usually possible to deduce the "naive" junction conditions through covariant geometric means, without splitting the variables.This formal trick then has been used to compute the junction conditions for a relatively complicated Horndeski class theory valid for arbitrary signature junction surfaces.
Even though the various ambiguities and tensions of the existing methods to compute junction conditions are well-known on a folklore level, it appears that until now no systematic analysis of the junction conditions from a general point of view has been performed.This work thus addresses this particular gap in the literature, while also providing effective methods for calculating the junction conditions.

B Formal adjoints
The higher product formulae will now be used to provide an explicit formula for the formal adjoints of linear differential operators.Let φ : U → R m , φ = (φ 1 , . . ., φ m ) = (φ i ) m i=1 be an m-tuple of smooth functions.A linear differential operator of order r acting on φ is of the form where φ ,µ := ∂ µ φ = ∂φ/∂x µ , and the D µ1...µ k for 0 ≤ k ≤ r are m × m matrices of smooth functions symmetric in the indices and • denotes matrix multiplication.For a pair of smooth functions φ, ψ : which does not necessarily converge (as U is not compact).Here φ t is the transpose of the column φ.
The formal adjoint of D is the linear differential operator D † such that for all compactly supported (hence the following integrals converge) smooth function ψ : Given the linear differential operator (B.1), its formal adjoint is with where (D t ) µ1...µ k ν1...ν l is the transpose of the matrix D µ1...µ k ν1...ν l .
Proof.From the definition of the formal adjoint, we have where we have used Corollary A.2, but hid all except the first term under the divergence As ψ is compactly supported, this term vanishes.We then apply the higher order product rule (Proposition A.1 and Corollary A.1) to the first integral here to get where From the symmetry of the bracket ψ, φ it is clear that the act of taking the formal adjoint is an involution, i.e.D † † = D for any linear differential operator.

C Higher Euler operators and Helmholtz conditions
The notation we use here is similar to that of Section 2.1 (the main exception is that we take M ⊆ R n instead of R n+1 ), but to ensure the independence of the Appendix, we quickly summarize here.We consider a variation problem specified by an order r Lagrange function L[q](x) = L(x, q(x), . . ., q (r) (x)), where the independent variables are x = (x 1 , . . ., x n ) = (x µ ) n µ=1 (and they are taken from some open set M ⊆ R n ), the dynamical variables are q(x) = (q 1 (x), . . ., q m (x)) = (q i (x)) m i=1 with formal derivatives q (k) = (q i ,µ1...µ k ).As the derivative variables q i ,µ1...µ k are symmetric in the lower indices, the differentiation rules they obey are summarized through As a rule, in f [q], the "functional argument" [q] stands for a dependence on x, q, q (1) , . . ., q (r) up to some finite order r (the order may differ for different objects f ) and such functions are also called "operators", since they are essentially differential operators on the dynamical variables q i (x).Total differentiation is denoted where ∂/∂x µ treats the q i and the derivative variables as if they were constants.
The variation of the Lagrangian given above is then and this can be rewritten to separate a total divergence in at least two ways.
1.The (higher ) Euler operators, also called (higher ) variational or Euler-Lagrange derivatives are defined as 2. The Lagrangian momenta are defined as The Euler operators are "operators" in the sense of associating to each Lagrangian L the expressions δL/δq i ,µ1...µ k .They were defined and extensively investigated by Aldersley [45] (see also [27]).The term "higher" refers to the fact that for k = 0 we have which are the usual Euler-Lagrange derivatives, thus for k > 0 these are "higher order" analogues of the Euler-Lagrange derivative.For k = 0 we also have P i = δL/δq i , hence we usually call the P µ1...µ k i Lagrangian momenta only for k > 0. The expressions for the Lagrangian momenta and the higher variational derivatives are extremely similar with only the binomial factor being a difference.But as this binomial factor is different in each term of the sum, the difference between the momenta and the higher variational derivatives is not trivial to quantify.
Proposition C.1.For any Lagrangian L of order r, we have the following two expressions for its variation: and without calculating anything explicitly, we see via the higher product formula (Corollary A.1) that this can be rewritten as for some coefficients P µ1...µ k i (1 ≤ k ≤ r) which can moreover be taken to be symmetric in the upper indices.In the preceding formula, the total derivative is evaluated and we also abbreviate P i = δL/δq i to get These can be solved by backwards induction to get as asserted.
1.The linearization of this operator is the linear differential operator 2. The Helmholtz operator H is the operator transforming an m-component differential operator ε = (ε i ) into the linear differential operator where D † is the formal adjoint.
3. Applying the formula (B.5) for the formal adjoint as well as the definitions of the higher variational derivatives (C.4), we get where which are called the Helmholtz expressions associated to ε i .

The equations
are called the Helmholtz variationality conditions (or just Helmholtz conditions) for ε i .
For clarity, we write out the Helmholtz expressions for s = 2: denote an m-component differential operator of order s ≤ 2r such that there is a Lagrangian function L[q] of order r with E i = δL/δq i (then E is called variational).Then the differential operator E satisfies the Helmholtz variationality conditions, i.e.
Proof.We work with the action integral directly.Ordinarily, variations δq i are obtained by first considering a one-parameter family q i s (x) of fields and taking the s-derivative at s = 0, i.e. δq i = ∂q i s /∂s s=0 .We now consider a two-parameter family q i s,t (x) of fields with Any of the two variations δ 1 q i and δ 2 q i may be chosen to have compact support. Then where we have integrated by parts and chosen δ 1 q to have compact support, thus As the two variations commute, by substracting δ 1 δ 2 S from this, we get zero: where we have used the formula for the formal adjoint along with the definition of higher variational derivatives.Thus Since δ 1 q i is any compactly supported bump function, the usual argument tells us that this is only possible if Then at any one point x ∈ M we may set δ 2 q j ,µ1...µ k (x) to have any fixed value as long as it is symmetric in the lower indices.As H µ1...µ k ij (ε) is also symmetric in the greek indices, we obtain that Stated differently, every Euler-Lagrange differential equation is such that its linearization is (formally) selfadjoint.
Remark.It is difficult to pinpoint where exactly did the Helmholtz variationality conditions appear to this degree of generality for the first time.Formulations of the variational condition in special cases such as second order ordinary differential equations date back to the 19th century.The formally self-adjoint nature of the Euler-Lagrange equations have also been recognized relatively early.We refer to [24,27,43] for the Helmholtz variationality conditions and the citations therein.

D Homotopy operators and integrability conditions
In this section we assume that the zero field q i (x) = 0 is a valid field configuration and if q i (x) is allowed, so is tq i (x) for 0 ≤ t ≤ 1.In other words, the space of field variables is star-shaped with respect to 0. For the applications in this paper, this will not be restrictive.If this assumption is made on the field variables, we are able to derive a number of powerful integrability relations for variational systems.A current will be an n-component differential operator Definition D.1.
2. Let ε i [q] be an m-component differential operator of order s.To ε we associate a Lagrangian Λ(ε) as called the Vainberg-Tonti Lagrangian associated to ε.

Let
[q]δq j ,µ1...µ k be a linear differential operator on field variations whose coefficients D µ1...µ k ij [q] are themselves (nonlinear) differential operators acting on the dynamical variables q i .To every such linear operator we associate an m-component (nonlinear) differential operator Let us symbolically write Div for the total divergence, i.e. the operator which associates to each current K µ the Lagrangian Div(K) := d µ K µ and E for the Euler-Lagrange operator which associates to each Lagrangian L the Euler-Lagrange expressions E(L) := (δL/δq i ) m i=1 .Also write H(ε) := (H i (ε)) m i=1 for the symbolic form of the Helmholtz operator.Finally, if L is a Lagrangian let L 0 (x) := L[0](x) be the ordinary function of the independent variables obtained by evaluating the Lagrangian on the zero field.
Proposition D.1.The operators Div, E, H and K, Λ, Q satisfy the following relations: 1. for each Lagrangian L we have Proof.
1.For deriving concrete formulae, suppose that the order of L is r.Then we have where we have used the first variation formula (C.8) evaluated at tq with δq i = q i .Note that for any operator f [q] we have ) which is clear when the total derivative is expanded.Integrating the above relation from 0 to 1 gives which is just (D.4) written out in more concrete terms.
2. Suppose that ε i has order s.Let L := Λ(ε) denote the Vainberg-Tonti Lagrangian associated to ε i and E i := δL/δq i .We then express E i in terms of ε i : The latter is just (D.5) when expanded in coordinates.
are taken with respect to the variables q i only.Note that the term L[0, p] at the end is no longer functionally constant and still depends on the other field variables p α .Actually if in addition to the condition on the space of the field variables we had also assumed the space of independent variables to be star-shaped then we would be able to write in (D.15) L 0 (x) = d µ 1 0 t n−1 L 0 (tx)x µ dt and hence find that under these more restrictive conditions every null Lagrangian is a total divergence without the functionally constant term.But the same result does not extend to the multi-field case (D.17) as L[0, p] still depends functionally on p.
Remark.The original solution of the total divergence problem (i.e. the current (D.1)) is due to Horndeski [46].The explicit solution (D.2) of the inverse variational problem has been derived by Vainberg [47] and Tonti [48].In the modern context, the contents of this Appendix are considered as part of the differential calculus on jet spaces [27,29,43].

E.1 Elementary definitions
Here we review the notion of a rigged (hyper -)surface in a pseudo-Riemannian space.The concept has been originally introduced by Schouten [49] and have been employed in GR to great effect by Mars [11] and Mars and Senovilla [8].For detailed discussion and proofs we refer to the latter.Previously the author has also used rigged surfaces to derive a general set of junction conditions in GR through a variational principle [40], the notations introduced here are similar.
Let M be an n + 1 dimensional smooth manifold with a pseudo-Riemannian metric g.Let Σ ⊆ M be a surface with embedding i : Σ → M .The induced metric is h := i * g, the pullback of g to Σ.It may happen that h fails to be a pseudo-Riemannian metric.A null point of Σ is a point where h is degenerate.Recall that at any x ∈ Σ, the T x Σ is canonically a subspace of T x M , hence for any vector, and more generally, a contravariant tensor, we may ask whether it is tangent to Σ or not.However for the dual tangent spaces we have T * x Σ = T * x M/T 0 x Σ, where T 0 x Σ is the annihilator of T x Σ, i.e. the set of all covectors which vanish on vectors tangent to Σ. Elements of T 0 x Σ are called normal and this vector space is one dimensional.It is thus meaningful to ask whether a covector, or more generally, a covariant tensor, is normal to Σ.
It should be remarked that contrary to common intuition, it is not meaningful to talk about a contravariant vector being normal or a covector being tangent to Σ, at least not a priori.We may call a vector N ∈ T x Σ to be normal if g(N, −) is a normal covector, i.e.N is the metric dual of a normal covector.Let N x Σ ≤ T x M denote the set of all normal vectors at x ∈ Σ, clearly dim N x Σ = 1.Then N x Σ is not necessarily disjoint from T x Σ.One may show that T x M = N x Σ ⊕ T x Σ if and only if x is not a null point.Whenever x is a null point, we have N x Σ ≤ T x Σ.Thus if Σ has null points, then there is no natural complement to T Σ in T M | Σ .
A rigged surface (Σ, l) is a surface Σ together with a vector field l ∈ Γ( T M | Σ ) along Σ, which is transversal everywhere, i.e. l x / ∈ T x Σ for any x ∈ Σ.Let l x ≤ T x M be the subspace spanned by l and l = x∈Σ l x , in which case we have The vector field l is called the rigging vector field.Note that if M and Σ are both orientable (which we suppose), then there is always a global transversal vector field along Σ, hence a global rigging.For any fixed choice of rigging, there is a unique normal covector field n ∈ Γ(T 0 Σ) satisfying n(l) = 1.
The rigging is of course not unique, given any rigging l, we may subject it to the transformations where α is a smooth, nowhere vanishing function on Σ while T ∈ Γ(T Σ) is a vector field along Σ tangent to it.The first kind of transformation preserves the subbundle l and only changes the global section representing it, while the second class of transformations changes the rigging subbundle.However the latter transformation preserves the normalization of n, i.e. n( l) = n(l).
Corresponding to any choice of rigging is the projection operator P : T M | Σ → T Σ acting as P X = X − n(X)l and its linear adjoint P * : T * Σ → T * M | Σ , (P * ω)(X) = ω(P X) for any ω ∈ Γ(T * Σ) and X ∈ Γ( T M | Σ ).In addition to the induced metric h = i * g we also define λ := i * l ♭ , where l ♭ is the metric dual of l, ν := P n ♯ , where n ♯ is the metric dual of n, ε(n) = g(n, n) and ε(l) = g(l, l).

E.2 Coordinate formulae
Let y a be a coordinate system on Σ, x µ a coordiate system on M with the inclusion i : Σ → M described by the parametric relations x µ = x µ (y).Set The projection operator is P µ ν = δ µ ν − l µ n ν .A vector field X along Σ is tangent if and only if it can be written in the form X µ = X a e µ a , hence in mixed bulk-surface coordinates, the projection operator is described by the coefficients ϑ a µ .The induced metric is Let us now smoothly extend l µ off Σ and introduce coordinates x 0 , y a adapted to l µ in the sense that the surface coordinates y a are Lie transported off Σ along l µ .We also extend n µ such that the relations l µ n µ = 1 are preserved.Equality in adapted coordinates is denoted * =.Then we have