Paper The following article is Open access

Variational formalism for generic shells in general relativity

Published 6 December 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Citation Bence Racskó 2022 Class. Quantum Grav. 39 015004 DOI 10.1088/1361-6382/ac38d2

0264-9381/39/1/015004

Abstract

We investigate the variational principle for the gravitational field in the presence of thin shells of completely unconstrained signature (generic shells). Such variational formulations have been given before for shells of timelike and null signatures separately, but so far no unified treatment exists. We identify the shell equation as the natural boundary condition associated with a broken extremal problem along a hypersurface where the metric tensor is allowed to be nondifferentiable. Since the second order nature of the Einstein–Hilbert action makes the boundary value problem associated with the variational formulation ill-defined, regularization schemes need to be introduced. We investigate several such regularization schemes and prove their equivalence. We show that the unified shell equation derived from this variational procedure reproduce past results obtained via distribution theory by Barrabès and Israel for hypersurfaces of fixed causal type and by Mars and Senovilla for generic shells. These results are expected to provide a useful guide to formulating thin shell equations and junction conditions along generic hypersurfaces in modified theories of gravity.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Thin shells in general relativity (GR) and field theories in general are weak (distributional) solutions to the field equations whose pathological behaviour is concentrated to a single hypersurface (or a series of nonintersecting hypersurfaces) in spacetime. In GR such solutions may describe energetic phenomena such as phase transitions, impulsive electromagnetic and gravitational waves [1, 2]. Thin shells also give rise to junction condition on the common boundary surface when glueing together spacetime domains.

Thin shells and junction conditions in GR have been considered by Lanczos [3], Darmois [4], O'Brien and Synge [5], and Lichnerowicz [6], however the most commonly used formulation has been given by Israel [7]. On a timelike or spacelike hypersurface partitioning the spacetime manifold into two subdomains, Israel prescribed the continuity of the induced metric hab and the Lanczos equation relating the jump of the extrinsic curvature Kab to the surface energy–momentum tensor. In the absence of a material shell, the Lanczos equation reduces to the continuity of the extrinsic curvature. The case when the surface energy–momentum tensor does not vanish will be referred to as a thin shell, and the relations imposed by the vanishing of the surface energy–momentum tensor as the junction conditions 1 . An advantage of Israel's formulation is double covariance. For practical calculations it is often useful to work with coordinate systems adapted to the subdomains that mismatch along the hypersurface. Differentiability classes of tensor fields may only be established in coordinate systems whose differentiability class exceeds that of the tensor field. Israel's equations are, however, relations between hypersurface tensors and thus one only has to ensure that the parametrization of the hypersurface is the same as viewed from either side and otherwise work with disjoint systems of bulk coordinates in each spacetime region.

Israel's formulation breaks down when the hypersurface has null points. At null points the normal vector field becomes tangential as well and the extrinsic curvature—which can be seen as the normal derivative of the metric—becomes an intrinsic tangential quantity that carries no transverse information. The 3 + 1 orthogonal decomposition along the shell facilitated by the normal vector becomes degenerate. To fix terminology, a hypersurface will be called pure if it is either timelike, spacelike or null, while it will be referred to as causality-changing, signature-changing or non-pure if its causal type is not constant. The term generic hypersurface is used when the causal type is absolutely not fixed and the surface may either be pure or causality-changing.

Null shells are physically relevant (we refer to [2] for an extensive treatment of their applications), for example as models for impulsive electromagnetic and gravitational waves. Generalizations of the Israel formalism for null shells have been given among others by Clarke and Dray [8], Barrabès and Israel [9], Mars and Senovilla [10], Poisson [11], Mars [12] and Senovilla [13]. Out of these, the formalisms of [9, 10, 12, 13] give a unified prescription valid for generic hypersurfaces 2 . The common point of generalization is that the normal vector field is accompanied by a transversal vector field which generates a non-orthogonal decomposition of the spacetime along the hypersurface. The role of the extrinsic curvature is carried by an analogous quantity built from the transversal vector field. A hypersurface equipped with a selected transversal vector field is called a rigged hypersurface. This structure has been investigated by e.g. Eisenhart [14] and Schouten [15] to describe the geometry of subspaces of manifolds with linear connections. The formalism has been systematically applied to GR by Mars and Senovilla in [10].

There exists at least four methods of obtaining the timelike or spacelike shell equation in GR [13]. These will be referred to as (i) the 'pillbox integration 3 ', (ii) the distributional method, (iii) the intrinsic method and (iv) the variational method. Pillbox integration has been employed by Israel in [7] and involves writing the field equations in Gaussian coordinates adapted to the hypersurface, separating a normal derivative and integrating the field equations through the shell as its thickness tends to zero. This approach is similar to the well-known textbook method [17] to derive the analogous jump conditions in electrodynamics. The distributional method has been pioneered by Taub [18], Geroch and Traschen [19]. The metric tensor is taken to be a C0 regular distribution 4 , from which it follows that the connection is allowed to have discontinuities and the curvature tensor may contain a delta function term. The field equations then impose a relation between the singular part of the Einstein tensor and a singular contribution to the energy–momentum tensor, which is Lanczos' equation. If the metric tensor were allowed to be discontinuous, the connection would pick up a delta function term, and the curvature tensor (quadratic in the connection) would involve products of delta functions, which are ill-defined. This imposes the continuity of the metric as a junction condition. The intrinsic method has been used by Mars [12] as an application of his concept of hypersurface data. He abstracted the properties of rigged hypersurfaces by defining data on an arbitrary hypersurface which may correspond to data specified by a rigging when the hypersurface is embedded in a pseudo-Riemannian space. The purpose has been to open the road for initial value problems in GR for any possible initial hypersurface, however through the use of the rigged analogues of the usual constraint equations, it becomes possible to formulate shells in a purely intrinsic manner with no need for even embedding the hypersurface.

The shell equations have also been obtained via variational methods [2022]. This is particularly useful for braneworld scenarios [21], where the Lanczos equation on the brane is a part of the equations of motion and thus the brane and bulk dynamics arise from a unified variational principle. When the variational formulation is followed, the combined shell + bulk dynamics appear as the broken extremals [23] of a variational problem with the shell equation being the natural boundary condition on the surface.

For a second order theory described by a first order Lagrangian, this is straightforward. The Einstein–Hilbert Lagrangian on the other hand is second order. Since a second order Lagrangian normally produces equations of motion of order four, the boundary conditions pertinent to the variational problem are that of a fourth order differential equation and require the fixing of both the metric and its transverse derivative at the boundary. As the actual field equations are only second order, fixing the transverse derivative would overdetermine the field equations and this causes the variational problem to become ill-defined [24]. When variational principles with only outer boundaries are considered, a common method of solution [25] is to add a boundary term (for example the Gibbons–Hawking–York term [26, 27], but other boundary terms could be introduced at the price of also introducing additional structures) to the Einstein–Hilbert action such that combined bulk + boundary action requires the boundary conditions consistent with a first order Lagrangian, and thus the variation problem becomes well-defined. After Parattu et al [28] such boundary terms will be referred to as variational counterterms. A shell may be considered as an interior boundary of the spacetime manifold thus it is clear that some similar regularization procedure is needed to obtain the correct results. One such way of regularization is to also add the Gibbons–Hawking–York counterterms to the action at the shell [21, 22]. Another which has been employed by Hájíček and Kijowski [20] is to consider the Lagrangian itself as a distribution. Since the Lagrangian involves a curvature tensor, it has a delta function term which is proportional the difference of the Gibbons–Hawking–York terms as calculated from the two sides.

The shell equations and junction conditions for null and generic shells have been in general derived via the distributional method, which is simple to generalize. Senovilla [13] has also shown that pillbox integration can also be adapted to the generic case, and the intrinsic method was already applicable to generic shells. It seems however that not much attention has been given to the variational method for hypersurfaces that are not timelike or spacelike. Jezierski et al have [29] considered the variational treatment of null shells, however they did not show that their results agree in the null limit with the results of e.g. Barrabès and Israel [9] or Mars and Senovilla [10]. There is also an unaddressed issue that has been pointed out by Parattu et al [30] when investigating counterterms on null boundaries. A variation in the metric is a variation in the causality, and such variations do not preserve the nullity of a hypersurface. The underlying reason is that in the tangent space at a fixed point, null vectors form a topologically closed set and every neighborhood of each null vector contains both timelike and spacelike vectors. A general variation will push the initially null surface off the lightcone. The same issue is not encountered in regards to timelike or spacelike surfaces as timelike/spacelike vectors form open sets and each such vector has a neighborhood that consists entirely of timelike/spacelike vectors. It stands to reason that variational methods involving null surfaces should be formulated in a way that can accommodate surfaces of arbitrary causal type.

The purpose of this paper is thus to fill this gap in the literature and provide a formulation of thin shells and junction conditions for GR through a variational principle valid for a generic hypersurface. A natural question is then why should one consider generic shells. One reason is that it is beneficial to provide a unified formalism capable of encompassing timelike, spacelike and null shells at the same time, rather than assuming the signature from the beginning. As the example of the Barrabès–Israel formalism shows (remarked in footnote 2), such unified formulations tend to include the case of non-pure hypersurfaces as well. Moreover, as argued before, even if one is interested in null shells exclusively, the convenient setting for a variational treatment of null hypersurfaces is the one which is applicable to generic hypersurfaces equally. For another reason, non-pure hypersurfaces themselves can appear in physically interesting situations. Some examples may be found in [13]. To give one explicitly, the stationary limit surface of a Kerr black hole is timelike almost everywhere but null at a set of measure zero. If one wishes to obtain matching conditions for spacetime regions separated by such hypersurfaces, one must incorporate signature-changing hypersurfaces. For an application of matching non-pure hypersurfaces, we refer to the works [3133] by Mars et al on signature change on brane worlds.

The primary motivation for the development of this work is the formulation of thin shell equations in modified theories of gravitation. Thin shells have already been considered in extended gravitational theories, for example in [34] thin shells and junction conditions have been examined in Brans–Dicke type scalar–tensor theories via the distributional formalism with the null and non-null cases separately. A variational formalism has also been given but only for the non-null case. In [35, 36], junction conditions have been formulated in Gauss–Bonnet gravity for applications to Gauss–Bonnet brane worlds via the variational formulation, but once again only for non-null cases. Shells in higher order gravity have also been investigated in [37] through the use of distributions. Higher order theories have qualitatively different shell behaviour with so-called double layers—energy–momentum terms proportional to the Dirac delta's derivative—appearing.

The most general scalar–tensor theory with second-order field equations is Horndeski's theory (originally published as [38], but the most common form is the equivalent 'DGSZ reformulation' [39]). Thin shell equations in Horndeski's theory have been found by Padilla and Sivanesan [40] through a variational method valid only for non-null hyper surfaces.

In [41] we gave a formulation of null shells in a reduced class of Horndeski theories via the distributional method and the qualitative form of these equations differed greatly from those obtained by Padilla and Sivanesan. For a more effective comparison it would have been beneficial to also follow a variational approach, however no such method was found that would be valid for generic shells, yet it is a valuable and often-used method for non-null hypersurfaces. It is thus reasonable to first examine how the variational formalism works for generic shells in GR before generalizing to more complicated theories.

The main obstacle for such a formalism seems to be the lack of an appropriate variational counterterm for generic boundaries, as the Gibbons–Hawking–York term is valid only for timelike and spacelike surfaces. Counterterms valid for null boundaries have been considered by Parattu et al [30] and extended to piecewise smooth boundaries involving corner terms by Lehner et al [42]. This formalism can be used when the boundary has separate timelike, null and spacelike pieces but does not allow for a unified treatment or for cases when the boundary has null points that do not form an entire segment (for example the null point is isolated or the null points form a line, etc). An alternative formulation in terms of tetrads have also been given by Jubb et al [43], which nonetheless shares the features of the formulation by Lehner et al in that it is necessary to break the boundary into pieces of pure signatures instead of giving a fully unified treatment.

However a unified counterterm has been provided recently also by Parattu et al in [28], which is valid for any boundary hypersurface rigged with a transversal vector field and reduces to the Gibbons–Hawking–York term in the appropriate limit. Although the formulation has not been extended to corner terms, we are mainly interested in smooth shells (as in the hypersurface corresponding to the shell being smooth) and therefore this limitation of the formalism does not affect our results. We show that this counterterm properly regularizes the action at the shell and the equations derived in e.g. Barrabès and Israel [9], Mars and Senovilla [10] and Senovilla [13] via the distributional method arise as the natural boundary conditions along the hypersurface. To make contact with the alternative distributional regularization procedure of Hájíček and Kijowski [20], it is also shown that the singular part of the Lagrangian supported on a generic surface is proportional to the difference of the counterterm of Parattu et al and thus it leads to the same variational principle we obtain by adding the counterterms manually. Finally, we also derive the correct shell equation via a first order equivalent to the Einstein–Hilbert action where no regularization is necessary. This is actually a special case of the regularization by counterterms as such first order equivalents can be seen as the Einstein–Hilbert action augmented by a different counterterm.

Outline of the paper: in section 2 we provide a short summary of the rigged hypersurface formalism which will be used throughout this paper. In section 3, several known variational counterterms for the Einstein–Hilbert action are discussed including the one recently proposed by Parattu et al [28] valid for generic hypersurfaces. Some general properties of these counterterms are investigated. The main part of the paper is section 4, where the dynamics of thin shells are formulated as a variational principle via three separate regularization schemes. Variational counterterms are employed in subsection 4.1, distributional regularization is considered in subsection 4.2 and the shell equation is also derived from a first order action without the need for regularization in subsection 4.3. Some of the longer calculations are given in appendices A and B.

Notation: the spacetime manifold is D + 1 dimensional and is denoted M. Coordinates on M are xμ with the Greek indices running μ = 0, 1, ..., D. Σ is a hypersurface in M, that is a D dimensional submanifold with coordinates ya with Latin indices a, b, c, ... taking the values 1, ..., D. Summation convention on repeated indices is assumed. The metric tensor in M is gμν , its determinant is $\mathfrak{g}$ and the volume form determined by it is

Equation (1)

Inner products with respect to the spacetime metric are denoted with dots in indexless notation, e.g. XY = Xμ Yν gμν . All manifolds are assumed orientable and oriented.

2. Rigged hypersurfaces

In this section we review the formalism of rigged hypersurfaces, establishing the notation to be used in the rest of the paper. We refer to the exposition by Mars and Senovilla [10] as well as the works [12, 44] for proofs of the statements made here. We also recover the limiting cases when the hypersurface is timelike or spacelike and we derive the null limit.

2.1. Structures induced by the rigging

We consider a hypersurface Σ in the D + 1 dimensional manifold M. The hypersurface is given locally by the embedding functions

Equation (2)

where the ya are the intrinsic coordinates of Σ. The derivatives

Equation (3)

are interpreted as the components of the holonomic coordinate frame of Σ, or from a more invariant point of view, the components of the pushforward and pullback operations between Σ and M. Without introducing any extra structure, a vector vμ defined at a point p ∈ Σ is tangential if it can be written in the form ${v}^{\mu }={v}^{a}{e}_{a}^{\mu }$ for some intrinsic hypersurface vector va . Then vμ is the pushforward of va . Thus, it is possible to decide whether a contravariant vector (and thus a general contravariant tensor in an index-by-index basis) is tangential or not. A covector nμ defined at some point p ∈ Σ is normal if ${n}_{\mu }{e}_{a}^{\mu }=0$, that is it annihilates all tangential vectors. The space of normal covectors at each point is one dimensional. Thus it is meaningful to decide if a covariant vector is normal or not. If ωμ is a covariant tensor at some p ∈ Σ, its pullback to Σ is the hypersurface covector ${\omega }_{a}={\omega }_{\mu }{e}_{a}^{\mu }$ (this notion is extended to all covariant tensors index-by-index).

The induced metric or first fundamental form on Σ is the pullback

Equation (4)

The point p ∈ Σ is a null point of the hypersurface if and only if ${h}_{ab}\left(p\right)$ is a singular matrix. Since we allow for null points and thus non-invertible induced metrics, we do not raise or lower Latin indices.

To proceed, we need to introduce a vector field μ along Σ, which is nowhere tangential (nor zero). We call this a choice of rigging and the pair $\left({\Sigma},\ell \right)$ is a rigged hypersurface. The set $\left(\ell ,{e}_{1},\dots ,{e}_{D}\right)$ is then a frame of M along Σ. The choice of rigging selects a unique normal covector field nμ which satisfies

Equation (5)

Then the set $\left(n,{\vartheta }^{1},\dots ,{\vartheta }^{D}\right)$ is the dual frame of $\left(\ell ,{e}_{1},\dots ,{e}_{D}\right)$, where the covector fields (along Σ) ${\vartheta }_{\mu }^{a}$ are uniquely determined by the duality relations

Equation (6)

Using ${\vartheta }_{\mu }^{a}$, given a hypersurface covector ωa , we can create a spacetime covector ${\omega }_{\mu }={\vartheta }_{\mu }^{a}{\omega }_{a}$ which satisfies ${\omega }_{a}={e}_{a}^{\mu }{\omega }_{\mu }$ and ωμ μ = 0. Likewise, we can project a spacetime vector vμ into Σ as ${v}_{{\Vert}}^{a}={v}^{\mu }{\vartheta }_{\mu }^{a}$, and also obtain a direct projection operator ${P}_{\nu }^{\mu }$ by pushing forward ${v}_{{\Vert}}^{a}$, i.e.

Equation (7)

with

Equation (8)

With respect to these bases, the spacetime metric and inverse metric can be expressed as

Equation (9)

where the elements that appear here are given explicitly as

Equation (10)

In particular, ${h}_{\ast }^{ab}$ may be seen as a pseudo-inverse to hab .

The choice of rigging also gives a volume form

Equation (11)

on Σ with

Equation (12)

where ${\pi }_{\mu {\mu }_{1}\dots {\mu }_{D}}$ is the D + 1 dimensional Levi-Civita symbol. This particular volume element is such that if Ω ⊆ M is a compact D + 1 dimensional domain of integration, whose boundary ∂Ω is rigged with an outward pointing transversal μ , Gauss' theorem takes the form

Equation (13)

where nμ is the normal adapted to the rigging, i.e. nμ μ = 1. Further properties of the volume form may be found in [10, 44].

Extrinsic curvature-type quantities may be obtained by differentiating the frame vectors in the tangential directions as

Equation (14)

Equation (15)

Equation (16)

These are all hypersurface tensors and χab is symmetric. For thin shells and junction conditions it is also useful to define

Equation (17)

which is non-symmetric in general and is not independent of the triplet $\left(\chi ,\psi ,\varphi \right)$. However, it will turn out that this quantity is what most naturally generalizes the extrinsic curvature to thin shells and will play an important role. For Hab we make an exception to our convention not to raise Latin indices and define

Equation (18)

A connection-type quantity ${\gamma }_{ab}^{c}$ is also given on Σ by

Equation (19)

This connection is torsionless but is not metric compatible in general.

2.2. Transformations between riggings

The choice of rigging μ along a hypersurface Σ ⊆ M is not unique and it may be subjected to two kinds of transformations. The first is a tangential shift

Equation (20)

where Ta is an arbitrary tangent vector field to Σ, and the second kind is a rescaling

Equation (21)

where α is a function along Σ that is nowhere vanishing. These transformations form a group parametrized by D + 1 functions whose structure has been analyzed in [44]. The quantities associated with rigged hypersurfaces transform under the shift as [10, 12]

Equation (22)

while ${e}_{a}^{\mu }$, nμ , hab , χab and μ,g are invariant. Under a rescaling, the transformations are

Equation (23)

while ${e}_{a}^{\mu }$, ${\vartheta }_{\mu }^{a}$, hab , ${h}_{\ast }^{ab}$ and ${\gamma }_{ab}^{c}$ are invariant. Note that since the volume element μ,g is invariant under shifts, the definition of μ,g essentially depends on that of the normal nμ only. Thus if one has a preferred normal along a hypersurface, the scaling of the normal already fixes the volume element without the need to choose a rigging explicitly.

2.3. Pseudo-Riemannian limit of rigged hypersurfaces

The usual formalism of timelike and spacelike (collectively, pseudo-Riemannian) hypersurfaces may be obtained from the rigged formalism by making a particular choice of rigging μ . We assume that Σ is timelike or spacelike and set

Equation (24)

to allow for both cases to be considered simultaneously. The induced metric hab is nondegenerate throughout Σ, its inverse hab exists and we raise and lower Latin indices with hab and hab respectively. Normal vectors are everywhere transversal, therefore we take as the rigging

Equation (25)

the unit normal (i.e. $\hat{n}\cdot \hat{n}={\epsilon}$) to Σ, which is unique up to sign. With this particular choice of the rigging, the normal associated to the rigging is

Equation (26)

We will only use $\hat{n}$ and keep track of the epsilons that appear. The rest of the quantities become

Equation (27)

where

Equation (28)

is the usual extrinsic curvature and $\mathfrak{h}=\mathrm{det}\left({h}_{ab}\right)$ is the determinant of the induced metric. The connection ${\gamma }_{ab}^{c}$ becomes the Levi-Civita connection of the induced metric hab with

Equation (29)

and Gauss' theorem takes the form

Equation (30)

where ${\hat{n}}^{\mu }$ is the outward pointing unit normal to ∂Ω.

2.4. Null limit of rigged hypersurfaces

Suppose now that Σ is null. There is no universally preferred convention here for the rigging, however the null rigging used by e.g. Poisson [11] is a useful choice and we present it here. If Σ is null then any normal field nμ is also null and is tangential to Σ. Moreover it satisfies the geodesic equation

Equation (31)

for some non-affinity function κ. We may then set up coordinates $\left({y}^{a}\right)=\left(r,{\theta }^{A}\right)$ on Σ (A, B, ... = 2, ..., D) such that

Equation (32)

and choose a null μ rigging which satisfies

Equation (33)

where

Equation (34)

are the rest of the basis fields, necessarily spacelike. The functions

Equation (35)

are then the components of the spacelike induced metric on the slices r = const. They are also the only nonvanishing components of the induced metric on the entire surface, i.e.

Equation (36)

The D − 1-metric qAB does possess an inverse, denoted qAB and the capital Latin indices are raised and lowered via qAB and qAB respectively. The most important extrinsic curvature quantity in this case is Hab , which is now symmetric and we split it as H11, H1A and HAB . We have

Equation (37)

We may express most of the quantities with Hab as

Equation (38)

The primary exception is χab , which is

Equation (39)

and is thus not expressible with Hab . As only tangential derivatives of tangential vectors are taken, when thin shells are involved, the jump of this quantity always vanishes.

Finally, with respect to the frame $\left(\ell ,n,{e}_{A}\right)$ the full metric tensor has components

Equation (40)

from which it follows that in this frame

Equation (41)

where $\mathfrak{q}=\mathrm{det}\left({q}_{AB}\right)$. The volume element can be thus written as

Equation (42)

Note that while it appears that the volume element μq is canonically given, it does depend on the way the manifold Σ has been sliced into spacelike D − 1-surfaces.

3. Variational counterterms

The Einstein–Hilbert action over M is

Equation (43)

where ϰ = 8πG. The integrand is second order in the metric while its Euler–Lagrange equations are also second order. If we assume the boundary ∂M has been rigged by an outward pointing vector μ , we may write its variation in generic form as

Equation (44)

where $\delta {g}_{\mu \nu ,a}={e}_{a}^{\kappa }{\partial }_{\kappa }\delta {g}_{\mu \nu }$ are the tangential derivatives of the metric variation, δgμν, = κ κ δgμν is the transversal derivative and Yμν , Yμν,a and ${Y}_{\ell }^{\mu \nu }$ are the appropriate coefficients that appear on the boundary. Imposing Dirichlet boundary conditions ${\left.\delta {g}_{\mu \nu }\right\vert }_{\partial M}=0$ gets rid of the first two terms on the boundary, but not the third. On the other hand demanding the transversal derivatives to also vanish would overdetermine the field equations. In order to make the variational problem well-defined a variational counterterm

Equation (45)

is added to the action, where ${\left(\partial g\right)}_{{\Vert}}$ and ${\left(\partial g\right)}_{\ell }$ are schematic notations for the tangential and transversal derivatives respectively. If the integrand $\mathcal{B}$ satisfies

Equation (46)

then it follows that imposing the usual Dirichlet condition ${\left.\delta {g}_{\mu \nu }\right\vert }_{\partial M}=0$ on the combined action SEH + B will get rid of all boundary terms. The variational counterterm is not unique, however if $\mathcal{B}$ and ${\mathcal{B}}^{\prime }$ are both integrands of variational counterterms, their derivatives with respect to gμν, must be the same function $-{Y}_{\ell }^{\mu \nu }{\rho }_{g,\ell }$ and thus the difference $\mathcal{B}-{\mathcal{B}}^{\prime }$ is a function of g and ${\left(\partial g\right)}_{{\Vert}}$ only. This result will be of significance for thin shells.

There are several counterterms known for the Einstein–Hilbert action:

The GibbonsHawkingYork counterterm: when the boundary ∂M consists of pseudo-Riemannian pieces only, the appropriate rigging (see section 2.3) can be chosen. The counterterm is

Equation (47)

Its validity follows from the variational formula [45]

Equation (48)

The first boundary term involves only tangential derivatives of the metric and the second term—which contains normal derivatives—is an exact variation. Adding this term to the action with an opposite sign will ensure that fixing the metric without fixing its derivatives on the boundary makes all remaining boundary terms vanish.

The Einstein counterterm: we assume that the manifold M is covered by the domain of a chosen (and fixed) coordinate chart xμ . Let us also take an outward pointing normal nμ along ∂M, and let μg,n denote the corresponding volume element obtained via any rigging μ which satisfies μ nμ = 1. The Einstein counterterm is then defined as

Equation (49)

where

Equation (50)

This expression is naturally defined in the interior of M as well, and by Gauss' theorem

Equation (51)

where the covariant derivative treats wκ as if it was a vector field (the rationale behind this is that we may consider ∂μ to be a locally defined connection associated to the chart xμ , and from this point of view the connection coefficients ${{\Gamma}}_{\mu \nu }^{\kappa }$ are tensor components—the components of the difference tensor between ∇ and ∂). Decomposing the scalar curvature as

Equation (52)

one obtains

Equation (53)

which is Einstein's first order, noncovariant ΓΓ-action [46, 47]. Since it is first order, fixing the metric at the boundary is sufficient to eliminate all boundary terms. The Einstein counterterm is not unique in the sense that different coordinate systems will produce different Einstein counterterms, as it is clear from the lack of covariance of (50).

The background connection counterterm: we can also introduce an arbitrary torsionless connection ${\bar{\nabla }}_{\mu }$. Quantities calculated from ${\bar{\nabla }}_{\mu }$ are denoted with an overbar. Let nμ be any outward pointing normal to the boundary ∂M and μg,n the associated volume element. The background connection counterterm is

Equation (54)

where ${{\Delta}}_{\mu \nu }^{\kappa }={{\Gamma}}_{\mu \nu }^{\kappa }-{\bar{{\Gamma}}}_{\enspace \mu \nu }^{\kappa }$ is the difference tensor. The Einstein counterterm is reproduced if M fits into a single coordinate domain and we choose ${\bar{\nabla }}_{\mu }={\partial }_{\mu }$. The term ${{\Delta}}_{\mu \nu }^{\kappa }{g}^{\mu \nu }+{{\Delta}}_{\nu \mu }^{\nu }{g}^{\kappa \mu }$ is once again defined on the entire manifold M and after using Gauss' theorem we get

Equation (55)

Since $\bar{R}={g}^{\mu \nu }{\bar{R}}_{\mu \nu }$ is the scalar curvature of the nondynamical background connection ${\bar{\nabla }}_{\mu }$, this action is also first order, from which immediately follows that fixing the metric at the boundary removes all boundary terms. Unlike the Einstein counterterm, the background connection counterterm is globally defined and both the counterterm and the resulting first order action are covariant. However the counterterm and action both contain an unphysical background field. This counterterm is also non-unique, as it depends on the connection chosen as the background.

The rigged counterterm: this counterterm has been introduced by Parattu et al [28] as a generalization of the Gibbons–Hawking–York term valid for hypersurfaces of arbitrary causal type. We fix an outward pointing rigging μ along the boundary ∂M. The rigged counterterm is [28]

Equation (56)

where ${P}_{\nu }^{\mu }={\delta }_{\nu }^{\mu }-{\ell }^{\mu }{n}_{\nu }$ is a tangential projector that removes the -directed parts of vectors. Parattu et al did not employ the formalism of rigged hypersurfaces, therefore this counterterm appeared in terms of spacetime, rather than hypersurface quantities. We rewrite it via ${P}_{\nu }^{\mu }={e}_{a}^{\mu }{\vartheta }_{\nu }^{a}$ and ${\vartheta }^{\mu a}={\nu }^{a}{\ell }^{\mu }+{h}_{\ast }^{ab}{e}_{b}^{\mu }$ as

Equation (57)

thus the counterterm has the equivalent expression

Equation (58)

a form resembling the Gibbons–Hawking–York counterterm with χab and φa playing the role of the extrinsic curvature. From (58) it can be seen that when the boundary is pseudo-Riemannian, choosing ${\ell }^{\mu }={\hat{n}}^{\mu }$ gives (see section 2.3) φa = 0, χab = epsilonKab and ${h}_{\ast }^{ab}={h}^{ab}$, thus the rigged counterterm reproduces the Gibbons–Hawking–York term in this limit.

In the presence of a boundary ∂M of any causal type, equipped with a rigging, the variational formula (48) is replaced by [28]

Equation (59)

where

Equation (60)

We remark that this quantity involves the covariant derivative of the normal vector and thus depends on the extension of it to a neighborhood of the boundary. As the calculations in appendix A show, the expression is independent of the extension of the normal field. The quantity Πμν will also be decomposed in terms of hypersurface quantities in subsection 4.1. The sign of the last term has been corrected as compared to the corresponding result in [28]. The boundary term that results from the variation of SEH + BR is proportional to δgμν and vanishes when the metric is fixed on the boundary. The rigged counterterm is not unique, different choices of rigging will give different counterterms.

4. Variational formalism of thin shells

We assume that the hypersurface Σ partitions the spacetime manifold M into two domains M+ and M. These domains are manifolds with boundaries and their interiors are disjoint. For simplicity we assume that M has no outer boundary, which implies that ∂M+ = ∂M = Σ (this notation currently ignores orientations). The formalism may be equally well used in the presence of outer boundaries, but including them would needlessly complicate the notation and outer boundaries play no role in our formalism anyway.

The regions M+ and M are distinct as manifolds with smooth 5 metrics ${g}_{\mu \nu }^{+}$ and ${g}_{\mu \nu }^{-}$ respectively. Coordinate systems ${x}_{+}^{\mu }$ and ${x}_{-}^{\mu }$ are employed which need not satisfy any matching conditions at Σ. As per the analysis of Clarke and Dray [8] (also comments made in [10, 13]), corrected and extended for the case of hypersurfaces with null points by Mars et al [32], the conditions for the existence of a C1 structure on M is that the induced metrics ${h}_{ab}^{+}$ and ${h}_{ab}^{-}$ agree on Σ, and in case Σ is not timelike or spacelike, there is a pair of rigging vectors ${\ell }_{+}^{\mu }$ and ${\ell }_{-}^{\mu }$ along Σ such that + and both point towards M+ (or both towards M, depending on one's choice) and

Equation (61)

where ${\lambda }_{a}^{\pm }={\ell }_{\pm }\cdot {e}_{a}$ are the projections of the transverse vectors ${\ell }_{\pm }^{\mu }$ on the tangent basis of the hypersurface. This identifies + and as 'being the same', and thus generates a C1 differentiable structure at Σ. It follows that any coordinate system adapted to the rigging , that is a coordinate system $\left(\sigma ,{y}^{a}\right)$ such that σ = 0 is the equation for Σ and

Equation (62)

is a C1 coordinate system. If Σ is timelike or spacelike, then the unit normal ${\hat{n}}^{\mu }$ always provides a rigging which satisfies the above conditions, therefore in that case there is no need to find a pair of matching riggings and it follows that Gaussian normal coordinates are always C1. From this point on we assume that all spacetime coordinate systems are C1 on Σ and C4 away from Σ. Since the final results will be expressed as hypersurface tensors, this does not reduce the practical applicability of the formalism. In these coordinate systems, the relation ${h}_{ab}^{+}={h}_{ab}^{-}$ also implies that the spacetime metric gμν is continuous, due to expansion (9), which involves only λa , 2 and hab , which are then all assumed continuous.

We use the notation

Equation (63)

for the jump discontinuity of a field F at Σ (thus $\left[F\right]$ is a function defined only on Σ) and

Equation (64)

for the 'soldering' of a field, where

Equation (65)

is the Heaviside step function associated to Σ. Any choice of value for θ at Σ ensures that for a continuous field F, $F=\bar{F}$ is a pointwise identity. The choice ${\left.\theta \right\vert }_{{\Sigma}}=1/2$ is taken for reasons of symmetry. We imagine that the hypersurface of discontinuity Σ is the limit of a layer of finite thickness, where the field F is continuous albeit rapidly varying. In the limit of infinitesimal thickness a value between F+ and F should be picked on Σ and taking the arithmetic average (corresponding to ${\left.\theta \right\vert }_{{\Sigma}}=1/2$) is the most 'democratic' choice that gives no preference to the field values on either side of the layer.

To conform to the usual conventions, we also assume that the rigging vector field points from M to M+. The orientation on Σ is induced by the rigging . It follows that Σ has the boundary orientation inherited as the boundary of M and the opposite to the boundary orientation inherited from M+.

4.1. Thin shell equation from the action regularized by counterterms

The total action will be taken to consist of the gravitational action SEH, an unspecified bulk matter action SM and an unspecified thin shell matter action STS. Instead of integrating over M at once, we split the integrals into sums of integrals over M+ and M. Since Σ is not a part of the outer boundary of the manifold, the usual Dirichlet conditions do not apply to Σ, the metric is not fixed there. We suppose the metric is C0 across Σ and at least C3 away from Σ. Since we are varying within this differentiability class, δgμν also inherits these properties. The equations of motion of the shell arise as the natural boundary conditions on the shell as the bulk and boundary contributions to the variation of the action must vanish separately.

The shell hypersurface Σ is an interior boundary and thus we add the rigged counterterm (56) to the action at both sides of Σ to ensure the proper boundary behaviour of the action. The total action is then

Equation (66)

The relative sign difference between ${B}_{\mathrm{R}}^{+}$ and ${B}_{\mathrm{R}}^{-}$ is caused by the orientation of Σ being opposite to the boundary orientation inherited from the domain M+. Variation of this integral with respect to the metric is carried out by applying the variation formula (59) to both the + and − integrals, giving

Equation (67)

where ${L}_{\mathrm{M}}={\mathcal{L}}_{\mathrm{M}}/\sqrt{-\mathfrak{g}}$ is the scalarized matter Lagrangian. The variation of the integral should vanish for all variations δgμν that are C0 across Σ and C3 away from Σ. In particular, we can choose an arbitrary δgμν which satisfies ${\left.\delta {g}_{\mu \nu }\right\vert }_{{\Sigma}}=0$, which implies that the coefficients of the δgμν in the bulk integrals should vanish, giving the Einstein field equations in the bulk. It then follows that the surface term ${\int }_{{\Sigma}}\left(\cdots \right)\delta {g}_{\mu \nu }\enspace {\mu }_{\ell ,g}$ should vanish separately even for arbitrary δgμν , which results in the equation

Equation (68)

where

Equation (69)

is the surface energy–momentum tensor. The second term is a contribution coming from the bulk matter Lagrangian if it also depends on the derivatives of the metric tensor, usually via the connection. It arises precisely as follows. If SM is the matter action with ${S}_{\mathrm{M}}={\int }_{M}{\mathcal{L}}_{\mathrm{M}}\left(g,\partial g,\psi ,\partial \psi \right){\mathrm{d}}^{D+1}x$, and scalar Lagrangian function ${L}_{\mathrm{M}}={\mathcal{L}}_{\mathrm{M}}/\sqrt{-\mathfrak{g}}$, the variation of the matter action with respect to the metric is

Equation (70)

The total divergence term here can be expressed in terms of the scalar Lagrangian and the covariant divergence. Specifically, since the metric determinant is independent of the metric's first derivative, we get

Equation (71)

and even though gμν,κ is not a tensor, ∂LM/∂gμν,κ is (see [15], chapter 2, section 11). We can therefore write

Equation (72)

If this integral is performed over a spacetime with a shell Σ we thus obtain the difference term $-{n}_{\kappa }\left[\frac{\partial {L}_{\mathrm{M}}}{\partial {g}_{\mu \nu ,\kappa }}\right]$ on the shell which contributes to the energy–momentum tensor. Out of the standard model fields, only the Lagrangian of the Dirac field depends on the connection, however the Dirac field being spinorial, an alternative formulation based on orthonormal tetrads would be necessary to incorporate them into the formalism. For some exotic matter fields (for example the scalar sector of Horndeski's theory [38]) this term may be nonvanishing. As far as we are aware, such possible contributions to the thin shell energy–momentum tensor have not been explored so far in the literature.

Equation (68) is the equation of motion for the thin shell in unprojected form. To proceed, we decompose the tensor Πμν in the frame $\left(\ell ,{e}_{a}\right)$. This is best accomplished by transitioning to a coordinate system $\left(\sigma ,{y}^{a}\right)$ adapted to the rigging (i.e. ${\ell }^{\mu }={\left(\partial /\partial \sigma \right)}^{\mu }$ and the ya are the hypersurface coordinates), giving

Equation (73)

where $\psi ={\psi }_{a}^{a}$ is the trace. The details of this derivation may be found in appendix A.

Since the metric is continuous, only the extrinsic curvature-type quantities χab , φa , ${\psi }_{a}^{b}$ may suffer jumps, as they in general involve the metric's transversal derivatives. The reason for the introduction of the tensor Hab in (17) has been that as it turns out the jumps of all such quantities may be related to that of Hab . We refer to Mars and Senovilla for details (equations (72)–(76) in [10]) and merely list the jump relations

Equation (74)

where

Equation (75)

and is always symmetric. The jump of the metric derivatives can be written as

Equation (76)

where the jump of the tangential derivative $\left[{\partial }_{a}{g}_{\mu \nu }\right]{\vartheta }_{\kappa }^{a}$ vanishes because of the continuity of the metric and gμν, = κ κ gμν is the transversal derivative. We then have

Equation (77)

where ${\xi }_{\mu \nu }{:=}\left[{g}_{\mu \nu ,\ell }\right]$ and it follows that

Equation (78)

which in adapted coordinates is the jump of the transversal derivative of the induced metric,

Equation (79)

For this reason it is $\left[{H}_{ab}\right]$ that carries information about the discontinuities of the metric's transversal development.

Inserting the jump relations (74) into (73) gives

Equation (80)

Equation (81)

and

Equation (82)

It follows that the jump $\left[{{\Pi}}^{\mu \nu }\right]$ is a tangential tensor field along Σ, which we may write as $\left[{{\Pi}}^{\mu \nu }\right]=\left[{{\Pi}}^{ab}\right]{e}_{a}^{\mu }{e}_{b}^{\nu }$. Following from (68), the surface energy–momentum tensor must also be tangential with ${S}^{\mu \nu }={S}^{ab}{e}_{a}^{\mu }{e}_{b}^{\nu }$ and the shell equation can be considered as the hypersurface tensor equation

Equation (83)

Since a contravariant tensor being tangential is an intrinsic notion independent of any choice of rigging, the components $\left[{{\Pi}}^{ab}\right]$ are calculated from $\left[{{\Pi}}^{\mu \nu }\right]$ in a way that is independent of the rigging. Applying the transformation formulae of section 2.2 to (83) shows that $\left[{{\Pi}}^{ab}\right]$ is invariant under the shift transformation ${\ell }^{\mu }{\mapsto}{\ell }^{\mu }+{T}^{a}{e}_{a}^{\mu }$ of the rigging and changes as $\left[{{\Pi}}^{ab}\right]{\mapsto}{\alpha }^{-1}\left[{{\Pi}}^{ab}\right]$ under the rescaling μ αℓμ . This ambiguity in the shell equation is related to the fact that for a generic hypersurface there is no preferred scaling for the normal field nμ . In the variational principle, both $\left[{{\Pi}}^{ab}\right]$ and Sab appear as a factor in the expression

Equation (84)

and the volume element μ,g depends on the scaling of the normal. It follows that for the densitized surface tensor ${\mathfrak{S}}^{ab}={S}^{ab}{\rho }_{\ell ,g}$ and densitized Π-tensor ${\mathfrak{P}}^{ab}={{\Pi}}^{ab}{\rho }_{\ell ,g}$, the analogous equation

Equation (85)

is completely independent of any gauge choices, including the scaling of the normal. If one wishes to use tensor equations, the scaling ambiguity in the generic case is unavoidable. For timelike or spacelike hypersurfaces a canonical choice is given by the unit normal which fixes the scaling of $\left[{{\Pi}}^{ab}\right]$ and Sab , while in the null case Poisson [11] gave a physical interpretation of this ambiguity in terms of observers taking measurements of the null shell.

The tensor Πμν which has been split into the components Π00, Π0a and Πab may be identified with the canonical momentum of the gravitational field, up to scaling and densitization (canonical momenta are usually taken to be tensor densities). Ordinarily, canonical momenta are constructed by foliating spacetime into a one-parameter family of spacelike hypersurfaces [25], but one may equally well consider the analysis of dynamics decomposed with respect to any foliation of spacetime, including the case when foliate with respect to the transversal coordinate σ adapted to the rigging μ . In the usual formulation, the canonical momentum is the derivative of the Lagrangian with respect to 'time' (which in this case is σ) however it is well-known [48] that the canonical momentum may also be identified with the coefficients of the field variation on the boundary when the Dirichlet conditions are not imposed. This is the basis for the so-called covariant phase space formalism [49, 50]. For the Einstein–Hilbert action extended with the rigged boundary term, by (59), the boundary part is

Equation (86)

which shows that Π00, Π0a and Πab are proportional to the canonical momenta corresponding to the metric degrees of freedom 2, λa and hab . The shell equation then has the interpretation that the surface energy–momentum tensor is the jump of the canonical momentum on the hypersurface.

If the condition ${\left.\delta {g}_{\mu \nu }\right\vert }_{\partial M}=0$ is not imposed on a boundary (such is the case for thin shells), the vanishing of the variation of the action forces the coefficients of the δgμν to vanish on the boundary. Since these coefficients are identified with the canonical momentum of the field, the canonical momentum must vanish on the boundary. This is referred to as the natural boundary condition [23], as it arises without having to impose a boundary condition by hand. We can thus also see that the shell equation $\frac{1}{2}{S}^{ab}-\frac{1}{2\varkappa }\left[{{\Pi}}^{ab}\right]=0$ is the natural boundary condition for the combined gravitation + bulk matter + shell matter actions on the hypersurface.

Unlike the equations of motions, canonical momenta are not invariant under equivalence transformations of Lagrangians such as adding total divergences and—in the case of Einstein–Hilbert type Lagrangians—they are sensitive to the specific form of the variational counterterm added to the action. However as discussed in section 3, the difference of two variational counterterms may depend only on the metric tensor and its tangential derivatives, but never on the transversal derivative. Only the transversal derivative has nonzero jump, thus while the expressions Π00, Π0a and Πab depend on the choice of counterterm, their jumps (of which only $\left[{{\Pi}}^{ab}\right]$ is nonvanishing) do not. Therefore, the thin shell equation (83) is actually independent of the choice of counterterm.

If Σ is timelike or spacelike and we apply the canonical choice of rigging presented in section 2.3, we obtain the equation

Equation (87)

which is the well-known Lanczos equation [7]. If instead we take Σ to be null and choose the null rigging adapted to a spacelike foliation of Σ (section 2.4), we decompose the equation into components S11, S1A and SAB , which are respectively

Equation (88)

These relations agree with those of Poisson [11], who interprets μ := S11 as the surface energy density, jA := S1A as the surface current and—since SAB is diagonal in that it is proportional to the metric—$p{:=}-\left[{H}_{11}\right]$ as the isotropic surface pressure of the null shell.

We conclude this section by comparing the result (83) to the analogous results in previous works. As mentioned in footnote 2, Barrabès and Israel [9] assume nn = const, which formally excludes causality-changing hypersurfaces and they use the normalization n = η−1, where η is a nowhere vanishing function along Σ. One this differing normalization convention is taken into account, equation (31) in [9] agrees with our shell equation (83). In place of Hab , they employ a different quantity (denoted ${\mathcal{K}}_{ab}$), the jump of which however coincides with that of Hab in all cases.

In [10] Mars and Senovilla consider only junction conditions and analyze the distributional forms of curvature tensors, therefore the shell equation itself does not appear directly. However since the energy–momentum tensor is proportional to the Einstein tensor, the singular part of the Einstein tensor (equation (71) in [10]) agrees with our $\left[{{\Pi}}^{ab}\right]$ up to the appropriate constant factor and projection. This singular part of the Einstein tensor also appears in explicitly projected form in equation (23) of [13].

4.2. Thin shell equation from the action regularized distributionally

Here we explore a different method of regularizing the action integral at the shell. In the timelike case this method was applied by Hájíček and Kijowski [20]. We show that it also works for shells of any signature. We can write the metric tensor as

Equation (89)

where θ is the Heaviside step function defined in (65). This relation is then interpreted distributionally. Reasonably rigorous treatments of tensor distribution theory on manifolds, can be found in [10, 18, 19, 51, 52], therefore we only do here a short review.

If T is a type (k, l) tensor field on M we say that a type (l, k) tensor density φ of weight 1 is a dual density to T, since then the contraction $\langle \varphi ,T\rangle ={{\varphi }_{{\mu }_{1}\dots {\mu }_{k}}}^{{\nu }_{1}\dots {\nu }_{l}}{{T}^{{\mu }_{1}\dots {\mu }_{k}}}_{{\nu }_{1}\dots {\nu }_{l}}$ is a scalar density of weight 1 that may be integrated over D + 1 dimensional regions of M. Let us define the vector space Dk,l (M) to consist of smooth compactly supported tensor densities of type (l, k) (called test densities), and the space ${D}_{k,l}^{\ast }(M)$ to consist of linear functionals on Dk,l (M) that are continuous in the following sense. A linear functional $\chi :{D}_{k,l}(M)\to \mathbb{R}$ is continuous and thus belongs to ${D}_{k,l}^{\ast }(M)$ if for each sequence φn Dk,l (M) of test densities whose supports are contained in a common compact set KM which is itself located in the domain of a coordinate chart, and such that the components ${{\left({\varphi }_{n}\right)}_{{\mu }_{1}\dots {\mu }_{k}}}^{{\nu }_{1}\dots {\nu }_{l}}$ and their partial derivatives of all orders tend to 0 uniformly, we have limnχ[φn ] = 0. Elements of ${D}_{k,l}^{\ast }(M)$ are called tensor distributions of type (k, l). A tensor distribution $\chi \in {D}_{k,l}^{\ast }(M)$ is regular if there exists a (locally integrable but otherwise 'rough') tensor field also denoted χ such that for any test density φ we have χ[φ] = ∫M χ, φ⟩ dD+1 x. This integral converges because φ has compact support and since the integrand is a density, no volume form is necessary here. Otherwise the distribution is singular. We remark that it is well-defined to take the tensor product of a tensor distribution with a smooth tensor field, however products with non-smooth tensor fields only make sense in limited circumstances.

de Rham [51] refers to a distributional k-form in the above sense as a current of degree k or a k-current for short. Since antisymmetric contravariant tensor densities with (D + 1) − k indices can be identified canonically with k-forms, it follows that the dual densities φ of k-forms ω can be canonically identified with (D + 1) − k-forms under the pairing map ⟨φ, ω⟩ = ωφ, thus k-currents are continuous linear functionals on (D + 1) − k-forms. de Rham defines the boundaryω of a k-current ω by ∂ω[φ] = ω[dφ], then the (distributional) exterior derivative by dω = (−1)k+1ω.

Finally, a few remarks on notation and local representations are in order. As de Rham proves 6 in [51], distributions have the sheaf property, i.e. if ${D}_{k,l}^{\ast }(U)$ denotes the space of type (k, l) tensor distributions over the open set U, and VU is an open subset, we have a well-defined restriction map ${\mathrm{r}\mathrm{e}\mathrm{s}}_{V,U}(\chi )\equiv {\left.\chi \right\vert }_{V}$ given by

Equation (90)

where extU,V : Dk,l (V) → Dk,l (U) extends the tensor density φDk,l (V) defined on V with compact support to a tensor density defined on U with compact support by taking φ to be zero on U\V. This means that the rule $U{\mapsto}{D}_{k,l}^{\ast }(U)$ is a presheaf of real vector spaces, and is in fact a sheaf, i.e. if a tensor distribution vanishes in a neighborhood of each point in U, then it vanishes on U, and if compatible local distributions are given on an open cover, they glue together to give a well-defined tensor distribution on the covered domain. One may then show that if UM is a coordinate domain and $\chi \in {D}_{k,l}^{\ast }(U)$ is a tensor distribution of type (k, l), we can write χ uniquely as

Equation (91)

where the components are scalar distributions on U, and for distributions defined on M, the entire distribution may be reconstructed from its sets of components if the manifold is covered by coordinate domains. Moreover, on any test density φ we have

Equation (92)

where the contraction is a distributional scalar density (i.e. D + 1-form) interpreted as a D + 1-current and it acts on the 0-form 1. Although the 1 function is not compactly supported, one can also show [51] that it makes sense to let a distribution act—through the use of a partition of unity—on a non-compactly supported test density and if the distribution itself has compact support, then this is always convergent, therefore the above expression is well-defined. If we further denote the action of a D + 1-current ω on 1 as

Equation (93)

we obtain 'classical' notation for tensor distributions (e.g. similar to what is found in [1]), since (1) it is possible to use index notation with tensor densities and make local calculations, (2) actions of distributions can be symbolically denoted by an integral.

We identify the Heaviside step function θ with the corresponding 0-current and define the (one-form) Dirac delta ${\delta }_{\ast }^{{\Sigma}}$ associated to the hypersurface Σ to be the exterior derivative of the Heaviside current, i.e. we have for any smooth compactly supported D-form (or equivalently, vector density) φ

Equation (94)

Since M is equipped with a volume form μg , choosing a rigging μ pointing from M to M+ with adapted normal nμ satisfying nμ μ = 1, also defines the volume element μ,g on Σ. Then it becomes possible to define a scalar distribution δΣ evaluated on a test D + 1-form φ as

Equation (95)

where f is the scalar function uniquely determined 7 by φ = g . It is easy to verify the relation (${\delta }_{\mu }^{{\Sigma}}$ are the components of the one-current ${\delta }_{\ast }^{{\Sigma}}$)

Equation (96)

which shows that the scalar δΣ depends on the choice of normal (i.e. it depends on the rigging only up to scaling). For any soldered quantity $\bar{F}$ we then have distributionally

Equation (97)

Since the jump of the metric vanishes, the connection can be written as

Equation (98)

and its jump as (77)

Equation (99)

where ${\xi }_{\mu \nu }=\left[{g}_{\mu \nu ,\ell }\right]$. The curvature tensor is then

Equation (100)

Let ${\mathcal{R}}_{\lambda \mu \nu }^{\kappa }$ denote its singular part, i.e. the coefficients of δΣ. Expanding gives

Equation (101)

where $\left[{H}_{\mu \nu }\right]=\left[{H}_{ab}\right]{\vartheta }_{\mu }^{a}{\vartheta }_{\nu }^{b}$ (we refer to [10] for details). The scalar curvature is calculated by contracting the curvature tensor twice as $R=\bar{R}+\mathcal{R}{\delta }^{{\Sigma}}$, where

Equation (102)

According to the jump relations (74), we may rewrite this as

Equation (103)

which is −2ϰ-times the jump of the integrand of the rigged counterterm (58). The gravitational (scalar) Lagrangian in the presence of a shell and interpreted as a distribution is then

Equation (104)

It follows that the Einstein–Hilbert action over M is

Equation (105)

where we have used that ${\chi }_{ab}{h}_{\ast }^{ab}-{\varphi }_{a}{\nu }^{a}$ can be written in the form ${P}_{\nu }^{\mu }{\nabla }_{\mu }{n}^{\nu }$. If we add to this the bulk and thin shell matter actions, we obtain the same variational principle as given by (66). We have thus shown that if instead of splitting the action into separate integrals on M+ and M and adding counterterms, we integrate over M while taking into account the singular contribution to the Lagrangian, the resulting singular terms give precisely the difference of the counterterms that otherwise would have had to be added by hand.

It is interesting to note that there is no a priori reason for the singular part of the Lagrangian to have the same value as the difference of the counterterms. The Einstein–Hilbert Lagrangian density can be written in the form

Equation (106)

where the coefficients are

Equation (107)

and

Equation (108)

and ${{\Gamma}}_{\ast }^{\kappa }={{\Gamma}}_{\mu \nu }^{\kappa }{g}^{\mu \nu }$. Since only the second derivatives contribute singular terms, the Lagrangian has the distributional form

Equation (109)

while a variational counterterm is given by

Equation (110)

which is obtainable by integrating the first term in ${\mathcal{L}}_{\text{EH}}$ by parts.

Since Pκλμν is algebraic in the metric and thus does not depend on the transversal derivatives, it is clear that the jump of the integrand of B is the same as the singular part of ${\mathcal{L}}_{\text{EH}}$. However if Pκλμν were to depend on the metric's transversal derivative, the singular part of the Lagrangian would be mathematically meaningless as Pκλμν would be discontinuous at Σ where it is being evaluated. If we choose ${\left.\theta \right\vert }_{{\Sigma}}=1/2$ as the value of the step function on Σ, then the meaning of such an expression can be salvaged as Pκλμν evaluated on the average value of the metric's transversal derivative at the price of taking products of Dirac deltas with discontinuous functions. Moreover, were Pκλμν to depend on the metric's derivatives, the counterterm would have to take a different form as one could no longer get rid of second derivatives in the action by simple integrations by parts.

It thus seems that such a simple relation between the singular part of the Lagrangian and the jump of the counterterms exists if the Lagrangian is affine in the second derivatives of the field with coefficients that do not depend on the derivatives of the field, however if these conditions are violated in a modified gravitational theory the above derivation breaks down and further analysis would be necessary.

4.3. Thin shell equation from a first order action

We mention here for completeness that the correct shell equation may also be obtained without having to regularize the Einstein–Hilbert action on Σ by employing a first order equivalent. To ensure global validity, we choose the background connection action (55) rather than the noncovariant ΓΓ-action (53).

As we have shown in section 3, we may view the first order equivalents as the Einstein–Hilbert action extended with a particular variational counterterm. Therefore we may ascertain without any explicit calculations that the first order action (55) leads to the correct shell equation, as the difference of two different variational counterterms to not depend on the metric's transversal derivative, therefore their jumps always agree. However it is the jump of the counterterm that appears in the action (66), therefore a first order action will lead to the same variational principle and thus the same shell equation.

Nonetheless it is instructive to rederive the result via the first order action from the beginning. The background connection action (55) is

Equation (111)

where

Equation (112)

and ${{\Delta}}_{\mu \nu }^{\kappa }={{\Gamma}}_{\mu \nu }^{\kappa }-{\bar{{\Gamma}}}_{\mu \nu }^{\kappa }$. To ensure manifest covariance, we consider L to be a function of gμν and ${\bar{\nabla }}_{\kappa }{g}_{\mu \nu }$, the covariant derivative of the metric with respect to the background connection, where the relation [25]

Equation (113)

is relevant. A variation of the action leads symbolically to

Equation (114)

where

Equation (115)

Since the background connection ${\bar{\nabla }}_{\kappa }$ has no a priori relation with the volume element μg , we may not use Gauss' theorem with it. Therefore we must express ${\bar{\nabla }}_{\kappa }\delta {g}_{\mu \nu }$ with the Levi-Civita connection ∇κ . This is accomplished via the difference formula [25]

Equation (116)

Inserting this into the variation gives

Equation (117)

thus the variation of the action is

Equation (118)

The bulk terms must be $-\frac{1}{2\varkappa }{G}^{\mu \nu }$ in disguise, since the action differs from the Einstein–Hilbert action in a total derivative term only. For the boundary term we have

Equation (119)

where ${{\Delta}}_{\lambda }={{\Delta}}_{\mu \lambda }^{\mu }$ and ${{\Delta}}_{\ast }^{\kappa }={{\Delta}}_{\mu \nu }^{\kappa }{g}^{\mu \nu }$. The details of the analogous derivation for the ΓΓ-action (53) are given in appendix 9 of [47] and is therefore omitted here. Let the contraction of the above expression be denoted

Equation (120)

We rewrite the variational principle for thin shells as

Equation (121)

and varying this with respect to the metric gives

Equation (122)

where Sμν is again defined by (69). Imposing the stationarity of the action gives the boundary equation

Equation (123)

The jump of Mμν can be written as

Equation (124)

where ${{\Gamma}}_{\lambda }={{\Gamma}}_{\lambda \mu }^{\mu }$ and ${{\Gamma}}_{\ast }^{\kappa }={{\Gamma}}_{\mu \nu }^{\kappa }{g}^{\mu \nu }$. The connection ${\bar{\nabla }}_{\mu }$ was assumed to be a smooth background structure, therefore $\left[{{\Delta}}_{\mu \nu }^{\kappa }\right]=\left[{{\Gamma}}_{\mu \nu }^{\kappa }\right]-\left[{\bar{{\Gamma}}}_{\mu \nu }^{\kappa }\right]=\left[{{\Gamma}}_{\mu \nu }^{\kappa }\right]$. If we decompose $\left[{M}^{\mu \nu }\right]$ in the frame $\left(\ell ,{e}_{a}\right)$ (since the calculation is lengthy, the details are in appendix B), we obtain that $\left[{M}^{00}\right]=0$, $\left[{M}^{0a}\right]=0$, and thus $\left[{M}^{\mu \nu }\right]$ is tangential with $\left[{M}^{\mu \nu }\right]=\left[{M}^{ab}\right]{e}_{a}^{\mu }{e}_{b}^{\nu }$, its projected components being

Equation (125)

which is equal to $\left[{{\Pi}}^{ab}\right]$. The shell equation is

Equation (126)

where Sab is defined by ${S}^{\mu \nu }={S}^{ab}{e}_{a}^{\mu }{e}_{b}^{\nu }$. This equation agrees with (83), which shows that the first order action indeed leads to the correct equations.

We remark that the corresponding derivation for the Einstein–Hilbert action extended with counterterms crucially relied on the variation formula (59) originally derived by Parattu et al [28], which is a nontrivial result and difficult to obtain. On the other hand the first order action provided a straightforward derivation, which is clearly advantageous. The disadvantage of the first order approach is that sufficiently complicated theories of gravitation (e.g. Horndeski theory [38]) do not admit first order equivalent Lagrangians, therefore this method cannot always be relied on.

Whether a given modified theory of gravity with second order field equations can be described in terms of a first order Lagrangian can be determined easily by looking at the field equations. A first order Lagrangian will produce Euler–Lagrange equations that have at most an affine dependence on the second derivatives of the field variables. However it is known [54, 55] that—at least locally—the converse of this statement is also true, every locally variational second-order differential equation 8 that is affine in the second derivatives has a local first order Lagrangian. Thus, a theory of gravitation specified in terms of a second order Lagrangian with second order field equations will have a (possibly only local and non-covariant) first order equivalent if and only if the field equations are affine functions of the second derivatives.

Looking at the field equations of Horndeski's theory (presented for example [40]) one can ascertain that the restrictions G5(ϕ, X) = 0, G4(ϕ, X) = G4(ϕ) and G3(ϕ, X) = G3(ϕ) are necessary to ensure the existence of a first order equivalent. This includes the Brans–Dicke type theories where the scalar field Lagrangian is first order and the non-minimal coupling of the scalar field to gravity does not involve the scalar field derivatives but excludes the Galileon-type models as well as kinetic gravity braiding where the higher-order nonlinear derivative interaction of the scalar field prevents the existence of first order equivalent Lagrangians. Outside Horndeski theories, Gauss–Bonnet gravity is an example of a theory with no first order Lagrangian, as the field equations are quadratic in the curvature tensors [35].

5. Conclusions

The purpose of this paper was to provide a variational formalism for spacetimes containing a thin shell of completely unconstrained signature. To treat shells of arbitrary signature, we have used the formalism of rigged hypersurfaces, reviewed in section 2. Shells are incorporated into the variational principle as interior boundaries and their equations of motion are the natural boundary conditions on them. The Einstein–Hilbert action needed to be regularized at the shell to ensure a valid variational principle. We have investigated multiple possible regularization procedures.

In subsection 4.1, regularization has been carried out by adding variational counterterms (reviewed in section 3) to the action. The shell equation (83) obtained by varying this modified action reproduces the results obtained through distributional methods by Barrabès and Israel [9], Mars and Senovilla [10] and Senovilla [13]. We have shown that the shell equation does not depend on the choice of the counterterm and have identified the geometric quantity the jump of which appears in the equations of motion to be the (tensorial) canonical momentum of the gravitational field, generalized to unconstrained instead of just spacelike foliations.

We have considered a different regularization process in subsection 4.2 by focusing on the singular part of the Lagrangian. We have shown that the singular term is related to the jump of the counterterm and leads to the same variational principle. This generalized the procedure employed by e.g. Hájíček and Kijowski [20] to arbitrary shells. We have also argued that a more general Lagrangian might have a less trivial relationship between the singular parts and the counterterms.

Finally, in subsection 4.3, we have obtained the equations of motion of the shell by employing a first order equivalent Lagrangian. This lead to a simpler variational procedure, but we have noted that more complicated theories might not have first order equivalents, rendering this method less adequate for generalization.

Aside from filling a gap in the literature, we expect that this work would be useful for formulating thin shells and junction conditions along generic hypersurfaces in second order modified theories of gravity, such as Horndeski theory [38]. Second order Lagrangians are capable of producing second order differential equations at least quadratic in the second derivatives (the equations of motions associated with the G3 term in Horndeski's theory is an example), which could lead to ill-defined products of delta functions if the distributional method were to be followed. Thus it would seem that variational approaches to thin shells are better behaved for such theories and always lead to unambigous shell equations.

Acknowledgments

This research was funded by the Hungarian National Research Development and Innovation Office (NKFIH) in the form of Grant 123996. I am grateful to László Á Gergely for valuable discussions on the topics of this paper.

Data availability statement

No new data were created or analysed in this study.

Appendix A.: Decomposition of Πμν

In this appendix we carry out the explicit decomposition of the tensor field

Equation (A.1)

defined along the hypersurface Σ in the frame $\left(\ell ,{e}_{a}\right)$. This calculation is best carried out by evaluating Πμν in adapted coordinates $\left(\sigma ,{y}^{a}\right)$ such that the ya parametrize Σ, while

Equation (A.2)

In such an adapted coordinate system we have

Equation (A.3)

Equation (A.4)

and the connection ${{\Gamma}}_{\mu \nu }^{\kappa }$ has components

Equation (A.5)

Equation (A.6)

The elements U and Za involve transversal derivatives of the frame vectors and thus are not independent of the way the quantities are extended off Σ. Fortunately, they will cancel. We first evaluate what we can without fixing the free indices as

Equation (A.7)

then we get

Equation (A.8)

Equation (A.9)

Equation (A.10)

Appendix B.: Decomposition of $\left[{M}^{\mu \nu }\right]$

We now carry out the decomposition in the frame $\left(\ell ,{e}_{a}\right)$ of the tensor field

Equation (B.1)

defined only along the hypersurface Σ, given in (124). As we have argued at (76), we may write the jump of the metric's derivative as

Equation (B.2)

where ${\xi }_{\mu \nu }=\left[{g}_{\mu \nu ,\ell }\right]$ is the jump of the transversal derivative. The jump of the connection is then

Equation (B.3)

This gives

Equation (B.4)

and

Equation (B.5)

We also have

Equation (B.6)

With these, we can write

Equation (B.7)

Contracting with nν gives

Equation (B.8)

Since $\left[{M}^{\mu \nu }\right]$ is symmetric, this implies that it is tangential to Σ with $\left[{M}^{\mu \nu }\right]=\left[{M}^{ab}\right]{e}_{a}^{\mu }{e}_{b}^{\nu }$, and these components are given by

Equation (B.9)

In order to proceed, we write

Equation (B.10)

where we have used $\left[{H}_{ab}\right]=\frac{1}{2}{e}_{a}^{\mu }{e}_{b}^{\nu }{\xi }_{\mu \nu }$ and defined

Equation (B.11)

Then

Equation (B.12)

and inserting these back into $\left[{M}^{ab}\right]$ gives

Equation (B.13)

Here all terms involving ${\xi }_{a}^{\ell }$ and ξ cancel, and the remaining terms are

Equation (B.14)

Footnotes

  • This terminology is not universal. Some authors refer to the Lanczos equation itself as a junction condition, even if the surface energy–momentum tensor does not vanish.

  • The work by Barrabès and Israel impose the condition nn = const, i.e. the length of the normal vector is constant along the hypersurface. This formally restricts their formalism to pure hypersurfaces. However the shell equation obtained therein agrees (after the differences in conventions have been addressed) by that of e.g. Mars and Senovilla, which is valid for generic shells, showing that this condition is not imperative in the derivation of the shell equation.

  • This terminology has been borrowed from Misner et al [16].

  • In C1 coordinates, the continuity of the first fundamental form is equivalent to the continuity of the spacetime metric on the shell. See Clarke and Dray [8] as well as the comments in [10, 13] for proof.

  • Or is at least C3 to ensure both the field equations and the Bianchi identities exist as regular functions.

  • de Rham deals only with currents in [51], not general tensor distributions. However his arguments are straightforward to generalize to tensor distributions, in fact to distributions modelled on sections of arbitrary vector bundles.

  • Note that in a shell spacetime, μg is merely continuous on Σ. This is not a problem as some distributions (those that can be identified with Radon measures), including the various Dirac deltas can also be seen to be linear functionals on continuous, rather than smooth test functions [51, 53].

  • Strictly speaking, those differential equations for which the number of equations agree with the number of unknown functions, which are referred to as source equations in e.g. [56]. Euler–Lagrange equations are always source equations.

Please wait… references are loading.
10.1088/1361-6382/ac38d2