An omnidirectional seismic image extension

This paper introduces an operator, which extends the Born operator for the linearized variable density acoustic wave equation. Inspired by angle dependent scattering, I will define it as acting on distributions depending on space and angle coordinates. I will show that it can be written as a weighted integral of space-shift extended operators associated to arbitrary directions after a suitable coordinate transform. I will also formulate conditions under which its normal operator is a pseudo-differential operator. These will be seen to be less restrictive than the ones imposed on horizontal space-shift extended operators in the sense that they allow diving wave paths.


Introduction
Seismic inversion aims at reconstructing material parameters describing the earth's interior from seismic data, which consist of measurements of seismic waves at the surface of the earth. The simplest model for seismic wave propagation is the constant density acoustic wave equation Original Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Here p(⃗ x, t) is the pressure field resulting from a source function w(⃗ x, t). Choosing w(⃗ x, t) = δ(⃗ x −⃗ x s )δ(t) and imposing causality, p = 0 for t < 0, leads to the causal fundamental solution G(⃗ x,⃗ x s , t), which describes the response of the earth due to a point source at ⃗ x = ⃗ x s at t = 0. In practice, seismic sources are reasonably approximated by point sources emitting a timedependent signature w(t), i.e. one measures the response of the earth due to a source function of the form w(⃗ x, t) = δ(⃗ x −⃗ x s )w(t), which is given by p = w * G. In this simple setting, the task at hand is to reconstruct the velocity function c(⃗ x) from measurements of the source signature w(t) and of the pressure field p(⃗ x r ,⃗ x s , t) depending on sources ⃗ x s = (x s , y s , z s ) and receivers ⃗ x r = (x r , y r , z r ) placed at the surface of the earth, i.e. for z s = z r = 0, at time t > 0.
A fruitful approach is to assume a scale decomposition for the velocity, i.e.
where the so-called background velocity c 0 (⃗ x) is a smooth function explaining the kinematics of wave propagation and where δc(⃗ x) contains the singularities of the velocity function, which are responsible for scattering of the wave field, see, e.g. the review paper [23]. The first order perturbation of the wave equation around c 0 (⃗ x) with respect to a velocity perturbation δc(⃗ x) is given by where G 0 is the fundamental solution of the wave equation with velocity c 0 (⃗ x) and δG is the first order perturbation of G. The approximation G = G 0 + δG is known as the Born or single scattering approximation of wave propagation.
The solution operator B[c 0 ] : V(⃗ x) := 2δc(⃗ x)/c 3 0 (⃗ x) → δG(⃗ x r ,⃗ x s , t) of equation (3) takes the from This operator is known as the Born operator of constant density acoustics. Rakesh [18] has shown that it is an order 0 Fourier integral operator under mild conditions. Moreover, under a traveltime injectivity condition formulated in [25], its normal operator is a pseudo-differential operator. This provides the framework for asymptotic linearized inversion, where one reconstructs the highest order singularities in V and δc. Assuming linearized data deconvolved for the source signature w(t), i.e. d = δG, the imaging formula is evaluated up to highest order asymptotics in terms of quantities that can be calculated through ray tracing, see, e.g. [1]. Alternatively, one can obtain the right hand side of formula (5) by minimizing the objective function for sufficiently small ϵ. The latter formulation is also suitable in a wave equation based framework. Since pseudo-differential operators do not move singularities, the location of singularities in V can be found by just using the adjoint of B[c 0 ] in formula (5). This simple procedure has been standard in seismic imaging for a long time.
To solve the much harder, non-linear problem of estimating c 0 (⃗ x), one can make use of the concept of space-shift dependent imaging, which was introduced by Claerbout [7][8][9] in the 70s under the name of survey sinking. His idea was to move or 'redatum' the acquisition surface from z = 0 to arbitrary depth z by back propagation of data using fundamental solutions G ′ 0 (⃗ x s ,⃗ x − ⃗ h, t) and G ′ 0 (⃗ x r ,⃗ x + ⃗ h, t) of the acoustic wave equation with smooth velocity c ′ 0 (⃗ x), where ⃗ h = (h 1 , h 2 , 0) t is a so-called horizontal space-shift vector. The sunken sources and receivers are located at ⃗ y s = ⃗ x − ⃗ h and ⃗ y r = ⃗ x + ⃗ h and are obviously related by a horizontal space-shift, Back propagation formally corresponds to the application of the adjoint of an order 0 Fourier integral operatorB hor [c 0 ] : with A an order 0 pseudo-differential operator, which annihilates distributions r(⃗ x, ⃗ h) focused at ⃗ h = 0. A suitable choice is A = | ⃗ h|. This strategy for velocity estimation is called differential semblance optimization and was introduced by Symes [23]. The original formulation was based on using the adjoint ofB hor [c 0 ] rather than its generalized inverse in the objective function (9). Later formulations [6,14,16,17,24] recognized that using the generalized inverse, or an approximation thereof, leads to a better objective function. The horizontal space-shift extended Born operator has been studied by several authors, see e.g. [22,26]. If rays connecting surface and subsurface locations have no horizontal tangents along their trajectories, its normal operator is a pseudo-differential operator. Its symbol has been calculated in [26] and has been used to define a pre-conditioning scheme to efficiently calculateB hor [c 0 ] † d in [12,13]. This also leads to an approximation ofB hor [c 0 ] † , which can be calculated with almost the same computational effort as the adjointB hor [c 0 ] * and leads to robust velocity inversion results, see [14].
The only serious downside to the framework of velocity estimation based on horizontal space-shift extended Born modeling outlined above is the exclusion of diving or turning rays as a condition for the normal operator to be pseudo-differential. These are rays that have segments in the downward and in the upward directions. They occur frequently in practice for velocity functions which are increasing with depth. A very simple example is the linearwith-depth function c(z) = c 0 + kz, in which ray paths are parts of circles. The exclusion of such diving rays is a much stronger condition than the traveltime injectivity condition used in non-extended imaging. It precludes the reconstruction of vertical reflectors, i.e. distribu- with singularities in the horizontal spatial directions. Vertical or nearvertical reflectors are quite common in complex geologies and diving rays play a crucial role in imaging them. It is highly desirable to build an extended imaging framework incorporating diving rays, as this would allow to use scattering caused by such reflectors in velocity estimation. In addition, it would allow to deal with two-sided illumination of reflectors as pioneered in [3] and incorporate reflections from 'above' and from 'below' in velocity estimation. Trying to build such a framework for extended imaging and velocity analysis was the main motivation for the work underlying this paper.
The key idea is to start with linearized angle dependent scattering, which is natural in the case of variable density acoustics. In section 2, I will explain how this leads to the Born operator of variable density acoustics acting on angle dependent scattering potentials of the form Here ρ stands for density and ρ(⃗ x) = ρ 0 (⃗ x) + δρ(⃗ x) is a scale decomposition similar to the one for velocity in equation (2). I will show that such an operator can be elegantly expressed by introducing an angle function θ e (⃗ p r ,⃗ p s ), defined on a conic set Ω ⊂ R 3 × R 3 , which extends the normal concept of scattering angle in the sense that it coincides with the normal scattering angle on the five-dimensional subspace defined by |⃗ p r | = |⃗ p s |. While introduced as an operator acting on scattering potentials of the form (10), which are clearly smooth in the angular variable, it will be seen to be equally applicable to scattering potentials, which are non-smooth in that variable. Taking this one step further, I will equalize the number of model variables to the number of data variables by introducing a scattering azimuth function ϕ e (⃗ p r ,⃗ p s ) extending the normal scattering azimuth and I will introduce an operator acting on distributions V(⃗ x, θ, ϕ) in spatial and angle coordinates. For suitable functions θ e (⃗ p r ,⃗ p s ) and ϕ e (⃗ p r ,⃗ p s ), the operator B e [c 0 ] is a Fourier integral operator of order 1 extending the normal variable density Born operator to distributions in ⃗ x, θ and ϕ. The vectors ⃗ p r and ⃗ p s occurring in the extended angle-azimuth functions will be seen to be tangents to rays arriving in subsurface locations ⃗ y r and ⃗ y s from surface locations ⃗ x r and ⃗ x s . Stationarity of the phase function of B e [c 0 ] requires that Here p θ and p ϕ are variables conjugate to θ and ϕ arising in the description of the wavefront relation of B e [c 0 ]. This space-shift is in general different from the horizontal one in equation (7). It is zero for p θ = p ϕ = 0, i.e. when acting on scattering potentials, which are smooth in the angular variables. In other words, the restriction of any extension B e [c 0 ] to such distributions coincides with the normal variable density Born operator, as it should. I will now sketch the construction of the angle-azimuth functions θ e (⃗ p r ,⃗ p s ) and ϕ e (⃗ p r ,⃗ p s ) as the solutions of a boundary value problem. They coincide with the normal angle-azimuth functions on the boundary manifold defined by |⃗ p r | = |⃗ p s | and are governed by a linear partial differential equation Lθ e = Lϕ e = 0. Here L is the partial differential operator with⃗ v(⃗ p r ,⃗ p s ) a vector, which can be chosen freely a priori. The only restriction is that⃗ v(⃗ p r ,⃗ p s ) · (⃗ p r +⃗ p s ) = 0 at the boundary manifold. This means that the characteristics of L are not tangent to the boundary manifold, so the boundary value problem is well-posed in the sense that it has local solutions. The space-shift (12) is orthogonal to ⃗ v(⃗ p r ,⃗ p s ), so it is determined by the choice of this vector. If⃗ v(⃗ p r ,⃗ p s ) is of the form⃗ v(⃗ p r +⃗ p s ), it is constant along the characteristics, which will therefore be straight lines ⃗ p To arrive at horizontal space-shifts, one would have to choose ⃗ v = ⃗ e 3 , the unit vector in the vertical direction. The resulting angle function θ e = θ hor is different from the normal angle functionθ(⃗ p r ,⃗ p s ) defined by because ⃗ e 3 · (∇ ⃗ pr − ∇ ⃗ ps )θ = 0, which will be demonstrated in section 2. Unfortunately, θ hor (⃗ p r ,⃗ p s ) is not defined in points (⃗ p r ,⃗ p s ) for which |⃗ p r | = |⃗ p s | and (⃗ p r +⃗ p s ) ·⃗ e 3 = 0. This is related to the above-mentioned fact that vertical reflectors cannot be reconstructed by inverting horizontal space-shift extended Born operators.
Choosing v(⃗ p r ,⃗ p s ) = ⃗ p r +⃗ p s is clearly a better idea, as it leads to angle functions θ e = θ(⃗ p r ,⃗ p s ), ϕ e = ϕ(⃗ p r ,⃗ p s ) which are well defined away from |⃗ p r | = |⃗ p s | as long as ⃗ p r +⃗ p s = 0. The associated angle-azimuth extended operator B[c 0 ] will be the focus of this paper. Its spaceshifts (12) are orthogonal to the sum of the ray tangents ⃗ p r +⃗ p s . In section 2, I will show that θ is again different fromθ, except in the case of constant velocity, so B[c 0 ] is in general different from the operatorB[c 0 ] defined by the standard anglesθ andφ.
Following Fomel, Sava and Rickett [19][20][21], I will introduce a Radon transform R : This transform is easily seen to be invertible and therefore induces a transformation of variables between (⃗ x, θ, ϕ) and The first result of this paper is thatB[c 0 ] can be written as a weighted sum over the unit sphere of operatorsB ν [c 0 ] associated to space-shifts orthogonal to a vector on the unit sphere characterized by a tuple of polar angles ν = (ν 1 , ν 2 ) The weights π ν are Fourier integral operators forming a partition of the unity operator, i.e. S 2 d 2 ν π ν = 1. The space-shift extended operatorsB ν [c 0 ] have been considered by many authors, e.g. [3,4,27,28]. Equation (17) puts all of them on the same footing in a natural manner and there is clearly no preferred direction such as the horizontal one.
Note that the variables h 1 , h 2 appearing in (16) are no longer the coordinates of a horizontal space-shift vector in this setup. Instead, they define a space-shift orthogonal to the direction defined by the polar angles ν = (ν 1 , ν 2 ) for eachB ν [c 0 ] in the integral (17). I stress that (the inverse of) the Radon transform (15) links the space-shift extended operatorB[c 0 ] to B[c 0 ], not toB[c 0 ], a difference which is relevant when acting on angle-azimuth dependent images which are non-smooth (or non-flat) in the angular directions. This also means that the angles appearing in so-called wave equation based angle gathers, see [4], are θ, ϕ, which will differ from the normal anglesθ,φ in the case of an erroneous estimate c 0 .
In section 4, I will analyze the wavefront relation ofB[c 0 ]. I will introduce an extended traveltime injectivity condition and show that under this condition it is locally, in the neighborhood of the wavefront relation of B[c 0 ], the graph of a map T * Y → T * X e from the cotangent bundle of data space to the cotangent bundle of space-shift extended image space. This neighborhood can be taken equal to the full wavefront relation ofB[c 0 ] by adding an extra condition on directional derivatives of c 0 . Under these conditionsB[c 0 ] * B [c 0 ] is a pseudodifferential operator, which is the second result of this paper. The third one is an expression for its principal symbol and a preconditioned least squares inversion scheme defined in the spirit of Hou and Symes, [12,13], which I will derive in section 5. The various conditions mentioned above replace the 'no-horizontal-ray-tangent' conditions formulated in [26] to derive the pseudo-differential property of the normal operatorB hor [c 0 ] * B hor [c 0 ]. The new conditions are less restrictive in the sense that they do allow diving rays, which was the main goal of this paper.
The results in this paper are satisfactory in the sense that they allow diving waves in extended imaging and velocity inversion from a theoretical perspective. This said, I do not know at this moment how to implement them numerically. I hope this paper will motivate other researchers to search for such an implementation.

Angle dependent scattering
In this section, I will be more precise on how to define angle dependent scattering for distributions which are not smooth in the angular variables. I will start with a short recap of linearized angle dependent scattering in an acoustic medium along the lines of Beylkin and Burridge [1]. Consider a three-dimensional acoustic medium characterized by sound speed c(⃗ x) and bulk density ρ(⃗ x). Wave propagation in such a medium is described by the acoustic wave equation be a decomposition in smooth and singular parts of the medium parameters, denote by G 0 Green's function associated to (c 0 , ρ 0 ) and by δG the first order perturbation of G with respect to these medium parameters. Using a high frequency approximation, it is easy to show that with V(⃗ x, θ) the angle dependent scattering potential defined in equation (10) and θ(⃗ x r ,⃗ x,⃗ x s ) half the angle between the tangent vectors to the rays arriving in ⃗ x from source and receiver locations ⃗ x s and ⃗ x r respectively. In general, there can of course be multiple rays connecting two points in a medium. This can be addressed by means of a multi-valued angle function θ(⃗ x r ,⃗ x,⃗ x s ), but this will lead to a rather ugly formula, which explicitly refers to the different rays connecting source and receiver points to subsurface locations. To avoid this, I insert the trivial identity and a similar one for Green's function associated to the source location ⃗ x s into equation (19). The slowness vectors ⃗ p r and ⃗ p s are tangent to rays arriving in ⃗ x and originating from ⃗ x r and ⃗ x s respectively and it is natural to introduceθ(⃗ p r ,⃗ p s ) as half the angle between these vectors. More formally,θ is defined by equation (14), which shows it is homogeneous of degree 0 in ⃗ p r and ⃗ p s . This allows me to rewrite equation (19) in the form Note that the summation over multiple arrivals is implicit in the integration over slowness variables ⃗ p r and ⃗ p s in this formula.
Equation (21) defines a map from distributions V(⃗ x, θ) which are smooth in θ to linearized scattering data δG(⃗ x r ,⃗ x s , t). In this section I will construct two different extensions of this map to Fourier Integral Operators B e : V(⃗ x, θ, ϕ) → d(⃗ x r ,⃗ x s , t), where V(⃗ x, θ, ϕ) are distributions which may be non-smooth in θ and ϕ. Contrary to the introduction, I will hide the dependence of various Born operators on the velocity c 0 from here on. I will start by introducing an azimuth functionφ(⃗ p r ,⃗ p s ) next to the already defined angle functionθ(⃗ p r ,⃗ p s ). Its construction follows Beylkin and Burridge [2], who restricted themselves to unit vectors ⃗ p r ,⃗ p s ∈ S 2 . It can however be easily extended to all independent vectors (⃗ p r ,⃗ p s ) ∈ R 3 × R 3 . Define a positively oriented, orthonormal basis {⃗ e 1 (⃗ p ),⃗ e 2 (⃗ p ),⃗ e 3 (⃗ p )} by where Note that the maps ⃗ p ∈ R 3 → ⃗ e i (⃗ p ) ∈ S 2 are only defined for p = 0, p h = 0. This is dealt with in remark 2, where I will argue that contributions from p = 0 and p h = 0 do not contribute to the integral kernels (32) and (46). Using the vectors ⃗ e i (⃗ p ), I introduce two orthonormal bases depending on ⃗ p r and ⃗ p s . The first one consists of vectors ⃗ u i (⃗ p r ,⃗ p s ) defined by A second orthonormal basis {⃗ v i (⃗ p r ,⃗ p s ), i = 1, 2, 3} is introduced as follows. The third basis vector is simply chosen as whereas the first one is chosen in the plane spanned by ⃗ p r and ⃗ p s . Since it has to be orthogonal to ⃗ v 3 and has to have norm 1, this determines ⃗ v 1 up to a sign. Defining ⃗ e r := ⃗ p r /|⃗ p r |, ⃗ e s := ⃗ p s /|⃗ p s | and choosing angles α, β ∈ [0, π] such that it is easy to see that for some ϵ = ±1 and we can obviously use the freedom in ⃗ v 1 to choose ϵ = +1. Here 2θ is the angle between ⃗ e r and ⃗ e s , see equation (14). Solving ⃗ v 1 from these equations, one finds which, in combination with equation (26), can also be seen as definition for ⃗ v 1 (⃗ p r ,⃗ p s ). The definition of the second basis is completed by Since the pairs of vectors {⃗ u 1 ,⃗ u 2 } and {⃗ v 1 ,⃗ v 2 } are both orthonormal bases of the space of vectors orthogonal to ⃗ p r +⃗ p s , they can be obtained from each other by a rotation. This finally defines the azimuth angleφ(⃗ p r ,⃗ p s ): Note that the vectors ⃗ u i , ⃗ v i , i = 1, 2, 3 are homogeneous of degree 0 in ⃗ p r and ⃗ p s , hence the same holds for the azimuth angleφ. The definitions ofθ andφ lead to the following naive extension operator B e =B of the variable density Born operator. Definition 1. Letθ(⃗ p r ,⃗ p s ) be half the angle between ⃗ p r and ⃗ p s and letφ(⃗ p r ,⃗ p s ) be as defined in equation (30). The operatorB : with integral kernel Hereχ(p) is a symbol of order 0 in p, tapering away a neighborhood of p = 0.

Proposition 1.
If there are no grazing rays, i.e. rays arriving at the acquisition plane z = 0 in a horizontal direction,B is a Fourier Integral Operator of order 1, which extends the map V(⃗ x, θ) → δG defined in (21) to distributions V(⃗ x, θ, ϕ), which may be non-smooth in the angular directions θ and ϕ.
Proof.B has a phase function where the so-called redundant coordinates ⃗ R are given by In appendix A it is shown that in the absence of grazing rays and away from ⃗ p r +⃗ p s = 0 this phase function is non-degenerate on the stationary set i.e. that the rank of the 15 × 25 matrix M : The order of an FIO is given by µ + N/2 − (n X + n Y )/4, see [10]. Here µ is the order of the amplitude in the kernel expression (32), N is the number of redundant variables, n X the number of variables in the tuple To calculate µ, one needs to redefine the redundant variables such that the phase function Φ is homogenous of degree 1 in the redefined variables. This can be achieved by choosing This leads to a factor ω −6 . Combining this with the second time derivative, shows that µ = −4, whence the order ofB is 1.

Remark 1.
One can obviously perform the integrals with respect to θ and ϕ in the expression for the integral kernel (32), which would lead to delta functions δ(θ −θ) and δ(ϕ −φ). I have kept the integrals because the terms p θ (θ −θ) and p ϕ (ϕ −φ) belong to the phase function of B, for which one needs to prove non-degeneracy. (28)-(30) that the vectors ⃗ v 1 and ⃗ v 2 and therefore the azimuth functionφ(⃗ p r ,⃗ p s ) are not defined forθ = 0, π/2, i.e. when the vectors ⃗ p r and ⃗ p s are dependent. The contributions ofθ = 0, π/2 to the integral kernels (32) and (46) below have weight 0 however because the coordinate transformation
The latter can be derived from the explicit form of this transformation where {⃗ n i (ν), i = 1, 2, 3} is the orthonormal basis associated to ν = (ν 1 , ν 2 ) defined in (57). These formulas also show that p = 0 corresponds toθ = π/2 and that p h = 0 corresponds to ν 1 = 0. Contributions from p = 0 and p h = 0 therefore do not contribute to the integral kernels either.
The consequence of the extension to distributions which are non-smooth in the angular directions is that one cannot impose stationarity with respect to the θ and ϕ variables. This means that the integration variables ⃗ y r and ⃗ y s do not coincide anymore at stationarity. Instead, they will be shifted in space according to the relation A straightforward calculation, which can be found in appendix B, shows that Since the slownesses ⃗ p r and ⃗ p s are tangent to rays in ⃗ y r and ⃗ y s respectively, they may have different lengths at stationarity, so the equations (39)-(40) mean that the shift between⃗ y r and⃗ y s is non-zero and will in general have a component along⃗ v 3 = (⃗ p r +⃗ p s )/ |⃗ p r +⃗ p s |. The following lemma introduces an alternative definition of the angle between ⃗ p r and ⃗ p s , which will lead to space-shifts between ⃗ y r and ⃗ y s orthogonal to ⃗ v 3 . It is based on the idea of solving a boundary value problem, which was outlined in the introduction.
, which coincide withθ(⃗ p r ,⃗ p s ) and ϕ(⃗ p r ,⃗ p s ) respectively for p r = p s and are annihilated by the linear partial differential operator They are given by These functions satisfy The proof of this lemma can be found in appendix C.
showing that θ(⃗ p r ,⃗ p s ) andθ(⃗ p r ,⃗ p s ) differ in general, but coincide for p r = p s , as in this case α = β =θ. The latter is the case for constant velocity media.
Lemma 1 obviously leads to a second extension operator B e =B.
with integral kernel Hereχ(p) is again a taper function similar to the one in proposition 1.

Proposition 2.
If there are no grazing rays, i.e. rays arriving at the acquisition plane z = 0 in a horizontal direction,B is a Fourier integral operator of order 1, which extends the map V(⃗ x, θ) → δG defined in (21) to distributions V(⃗ x, θ, ϕ), which may be non-smooth in the angular directions θ and ϕ.
Proof. The proof is completely similar to the one of proposition 1. Non-degeneracy of the phase function ofB is demonstrated in appendix A.
For completeness sake I also provide the full expressions for the derivative ∇ ⃗ pr − ∇ ⃗ ps onθ andφ in the following lemma, the proof of which is left to the reader.
Note that the first formula implies that ⃗ e 3 · ∇ ⃗ pr − ∇ ⃗ ps θ is in general unequal to 0 as mentioned in the introduction. This shows that θ hor =θ.

Space-shift extended Born scattering
In this section I will show how the angle extended Born operator B is connected to the more familiar space-shift extended Born operators. The connection is provided by the invertible Radon transform R : (15). I will show that the space-shift extended Born operatorB = BR can be cast in the form of a weighted integral over the unit sphere of Born operators extended by space-shifts orthogonal to the directions being integrated over. The weights will be seen to be Fourier integral operators which form a partition of the unity operator. The connection between space-shift dependent images and angle dependent ones by a Radon transform was first formulated by Fomel, Sava and Rickett, [19][20][21]. These authors restricted themselves to horizontal space-shifts, characterized by a vector (h 1 , h 2 , 0) t , whereas I will be dealing with space-shifts of the form Using the notation the definition (15) for R can be rewritten in the compact form The following lemma formulates the essential properties of this Radon transform.
Proof. Non-degeneracy of the phase function of R is trivial, as is the calculation of its order The formula for the inverse is demonstrated by a straightforward calculation of the products R −1 R and RR −1 , which I will leave to the reader.
The Radon transform R and its inverse clearly provide coordinate transforms between (⃗ x ′ , h 1 , h 2 ) and (⃗ x, θ, ϕ), θ = 0, π/2. The cases θ = 0, π/2 are dealt with in remark 2. From a physical point of view, the angular variables may be most natural, as they are the actual scattering angles for the case of distributions smooth in these variables. From a mathematical point of view, the analysis of the wavefront relation ofB = BR will be seen to be easier in the space-shift coordinates though.
thenB is a Fourier integral operator of order 0. Its action is given by with kernelK defined bỹ

Hereχ(p) is the taper function introduced in the defining relation (46) for K.
Proof. The proof can be found in appendix D.
For completeness sake, I will briefly discuss the case of horizontal space-shift extended operators. In this case, one needs to start from an angle-azimuth extended operator B hor defined by the functions θ hor , ϕ hor , which are solutions of ⃗ e 3 · ∇ ⃗ pr − ∇ ⃗ ps θ hor = 0, ⃗ e 3 · ∇ ⃗ pr − ∇ ⃗ ps ϕ hor = 0 and coincide withθ,φ for |⃗ p r | = |⃗ p s |. One also needs to use a Radon transform R hor with phase function ⃗ k(⃗ p, θ, ϕ) · ⃗ h where ⃗ h = h 1 ⃗ e 1 + h 2 ⃗ e 2 is horizontal rather than orthogonal to ⃗ p. This is the content of the following proposition, the proof of which can be found again in appendix D.
Then R hor has an inverse which is well defined on distributions V(⃗ x, θ, ϕ) whose wavefront set does not contain points (⃗ x, θ, ϕ,⃗ p, p θ , p ϕ ) for which p 3 = 0. Its kernel is given by Moreover, the operatorB hor := B hor R hor is given by the kernel (8).
I will conclude this section by a directional decomposition ofB which follows readily from the expression (54) by splitting the ⃗ p integral into an integral over the length p and the polar angles ν = (ν 1 , ν 2 ) associated to the unit vector ⃗ e 3 (⃗ p ). Introducing the notation the space-shift vector ⃗ h(⃗ p ) can be written as With this notation, the space-shift extended Born operator associated to the direction ⃗ n 3 (ν) can be defined as with kernelK In the sequel I will simply refer toB ν as the space-shift extended Born operator of type ν. It is easy to see from equation (60) that the order ofB ν is 0.
Using the notation d 2 ν = sin ν 1 dν 1 dν 2 , it is clear that which shows that the Fourier Integral Operators π ν form a partition of the unity operator. The order of π ν is 1.

Proposition 5.
LetB ν be the space-shift extended Born operator of type ν defined in (59, 60), π ν the operator defined in (61) and χ the pseudo-differential operator with symbolχ introduced in proposition 1. The space-shift extended Born operatorB = BR has the following directional decompositioñ Note that the integration over ν 1 , ν 2 in the right hand side of (63) lowers the order of the product B ν π ν by 1, leading toB having order 0 as stated in proposition 3. Formula (63) can be made identical to formula (17) from the introduction by absorbing the pseudo-differential operator χ in the definition ofB ν .

Analysis of the wavefront relation ofB
In this section I will analyze the wavefront relation Λ(B) ofB. My approach will be twopronged. First, I will show that stationarity of the phase function ofB implies that there are points on ray trajectories starting in ⃗ x r and ⃗ x s , whose connecting vector is orthogonal to the sum of the slowness vectors in these points. This should not come as a surprise, because the angle functions θ(⃗ p r ,⃗ p s ) and ϕ(⃗ p r ,⃗ p s ) defining B =BR −1 have been constructed such that the space-shifts (12) are orthogonal to ⃗ p r +⃗ p s , see lemma 1. I will show that this property locally describes Λ(B) as the graph of a smooth map T * Y → T * X e in a conic neighborhood Ω ⊂ Λ(B) of any point in Here ⃗ x r ,⃗ x s , t,⃗ κ r ,⃗ κ s , ω are coordinates in T * Y, the cotangent bundle of data space, ⃗ x ′ , ⃗ h,⃗ p,⃗ q are coordinates on T * X e , the cotangent bundle of space-shift extended image space. I will also formulate an extended Traveltime Injectivity Condition under which this description can be extended to a conic neighborhood Ω of Λ 0 (B) in Λ(B). The second part of the analysis will focus on the description of Λ(B) as a one-to-two relation between T * X e and T * Y, which can be done explicitly. Connecting the two approaches will lead to an explicit description of Λ(B) in proposition 7.
The phase function ofB can be read off from equations (59)-(61), (63) and the most singular Here τ (⃗ x,⃗ y) is the traveltime along a ray from ⃗ x to ⃗ y, A(⃗ x,⃗ y) an amplitude which does not play a role in the phase function. Putting this all together leads to where λ i , i = 1, 2 are given by Moreover, stationarity with respect to the redundant variables ⃗ x, ω, p and ν i , i = 1, 2 in the integral kernel ofB leads to the relations Equations (66)-(68) define Λ(B). The last two of these equations determine a space-shift with δ 1 , δ 2 functions of ⃗ x, ⃗ h, ⃗ p and ⃗ q, so ⃗ x ′ is a function of these variables. The other equations state that there are rays starting in ⃗ x r and ⃗ x s in the downward directions (⃗ κ r , (ω 2 /c 2 0 (⃗ x r ) − ⃗ κ 2 r ) 1/2 ) and (⃗ κ s , (ω 2 /c 2 0 (⃗ x s ) − ⃗ κ 2 s ) 1/2 ) to ⃗ x + ⃗ h ν and ⃗ x − ⃗ h ν respectively, such that the sum of their slowness vectors in these points is in the ν-direction, i.e. orthogonal to the space-shift 2 ⃗ h ν defined in (58). Moreover, the total traveltime along these ray trajectories equals t.
Writing ⃗ y r = ⃗ y r (t r ;⃗ x r ,⃗ κ r , ω) and ⃗ y s = ⃗ y s (t − t r ;⃗ x s ,⃗ κ s , ω) for these ray trajectories and ⃗ p r = ⃗ p r (t r ;⃗ x s ,⃗ κ r , ω) and ⃗ p s = ⃗ p s (t − t r ;⃗ x s ,⃗ κ s , ω) for the associated slowness vectors, this orthogonality property can be expressed as which is of the form F(t r ;⃗ x r ,⃗ x s , t,⃗ κ r ,⃗ κ s , ω) = 0. From the above it is clear that for each (70). The implicit function theorem guarantees the existence of a local solution t r = T r (⃗ x r ,⃗ x s , t,⃗ κ r ,⃗ κ s , ω), provided that F tr = 0. Using this function T r and the equations (66)-(69), one can locally construct (⃗ x ′ , ⃗ h,⃗ p,⃗ q ) as functions of (⃗ x r ,⃗ x s , t,⃗ κ r ,⃗ κ s , ω).
The t r -derivative of F is calculated as follows where, using that In (71) I have also used the obvious notation h = | ⃗ h| and the simple identity Points in Λ(B) for which the right hand side of (71) does not vanish have a conic neighborhood Ω = (Ω X × Ω Y ) ∩ Λ(B), which can be written as the graph of a map from Ω Y ⊂ T * Y → Ω X ⊂ T * X e . Equivalently, the projection π Y : Ω → Ω Y is an injective immersion. This is certainly the case for points in Λ 0 (B), as the right hand side of (71) reduces to 4ω cos 2θ = p 2 /(ωs 2 ) in this case. This local result can be extended to a neighborhood Ω ⊂ Λ(B) of the full manifold Λ 0 (B) under the following condition, which extends the one formulated in [25].
Assuming eTIC, the projection π Y is injective and immersive on the union of all conic neighborhoods of points on Λ 0 (B) which can be described as local graphs of maps, so it can be represented as the graph of a map Note that eTIC is trivially satisfied if the velocity is a constant. Rays are straight lines of the form ⃗ y r = ⃗ x r + c 2 t r ⃗ p r , ⃗ y s = ⃗ x s + c 2 t s ⃗ p s , with constant slowness vectors ⃗ p r and ⃗ p s in this case, so the equation (⃗ y r −⃗ y s ) · (⃗ p r +⃗ p s ) = 0 leads to so the map (⃗ y r ,⃗ p r ,⃗ y s ,⃗ p s ) ∈ Σ rs → t r + t s is indeed injective. In appendix E I will show that eTIC is also satisfied for velocity functions of the form c(⃗ x) = c 0 + kz, with two well-defined exceptions. Ray paths in such velocity models are circles, so these velocity functions provide examples where eTIC allows for diving ray paths. The two exceptions are illustrated in figures 3 and 4 in appendix E. Note that in the limit h = | ⃗ h| → 0 they correspond to scattering over π, which is addressed through the taper functionsχ(⃗ p in propositions 1 and 2. In fact, it is clear from figures 3 and 4 that such taper functions also deal with the two exceptions in a neighborhood of Λ 0 (B). I now turn to the second part of the analysis and focus on the projection π X : Λ(B) → T * X e , which is in general a two-to-one map, because points ⃗ x + ⃗ h(⃗ p ) and ⃗ x − ⃗ h(⃗ p ) in the subsurface can in principle be reached by rays in the directions ∇τ (⃗ x r ,⃗ x + ⃗ h(⃗ p )) and ∇τ (⃗ x s ,⃗ x − ⃗ h(⃗ p )) or by different rays arriving in the opposite directions ∇τ ). In appendix F I have constructed the wavefront relation for the space-shift extended Born operator for the case of horizontal space-shifts in an algebraic manner. This is done by reconstructing unit ray tangent vectors at starting points and ⃗ q = q 1 ⃗ e 1 + q 2 ⃗ e 2 and following the rays defined by these starting points and directions back to the measurement surface z = 0. It turns out that the unit ray tangent vectors are determined up to a sign, so the construction yields a 1-to-2 relation between T * X e and T * Y, which is of course the inverse image of the projection π X .
The results for horizontal space-shifts are formulated in proposition 8 and carry over trivially to arbitrary space-shifts. They do in fact simplify considerably for slowness vectors orthogonal to space-shift vectors. Consequently, The definition for D makes sense if q 2 < p 2 tan 2 θ c (⃗ x, ⃗ h,⃗ p ). The Radon transform R introduced in (15) changes coordinates (⃗ x ′ , ⃗ h,⃗ p,⃗ q ) to (⃗ x, θ, ϕ,⃗ p, p θ , p ϕ ) such that q = p tan θ. This means that in angular coordinates D is defined for θ < θ c (⃗ x, ⃗ h(θ, ϕ, p, p θ , p ϕ ),⃗ p ), which explains the nomenclature of critical angle. A little further algebra starting from equations (221) leads to the following unit tangent vectors to rays in ⃗ with ϵ = ±1 and From these expressions for λ and µ one easily derives the identities whence All the formulas above simplify enormously of course for ⃗ h = 0, as in this case s + = s − , θ c = π/2, D = 1, λ = µ = 1.

I now connect the arrival slownesses
to the departure directions ⃗ e α and ⃗ e β by setting Using equations (79), (80) and the definition cos 2θ = ⃗ p r /p r ·⃗ p s /p s = ⃗ e ϵ α ·⃗ e ϵ β , I derive the following identity, which holds for all points (⃗ x, ⃗ h,⃗ p,⃗ q ) in the image of the ray based map Ω Y ⊂ T * Y → Ω X ⊂ T * X e constructed in the beginning of this section.
I can now formulate the following proposition.

If eTIC holds and if h∂ h f
is the graph of a smooth map T * Y → T * X e andB * B is a pseudo-differential operator.
The assumption h∂ h f ν ⩾ 0 is obviously valid for constant velocity media, but also for velocity functions of the form c(⃗ x) = c 0 + kz. Indeed, in the latter case one has For media for which h∂ h f ν can become negative, we have to restrict to a subset of the wavefront relation on which the right hand side of equation (85) is positive in order for π Y to be immersive.
The following definition and proposition summarize the results obtained in this section.
Definition 3. Ω ϵ X ⊂ T * X e , ϵ = ±1 are the conic regions defined by all points (⃗ x, ⃗ h,⃗ p,⃗ q ) for which and rays starting in⃗ reach the surface of the earth. Proposition 7. The following properties hold.
That is, ⃗ x r and ⃗ x s are the coordinates of the intersections of these rays with z = 0, t is the total traveltime along these rays and

(90)
Assuming eTIC and no grazing rays, the maps F ϵ are injective and smooth. The inverse maps G ϵ : Ω ϵ Y → Ω ϵ X exist and are smooth as well.

(d)B * B is a pseudo-differential operator when acting on distributions with wavefront set in
Proof.
(a) Smoothness of the maps F ϵ follows from the fact that ray trajectories depend smoothly on their initial parameters and that the rays are not tangent to z = 0. Injectivity of F ϵ is a direct consequence of eTIC. The inverse maps G ϵ are constructed by tracing the rays back into the subsurface and imposing condition (70). Because of the second condition in (89) and because rays depend smoothly on their initial conditions, this leads to a smooth map.

(b) From equations (77)-(80) one reads off that
. This proves the second property. (c) The third property follows directly from the definitions of Ω + X and Ω − X . (d) The fourth property follows from the fact that there is a smooth map G : Note that the last equation of (90) means that Ω + Y contains only positive temporal frequencies and Ω − Y only negative ones. This means that the wavefront set of a distribution of the form B * (1 + iH t )d, with H t the temporal Hilbert transform and d = d(⃗ x r ,⃗ x, t) seismic data, will end up in Ω + X . Similarly, the wavefront set ofB * (1 − iH t )d belongs to Ω − X . With this observation the relevance of the second property in proposition 7 becomes clear. It extends a well known property of non-extended imaging in a medium without diving rays, namely that the positive frequencies reconstruct singular directions ⃗ p in the image pointing in the upward direction, the negative ones the singular directions pointing in the downward direction.

Preconditioned least squares inversion
In this section I will consider the least squares problem of finding space-shift dependent reflectivity and suitably small ϵ. This would lead toB † and a velocity inversion framework as explained in the introduction. I will calculate the principal symbol of the pseudodifferential operatorB * ∂ zr ∂ zs ∂ −2 tB and show how it can be used to precondition the least squares inversion scheme forB in the spirit of Hou and Symes [12,13]. Here ∂ zr and ∂ zs are pseudo-differential operators acting on distributions d(⃗ x r ,⃗ x s , t) in data space Y, defined by their symbols −isgn ω (ω 2 /c(⃗ x r ) 2 − ⃗ κ 2 r ) 1/2 and −isgn ω (ω 2 /c(⃗ x s ) 2 − ⃗ κ 2 s ) 1/2 . They can be interpreted as differentiations with respect to vertical receiver-source coordinates z r and z s .
A numerical implementation of this preconditioned least squares problem would of course require the repeated evaluation of quantities of the formBr andB * d, which is something I do not know how to do in a feasible manner. A naive implementation based on the expres-sionB =´d 2 νB v π ν is almost certainly out of reach computationally. The discussion in this section is therefore theoretical in nature. I hope it will support future work towards a practical implementation.
The calculation of the symbol ofB * ∂ zr ∂ zs ∂ −2 tB is along the same lines as described in [26], so I will only outline the major steps here. SinceB * ∂ zr ∂ zs ∂ −2 tB maps distributions depending on (⃗ x ′ , ⃗ h ′ ) to distributions depending on (⃗ x, ⃗ h ), its phase function is of the form Here ⃗ x 0 , ⃗ x 1 , ⃗ p, ⃗ p ′ and ⃗ x r , ⃗ x s , ω are redundant variables which are integrated over. The first step is to linearize Φ around the stationary values ⃗ x 1 = ⃗ x 0 , ⃗ h ′ = ⃗ h and ⃗ p ′ = ⃗ p. The linearized phase function looks like The next step is to change coordinates (⃗ x r ,⃗ x s ) → (α, β), with α = (α 1 , α 2 ), β = (β 1 , β 2 ) angles determining the vectors ⃗ e α = s −1 ). The Jacobian of this transformation leads to a combination of spherical spreading factors and cosines of emergence angles, which cancel against similar factors in Green's functions ) and the symbols of ∂ zr and ∂ zs , see [26]. The phase function becomes Note that the functions This leads to a phase function of the form To arrive at this result, I have used and have subsequently replaced ⃗ e 3 (⃗ p ) in the last term by its stationary value ω(s + ⃗ e α + s − ⃗ e β )/p, which is justified because the relevant term is already first order in (⃗ p ′ −⃗ p ).
Here ω = ω ϵ (⃗ x 0 , ⃗ h,⃗ p,⃗ q ) is defined in the first relation of (80). The Jacobian of the transformation (ω,⃗ e α ,⃗ e β ) → (⃗ p 0 ,⃗ q ) has been calculated in [26]. It is given by Integrating over ⃗ x 1 and ⃗ p 0 leads to Setting A somewhat lengthy calculation, which can be found in appendix G, leads to the Jacobian in which one recognizes the right hand side of ∂F/∂t r up to a factor, see (85). The second condition in proposition 7 guarantees that this Jacobian is positive on as determined by the last two equations of (68) in all amplitudes, which restricts the ⃗ x 0 -dependence to the phase function. Integrating over ⃗ x 0 and ⃗ x 1 and taking into account the various Jacobians, one finally arrives at Note that points in the intersection Ω + X ∩ Ω − X ⊂ Ω X contribute twice to the integral (101), each time with the same weight, corresponding to two-sided illumination.
Using the second equation of (82), this can be rewritten in the form This shows that where I have introduced the Hilbert transform with respect to time H t and the order 0 pseudodifferential operator ψ with symbol with accounting for single or double illumination. Following Hou and Symes [12,13], I define preconditioning operators in image and data space by Note that approximatingB † by W −1 XB * W Y , as was done successfully in [14] for velocity inversion, may lead to problems in case of mixed single and double illumination.

Data availability statement
No new data were created or analysed in this study.

Acknowledgments
I would like to thank Bill Symes for many inspiring discussions over the years. Some of the ideas for this paper originated during a visit to Rice University many years ago. I would also like to thank an anonymous reviewer for valuable feedback.
Let's denote the redundant coordinates by ⃗ R, i.e.
Clearly, the number of redundant variables is 15. I will need to show that on the stationary set the rank of the 15 by 25 matrix M := ∇ t (⃗ x,θ,ϕ,⃗ xr,⃗ xs,t, ⃗ R) ∇ ⃗ R Φ e is 15. The differential with respect to the redundant variables is given by This leads to where I have introduced the notation Apart from this, I have used the obvious notations 0 mn for the m × n matrix filled with zeroes, 0 n as short hand for 0 nn and 1 n for the n × n identity matrix. It is immediate that with It follows that with I will now first restrict to the case (θ e , ϕ e ) = (θ, ϕ) and show that M 2 has rank 3 away from ⃗ p r +⃗ p s = 0 in this case. I start with the identities (43) with This shows that, away from ⃗ p r +⃗ p s = 0, the columns ⃗ c s −⃗ c r and ⃗ d s − ⃗ d r in the matrix M 2 can be replaced by the independent vectors ⃗ v 1 and ⃗ v 2 respectively for the rank calculation.
The rank of M 2 is therefore at least 2 and I need to identify just one extra independent column. Moreover, it shows that for the rank calculation of M 2 columns can be evaluated modulo linear combinations of ⃗ v 1 and ⃗ v 2 . I will use the symbol ∼ to indicate that left and right hand side of an equation are the same modulo such linear combinations.
Below, I will focus on the two columns of (P r − Q)B r and show that, away from ⃗ p r +⃗ p s = 0, the space spanned by ⃗ v 1 , ⃗ v 2 and these columns has dimension 3, unless cos α = 0. The 3 × 2 matrix of mixed second order derivatives B r can be written as where ⃗ e 1 (⃗ x) and ⃗ e 2 (⃗ x) are mutually orthogonal unit vectors orthogonal to the ray from ⃗ x to ⃗ x r in the point ⃗ x and Q 2 = Q 2 (⃗ x,⃗ x r ) is an invertible 2 × 2 matrix which can be calculated by dynamic ray tracing along that ray, see [5]. The subscript h in ⃗ e K,h means taking the horizontal component of ⃗ e K . Defining Equation (119) can be rewritten as Introducing the linear differential operators and using I derive It follows that M 2 has rank 3, unless It is easy to see that ⃗ a 1 and ⃗ a 2 are independent if there are no so-called grazing rays (i.e. rays tangent to the surface z = 0 on emergence at ⃗ x r ). Formula (125) is therefore equivalent to A straightforward calculation, using that ⃗ e 1 and ⃗ e 2 are orthogonal to ⃗ e r and can therefore (see formula (27)) be written as for some ϕ, shows that Hence, if cos α = 0 and ⃗ p r +⃗ p s = 0, the equations (125) imply p θ = p ϕ = 0, in which case P r = P s = Q = 0 3 and the rank of M 2 is obviously 3. In other words, away from ⃗ p r +⃗ p s = 0, the rank of M 2 can only be smaller than 3 if cos α = 0. A completely similar argument shows that the space spanned by ⃗ v 1 , ⃗ v 2 and the columns of (Q t − P s ) B s has dimension 3, unless cos β = 0. Since cos α = 0 = cos β would imply ⃗ p r · (⃗ p r +⃗ p s ) = 0 and ⃗ p s · (⃗ p r +⃗ p s ) = 0 and hence ⃗ p r +⃗ p s = 0, the rank of M 2 must be 3 away from ⃗ p r +⃗ p s = 0.
A similar argument to the one above leads to the conclusion that the rank of M 2 is again 3, unless The calculation of this determinant is somewhat tedious. Using the definitions (131), it is straightforward that where for L = 1, 2 To proceed, I note that and where I have used that ⃗ v 2 is orthogonal to ⃗ e r and ⃗ e s . Using these identities and equation (127), it is straightforward to show that Using the definitions (130), the first factor can be easily rewritten as Using the definitions (127) and the relations (136), the determinant in the second factor can be written as where I have introduced the linear differential operator Using the simple identities D r (p r ) = 0 = D r (p s ) , D r cos 2θ = − 1 p r sin 2θ, it is easy to see that With this expression, I find Combining (140) and (145), finally leads to It follows that the rank of the matrix M 2 can only be smaller than 3 if cos β = 0. A similar analysis on the columns of the matrix (Q t − P s ) B s and ⃗ v ′ 1 and ⃗ v ′ 2 leads to the conclusion that the rank of M 2 can only be smaller than 3 if cos α = 0. This shows that also for the case (θ e , ϕ e ) = (θ,φ) the phase function Φ is non-degenerate away from ⃗ p r +⃗ p s = 0.

Appendix B. Proof of equation (40)
In this appendix I will prove the equation (40). To alleviate the notation, I will denote the slowness variables ⃗ p r and ⃗ p s by ⃗ x and ⃗ y respectively. With this notation the operator L defined in (41) can be written as I also record the following simple expressions A straightforward calculation shows that Using these identities, I derive This proves the first equation of (40). To proceed, I will need a couple of identities, which are formulated in the following lemma. Consequently, Note that the third identity of (151) is trivial, since⃗ v 3 (⃗ x,⃗ y) = ⃗ u 3 (⃗ x +⃗ y). Equation (152) follows immediately from equations (151).
To prove the first two identities of (151), I start with the definition (28), which I rewrite in the form with coefficients a and b given by Since {⃗ v 1 ,⃗ v 3 } is a basis of the space spanned by the vectors ⃗ x and ⃗ y, I can write The coefficients α i and β i can be derived from the properties which follow from |⃗ v 1 | = 1, ⃗ v 3 ·⃗ v 1 = 0 and the third relation of (151). Combining (155) and (156) leads to Using these relations, equations (154) and the expressions (148), I write which proves the first relation of (151). The second one is derived as follows: This completes the proof of lemma 4. The proof of the second equation of (40) is now straightforward. Since the vectors ⃗ u i (⃗ x,⃗ y), i = 1, 2, 3 only depend on⃗ x +⃗ y, they are annihilated by L, just as the vectors⃗ v i (⃗ x,⃗ y), i = 1, 2, 3. Using the defining relations (30) for the azimuth angleφ, it then follows immediately that Lφ = 0.

Appendix C. Proof of lemma 1
In this appendix I will provide a proof for lemma 1. Just like in appendix B I will again denote the slowness variables ⃗ p r and ⃗ p s by ⃗ x and ⃗ y respectively.
Consider the characteristic curves (⃗ x(t), ⃗ y(t)) associated to the operator L defined in equation (41), which are determined by the equations The characteristic curve through ⃗ x = ⃗ x(0) and ⃗ y = ⃗ y(0) is given by It is easily seen that the norms x(t) = |⃗ x(t)| and y(t) = |⃗ y(t)| are equal if and only if t = t 0 (⃗ x,⃗ y) given by This means that each point (⃗ x,⃗ y) not on the 5-dimensional submanifold determined by the equation x = y can be connected to precisely one point (⃗ x 0 ,⃗ y 0 ) := (⃗ x(t 0 ),⃗ y(t 0 )) on this manifold by the characteristic through that point (⃗ x,⃗ y). This shows that the functions θ(⃗ x,⃗ y) and ϕ(⃗ x,⃗ y) exist and are unique. They are given by Since Lφ = 0, the functionφ(⃗ x,⃗ y) is constant along a characteristic curve of L, meaning that ϕ(⃗ x 0 ,⃗ y 0 ) =φ(⃗ x,⃗ y). The second equation in (163) therefore implies that ϕ =φ as stated.
The formula for tan θ is proved as follows. Since x 0 = y 0 , we have Moreover, it follows from the definitions (26) of the angles α and β that Using these identities and the definition (28) of the vector ⃗ v 1 , I find from which it follows that Using the first equation in (164) and equation (167), I derive The vector ⃗ v 1 is annihilated by L according to lemma 4 and the same holds trivially for |⃗ x +⃗ y|, these quantities are constant along its characteristics. Therefore, This completes the proof of the defining relations (42) for θ and ϕ.
Using equation (169) and the first equation of (151), I find where I have used that⃗ v 2 is orthogonal to the space spanned by the vectors⃗ x and⃗ y. This proves the formula for (∇ x − ∇ y ) θ in (43).
Applying the operator ∂ xi − ∂ yi to the first of the defining relations (30) for ϕ and using the second of the relations (30) and the first two identities of (151), I find from which the formula for (∇ x − ∇ y ) ϕ in (43) can be read off.

Appendix D. Proof of proposition 3
In this appendix I will show that the operatorB = BR is a Fourier integral operator with kernel defined in equation (54). Here B is defined in definition 2, R is defined in equation (50). First of all, it is easily verified based on the phase function in equation (50) that the wavefront relation of R is the graph of an invertible map from the cotangent bundle of angle extended image space to the cotangent bundle of space-shift extended image space, away from θ = 0, π/2. The exceptions are handled by remark 2. This means that the composition of B and R is again a Fourier integral operator, see [11].
To calculate the expression for the kernel, the product BR is evaluated as follows.
The right hand side of this equation has a fixed sign, unlesŝ aa = tan(θ/2) tan(θ/2) = 1, which can only happen ifθ + θ = π. Substituting this in equation (193), I observe that this can only be the case ifx 0 = x 0 ,r = r orx 0 = x 0 ,r = r. The casex 0 = x 0 ,r = r,θ + θ = π clearly corresponds to scattering over π and can be excluded. These exceptions are illustrated in figures 3 and 4. I conclude that eTIC is satisfied with the exception of the combination of rays traveling in opposite directions, which either have the same center, or the same radius (but not both).