On variable viscosity and enhanced dissipation

In this article we consider the two-dimensional Navier–Stokes equations with variable viscosity depending on the vertical position. As our main result we establish linear enhanced dissipation near the non-affine stationary states replacing Couette flow. For instance, these shear flows may grow exponentially. Moreover it turns out that, in contrast to the constant viscosity case, decreasing viscosity leads to stronger enhanced dissipation and increasing viscosity leads to weaker dissipation.


Introduction
In the present paper we are concerned with the two-dimensional incompressible Navier-Stokes equations in the presence of large (stratified) viscosity variations Here t ∈ [0, ∞) and x y ∈ T × R denote the time and space variables.The vector-valued function v = v(t, x, y) : [0, ∞) × R 2 → R 2 and the scalar function p = p(t, x, y) : [0, ∞) × R 2 → R denote the unknown velocity vector field and the unknown pressure of the two-dimensional flow, respectively.The symmetric part of the velocity gradient 1 2 Sv := 1 2 (∇v + (∇v) T ) denotes the symmetric deformation tensor.The viscosity coefficient is a given non-constant positive scalar function.More precisely, we consider the case of stratified viscosity µ(y) depending on the vertical direction only and study its interplay with 2D shear flows.Viscous stratification is a typical phenomenon not only in nature (e.g. in the atmosphere and ocean flows) but also in industrial application (e.g. in the chemical and food industry).The (in)stabilities in viscosity-stratified flows have attracted constant interests of physicists [Cra69, GS14, Hei85, Lin44, HB87, Yih67].While additional dissipation at first sight suggest stabilization 1 , in experiments viscosity exhibits dual roles [Dra02, Chapter 8, pp.160]: a stabilizing role due to the dissipation of energy and a more subtle destabilizing role.Yih [Yih67] showed that the instability in a low Reynolds number flow can be caused by viscosity stratifications Date: October 22, 2021. 1 The Orr-Sommerfeld eigenvalue Problem has only positive eigenvalues for Couette flows, which implies the stability of Couette flows for all Reynolds number, but experiments showed instability under small but finite perturbations. 1 (see also Craik [Cra69] for the study of flows with continuous viscosity stratification).These results motivated decades of active researches on the instability caused by viscosity interfaces, see [GS14] for a review paper on this topic.
In this paper we consider the model (1) of the fluids with equal density/temperature but different viscosities, which can for instance be used to describe the transport of the highly viscous oil and an immiscible low viscous lubricant (see e.g.[JRR84,PV91] for the relevant instability analysis).We then study the asymptotic behavior of perturbations to the shear flow solutions which satisfies the hydrostatic balance ∂ y (µ∂ y U ) = 0.
(3) This condition implies that as µ decreases ∂ y U increases and vice versa.
As an additional assumption, while we allow µ and hence ∂ y U to change by several orders of magnitude, we require that locally there is not too much oscillation: Thus, for instance µ may decrease exponentially but only with a small exponent c.
As a consequence of the balance relation (11) one observes that the variable viscosity coefficient changes the slope of the underlying velocity profile, such that the viscous stratification comes into play, even at high Reynolds numbers2 .More precisely, as we discuss following Theorem 1.1 as the viscosity decreases towards zero, the effective dissipation µ(∂ y U ) 2 rate becomes larger.This also helps to explains wall heating or cooling techniques (corresponding to the liquid flows or gas flows respectively) in industrial application, which produce less viscous flow near the wall, and hence stabilize the flows [BG81].
In recent years there has been extensive research on the stability study of the shear flows (10) for the inviscid fluids with µ = 0, and for the viscous fluids with constant viscosity µ = const.> 0.
Since the literature is extensive, we here do not provide a complete overview but refer the interested reader to the following recent works for further discussion [Jia19, LX19, WZZ18, IJ18, Wid18, YL18, BMV16, EW15, LZ11, WZZ17, BGM17, BM15, DWZ20, LZ11].We, in particular, recall that for linearized equations around Couette flow µ = const.> 0, U (y) = y, it can be shown by explicit calculations that the interplay of shearing and dissipation leads to damping with a rate exp(−C 3 √ µt), and thus on a time scale µ − 1 3 much smaller than the dissipation time scale µ −1 .This phenomenon is hence called enhanced dissipation (see [BVW18] for further discussion and the analysis of the nonlinear problem).
Given a particular value of the viscosity at a given point, µ(y 0 ), in this paper we are interested in the change of the (local, effective) dissipation rates if µ varies as y > y 0 increases.For instance, how much of an increase or decrease of µ is required to change the dissipation rate by a factor of 10?As our main results we establish stability of the linearized equations and prove damping with a local rate µ(U ′ ) 2 , which is inversely proportional to 3 √ µ.
Theorem 1.1.Let µ ∈ C 2 (R) with µ > 0 be a given stratified viscosity profile.Then a stationary solution of the Navier-Stokes equations (1) on T × R in the presence of the variable viscosity µ is given by a shear flow v = (U (y), 0) T such that and the linearized equations around this solution in vorticity formulation read where U ′ = ∂ y U and U ′′ = ∂ 2 y U denote y derivatives.Additionally suppose that µ only varies gradually, in the sense that where F (µ) denotes the (tempered) Fourier transform and that µ is bounded above and below and sufficiently small so that For instance, µ may grow at a (small) exponential rate from a value ν 2 to 0.1ν with ν < 0.1.For simplicity of presentation also assume that U ′ ≥ 1 (which can be assumed without loss of generality after rescaling).Then the linearized equations (6) are stable and exhibit enhanced dissipation.More precisely, there exists a timedependent family of operators A(t) with Furthermore, if ´ω0 dx = 0 then for all times t > 0 it holds that Moreover, under further regularity assumptions these results also extend to stability of the "profile" W (t, x, U (y)) := ω(t, x − tU (y), y) in higher Sobolev norms H N (see Proposition 5.1 for a precise statement).
Let us comment on these results: • We remark that due to the balance relation (5) it holds that µU ′ = const.=: σ.
Hence, one observes that the local dissipation rate satisfies and thus is proportional to µ(y) −1/3 .Thus a decrease of µ by a factor 1000 corresponds to an increase of the dissipation rate by a factor 10. Conversely, increasing the viscosity compared to µ(y 0 ) corresponds to weaker dissipation.• Our assumptions on µ ensure that the effective dissipation rate µ(U ′ ) 2 is always smaller than one.The dependence in terms of a third root hence reflects the enhancement of the mixing rate due to shear.• The nonlinear constant viscosity Navier-Stokes problem has been studied in [BVW18].This article extends these results in the linearized case to the stratified viscosity problem.In particular, we extend the by now common Cauchy-Kowalewskaya approach to the setting where U ′ (y) and µ(y) may vary by many orders of magnitude (but may do so only gradually).We expect these methods to be of interest of their own for the wider community and applicable also to other related problems (e.g. the variable viscosity Boussinesq equations).• Unlike in the constant viscosity setting for the shear flow considered in this article the second derivative of the shear U ′′ is non-trivial and does not approach zero under the (variable viscosity) heat flow.Hence, as in the inviscid setting [Zil17] we require a smallness condition to control the correction term U ′′ v 2 in the linearized equation (this condition further allows us to control derivatives of the viscosity).This motivates our assumption (7) (see Section 3.1 for further discussion).• We remark that in view of (5) the shear flow U is strictly monotone and hence invertible (but U ′ might be very large).In our analysis it will prove advantageous to thus equivalently consider the variable z = U (y) and the profile W moving with the shear.Moreover, stability is most naturally phrased in terms of spaces L 2 (dz) ≃ L 2 (U ′ dy).• The first condition in (7) allows µ to grow exponentially but only with a small exponent.In particular, this implies that level sets of the form {y : 10 j < µ(y) < 10 j+1 } for j ∈ Z are bounded below in size, which we exploit in a partitioning construction in Section 4. The second condition in (7) should be understood as a regularity assumption on the relative change µ sup(µ) , which should decay at very large frequencies (larger than sup(µ) −1/3 ).As we discuss in the following corollary the weighted dissipation estimate implies exponential decay, if ω remains suitably localized.
Corollary 1.2.Let W (t, x, U (y)) := ω(t, x − tU (y), y) be as in Theorem 1.1, let M ⊂ R and suppose that on a given time interval (t 1 , t 2 ) ⊂ (0, ∞) a fraction at least θ ∈ (0, 1) of the L 2 energy is localized in M .That is, Then for all t ∈ (t 1 , t 2 ) it holds that Proof.By Theorem 1.1 it holds that with c := (µ(U ′ ) 2 ) 1/6 .By assumption the left-hand-side can be bounded from above by which yields the result.
We stress that the time interval considered in this corollary might be very small if M is a region with a very fast effective decay rate, since then the L 2 energy (or enstrophy) in that region can be expected to decay much faster than in other regions and hence will correspond to a much smaller fraction than θ after a time.
Based on the local dissipation rate at first sight one might also conjecture an estimate of the form to hold.However, such an estimate cannot be expected to hold in general, since the Biot-Savart law is non-local and not decaying quickly enough.More precisely, if W is highly localized in a region M , then the velocity field generated by W exhibits decay away from M in terms of a power law of the distance dist(y, M ).Hence, supposing for the moment that W remains localized and that M is a region with small decay rate, we expect W to decay with slower rate.In particular, if M ′ is a different region with much higher damping rate, then the decay of the Biot-Savart law in terms of dist(M, M ′ ) is not sufficiently strong to compensate for the difference in dissipation rates.
The remainder of our article is structured as follows: • In Section 2 we introduce function spaces, changes of variables and notational conventions used throughout the article.• As a first model setting in Section 3 we establish linear L 2 stability for the case when µ varies only by a bounded factor.This allows us to more clearly present the main tools of our proofs and discuss the necessity of assumptions.
• In Section 4 we extend these L 2 stability results to the general setting by constructing local versions of several estimates.Here the non-local structure of the Biot-Savart law and the interaction of the localization and dissipation require careful analysis.• Using the linear L 2 stability results as a building block, in Section 5 we establish linear stability in H N and thus prove Theorem 1.1.

Stationary Solutions and Notation
In this section we establish that the shear flow U given in Theorem 1.1 indeed is a stationary solution.Furthermore, we derive the linearized equation around this state in vorticity formulation.
In our analysis of the Navier-Stokes equations it is often convenient to work in Lagrangian coordinates moving with the underlying shear flow (U (y), 0).Moreover, since we assume that U is strictly monotone there exists a change of coordinates y → z = U (y) which straightens out the flow lines.For easier reference the equivalent formulations of the equations with respect to these coordinates are also collected in this section.Moreover, we define Sobolev spaces and multipliers with respect to z.
We remark already here that this construction requires further refinement for the general situation, but provides a good description if one additionally assumes that µ is globally comparable to a constant, which is the model setting of Section 3. In Section 4 we replace this global change of variables by a family of suitably localized coordinate changes, which accounts for the fact that µ and hence ∂ y U may change by many orders of magnitude.
Proof of Lemma 2.1.Following Theorem 1.1 we make the ansatz The Navier-Stokes equations (1) then reduce to the following equations The second equation ∂ y p = 0 implies p = P (x) for some function P depending only on x, while ∂ y (µ∂ y U ) depends only on y.Hence, both functions need to equal a common constant, which yields the hydrostatic balance relation and p = P (x) = C 0 x + C 1 , where C 0 , C 1 ∈ R are constants.In particular, specializing to the case C 0 = 0, we verify that our choice of U yields a stationary solution.
If we also allow for C 0 to be possibly non-trivial there are many solutions of potential interest: • The Uniform flow: U = const.
• The jet or wake: U = sech 2 (y), with µ = y cosh 2 (y) coth(y).In this article we restrict to the case C 0 = 0 since then for non-vanishing viscosity the (non-trivial) shear flow U has no critical points, which would pose an obstacle to damping estimates.Furthermore, in view of physical applications we additionally assume that the effective damping rate µ(∂ y U ) 2 is not large.
In the following let U, µ be solutions of (11) which hence are solutions of the Navier-Stokes equations in velocity formulation.We may then obtain the equation for the vorticity by applying the operator ∇ ⊥ • to the velocity equation (1).Notice that We may calculate(see also [HL20]) which can be equivalently expressed as Thus we arrive at the vorticity formulation for the Navier-Stokes equations with viscosity µ: Finally, we linearize the vorticity equation ( 12) around this shear flow to arrive at the following linearized equation In the following we introduce some equivalent reformulations of linearized equations (9) in order to simplify our notation.We first observe that in the equations (9) Hence the evolution of the x-average of the vorticity which we denote by ω = decouples as The x average hence evolves as in a variable coefficient heat equation and does not influence the evolution of the orthogonal complement For this reason we in the following without loss of generality assume that initially ω = = 0, which then remains the case for all times.
As another consequence of the lack of x-dependence, the equations decouple after a Fourier transform in x, which we denote by ω(t, k, y) = 1 2π ˆe−ikx ω(t, x, y)dx.
Our equations read: We may further consider the vorticity moving with the underlying shear Expressed in Fourier variables it holds that and hence As a final step we observe that by assumption U (y) is strictly monotone and hence there exists a change of variables y → z = U (y), which serves as our main formulation in the following.

Lemma 2.2 (Vorticity formulation).
Let U, µ be as in Theorem 1.1 and consider the linearized equations in vorticity formulation.Further denote the change of coordinates y → z = U (y) and define Then ω solves (9) if and only if for every k ∈ Z the Fourier transform of W with respect to x solves: where with slight abuse of notation coefficient functions are evaluated in y such that z = U (y), e.g.
. In other words, by introducing we may write the above equation for F x W as Proof of Lemma 2.2.Since by assumption µ∂ y U equals a non-trivial constant and µ > 0 does not vanish, it follows that U is strictly monotone and hence invertible.Thus, the above claimed change of variables exists.Furthermore, it holds that which together with (14) concludes the proof.
In the following sections we establish asymptotic stability of W in Sobolev regularity.More precisely, we will first consider the special case where U is globally bilipschitz with comparable upper and lower Lipschitz constants in Section 3. Building on these results, in Section 4 we consider the general case, where we further introduce modified changes of coordinates adapted to the local behavior of the coefficient functions.Finally, in Section 5 we bootstrap the stability results in L 2 to establish stability in H N .
Unless noted otherwise we here always work in coordinates with respect to z and may without loss of generality assume that W is localized at frequency k = 0, arbitrary but fixed, with respect to x.We thus briefly write and use

•, •
to refer to the inner product on that space.

A Model Case and L 2 Estimates
In this section we consider a special case of the linearized Navier-Stokes equations (9) in vorticity formulation in which we additionally assume that µ is comparable to a constant globally instead of just locally.More precisely, in this section we additionally require that sup(µ) inf(µ) ≤ 100.
We note that this further implies that U is bilipschitz and hence allows for a global change of variables to z = U (y) (see Section 2 and Lemma 2.2).This simplification therefore allows us to more clearly present commutator estimates and introduce techniques of proof.
In a second step in Section 4 we use that such bounds are true locally, that is when µ and U are restricted to bounded intervals of a given length.Extending these restrictions to functions on the whole space which are bounded above and below we will hence be able to use this section's results for the "localized" problems.A key challenge then lies in controlling non-linear interaction due to the Biot-Savart law and in "gluing" the various estimates in a way that preserves dissipation and decay estimates.
The following proposition summarizes our main results for this section and employs a by now common Lyapunov functional/energy approach (see for instance [MSHZ20, BMV16, TW19, Lis20]), where a key challenge lies in constructing a suitable time, frequency and space-dependent operator A which captures possible growth in the evolution of solutions to (9).Proposition 3.1.Let µ, U satisfy the assumptions of Theorem 1.1 and additionally suppose that sup(µ) inf(µ) ≤ 100.
Then there exists a time-dependent family of operators A(t) such that for any initial data In particular, the linear stability estimates in L 2 of Theorem 1.1 hold in this setting.
• The operator A(t) is defined in terms of a Fourier multiplier in Definition 3.2.• We remark that in the present setting by assumption µ and hence U ′ may only vary by a factor 100. Therefore u is also comparable to an average value of U ′ .• The decay by ν 1/3 + ν(ξ − kt) 2 quantifies the enhanced dissipation mechanism.More precisely, if |ξ − kt| ≥ ν −1/3 the latter term dominates, but for frequencies with |ξ − kt| smaller than this the enhanced rate ν 1/3 still persists (see Definition 3.2 and Lemma 3.4 for further discussion).• The last multiplier u 1+u 2 (ξ−kt) 2 corresponds to control of the velocity perturbation.Since u is bounded below this contribution is dominated by ν 1/3 for large frequencies.The statement of the theorem thus is that even for frequencies where |ξ − kt| is small this control persists.In particular, this control remains valid as ν tends to zero, which is crucial for the control of U ′′ v 2 in the evolution equation (see Lemma 3.6).
• We note that here the choice of G involves a small factor 0.1, as do the dissipation rates in Theorem 1.1.These small factors allow use some flexibility in the control of interaction terms (see Section 3.1).In the interest of clear presentation we formulate the main steps of the proof as a series of lemmas.We then show how to use these to establish Proposition 3.1 before proving the lemmas at the end of this section.Definition 3.2 (Decreasing Multiplier and Fourier sets).Let µ, U be a given stationary solution as in Theorem 1.1 and define the local dissipation rate ν as and the local shear rate as

Let further
We then define the good set G t ⊂ Z × R by and the bad set B t as the complement (excluding k = 0) For any fixed k, if B t ∩{k} × R is non-empty, the set G t ∩{k} × R has two connected components, where we denote by G − t the half-line extending to −∞ and by G + t the half-line extending to +∞.
Associated with this partition we define a Fourier multiplier m by and the asymptotic condition We denote the operator associated with the Fourier multiplier m by A(t): where F denotes the Fourier transform with respect to x and z = U (y).
This multiplier combines features of the inviscid multiplier of [Zil17] and the constant viscosity multiplier of [Lis20,BVW18].
• The relative decay of µ by −ν 1/3 compensates for the relatively weak dissipation in the bad Fourier region.Here the decay of A allows to establish damping of AW .• The second multiplier models the growth of v 2 as given by the Biot-Savart law or rather of U ′ v 2 .As we discuss in the proof of Lemma 3.6 we then use that by assumption U ′′ = U ′′ U ′ U ′ is small compared to U ′ and hence the linearization error U ′′ v 2 can be controlled by this multiplier when we are in the bad region.
As we prove in the following subsection the multiplier m (and hence the operator A) satisfies several useful bounds and, in particular, serves to control various error terms when W is concentrated in the bad set.

Lemma 3.3. Let m be as in Definition 3.2. Then m satisfies the following estimates:
(1) There exists a constant c independent of ξ and t such that (2) The multiplier m is constant (independent of ξ and t, but might depend on k) for large positive or negative times.By the conventions of our definition one of these constants is chosen as 1 and the other as c: (3) The operator A is a continuous invertible operator from L 2 to L 2 and satisfies Given this definition of our multiplier our main task in the following is to establish suitable estimates for where we used the equation ( 17) to rewrite A∂ t W .More precisely, we intend to show that the dissipation and the decay of m(t) are strong enough to absorb possible growth and that hence AW 2 is decreasing in time.Integrating these estimates we thus obtain a Lyapunov functions, which allows us to prove Proposition 3.1.
The following lemma quantifies the combined strength of the dissipation mechanism and the decay of the multiplier.
Lemma 3.4 (The dissipation term).Let t ≥ 0, let A, G and m be given by Definition 3.2 and let W ∈ L 2 be a given function.Then it holds that where u and ν are defined as in Definition 3.2.
This decay lies at the core of our damping mechanism.In the following lemmas we show that all other contributions to d dt AW 2 can be considered as errors.We begin by the errors in the dissipation term due to the fact µ is non-constant.Here we recall that by the assumptions imposed in Theorem 1.1 the relative change of µ is required to be small: as well as This smallness together with the fact that commutator terms involve integro-differential operators of lower order than 2 allows us to control these errors.
Lemma 3.5 (Viscosity errors).Let t ≥ 0, let A, C, ν, u and m be given by Definition 3.2 and let W ∈ L 2 be a given function.Then it holds that Lemma 3.6 (Velocity errors).Let t ≥ 0, let A, C, ν, u and m be given by Definition 3.2 and let W ∈ L 2 be a given function.Then it holds that The proofs of Lemmas 3.3 to 3.6 are given in the following Section 3.1.We briefly discuss how to combine the estimates of the the lemmas to establish Proposition 3.1.
Proof of Proposition 3.1.Let A, m be given as in Definition 3.2, let W denote the solution of the linearized Navier-Stokes equations and consider the energy Then by the results of Lemmas 3.4, 3.5 and 3.6 it follows that In order to conclude we recall that by Lemma 3.3 the Fourier multiplier m corresponding to A is bounded between c and 1 and we may hence relate E with W 2 L 2 .In Section 4 we will extend these estimates to the case where µ (and hence U ′ ) is allowed to vary by many orders of magnitude.In particular, the values of ν and µ then are only locally defined.A key challenge there is to show that non-local effects and interactions between different regions in space can be controlled in a sufficiently good way.Finally, in Section 5 we show that the damping estimates in L 2 can be bootstrapped to yield stability in arbitrary Sobolev regularity, following an argument of [Zil21].
3.1.Proof of Lemmas.We begin by discussing the properties of the multiplier m, which may be computed explicitly in terms of integrals.
Proof of Lemma 3.3.By definition it holds that ∂ t m ≤ 0 and hence Furthermore, we may explicitly compute m as where we used that m(−∞, k, ξ) = 1.It then holds that is bounded by a uniform constant (and further improves for k large).Furthermore, also is uniformly controlled.Therefore, we may estimate We further observe that for and m is thus constant on these intervals.On the left interval m equals m(−∞, k, ξ) = 1, while on the right it equals exp(− Finally, by Parseval's identity these bounds for the multiplier m are equivalent to L 2 bounds for the operator A. Having established these properties of the operator A we next turn to estimating the dissipation.Here for good frequencies in the sense of Definition 3.2 the dissipation is rather strong.If one instead considers bad (that is, close to resonant) frequencies we rely on the fact that m decreases in time t to provide sufficient decay.
Proof of Lemma 3.4.We recall that by Lemma 3.3 the multiplier m is multiple of the identity on the connected components of the good Fourier set G t .In particular, when restricted to these sets A commutes with all other operators.In our proof we hence expand according to this Fourier decomposition.That is, we study We begin by discussing the diagonal cases i = j in the good regime.Estimates for AW j , A div t (µ∇ t )W j , j = 1 or 3: We recall that A = Id or c Id when applied to W 1 or W 3 , respectively.Hence, we may explicitly compute that where we estimated the first summand by 0 from above and used that µ(U ′ ) 2 ≥ ν by definition.For the last term we recall that µ and hence U ′ is slowly varying and we may therefore control where we used that (∂ z − t∂ x ) is invertible with operator norm of the inverse map bounded by G −1 on the good set and controlled and hence this error can be absorbed into the decay.We thus note that and by the same argument Since µ and U ′ are not constant there further is some non-trivial interaction between different good contributions.Estimates for AW i , A div t (µ∇ t )W j with (i, j) = (1, 3) or (i, j) = (3, 1): We observe that AW i , A div t (µ∇ t )W j = A 2 W i , div t (µ∇ t )W j and that A 2 is a multiple of the identity.In the following we may thus for simplicity of notation instead consider

By the same argument as above
can be considered a negligible error term.
For the remaining terms we instead exploit that W i and W j have disjoint support in Fourier space and that these supports have distance at least 2G.In particular, we may estimate This interaction term can hence absorbed by the decay provided which is part of our assumption (7) that the relative size of µ is slowly varying.We remark that here F (µ) refers to the distributional Fourier transform, since µ is bounded but not in L 1 .
By the same argument and using that which can be absorbed provided which holds by assumption.Estimates for terms involving W 2 : It remains to discuss the influence of the part W 2 Fourier-localized in the bad set.
We first study the self-interaction term: Since none of A, µ and U ′ are constant, we cannot easily appeal to the negativity of the elliptic operator in this regime.Instead we use that since W 2 is localized in the Fourier set where |ξ − kt| is not large (yet).We recall that here G was chosen in such a way that νG 2 < 0.01ν 1/3 and thus the dissipation in this bad region is weaker than desired.Similarly, we may estimate where we used that (U ′ ) 2 ≥ 1 and hence inf(µ) ≤ ν.As remarked following Definition 3.2 we may without loss of generality only consider those k for which νk 2 < 0.001ν 1/3 is much smaller than the enhanced dissipation rate, since otherwise this horizontal dissipation already achieves the desired decay.With this understanding we restrict to this case for the reminder of the article.It then also holds that with a small constant C. Furthermore, we may control with a small absolute constant C, by our choice of cut-off G.
Since the decay of A yields that these contributions can be absorbed.Finally, it remains to discuss the cross terms , where one i, j equals 2. For the first term we estimate by which can be absorbed as above.Similarly, the second term can be controlled by and the last term by We may thus use use Young's inequality and the previously obtained damping estimates to absorb these error terms, which concludes the proof.
In the previous lemma we have established dissipation due the "main" term of the dissipation operator.The following lemma shows that this decomposition is indeed and all terms involving higher derivatives of µ can be considered lower order.We recall here that by assumption µ is slowly varying in the sense that Proof of Lemma 3.5.Given the decay established in Lemma 3.4 we may use integration by parts, Hölder's inequality and Young's inequality to reduce our proof to establishing bounds on norms of and of In analogy to the proof of Lemma 3.4 we here again distinguish between the parts of v 1 , v 2 generated by the vorticity in the good region W 1 , W 3 and the one localized in the bad region W 2 .More precisely, we note that as in [CZZ19] in these estimates we may replace v = ∇ t φ, defined in terms of the usual stream function, by simpler potential using the averaged value u of U ′ .For this purpose we define ψ to satisfy 2 )φ with suitable integrability assumptions at infinity.Then testing these equations with either ψ or φ one obtains that the energies ψ 2 are comparable (in the sense of bilinear forms acting on W ).
In the following we may thus discuss We then observe that in the good Fourier region Similarly, in the good Fourier region can be controlled by the dissipation, since It thus only remains to discuss the bad Fourier region.However, there the weight A is chosen in just such a way that More precisely, we may estimate in the good region.The right-hand-side then is controlled by the decay of A and by the dissipation, which concludes the proof.
Finally we turn to the control of the error in due to the convective term: Proof of Lemma 3.6.We remark that in the case of very small effective viscosity one cannot expect to control in terms of the dissipation.Hence, we fall back to the estimates developed for the inviscid case in [Zil17].More precisely, let again u = min U ′ and define the constant coefficient stream function ψ A and ψ as the solution of the equations 2 )ψ = W, respectively.Note that both differential operators involve constant coefficients and hence both ψ and ψ A can be explicitly computed in terms of Fourier multipliers.
Furthermore, integrating by parts and using that v 2 = ∂ x φ we control (23) by We next claim that it holds that Assuming this claim for the moment, we may compute and hence observe that the velocity error (23) can be estimated by

Since by assumption
is small this error can thus indeed be absorbed using the dissipation and the decay of A(t).
It remains to prove the claim (24) for which argue as in [Zil17].That is, we test the stream function equation , which yields the first estimate of (24).The second estimate (24) immediately follows from the explicit characterization of ψ and ψ A in terms of Fourier multipliers (which only differ by multiplication with the Fourier weight of A).
We remark that unlike the estimates of Lemmas 3.4 and 3.5 the above estimate does not explicitly involve the viscosity and has been obtained in the inviscid case.This lemma hence imposes the strongest restrictions on the profile U (and hence equivalently on µ).As discussed following the statement of Theorem 1.1 we do not expect the smallness condition to be optimal, but rather a non-resonance/spectral condition as in [WZZ18].However, the present stronger assumption allows for an approach in terms of a Lyapunov functional.

Localization and Non-local Interactions
In this section we consider the linearized equations (9) in vorticity formulation Unlike in Section 3 we here allow for µ (and hence also U ′ ) to vary by many orders of magnitude.
Our main result of this section, Proposition 4.1 then establishes the stability and damping results of Theorem 1.1 in L 2 , for which a special case had been treated in Proposition 3.1.Proposition 4.1.Let µ, U satisfy the assumptions of Theorem 1.1.Then there exists a time-dependent family of operators A(t) such that for any initial data ω 0 ∈ L 2 the solution W (t) with that initial data satisfies We recall that as part of the assumptions of Theorem 1.1 we require that (7) holds: This quantifies the requirement that µ may only change gradually (but since R is unbounded it may change by many orders of magnitude over all).This constraint on the relative rate of change then further implies that when restricted to any interval I of suitable size, it holds that max I µ min I µ ≤ 100.
Thus, if we extend the restrictions µ| I , U | I by constants to functions µ I , U I on all of R, then these extensions satisfy the assumptions of Section 3. Thus we may "locally" reduce to that model setting.However, these restrictions and extensions have to be related to the actual whole space problem (9) (see Lemma 4.4) and have to be combined to control growth of the whole space problem (see Lemmas 4.5, 4.6 and 4.7).
Our main challenges in the following are to formalize this intuition and to control non-local errors.More precisely, since the velocity is non-local and so are several commutator terms, it is not possible to just restrict W and reduce estimates to the ones of Section 3. Instead we will show that in the sum over all localized estimates still holds.

Partitions and Non-local Interaction.
The following lemma establishes the existence of a partition of R such that on each interval of the partition µ (and hence U ′ ) is comparable to a constant.Furthermore, the sizes of these intervals is bounded below and hence cut-off functions and partitions of unity corresponding to this partition have controlled W k,∞ norms.Using these partitons we may also construct extensions of the restrictions of µ, U which satisfy the assumptions of the model setting studied in Section 3. for all where 3I j denotes the rescaled intervals with the same center.Furthermore, the length of each interval I j is bounded below by 1.
Associated with this partition there exists a family of non-negative functions χ j ∈ C ∞ c with supp(χ 2 j ) ⊂ 3I j such that χ 2 j is a partition of unity.For each j there exist µ j , U j ∈ C N +2 (R) such that and so that µ j and ∂ y U j are constant outside 3I j and max R µ j min R µ j ≤ 100.
Proof of Lemma 4.2.We recall that by assumption on µ the relative rate of change µ is bounded.Hence, given any two points y 1 , y 2 we observe that µ(y 2 ) µ(y 1 ) = exp(ln(µ(y 2 )) − ln(µ(y 1 ))) = exp is bounded in terms of |y 2 − y 1 |, also when exchanging y 1 and y 2 .In order to construct the intervals I j we thus pick an initial point y 1 = 0 and then choose y 2 > 0 (or y 2 < 0) maximally such that (25) holds with I j = (y 1 , y 2 ).By the above calculation the size of I j is bounded below.Therefore, iterating this procedure with y 1 chosen as a boundary point of a previously generated interval, we obtain the desired partition (I j ) j of R with the size of each I j bounded below.
We remark that in this greedy procedure it is possible that for up to two choices of j the interval I j is unbounded (in which case the above procedure only generates finitely many j).In this case we may instead impose that y 2 should be maximized under the additional constraint that |y 2 − y 1 | ≤ 1000.
It is a classical result that given such a partition of intervals there exists a partition of unity for which the square root of each function is still smooth and such that bounds on C k norms are uniform in j (since the size of each I j is bounded below).
Furthermore, given this partition unity we construct an extension of µ by The associated shear profile U j is then constructed by integrating with C and the the constant of integration chosen such that U j (y 1 ) = U (y 1 ) and ∂ y U j (y 1 ) = ∂ y U (y 1 ).This then directly implies the desired bounds, where we used that the derivatives of the partition of unity are bounded and hence the estimate (25) only possibly deteriorates by a small factor under this extension.
Given these partitions we may naturally define operators acting on χ j W by using the results of Section 3. Definition 4.3 (Localized Fourier weights).Let χ 2 j be the partition of unity of Lemma 4.2 and let µ j , U j be the collection of viscosities and shear associated with these partitions.
We then define A j to be the operator as given in Definition 3.2 for µ, U replaced by µ j , U j .Furthermore, we define and the energy functional We remark that here for each interval I j we consider the L 2 inner product on L 2 (dz j ).However, since χ j is compactly supported in 3I j we observe that , where in the last step we used that z and z j agree on this support.In view of this compatbility with L 2 (dz) we may hence transparently switch between the spaces L 2 (dz j ), j ∈ Z in several estimates and thus suppress this formal j dependence in our notation.
Since χ 2 j is a partition of unity, the norms of W and the sum of the norms of W j are comparable.Lemma 4.4 (Norm estimates).Let χ j and W j be as in Definition 4.3 and suppose that χ j ∈ C 0 b .Then there exist constants 0 < c 1 < c 2 < ∞ such that the L 2 norms satisfy Let next N ∈ N and suppose that χ j ∈ C N b .Then there exist constants d 0 , . . ., d N with d N = 1 and c 1 , c 2 such that Proof of Lemma 4.4.Since χ 2 j is a partition of unity, this estimate is actually trivially true with c 1 = c 2 = 1 and equality.
Moreover, if A j W ≈ W with constants uniform in j, this also implies that Here and in the following the notation a ≈ b states that there exist constants 0 For N > 1 we argue by induction.More precisely, for any given multi-index α we may expand By the same argument as in the L 2 case it holds that For all other terms we note that and that the supports of the functions χ j at most cover R twice.Hence, we may control which can be controlled in terms of by the induction assumption.We further remark that these comparisons remain true if W j is replaced by A j W j .
Given this definition of an energy, we next need to verify that it indeed is a Lyapunov functional and thus study Compared to the results of Section 3 we here encounter several additional challenges: • The Biot-Savart law is non-local.Therefore χ j v depends on all (W j ′ ) j ′ not just W j .We thus need to compare various localizations of the Biot-Savart law, while at the same time also localizing in frequency.• The evolution of W j hence also depends on all (W j ′ ) j ′ .• In the dissipation term we have a double sum with respect to j and j ′ .
Here we observe that for |j − j ′ | ≥ 2 the support of χ j and χ j ′ are disjoint and hence we only need to consider j ′ ∈ {j − 1, j, j + 1} (only neighbors instead of full non-local interaction as for the velocity).However, the coupling introduced by this interaction implies that we cannot hope to control A j W j , A j W in terms of itself, but rather have to control sums over all j.The following lemma generalizes Lemma 3.4 to the present setting.Lemma 4.5 (Localized dissipation estimates).Let W ∈ S, then it holds that 0.01 Proof of Lemma 4.5.We note that in (26) the dissipation involves W and not just W j and we thus have to control the interaction with other intervals.However, by construction only neighboring functions χ j , χ j ′ with j ′ ∈ {j − 1, h, j + 1} have intersections of their support.
We thus expand Here the "diagonal term" can be controlled by using Lemma 3.4 of Section 3.
For the other terms we note that [div t µ∇ t , χ j ] is a first order differential operator.In the good region it can thus easily be controlled by the dissipation by the same argument as in the proof of Lemma 3.4.
In the bad region we have to require that derivatives of χ j are not too large.As discussed in Lemma 4.2 this control of the derivatives is a consequence of our assumption that µ only varies gradually and that hence the sizes of the intervals I j is bounded below by a (large) constant.This then implies that we can use Young's inequality to absorb these terms into the dissipation.
This smallness is a consequence of our assumptions on µ, which imply that that each χ j is supported on intervals of size at least L and hence an n-th order derivative is controlled in terms of L −n , which is much smaller than 1.Lemma 4.6 (Non-local velocity estimates).Let t ≥ 0, let A, C and m be given by Definition 3.2 and let W ∈ H N be a given function.Then it holds that Proof of Lemma 4.6.We again observe that here the right-hand-side depends on all of W .However, unlike in Lemma 4.5 here χ j v 2 depends on W j ′ for all j ′ and not just j ′ ∈ {j − 1, j, j + 1}.Instead of estimating in terms of j ′ as in Lemma 4.5, we generalize the elliptic estimates of [CZZ19] to the present setting.
More precisely, let φ j be the stream function generated by W j : ∆ j φ j = W j = χ j W, and let φ denote the stream function generated by W : Then by testing the above equations with −φ j and −φ, respectively, we observe that and Using the fact that derivatives of χ j are bounded, it thus follows that Thus errors in velocity can be controlled in terms of sums of ∇ j φ j (see also Lemma 4.4).Moreover, the above argument extends to considering weighted spaces.
In order to conclude, we note that by the definition of U j , µ j and W j each such contribution can be controlled in terms of the decay of the multiplier A j and the dissipation.Hence the velocity errors can be absorbed.Lemma 4.7 (Viscosity errors).Let t ≥ 0, let A, C and m be given by Definition 3.2 and let W ∈ H N be a given function.Then it holds that Proof of Lemma 4.7.In order to prove these estimates we employ a combination of the methods used in the proofs of Lemmas 3.5, 4.5 and 4.6.More precisely, we first use the structure of the Biot-Savart law to express in terms of W and lower order terms.For the terms involving W we can then argue analogously as in Lemma 3.5, using the decoupling of χ j and χ j ′ if j and j ′ are far apart as in Lemma 4.5.Finally, for the remaining terms involving the velocity, we argue as in Lemma 4.6 and thus reduce to estimating ∇ j φ j in place of v. Summing over the "diagonal" estimates as established in Lemma 3.5 then yields the result.
Having establised these estimates, we are now ready to prove Proposition 4.1 and thus also prove part of Theorem 1.1.An extension of these results to higher Sobolev norms H N is given in Section 5, which then completes the proof of Theorem 1.1.
Proof of Proposition 4.1.Let ω 0 ∈ L 2 (dz) be a given initial datum, let µ, U satsify the asssumptions of Theorem 1.1 and let W denote the solution of (17) with this initial data, where Then by Lemma 4.2 there exists a parition of R into intervals I j and an associated partition of unity χ 2 j .We then define A j and as in Definition 4.3 and study the evolution of the energy Inserting the evolution equation ( 17) we then have to estimate Combining the estimates of each summand, derived in Lemmas 4.4 to 4.7 we deduce that Finally, we recall that µ j , U ′ j , z j agree with µ, U ′ , z on each interval I j and that by Lemma 4.4 the energy E(t) is comparable to W (t) 2 L 2 (dz) .This hence concludes the proof of Proposition 4.1 where the symmetric operator A is defined such that A(t)W (t) 2 := E(t).

Stability in H N
As the last step of our proof of Theorem 1.1, in this section we extend the stability and damping estimates in L 2 established in Section 4.1 to estimates in H N .Here we follow an inductive approach introduced in [Zil21] in the inviscid setting.We consider the linearized equations (17) x + (U ′ (∂ z − t∂ x )) 2 W, where we introduced the time-dependent linear operator L for brevity of notation.We remark that derivatives with respect to x can be identified with multiplication by ik, since the linearized equations decouple with respect to k. Hence higher derivatives in x can be estimated using the L 2 energy.In the following we hence only consider derivatives with respect to z. Applying N derivatives to (17) we obtain that In the following lemma we then that the commutator term can be considered an error term involving fewer than N derivatives, while L∂ N z W can be treated in the same way as in the L 2 estimate.In this sense the L 2 estimate forms the core of our argument.
Proposition 5.1.Let µ, U satisfy the assumptions of Theorem 1.1.In particular, let N ∈ N and suppose that ∂ z ln(µ) ∈ W N +1,∞ .Let A be as in Proposition 4.1, then there exist constants c 0 , c 1 , . . ., c N > 0 depending only on the W k,∞ norms of ∂ z ln(µ) such that is a Lyapunov functional and satisfies We remark that here we only require that the W N +1,∞ norm is finite.Only the W 1,∞ needs to be small in order to establish the L 2 stability estimate.
Proof of Prosposition 5.1.The case N = 0 has been established in Proposition 4.1 with c 0 = 1.We hence aim to proceed by induction.Hence, suppose that the estimates have been established for the case N − 1 and consider with c N to be determined later.
Then by the induction assumption it holds that for all 0 ≤ l ≤ N − 1 In particular, all derivatives of W up to order N − 1 can be controlled by the induction assumption.We thus turn to the control of the "leading