\Gamma-convergence of Onsager-Machlup functionals. Part II: Infinite product measures on Banach spaces

We derive Onsager-Machlup functionals for countable product measures on weighted $\ell^p$ subspaces of the sequence space $\mathbb{R}^{\mathbb{N}}$. Each measure in the product is a shifted and scaled copy of a reference probability measure on $\mathbb{R}$ that admits a sufficiently regular Lebesgue density. We study the equicoercivity and $\Gamma$-convergence of sequences of Onsager-Machlup functionals associated to convergent sequences of measures within this class. We use these results to establish analogous results for probability measures on separable Banach or Hilbert spaces, including Gaussian, Cauchy, and Besov measures with summability parameter $1 \leq p \leq 2$. Together with Part I of this paper, this provides a basis for analysis of the convergence of maximum a posteriori estimators in Bayesian inverse problems and most likely paths in transition path theory.


Introduction
A maximum a posteriori (MAP) estimator is an important feature of a Bayesian inverse problem (BIP) because of its interpretation as a mode of the posterior distribution, i.e. as a point in parameter space X to which the posterior assigns the most mass, relative to other points. This interpretation is only heuristic, because even in the straightforward case that the parameter space has finite dimension and the posterior admits a Lebesgue density, every point will have measure zero. To make the interpretation rigorous, one can consider -for a given probability measure µ on X -the behaviour of ratios of small ball probabilities µ(Br (x 1 )) µ(Br (x 2 )) for infinitesimally small r and for any two parameters x 1 , x 2 ∈ X. Intuitively, if x 2 is a mode of µ, then, for any x 1 , the limit superior of this ratio must be less than or equal to 1.
In Part I of this paper (Ayanbayev et al., 2021), we called any x 2 that satisfies the limit superior inequality in the previous paragraph a global weak mode of µ, and showed that, under certain assumptions, a point is a global weak mode if and only if it minimises an Onsager-Machlup (OM) functional I µ : X → R of µ. In practice, the full posterior is not accessible and must be approximated, and we also analysed the convergence behaviour of the modes associated to an arbitrary collection {µ (n) | n ∈ N ∪ {∞}} of measures defined on a metric space X, where µ (∞) plays the role of the full posterior and (µ (n) ) n∈N plays the role of a sequence of approximate posteriors. Our findings were as follows: (a) If (extended) OM functionals I µ (n) : X → R exist for each n ∈ N ∪ {∞} and (I µ (n) ) n∈N is an equicoercive sequence with Γ-lim n→∞ I µ (n) = I µ (∞) , then minimisers of I µ (n) converge (up to taking subsequences) to a minimiser of I µ (∞) (Ayanbayev et al., 2021, Section 4). (b) Since modes of µ (n) are minimisers of their OM functionals, it follows that modes converge (up to taking subsequences) to a mode of µ (∞) (Ayanbayev et al., 2021, Section 4). are the priors, n ∈ N ∪ {∞}. Under rather weak assumptions on the Φ (n) , if the conditions in (a) hold for the priors, then they also hold for the posteriors. In particular, the existence of the OM functionals I µ (n) for the posteriors follows from the existence of the OM functionals for the priors (Ayanbayev et al., 2021, Section 6). In principle, establishing Γ-convergence and equicoercivity would require explicit formulae for the OM functionals of the posteriors, and such formulae can be difficult to obtain. Fortunately, by (c), we only need to prove Γ-convergence and equicoercivity for the OM functionals of the priors and continuous convergence of the potentials. Indeed, for some commonly-used priors, the OM functionals of the priors have a simple form and the requisite Γ-convergence and equicoercivity calculations can be performed more-or-less explicitly.
In Part I of this paper (Ayanbayev et al., 2021), we determined OM functionals and proved (a) for possibly degenerate Gaussian measures, as well as for Besov-1 measures. In this paper, we aim to do the same for a rather large class of countable product measures defined on weighted sequence spaces. This class of measures consists of countable products of scaled and shifted copies of a reference probability measure µ 0 on R, where µ 0 admits a sufficiently regular Lebesgue density. The class includes Gaussian measures, Cauchy measures, and Besov-p measures for 1 p 2. The precise description of this class is given in Assumption 4.1.
The first main contribution of this paper, Theorem 4.10, shows the existence of and derives an explicit formula for OM functionals of measures in this class under another technical assumption. The second main contribution is to prove equicoercivity and Γ-convergence of OM functionals associated to a convergent sequence in this class, where convergence is meant in the sense of convergence of the scale and shift sequences, and convergence of the Lebesgue densities of the reference probability measures: see Theorems 4.13 and 4.14.
As concrete examples, we consider Besov-p measures for 1 p 2, and Cauchy measures. Since Bayesian inference is often performed on infinite-dimensional separable Banach or Hilbert spaces, we also translate the results from the weighted sequence space setting to the separable Banach or Hilbert space setting.
The main challenge in this work is proving the existence of the extended OM functionals. In this paper, we consider two approaches for this. The first approach, which we call the continuity approach, considers shifted measures µ h ( · ) := µ( · − h) and the corresponding Radon-Nikodym derivatives r µ h := dµ h dµ , whenever they exist. The main idea of this approach, which has previously been used by Helin and Burger (2015) and Agapiou et al. (2018), is to consider the negative logarithm of the function E ∋ h → r µ −h (u * ), where u * is some suitable reference point, and E ⊆ X is a subset on which r µ h is continuous and may depend on the reference point u * . We make some contributions to this approach. Ultimately, we do not use it for the derivation of our main results, because proving continuity on a sufficiently large subset E ⊆ X turns out to be more challenging than using a different approach.
The second approach, which we call the direct approach, avoids considering continuity of r µ −h , and directly addresses the limit of the ratio µ(Br(x 1 )) µ(Br(x 2 )) as r ց 0 to derive the OM functional of µ on a sufficiently large subset E ⊆ X. By removing the constraint on E that r µ −h must be continuous on E, we can prove a formula for the OM functional using this direct approach, for the class of probability measures mentioned above.
We emphasise, however, that in both approaches it is important to consider points in X \ E with great care. In the direct approach, we achieve this by proving a property M (µ, E) which guarantees that we do not miss any modes outside of E.
The structure of the paper is as follows. In Section 2 we discuss related work. Section 3 introduces key notation and concepts, including the formal definition of the OM functional. In Section 4, we present the main results of this paper, namely the derivation of OM functionals of certain product measures on the sequence space R N as well as the Γ-convergence and equicoercivity properties of sequences of such measures (and the images of such measures in Hilbert and Banach spaces). In Section 5, we summarise the results of the paper and suggest some directions for future work. We collect auxiliary results in Appendix A and state technical proofs in Appendix B.

Overview of related work
OM functionals have been extensively studied in the context of stochastic processes defined by stochastic differential equations; see e.g. (Ledoux, 1996, Chapter 7) and the references therein. However, Γ-convergence does not appear to have been considered in this context until the work of Pinski et al. (2012). In their work, Γ-convergence tools were used to study the minimisers of OM functionals in the zero temperature limit. Lu et al. (2017a) considered optimal Gaussian approximations of the law of a diffusion process with respect to the Kullback-Leibler divergence using Γ-convergence, and studied the relationship between the OM functional and the so-called Freidlin-Wentzell rate functional. Some examples of recent work that further investigate this relationship include (Du et al., 2021;. OM functionals have only recently been studied in the context of BIPs and their MAP estimators, beginning with the seminal work of Dashti et al. (2013), and continuing with (Clason et al., 2019;Dunlop and Stuart, 2016;Helin and Burger, 2015), for example. The importance of the OM functional in this context is that its minimisers are the modes (MAP estimators) of the posterior measure. However, these works establish OM functionals only for very few measures and do not consider Γ-convergence, as they only study a single fixed posterior measure instead of a sequence of such measures. As far as we are aware, the only application of Γ-convergence tools in the context of BIPs appears to be the work of Lu et al. (2017b), where, the goal is to find optimal Gaussian approximations of non-Gaussian probability measures on R d with respect to the Kullback-Leiber divergence. The Γ-limits of interest are specified in terms of increasing quantity of data or decreasing amplitude of noise in the data. The Γ-limit is used to characterise frequentist consistency properties of the measure, including a Bernstein-von Mises result. However, Lu et al. (2017b) do not mention OM functionals.

Preliminaries and notation
Throughout this article, X will denote a topological space, which in many cases will be a metric, normed, Banach or Hilbert space. When thought of as a measurable space, X will be equipped with its Borel σ-algebra B(X), which is generated by the collection of all open sets. If X is a metric space, then we write B r (x) for the open ball in X of radius r centred on x, in which case B(X) is generated by the collection of all open balls. The most prominent spaces considered in this manuscript are the real sequence spaces ℓ p := ℓ p (N) of p th -power summable sequences, 1 p < ∞, as well as the α-weighted ℓ p spaces defined by The ℓ p and α-weighted ℓ p spaces are separable Banach spaces. In many cases, we will first define the measure µ on (R N , B(R N )), where R N is equipped with the product topology, show that µ(X) = 1 for X = ℓ p α for some 1 p < ∞ and α ∈ R N >0 , and then view µ as a measure on (X, B(X)). For this purpose, it is important to note that the Borel σ-algebra B(X) is contained in the Borel σ-algebra B(R N ); see Lemma B.1.
The set of all probability measures on (X, B(X)) will be denoted P(X). We denote its elements by µ, ν, µ 0 , µ (n) , n ∈ N ∪ {∞}, etc. The topological support of a measure µ ∈ P(X) on a metric space X is which is always a closed subset of X. We write R for the extended real line R ∪ {±∞}, i.e. the two-point compactification of R, and R 0 := R 0 ∪ {∞}. We denote the absolute continuity of µ with respect to ν by µ ≪ ν, their equivalence (i.e. mutual absolute continuity) by µ ∼ ν, and their mutual singularity by µ ⊥ ν.
As motivated in Section 1, we now introduce the term "Onsager-Machlup functional" of a measure µ, the minimisers of which correspond exactly to global weak modes of µ under certain assumptions (Ayanbayev et al., 2021, Proposition 4.1).
Definition 3.1. Let X be a metric space and let µ ∈ P(X). We say that (3.4) and in this situation we extend I to a function I : X → R with I(x) := +∞ for x ∈ X \ E.
As we remark in Part I of this paper (Ayanbayev et al., 2021, Section 3), property M (µ, E) does not depend on the choice of x ⋆ in (3.4). The importance of property M (µ, E) is that it guarantees that we only need to look for global weak modes of µ within E and may freely ignore points in X \ E. This also justifies setting I := +∞ outside E. However, in order for this property to hold, the subset E on which an OM functional can be defined needs to be chosen to be as large as possible. On the other hand, any measure has an OM functional on sufficiently small E (such as a singleton set), and so there is a certain tension between existence of an OM functional and the M -property. We recall also that OM functionals are at best unique up to the addition of real constants (Ayanbayev et al., 2021, Remark 3.4). Whenever we prove Γ-convergence and equicoercivity, we use the same version of the OM functional.
The following terminology will be necessary for the continuity approach mentioned in Section 1.
Definition 3.2. When X is a linear topological space, µ ∈ P(X), and h ∈ X, we write µ h for the shifted measure For h ∈ Q(µ), we define the shift density r µ h := dµ h dµ ∈ L 1 (µ) as the Radon-Nikodym derivative of µ h with respect to µ, i.e.
Remark 3.3. Note that, in contrast to OM functionals, the shift-quasi-invariance space Q(µ) and the shift density r µ h do not depend on a particular metric.

OM functionals for product measures; equicoercivity and Γ-convergence
Determining the shift-quasi-invariance space Q(µ), the shift density r µ h and the OM functional I µ for a general measure µ on an infinite-dimensional space is a challenging task, as is establishing Γ-convergence and equicoercivity for such OM functionals. In the following, we describe two approaches that apply to a class of shifted product measures µ = k∈N µ k , µ k ( · ) := µ 0 (γ −1 k ( · − m k )). This class includes many of the classical prior measures that arise in the study of inverse problems, such as Gaussian, Besov, and Cauchy measures. Their common structure is summarised by the following assumptions on µ, where (A1)-(A3) should be seen as common basic assumptions, while (A4)-(A6) are technical assumptions that will be used individually in specific settings.
Remark 4.2. While many product measures satisfy (A5), the Besov measure µ = B s p with 1 p < 2 does not have a sufficiently smooth probability density ρ. This is why we treat this case separately, via (A6).
Note also that, since the shift-quasi-invariance space Q(µ) and the shift density r µ h do not depend on the particular metric (cf. Remark 3.3), the corresponding results hold on all of R N and do not require (A1).
Many prior measures of interest, such as Gaussian, Cauchy and Besov measures, are often defined on Banach or Hilbert spaces Z that are not subspaces of R N . Thus, we introduce the following notation, which will allow us to translate the results from ℓ p α ⊆ R N to Z: Notation 4.3. Let X = ℓ p α for some 1 p < ∞ and α ∈ R N >0 . Let Z denote a separable Banach space with Schauder basis ψ = (ψ k ) k∈N such that the synthesis operator and the coordinate operator are well defined and S ψ is a continuous embedding. Note that T ψ • S ψ = Id X . For a probability measure µ ∈ P(X), we denote by µ ψ := (S ψ ) # µ the push-forward of µ under S ψ . If instead of µ ∈ P(X) we have µ ∈ P(R N ) and µ(X) = 1, then µ ψ denotes the push-forward of the restriction of µ to (X, B(X)).
Example 4.4. The standard example of the setup described by Notation 4.3 is to consider (ψ k ) k∈N to be the standard Fourier basis of the space Z = L 2 (T d ; R) of square-integrable periodic functions in d variables. Taking p = 2 and α = (1, 1, . . . ), the operators T ψ and S ψ are isometries -they are the Fourier transform and its inverse, respectively. By way of contrast, taking α n ∼ n s for s > 0 yields a Sobolev space as Z, and further taking p = 2 yields a Besov space.
Most of our results on X = ℓ p α can be transferred to the Banach space Z via S ψ . However, for the statements concerning OM functionals, we will assume in addition that S ψ is an isometry, i.e. that x X = S ψ x Z for every x ∈ X. This is because the definition of the OM functional depends strongly on the metric, and because even equivalent norms can yield different OM functionals (Ayanbayev et al., 2021, Example B.4).
Lemma 4.5. Suppose that Assumption 4.1 (A1)-(A3) hold. If S ψ in Notation 4.3 is an isometry, then Hence, if I µ : X → R is an OM functional for µ, then defines an OM functional for µ ψ . Similarly, if I µ ψ : X → R is an OM functional for µ ψ , then I µ := I µ ψ • S ψ : X → R defines an OM functional for µ.
The two approaches that we consider for establishing OM functionals consist of the continuity approach, which we present in Section 4.1, and the direct approach, which we present in Section 4.2. In the literature on MAP estimators, the continuity approach appears to have been first proposed by Helin and Burger (2015). The approach connects the OM functional for µ with the continuity of the shift density r µ h from Definition 3.2. In contrast, the direct approach considers the ratio of small ball probabilities directly, and does not require continuity of the shift density r µ h .

Continuity approach
We present some results that are related to the approach from (Helin and Burger, 2015), i.e. the approach of using continuity of the shift density r µ h . The results Lemma 4.6 and Corollary 4.7 do not require the product structure of the measure as formulated in Assumption 4.1. Theorem 4.8 derives the shift-quasi-invariance spaces Q(µ) and shift densities 1 r µ h specifically for product measures fulfilling Assumption 4.1 (A2)-(A4). These assumptions refer to the continuity and symmetry of the reference density ρ, the affine transformation relationship between the µ k and µ 0 , and the finite Fisher information condition. One of the key disadvantages of this approach is that it requires the existence of representatives of shift densities or logarithmic derivatives that are continuous on sets of full measure, see e.g. (Helin and Burger, 2015, Assumption (A1)). This is the reason why we do not use either Lemma 4.6 or Corollary 4.7 to derive OM functionals.
Lemma 4.6. Let X be a vector space with a metric and µ ∈ P(X). Let A ∈ B(X) be a bounded neighbourhood of the origin. Let µ(F ) = 1 for some F ∈ B(X), and h ∈ Q(µ). Assume that the shift density r µ h has a representativer µ h (i.e. r µ h −r µ h = 0 in L 1 (µ)) such thatr µ h | F : F → R 0 is continuous 2 . Then, for all x ∈ F ∩ supp(µ), the limit below exists and Proof. Let x ∈ X and ε > 0 be arbitrary. By definition of the shift density r µ h , 1 We wish to highlight the case of Besov-p measures: In previous work (Agapiou et al., 2018), formulas for Q(µ) and r µ h could only be derived for p = 1 by a considerable amount of work, while our results include the cases 1 p < ∞ and the proof is a rather simple application of Theorems A.1 and A.2. 2 This is a much weaker assumption than continuity ofr µ h on F , which would mean thatr µ h is continuous at each point of F as a function on X. See also (Lie and Sullivan, 2018a, Lemma 4.6) for a result that only requires local continuity.
Lemma 4.6 generalises (Agapiou et al., 2018, Lemma 2.3) in two ways: it does not require symmetry or convexity of A, and it requires the continuity of the restriction ofr µ h to some set of full measure F , instead of continuity ofr µ h on the whole space X. Continuity on X was also assumed by Helin and Burger (2015, Lemma 2). On the other hand, Agapiou et al. (2018, Lemma 2.3) do not assume A to be a bounded neighbourhood of the origin. However, the fraction of small ball probabilities on the left-hand side of (4.3) may be ill defined even if supp(µ) = X and A is symmetric and convex. For example, if µ is an absolutely continuous measure on (R 2 , B(R 2 )) and A = {0} × [−1, 1] is a line segment, then µ(εA + x) = 0 for every x and ε. If A is a bounded neighbourhood of the origin, then the expression on the left-hand side of (4.3) is well defined if and only if x ∈ supp(µ). In this case, we obtain the following result.
Corollary 4.7. Let X be a vector space with a metric, µ ∈ P(X) and F ∈ B(X) be a set of full measure. Assume that, for some h ∈ Q(µ), the shift density r µ h has a representativer µ h such Assume that the above condition holds for any h ∈ Q(µ), let x * ∈ F ∩ supp(µ) be arbitrary and defines an OM functional for µ on E(x * ).
and we obtain (4.5). Next, recall that (3.7) states that The derivation of Q(µ) and r µ h for product measures µ that satisfy Assumption 4.1 (A2)-(A4) relies on a theorem of Kakutani (1948) and a consequence of this theorem, due to Shepp (1965). Therefore, we state both in Appendix A. Below, denotes the Hellinger integral of two probability measures µ and ν on the same measurable space (Ω, F), where λ is another measure on (Ω, F) with µ, ν ≪ λ. Note that the value of H(µ, ν) is independent of the choice of λ; see e.g. (Jacod and Shiryaev, 2003, Chapter IV, §1.a, Lemma 1.8).
Theorem 4.8 (Shift-quasi-invariance space and shift density r µ h of certain product measures). Let µ satisfy Assumption 4.1 (A2)-(A4). Then the shift-quasi-invariance space of µ is Q(µ) = ℓ 2 γ and, for any h ∈ Q(µ) and x ∈ R N , . (4.8) Further, if Assumption 4.1 (A1) is satisfied, then the objects µ ψ , S ψ and T ψ defined in Notation 4.3 satisfy Q(µ ψ ) = S ψ (ℓ 2 γ ), and, for any h ∈ Q(µ ψ ) and z ∈ Z, r From the definition of µ k above, (A2) and (A3), we have dµ Using the definition of ν k and the a.e. positivity of ρ in (A4), it follows that µ k ∼ ν k and µ k ∼ν k for all k ∈ N. Using the change of variables formula, Hence, by Kakutani's theorem (Theorem A.1), µ ∼ ν if and only ifμ ∼ν, and similarly µ ⊥ ν if and only ifμ ⊥ν. Finally, Shepp's theorem (Theorem A.2) implies the following: This proves Q(µ) = ℓ 2 γ , where we used that ℓ 2 γ ⊆ X by Corollary B.5, while (4.8) follows directly from Theorem A.1. For the final statement first note that, since µ(X) = 1 by assumption, we have, for any B ∈ B(Z) and h ∈ Z, Hence, for h ∈ Z, the shift density r Having identified the shift-quasi-invariance space Q(µ) and the shift density r µ h , the second step in the continuity approach involves finding a representativer µ h and a sufficiently large subset F of X such that the restriction ofr µ h to F is continuous. The third step is then to apply either Lemma 4.6 or Corollary 4.7. We do not pursue the continuity approach further because the second step is difficult to carry out and because a more direct approach yielded the desired results. We describe the direct approach in the next section.

Direct approach
The following definition and theorem provide the basis for establishing the OM functional for the product measures defined in Assumption 4.1. We demonstrate this by applying both to the Cauchy measure in Corollary 4.28, and to the Besov-p measure with 1 p 2 in Corollary 4.21.
Recall that (A2) assumes that the reference measure µ 0 on R has a continuous, symmetric density ρ decreasing on R 0 , and (A3) assumes that each measure µ k on R is obtained from µ 0 by an affine change of variables.
In particular, in this case and under the additional assumption that E γ,m ⊆ m + ℓ 2 γ , I µ = q γ,m : X → R is an (extended) OM functional for µ.
Proof. The technical proof is given in Appendix B.1. Similar statements follow for the Banach space Z in Notation 4.3 under the assumption that S ψ is an isometry.
Corollary 4.11. Using Notation 4.3, assuming S ψ to be an isometry, and assuming that Assumption 4.1 (4.14) In particular, in this case and under the additional assumption that Proof. By Lemma 4.5, (4.13) and (4.14) follow directly from (4.11) and (4.12).
Theorem 4.10 yields the full OM functional for a limited class of product measures. We conjecture that the conclusions of Theorem 4.10 hold for a larger class of product measures.
In particular, property M (µ, E γ,m ) is satisfied and I µ : X → R with I µ = q γ,m defines an OM functional for µ.
The following two theorems refer to (4.2), which we recall below: The following result concerns equicoercivity of a sequence of OM functionals. It assumes that one is given a sequence of probability measures, where each probability measure µ (n) ∈ P(R N ) is defined by, in the sense of Assumption 4.1 (A1)-(A3), an absolutely continuous reference measure µ (n) 0 ∈ P(R), a shift vector m (n) ∈ X = ℓ p α , and a scaling vector γ (n) ∈ R N >0 , n ∈ N∪{∞} (note that γ (n) ∈ X by Lemma B.3). Furthermore, it assumes that each probability measure has an OM functional. The result states that if the sequence of probability measures µ (n) converges to µ (∞) in the sense that both the sequence of shift vectors and the sequence of scaling vectors converge in X to the corresponding pair of shift and scaling vectors, and if the sequence of Lebesgue densities of the reference measures converges pointwise, then the sequence of OM functionals is equicoercive.
Proof. By Assumption 4.1 (A2), the negative log-densities q (n) : R → R 0 are symmetric and their restrictions q (n) | R 0 : R 0 → R 0 are strictly monotonically increasing bijections. Let t 0 and a n := (q (n) | R 0 ) −1 (t). Since ρ (n) converges pointwise to ρ (∞) by assumption, a n → a ∞ as n → ∞. Further, by Theorem 4.10, x ∈ X. (4.16) The proof is structured around the following four steps, of which the second and fourth are straightforward.
Step 1. The operators are well defined, compact and a n T (n) − a ∞ T (∞) → 0 as n → ∞.
Step 2. It follows that the sets are pre-compact and, by (4.16), Step 3.
is sequentially pre-compact. Hence K t := K • t is compact, which proves equicoercivity of (I µ (n) ) n∈N . Note that for t < 0 there is nothing to prove, since Step 4. Equicoercivity of (I µ (n) ψ ) n∈N follows directly from Lemma 4.5. Recall that this lemma transforms an OM functional on the sequence space X into an OM functional on the separable Banach space Z, where X and Z are related by the synthesis operator S ψ : X → Z and coordinate operator T Ψ : Z → R N .
We now give the proofs of the non-trivial first and third steps.
Proof of Step 1. Let n ∈ N∪{∞}. Since γ (n) ∈ ℓ p α by Lemma B.3, Hölder's inequality implies, for any v ∈ ℓ ∞ , proving well-definedness of T (n) . Consider the finite-rank operators Then T (n) m − T (n) → 0 as m → ∞, since Hölder's inequality implies, for any v ∈ ℓ ∞ with v ℓ ∞ 1, where the last term is independent of v and goes to 0 as m → ∞ since γ (n) ∈ ℓ p α . Hence, T (n) is a compact operator. Finally, T (n) − T (∞) → 0 as n → ∞, since Hölder's inequality implies, for any v ∈ ℓ ∞ with v ℓ ∞ 1, where the last term is independent of v and goes to 0 as n → ∞ by assumption. It follows that has a subsequence -which for simplicity we also denote by (w (j) ) j∈N -that converges to some element w ∈ X. It follows that, as j → ∞, and thus (x (ν) ) ν∈N has a convergent subsequence and K • t is sequentially pre-compact. The following result concerns Γ-convergence of OM functionals. As in Theorem 4.13, one is given a sequence of probability measures, where each probability measure µ (n) is defined by an absolutely continuous reference measure µ (n) 0 , a shift vector m (n) , and a scaling vector γ (n) , and each probability measure has an OM functional. Again, we assume convergence in X of the sequence of shift vectors and the sequence of scaling vectors. However, we replace the assumption of pointwise convergence of the sequence of Lebesgue densities in Theorem 4.13 with the assumption of local uniform convergence from below of the negative log-densities and assume the OM functionals to have the specific form I µ (n) = q (n) γ (n) ,m (n) . Under these assumptions, we obtain Γ-convergence of the OM functionals.
Note that Fatou's lemma is general enough to handle extended real-valued sequences, so we do not need to treat cases such as I µ (∞) (x) = ∞ separately. For the Γ-lim sup inequality, let x ∈ X and choose the sequence (x (n) ) n∈N in X by is finite. By (A2) -the assumption that the reference density ρ is continuous, symmetric, and monotonically decreasing -and the formula (4.9) -which states that q(u) := − log ρ(u) ρ(0) = log ρ(0) − log ρ(u) -it follows that q (∞) is monotonically increasing, with q (∞) (x) → ∞ as |x| → ∞. If the terms are unbounded, then this implies that the q (∞) ( ) are unbounded, and hence that I µ (∞) (x) is not finite.

By taking the contrapositive, it follows that if
Thus, It follows that Using the reverse Fatou lemma and that q (n) q (∞) for all but finitely many n ∈ N, by assumption.
follows directly from Lemma 4.5. For the Γ-lim inf inequality, we additionally use that ran S ψ is complete and therefore closed in Z.
While the proof of equicoercivity (Theorem 4.13) only uses the inequality I µ (n) q (n) γ (n) ,m (n) , which holds by Theorem 4.10, the Γ-convergence of the corresponding OM functionals relies on the complete knowledge of the OM functionals which are assumed to be given by I µ (n) = q (n) γ (n) ,m (n) . This assumption is proven in Theorem 4.10 only for certain product measures. For example, Theorem 4.10 applies to Cauchy measures and Besov-p measures with p ∈ [1, 2] (cf. Corollaries 4.21 and 4.28), but does not apply for Besov-p measures with p > 2, because m+ℓ 2 γ E γ,m in this case. Therefore, Conjecture 4.12 remains an important open problem.

Application to Besov measures
This section considers the Γ-convergence of OM functionals of Besov measures as introduced by Lassas et al. (2009) and Dashti et al. (2012). 3 We will consider Besov B s p measures with integrability parameter 1 p 2 and smoothness s ∈ R, in contrast to the analysis of Part I of this paper (Ayanbayev et al., 2021, Sections 5.1 and 5.2), which was limited to the cases p ∈ {1, 2}.
Throughout this subsection, we make use of the following notation: Notation 4.15. Let s ∈ R, d ∈ N, 1 p 2, η > 0, t := s − p −1 d(1 + η) and assume that τ := (s/d + 1/2) −1 > 0. Define γ 0 := 1 and γ, δ ∈ R N by as well as the probability measures µ k , k ∈ N ∪ {0}, on R with probability densities where m ∈ ℓ p δ is some fixed shift. Further, let Z 0 be a separable Hilbert space 4 with complete orthonormal basis ψ = (ψ k ) k∈N and S ψ : R N → k∈N span ψ k , c → k∈N c k ψ k . We emphasise that the direct product k∈N span ψ k is neither span ψ nor Z 0 . In Corollary 4.20, we state how S ψ here is related to the synthesis operator S ψ : X → Z from Notation 4.3.
The role of η, t and δ will be explained in Remark 4.19, where we discuss normed spaces of full Besov measure. We define (shifted) Besov measures as follows, using notation that is an adaptation of that of Dashti et al. (2012): and define the Besov space X s p = X s p (ψ) as the completion ofX s p with respect to · X s p . By Parseval's identity, the initial space Z 0 coincides with the Besov space X 0 2 .
Remark 4.18. Since it is the parameter p that most strongly affects the qualitative properties of the measure, we often refer simply to a "Besov-p measure" for any measure in the above class, regardless of the values of s, d, etc. The scaling of the Besov-2 measure corresponds to the "physicist's Gaussian distribution" rather than the "probabilist's Gaussian distribution". In particular, for p = 2, v k ∼ µ k has variance 1 2 γ 2 k . A consequence of this is that the OM functional of the Besov-p measure will be · p X s p , i.e. appears to lack a prefactor of 1 p relative to the Gaussian OM functional -one half of the square of the Cameron-Martin norm -given by Ayanbayev et al. (2021, Section 5.1).
Remark 4.19. Note that the random variable u = k∈N v k ψ k in Definition 4.17 takes values in a space Z that may be larger than Z 0 . It has already been shown by Lassas et al. (2009, Lemma 2) that, fort ∈ R, Hence, using the choice t := s − p −1 d(1 + η) in Notation 4.15, Z can be chosen as the Besov space X t p (ψ) = S ψ (ℓ p δ ), i.e. "just a bit larger than" X s−d/p p (ψ) = S ψ (ℓ p γ ). The shift by m ∈ ℓ p δ does not cause problems, since S ψ (m) ∈ X t p (ψ). For the sequence space Besov measure µ = B s p on R N , the space X t p = ℓ p δ has full µ-measure. Given Remark 4.19, we will from now on consider the Besov measures µ = B s p and µ = B s p (ψ) as measures on the normed spaces X = X t p and Z = X t p (ψ), respectively. Apart from the different degree of summability (2 in place of p), the next result can be interpreted as saying that the shifts h with respect to which the B s p measure is quasi-invariant are d 2 degrees smoother than the typical draws from that measure. For p = 1, the corresponding result was obtained in Agapiou et al. (2018, Lemma 3.5), without using Shepp's theorem.
In preparation for the next two results, we recall Notation 4.3: X = ℓ p α for some 1 p < ∞ and α ∈ R N >0 , Z is a separable Banach space with Schauder basis ψ = (ψ k ) k∈N , the synthesis operator S ψ : X → Z satisfies x = (x k ) k∈N → k∈N x k ψ k , and the coordinate operator T ψ : Z → R N satisfies z = k∈N v k ψ k → (v k ) k∈N . If µ ∈ P(X), then µ ψ := (S ψ ) # µ is the push-forward of µ under S ψ . For the following result, X t p (ψ) and B s p (ψ) are given in Definition 4.17.
Corollary 4.20 (Shift-quasi-invariance space and shift density of a Besov measure). Let µ = B s p be the sequence space Besov measure on R N or on X = X t p = ℓ p δ . Then Q(µ) = ℓ 2 γ = X and, for any h ∈ Q(µ) and x ∈ R N (respectively x ∈ X), Further, using Notation 4.3 with α = δ and Z = X t (ψ) and, for any h ∈ Q(µ ψ ) and z ∈ Z, r µ ψ h (z) = r µ T ψ (h) (T ψ (z)). Proof. Assumption 4.1 (A2) and (A3), which concern the continuity and symmetry of the reference density ρ and the assumption that each µ k is related to µ 0 by an affine transformation respectively, are satisfied by virtue of Definition 4.16. Assumption 4.1(A1), which concerns the assumption that X = ℓ p α and µ(X) = 1, follows from Remark 4.19, while (A4), which states that the reference density ρ has finite Fisher information, follows from a straightforward computation. Theorem 4.8 yields the formula (4.18) for r µ h , the spaces Q(µ), Q(µ ψ ), and the equation for r µ ψ h . The following corollary is an application of Theorems 4.10, 4.13 and 4.14 to Besov-p measures µ, µ (n) , n ∈ N, 1 p 2, with different smoothness parameters s, s (n) and shifts m, m (n) such that s (n) → s and m (n) → m as n → ∞. Note that it is not entirely clear on which space X to consider equicoercivity and Γ-convergence, since the measures µ, µ (n) seem to live on different spaces X = ℓ p δ , X (n) = ℓ p δ (n) with After all, Theorems 4.13 and 4.14 explicitly demand all measures µ, µ (n) to be defined on the same space X = ℓ p α . However, as we will see, the assumed convergence s (n) → s guarantees the existence of such a common space X of full µ (n) -measure for all but finitely many n ∈ N.

Using Notation 4.15, the OM functional
Then there exists n 0 ∈ N such that, for each n n 0 , µ (n) (X) = 1 and we therefore consider these measures on the same space X = X t p = ℓ p δ . Then the sequence (I µ (n) ) n n 0 of OM functionals of µ (n) given by I µ (n) = · −m (n) p X s (n) p : X → R is equicoercive and I µ = Γ-lim n→∞ I µ (n) . Similarly, using Notation 4.3 and assuming S ψ to be an isometry, I µ ψ and I µ (n) Proof. Assumption 4.1 (A1)-(A3) and (A6) -i.e. the support condition on µ, continuity and symmetry of the reference density ρ, affine transformation property and Besov property -are satisfied by Definition 4.16 and Remark 4.19 with hence (4.19) follows directly from Theorem 4.10. In other words, the result in Conjecture 4.12 holds for the Besov measures µ and µ (n) : and a similar result holds with µ replaced by µ (n) . The analogous statement for I µ ψ and I µ (n) ψ , n ∈ N, follows from Lemma 4.5. Recall that this lemma transforms an OM functional on the sequence space X into an OM functional on the separable Banach space Z, where X and Z are related by the synthesis operator S ψ : X → Z and coordinate operator T Ψ : Z → R N . Since s (n) → s, there exists n 0 ∈ N such that, for n n 0 , |s (n) − s| dη 2p . Therefore, for n n 0 , t = s − p −1 d(1 + η) < s (n) − p −1 d and µ (n) (X) = 1 for X = X t p = ℓ p δ by Remark 4.19. Further, for n n 0 , the sequences a (n) = (k −1−η |k p d (s−s (n) ) − 1|) k∈N are (uniformly) bounded by the summable sequence a = (2k −1−η/2 ) k∈N and the reverse Fatou lemma implies proving γ (n) − γ X → 0. Equicoercivity and Γ-convergence of the sequences (I µ (n) ) n∈N and (I µ (n) ψ ) n∈N directly follow from Theorems 4.13 and 4.14 respectively.

Application to Cauchy measures
This section considers infinite-dimensional Cauchy measures in the sense of infinite products of one-dimensional Cauchy distributions, as used by e.g. Sullivan (2017) and Lie and Sullivan (2018b). We note that there is another class of "Cauchy measures" in the literature, namely the class of stochastic processes with Cauchy-distributed increments, as used by e.g. Markkanen et al. (2019) and Chada et al. (2021).
Definition 4.22. We define the Cauchy measure C(m, γ) := k∈N C(m k , γ k ) on R N with shift parameter m ∈ R N and scale parameter γ ∈ R N >0 as the product measure of one-dimensional Cauchy measures on R with shift parameter m k and scale parameter γ k , k ∈ N, i.e. with probability densities Assumption 4.23. X = ℓ q for some q 1, m ∈ ℓ q , γ ∈ ℓ 1 (N) ∩ R N >0 . In addition, if q = 1, then γ satisfies k∈N |γ k log|γ k || < ∞.
Recall Notation 4.3: X = ℓ p α for some 1 p < ∞ and α ∈ R N >0 , Z is a separable Banach space with Schauder basis ψ = (ψ k ) k∈N , the synthesis operator S ψ : X → Z satisfies x = (x k ) k∈N → k∈N x k ψ k , and the coordinate operator T ψ : Z → R N satisfies z = k∈N v k ψ k → (v k ) k∈N . If µ ∈ P(X), then µ ψ := (S ψ ) # µ is the push-forward of µ under S ψ .
The following theorem guarantees the well-definedness of the random variable u above: Proof. The support condition (A1) follows from Theorem 4.25; the continuity and symmetry of the reference density ρ (A2) and the affine transformation property of the (µ k ) k∈N (A3) follow from Definition 4.22; the finite Fisher information (A4) and smoothness assumptions on the reference density ρ (A5) can be verified by straightforward computations.
Further, for n ∈ N, let µ (n) = C(m (n) , γ (n) ) be Cauchy measures such that m (n) and γ (n) satisfy Assumption 4.23 for the same q 1 as above and m (n) − m X → 0 and γ (n) − γ X → 0 as n → ∞. Then the sequence (I µ (n) ) n∈N is equicoercive and I µ = Γ-lim n→∞ I µ (n) . Similarly, using Notation 4.3 with α ≡ 1 and assuming S ψ to be an isometry, I µ ψ and I µ (n) ψ , n ∈ N, defined by (4.2) constitute OM functionals for µ ψ = C q,ψ (m, γ) and µ (n) ψ = C q,ψ (m (n) , γ (n) ), respectively, Proof. Assumption 4.1 (A1)-(A5) are satisfied by Lemma 4.26. We have where we used that k∈N log 1 + γ −2 k (h k − m k ) 2 is finite if and only if h − m ∈ ℓ 2 γ , as well as Corollary B.5 to guarantee that ℓ 2 γ ⊆ X. Thus, the first statement follows from Theorem 4.10, i.e. the result in Conjecture 4.12 holds for the Cauchy measures µ and µ (n) , n ∈ N: and a similar result holds with µ replaced by µ (n) . The analogous statement for I µ ψ and I µ (n) ψ , n ∈ N, follows from Lemma 4.5. Recall that this lemma shows that an OM functional on the sequence space X yields an OM functional on the separable Banach space Z, where X and Z are related by the synthesis operator S ψ : X → Z. The equicoercivity and Γ-convergence of the sequences (I µ (n) ) n∈N and (I µ (n) ψ ) n∈N now follow directly from Theorems 4.13 and 4.14 respectively.

Closing remarks
In this paper, our first main contribution is to obtain a formula for the OM functionals of a class of probability measures on a weighted sequence space X = ℓ p α . This class is defined using Assumption 4.1, and the key result that we used to obtain these formulas is Theorem 4.10. In addition, we considered collections of measures in this class that converge to a limiting measure in the sense that the collections of shift and scale sequences converge to a limiting pair of shift and scale sequences, and convergence of the Lebesgue densities of the associated reference measures. Our second main contribution is to state sufficient conditions for equicoercivity and Γ-convergence of the corresponding sequence of OM functionals. For this, we relied on Theorem 4.13 and Theorem 4.14. In addition, we applied these results to Cauchy and Besov-p measures for 1 p 2. We used the results in the weighted sequence space setting to prove the analogous results for measures on separable Banach or Hilbert spaces.
In the context of BIPs, the Besov, Cauchy, and more general product measures considered in this paper arise most naturally as prior distributions. The results of this paper therefore provide a convergence theory for the corresponding prior OM functionals. Since these priors are unimodal, this convergence theory would appear to be surplus to requirements; it is in some sense "obvious" how the modes of sequences of such measures ought to converge. However, the importance of this paper's results is that prior Γ-convergence and equicoercivity can be transferred to the posterior using the results of Part I of this paper (Ayanbayev et al., 2021, Section 6), and understanding the convergence of posterior modes (i.e. MAP estimators) is a non-trivial and novel contribution.
An important open problem raised in this paper is Conjecture 4.12. Proving this conjecture would significantly enhance the applicability of our results. In addition, it would be of interest to study equicoercivity and Γ-convergence of so-called "generalised OM functionals" as introduced by Clason et al. (2019).

A. Equivalence of product measures
The following two dichotomies on the equivalence or mutual singularity of certain infinite product measures are classical results. Here, H(µ, ν) denotes the Hellinger integral defined in (4.7).
Then precisely one of the following alternatives holds true:

B. Technical supporting results
Lemma B.1. Let X = ℓ p α for some 1 p < ∞ and α ∈ R N >0 and let Y = R N be equipped with the product topology and the corresponding Borel σ-algebra B (Y ). Then B(X) ⊆ B (Y ).
Proof. By definition of the product topology, for i ∈ N, the projections π i (y) = y i , y ∈ Y , are continuous and so are the functions f i (y) = | y i −z i α i | p , where z ∈ Y is any fixed sequence. Hence, the (f i ) i∈N are Borel measurable, and so is the function f (y) = y − z p α for some 0 < τ < ∞, if the following condition is fulfilled: ∼ µ 0 . Note that we may assume m = 0 without loss of generality since m ∈ X and therefore v ∈ X if and only if v + m ∈ X. Let w k := | γ k α k u k | p . Since µ(ℓ p α ) = 1, v p ℓ p α = k∈N w k < ∞ a.s., which, by Kallenberg (2021, Theorem 5.18), implies: (i) for any A > 0, k∈N P(|w k | > A) < ∞ and (ii) First note that (i) implies γ k /α k → 0 as k → ∞. Hence, c := min k∈N c k is strictly positive, where Since |w k | < 1 if and only if |u k | < α k γ k , it follows from (ii) that proving (a). If condition (B.1) is fulfilled, then there exists K ∈ N such that, for all k K, α k γ k x 0 and thereby Hence, condition (i) implies (b).
Proof. Since γ ∈ ℓ p α ⊆ ℓ ∞ α by Lemma B.3, the claim follows directly by considering the first and second alternatives in Proposition B.4 for the case where p < 2 and p 2 respectively.

B.1. Proof of Theorem 4.10
In this section we give the proof of Theorem 4.10 which is technical and requires additional notation and lemmas: Definition B.6. A non-negative function f : R d → R 0 , d ∈ N, has the symmetric decay property if • d = 1 and f is symmetric, i.e. f (x) = f (−x) for every x ∈ R, and the restriction f | R 0 is monotonically decreasing; • d > 1 and f has the symmetric decay property "along each coordinate", i.e., for any u ∈ R d , the functions f ( · , u 2 , . . . , u d ), f (u 1 , · , u 3 , . . . , u d ), . . . , f (u 1 , . . . , u d−1 , · ) have the symmetric decay property.
Proof. We will show that h has the symmetric decay property along the first coordinate. The proofs for the other coordinates proceed analogously. For any u = (u 2 , . . . , u d−1 ) ∈ R d−2 and any u 1 , u ′ 1 ∈ R with |u 1 | |u ′ 1 |, it holds that s(u 1 , u) s(u ′ 1 , u), and therefore The symmetry of h follows directly from the symmetry of s and f .
Proof. Due to symmetry, we only need to consider v 0, and we split this into two cases, according to whether or not v 2s.
Since Λ : R → R 0 is even and non-negative, we obtain Step 1. Let m = 0. For every h ∈ X, N > 0 and 0 < ε < 1 there exist r * > 0 and K * ∈ N such that for any 0 < r < r * and K > K * , − log µ [1:K] (B Since the right-hand side does not depend on r and K and since N, ε > 0 are arbitrary, this proves (4.11) for m = 0.
We now give the proofs of the non-trivial first and second steps.