Effective estimates of Lyapunov exponents for random products of positive matrices

In this note we describe estimates on the error when calculating the Lyaponov exponent for random products of positive matrices using dynamical determinants. This extends the results in (Jurga N and Morris I 2019 Nonlinearity 32 4117–46; Pollicott M 2010 Invent. Math. 181 209–26) by drawing upon a new approach introduced in (Jenkinson O and Pollicott M 2018 Adv. Math. 325 87–115; Jenkinson O, Pollicott M and Vytnova P 2018 J. Stat. Phys. 170 221–53).


Introduction
The importance of Lyapunov exponents for random matrix products has been illustrated by the renowned work of Kesten [4], Furstenberg [3], Osceledec [9], and others, and its influence in other areas of mathematics and physics. Lyapunov exponents for positive matrices, in particular, play a useful role in such topics as entropy rates in information theory [5] and in the theory of hidden Markov processes.
Let us consider for definiteness the case of 2 × 2 matrices, the general case being similar. Assume that A (0) = (a (0) i j ) and A (1) = (a (1) i j ) are two such matrices with strictly positive entries.
Moreover, for simplicity, we will assume that both matrices have determinant 1, otherwise the contribution from the determinant can be easily compensated for. Let 0 < p 1 , p 2 < 1 with p 1 + p 2 = 1. We can define the Lyapunov exponent λ = λ(A 1 , A 2 ; p 1 , p 2 ) the existence of the limit following by subadditivity.
Typically there is no simple closed form for λ and so we often want to estimate the associated Lyapunov exponent numerically, as accurately as possible, given the data we compute from finite products of matrices. Working from the definition is not very efficient in this regard, but there are a variety of different methods available. One empirically efficient approach was proposed in [10] and was further developed in [8]. The method involves introducing a convergent power series d(z, s) := 1 + ∞ n=1 a n (s)z n (for s ∈ R), where the coefficient a n (s) is expressed in terms of the 2 n+1 possible finite products of up to n matrices, and writing This will be described in more detail in section 2. Ultimately, the value of any computed numerical quantity depends on its known accuracy. This in turn depends on the bounds on the size of terms a n (t). A useful indication of the size of the expected error comes from the size of the coefficient a n (t). This is formalized in section 3, but has already been studied in [8,10].
In this note we want to concentrate on the validated accuracy of these approximations to these coefficients, and thus to the Lyapunov exponent. There are what we shall call basic Euler bounds coming directly from the method used in [10], perhaps suitably refined using different functional spaces as in [6,8]. But the updated approach in [6] using what we shall call computed bounds usually leads to better bounds or, consequently, to more accurate estimates on the Lyapunov exponents. We will elaborate on these different bounds in section 3.
In this brief note we compare these two approaches for estimating the error and illustrate the improvement in the error through a number of examples. This is especially pronounced when the matrices are less hyperbolic (i.e., their eiegenvalues have absolute values close to 1). This is illustrated by the following example.
Since the method of computing the Lyapunov exponent is based on the analysis of series like d(z, t), an indication of its accuracy is given by the size of the coefficients a n . For p 1 = p 2 = 1 2 we show in table 1 the estimates for the Lyapunov exponents using the terms a n (t) for n = 1, . . . , 9. We also show the different bounds for a n=9 (0) we get using the traditional Euler bounds and the new computational bounds. The improvement in the error term becomes particularly significant when t > 0 is small. This illustrates that the computed bounds are somewhat better when the hyperbolicity is weaker.
The new approach we describe for estimating the error terms in the Lyapunov exponent in terms of computational bounds is based on similar approaches for estimating other quantities in [6,7]. In particular, the present manifestation helps to further illustrate the principle of using computed bounds to estimate the error.

An approach using determinants
To begin, we first describe the basic approach from [10] which is based on fixed points for contractions of the interval.

A dynamical formulation
Let A 1 , A 2 be two 2 × 2 matrices with strictly positive entries. We associate to the matrices A 1 = (a (1) i j ) 2 i, j=1 and A 2 = (a ( for k = 1, 2. These correspond to the projective maps associated to the linear actions of the matrices, and the positivity of the matrices ensures that T 1 and T 2 are both contractions of the interval. We next formally want to associate a (bi-complex) function of two variables z and s formally defined by where we use the notation: It is easy to see from the definition that the function d(z, s) converges for |z| and Re(s) sufficiently large. The following result is essentially due to Ruelle [11] (see also corollary 1 in [6,10]).

Lemma 2.1 (after Ruelle). The function d(z, s) extends as a bi-analytic function to all
We now see by the implicit function theorem that s → z(s) is differentiable at s = 0. We can express the Lyapunov exponent in terms of the function z(t). Rather than the original formula from [10] we use a simple alternative formulation due to Morris and Jurga [8]: Since matrices are invertible the Lyapunov exponent is given by λ = 1 2 z (0). We can use lemmas 2.1 and 2.2 and the implicit function theorem to write λ in terms of d(z, t): since z(0) = 1.

Remark 2.3.
The way in which we have associated to the matrices A 1 and A 2 the Mobius maps T 1 and T 2 is far from unique. Different choices of representations of projective space may lead to different maps, with slightly different properties, but will necessarily lead to the same value of the Lyapunov exponent.

Approximating determinants
Now that (2) gives a formal expression for the Lyapunov exponent in terms of the determinant, we need to use an approximation in order to obtain a numerical estimate. By lemma 2.1, d(z, s) is bianalytic and thus has a Taylor expansion in z at zero of the form where a n → 0 super exponentially as n → +∞. For any given N 1 we can consider the polynomial approximations From the definition of d(z, t) we see that the coefficients of d (N ) (z, s) can be computed using only the 2 N+1 − 1 fixed points x i where |i| N. Let z N (s) > 0 be the smallest positive real zero of d (N ) (z, s) guaranteed by lemma 2.1. We can approximate λ by values λ N defined by the following analogue of (2): In practice we would choose N as large as practicable to get that λ N is a good approximation to λ. We can write this explicitly as Remark 2.4. To obtain the Taylor coefficients a n (s) for d(z, s) as a function of z it is convenient to denote Collecting together the powers of z gives an explicit formula: where the summation is over the finite set of k-tuples (m 1 , . . . , m k ) ∈ N k , for any 1 k n. This expression can then be differentiated in s to get a formula for a n (s).
We can illustrate this method with two simple examples.
. We can then consider the associated two contractions We will return to the accuracy of this estimate in the next section. Let p 1 = p 2 = 1 2 . Letting N = 10, say, we can estimate We will return to the accuracy of this estimate in the next section.

Example 1.1 revisited.
We can associate to the matrices A 1 (t) and A 2 (t) the Möbius maps . Now that we have described the method of approximating the Lyapunov exponent using determinant functions we need to ask what confidence we should have in this estimate. The purpose of this note is to give the best possible error bounds. We address this in the next section.

The error estimates
We now concentrate on efficiently bounding the coefficients a n (s) (for n N) which is the key to estimating the difference between λ N and λ.

Contraction rates
To formulate a basic bound on the coefficients a n (s) (for n N) we consider a disk D = D(c, r) in the complex plane C centred at c ∈ R and of radius r > 0 and the image disks T 1 (D), T 2 (D) ⊂ C. 2 Assume these are contained in a concentric disk D = D(c, r ) > 0 with r < r: We can set θ = r r ∈ (0, 1). Typically, we would like to choose c, r so as to minimize the value of θ (figure 2).

Bounds on a n (s) terms
In this subsection we specify the bounds on the terms a n (s) in a fairly formal way. In the appendix A we sketch the ideas. We will also illustrate them with examples in the next section.
We begin by introducing the notation needed to describe the basic Euler error bounds on the coefficients a n (s).

Notation (basic Euler bounds). Let us denote
Thus defined, we can recall the original bounds on a n (s).

Lemma 3.1 (after Ruelle).
We can bound |a n (s)| K n s E n , for n 0. Proof. This essentially dates back to the original work of Ruelle and Grothendeick (cf [10]), although by working in the context of Hardy-Hilbert spaces one can streamline the estimates a little (see [1,6,8]), and is sufficient for large values of n.
In those examples for which 0 < θ < 1 is sufficiently small the bounds given in lemma 3.1 will be reasonably effective for realistic values of N. The size of N is determined by the practical considerations of computing 2 N+1 − 1 fixed points and the associated weights by derivatives 3 . We next introduce the notation needed to describe the new error bounds. This allows us to take advantage of estimates of other ingredients in the bounds which can be numerically computed to high numerical precision. The expression for ρ n (s) may appear complicated but it can be readily accurately computed and leads to the following (compare with lemma 3.1). The improvement is that the bound for N + 1 n M follows by adapting the analysis in [6] (and is better for smaller values of n). Example 2.5 revisited. The first 20 values of ρ n (0) and K n 0 E n are given in table 2(i). For this particular example the matrices are relatively hyperbolic (i.e., with eigenvalues not close to 1) and the improvement in the bounds coming from theorem 3.2 is not particularly impressive. Example 2.6 revisited. The first 20 values of ρ n (0) and K n 0 E n are given in table 2(ii). For this particular example the matrices are less hyperbolic than in the previous example and in this case the improvement in the bounds by using theorem 3.2 is more significant. For example, when n = 12 the computational error bounds are reasonably effective, in as much as the error term is significantly smaller than the principal term, whereas the basic bounds are not. Comparing (2) and (4) we see that in order to estimate λ we first require the bounds

Theorem 3.2 (improved bounds on the |a n (s)|). We can bound the terms
using Cauchy's theorem. As one easily observes, choosing 0 < θ < 1 as small as possible helps in making the bound better. This, in turn, depends on the choice of c and r.
More precisely, we can write Obtaining upper bounds on the values |a n (0)| for the numerator of the expression in (6) comes from a standard application of Cauchy's theorem, i.e., for any suitably small δ > 0.
Since | ∞ n=1 na n (0)| = | ∂d(z,0) ∂z | z=1 | > 0 (we will give a more explicit lower bound at the end of the appendix A) and, for sufficiently large N, we can write where we have used the estimates in theorem 3.2 to bound the tail of the series.
The main term | N n=1 na n (0)| on the last line of (8), which also appears on the right-hand side of (6), can be numerically computed in specific examples. The remaining two terms are effectively bounded, and in the case of the examples we consider they are considerably smaller than the first term and, in particular, this was is used in the bounds for the Lyapunov exponents for the examples in the next subsection.

Bounds on the Lyapunov exponents
Finally, we can use (6) and lemma 2.2 to write Although the error bound in this case is significantly worse than in the previous example it replaces an ineffective Euler bound.
where s ∈ R. The operators L s are trace class and we can write d(z, s) = det(I − zL s ) [6]. Moreover, we can bound the coefficients by for n 1, where the summation is over indices i 1 , . . . , i n ∈ N satisfying i 1 < · · · < i n and for i ∈ N, denotes the ith approximation number. Here the infimum is taken over all linear operators of rank at most k − 1.

A.2. The bounds on the approximation numbers
We have two different bounds on the approximation numbers.
(a) We have the basic bound A n (L s ) K s θ n , for n 1 (see lemma 2.1, and corollary 1 in [6]). (b) We have a better bound on A n (L s ), for n 1. Let κ n : H 2 → H 2 be the orthogonal projection onto the n-dimensional space (where we call that c and r are the centre and radius of D, respectively) then A n (L s ) L s (I − κ n ) . For any f (z) = ∞ k=0 c k m k ∈ H, a simple application of the Cauchy-Schwarz inequality gives that In particular, combining this bound with the definition of A n (L s ) gives When n M this completes the proof of theorem 3.2.

Remark 3.4.
It is possible to further improve the bounds, at the cost of introducing more complicated expressions. In particular, we can get an even more refined estimate if we introduce the following notation. Let N Q M