Matrix Moments in a Real, Doubly Correlated Algebraic Generalization of the Wishart Model

The Wishart model of random covariance or correlation matrices continues to find ever more applications as the wealth of data on complex systems of all types grows. The heavy tails often encountered prompt generalizations of the Wishart model, involving algebraic distributions instead of a Gaussian. The mathematical properties pose new challenges, particularly for the doubly correlated versions. Here we investigate such a doubly correlated algebraic model for real covariance or correlation matrices. We focus on the matrix moments and explicitly calculate the first and the second one, the computation of the latter is non-trivial. We solve the problem by relating it to the Aomoto integral and by extending the recursive technique to calculate Ingham-Siegel integrals. We compare our results with the Gaussian case.


Introduction
The Wishart model for random correlation or covariance matrices plays a central rôle in statistical inference. Its original version [1], correlates the rows of the random data matrices, i.e., the time series, more recently [2,3,4,5,6] correlations of the columns, i.e., of the position series, were also included. A crucial assumption is that a Gaussian form of this multi-multivariate distribution is justified, a viewpoint that can often be backed by arguments related to the Central Limit Theorem. Nevertheless, the enormous amount of data on various systems revealed that such a line of reasoning has its limitations. Finance is probably the field with the best evidence, often revealed by physicists who also produced a large body of work on random matrices in this context, including non-Gaussian models [7,8,9,10,11,12,13,14,15,16,17,18,19,20]. It remains a challenge to systematically explore statistical properties of other systems in that respect and to that extent.
Here, we study an algebraic extension of the doubly correlated Wishart model that we used recently [21]. It involves a determinant and thus all matrix invariants, while the Gaussian Wishart model relies on the trace only. Our model [21] generalizes previously introduced ones by Forrester and Krishnapur [22] as well as by Wirtz, Waltner, Kieburg and Kumar [23]. Importantly, we consider real matrices, i.e. the case also referred to as orthogonal, because it is the most commonly encountered one in the application to data. We mention in passing that Ref. [21] also extends a new interpretation [24,25] of the Wishart model, namely a random matrix model for non-stationarity, conceptually different from the usage in statistical inference. Various multivariate algebraic amplitude distributions were modeled and calculated.
Not surprisingly, an algebraic extension of the doubly correlated Wishart model is mathematically considerably more complicated than the Gaussian version, triggering the present study. The difficulties are particularly severe, as we employ the orthogonal, not the much more convenient unitary symmetry. We compute the first and the second matrix moment of our algebraic model, as they can be used in data anlysis to fix parameters of the above mentioned multivariate algebraic amplitude distributions and thus also of the algebraic distribution for the random correlation or covariance matrices. The second moment poses serious challenges which we face by further developing some techniques, more precisely, we manage to establish helpful relations to the Aomoto integral [26,27] and, furthermore, we extend the recursive method [28,29] to compute integrals related to the Ingham-Siegel type. The strategy is to avoid or circumvent group integrals, which are, as is well known, quite cumbersome for the orthogonal symmetry relevant here, much more complicated as for the unitary one. We have two goals. First, we wish to present explicit formulae for the first two moments and the resulting matrix variance to be used in data anlysis. We also compare with the Gaussian case. Second, we wish to contribute new pieces to the tool box of Random Matrix Theory, which are useful for algebraic extensions of the Wishart model and in other random matrix models.
The paper is organized as follows. In Sec. 2, we define the problem mathematically and set up proper generating functions. In Sec. 3, we compute the first matrix moment, preparing for the calculation of the much more involved calculation for the second matrix moment in Sec. 4. We conclude in Sec. 5. Some details are worked out in the Appendix.

Determinantal Distribution, Matrix Moments and Generating Function
In Sec. 2.1, we introduce our algebraic extension of the doubly correlated Wishart model. In Sec. 2.2, we define the matrix moments and introduce as well as compute a generating function.

Determinantal Distribution
Consider K × N real data matrices X with elements X k (n) If one views the K rows of X as time series and its N columns as position series, the K × K and N × N matrices are the sample covariance matrices of time and position series, respectively. We always uese the dagger symbol to indicate the transpose. In a statistical model, one draws the data matrices X from a distribution. There are many such random matrix applications in the theory of complex systems. In recent years one became interested in algebraic distributions of covariances or correlations to model heavy tails found in data. A nearlying choice is the determinantal distribution This distribution exists, in the sense that it is integrable, if the condition holds. The distribution (2) was introduced in Ref. [21], generalizing related ones defined in Refs. [22,23]. We give the distribution (2) in two equivalent forms in Eq. (2), one with the argument of the determinant being real-symmetric, and another one, in which this is not the case. The fixed matrices Σ and Ξ are positive definite of dimensions K × K and N × N. They are, as to be discussed below, different from, but related to the corresponding covariance matrices. The distribution (2) depends on two shape parameters L and M. It converges to the Gaussian It is seldom possible to directly compare a distribution such as (2) or (5) with data, one usually analyzes other observables f (ρ|X) which depend on some arguments ρ and parametrically on the time or position series. To carry out the data comparison, one calculates the ensemble average in which the invariant measure or volume element reads Obviously, the parameters L and M in the algebraic case must have values that ensure the convergence of the integral (7).

Matrix Moments and Generating Function
To make this program meaningful, one has to wisely choose the matrices Σ and Ξ or, even better, to determine them from data. If the distribution (2) were Gaussian, this would be an easy task, because Σ and Ξ could simply be obtained from the data as the sample covariance matrices. This is not so in the present algebraic case. To determine the matrices Σ and Ξ, we have to find out their relation to the covariance matrices. This motivates the study of the ν-th matrix moments for Y = G, A, and provided they exist in the algebaric case. Due to the algebraic structure of w A (X|Σ, Ξ), there are highest possible matrix moments, depending on the parameters of the distribution. The first matrix moments, ν = 1, yields the desired relations to the covariance matrices for the time and position series, respectively, where the left hand sides are to be identified with the sample covariance matrices. The evaluation of the matrix integrals (9) and (10) is possible for ν = 1 by a proper change of the integration matrix which makes the Σ and Ξ dependence trivial and thus the remaining integrals much easier. The calculation of the higher order moments, ν = 2, 3, . . ., turns out to be an interesting and non-trivial challenge in mathematical physics. Here, we tackle the calculation of the second matrix moment. For ν > 1, the above mentioned change of the integration matrix does not remove Σ or Ξ from the relevant integrals. An alternative approach is called for. We introduce the generating function depending on the K × K real-symmetric source matrix J. This generating function is normalized, We define the matrix gradient corresponding to the source matrix, in which the factors 1/2 account for the symmetry of J and XX † /N. We have The function Z Y (J) generates the matrix moments (9) and is defined similar to a characteristic function. We notice that the normalization (12) is scalar and thus not identical to the zeroth moment which is equal to the unit matrix 1 K . To generate the matrix moments (10), XX † /N has to be replaced by X † X/K and J has to be a real-symmetric N × N matrix. However, once the moments (9) are computed, the moments (10) can be inferred without further calculation by replacing parameters and dimensions.
A natural object involving the first and the second matrix moments is the matrix variance as it is, among other things, invariant under a shift by a fixed matrix Ω, say, To avoid confusions, we emphasize that Eq. (15) defines the variance of the matrix XX † /N that serves as estimator for the sample covariances. Hence, it defines the variance of the covariances, and its dimension is that of the time series to the fourth power. Nevertheless, it has all the properties expected for a variance, in particular it is by construction a positive semidefinite matrix. We first consider the algebraic case and use the generating function to cast the matrix integrals (9) into a form better adjusted for explicit evaluation. To this end, we employ the Ingham-Siegel integral [28,29] S>0 where the matrices S and R are N × N real-symmetric, S has to be positive, indicated by S > 0. Convergence is guaranteed if q ≥ (N + 1)/2. We write w A (X|Σ, Ξ) as an Ingham-Siegel integral where we take the real-symmetric matrix argument of the determinant in the second form of Eq. (2). The entire dependence on X and X † in the exponent can be written as owing to the relation tr The real matrices F and G have dimensions N × N and K × K, respectively. The integral over X is then of simple Gaussian form. Thus we find Expanding the large determinant expression in the integrand to second order in J, we arrive at up to second order in J. We observe that the invariance of the integration measure allows us to replace the matrix Ξ by the diagonal matrix of its eigenvalues. This is also true for the generating function (19) to all orders in J, but only form now on, it will lead to simplifications. We also compute the Gaussian case for later comparison. It is easily inferred from the above derivation. We find which trivially also only depends on the eigenvalues of Ξ.

First Matrix Moment
There are various straightforward ways to calculate the first matrix moment, including the one mentioned in Sec. 2.2. Here, we choose a method that prepares for the much more demanding computations of the second matrix moment in Sec. 4. In Sec. 3.1 we reduce the non-invariant matrix integral to be evaluated to an invariant one, allowing us to apply the Aomoto integral in Sec. 3.2 which quickly yields the final result.

Reduction to an Invariant Integral
We consider the algabraic case and apply Eq. (14) for ν = 1 to our formula (21). The matrix gradient of tr ΣJ is simply the matrix Σ, and we find for the first matrix moment with the function The advantage of working with the generating function instead of evaluating Eq. (9) starts becoming visible. The matrix structure of the first moment results directly from the matrix gradient, the remaining integral is scalar. While this simplification is not decisive for the first matrix moment, it will turn out very helpful for the second one. We write the trace tr ΘS −1 as sum, pull out the Θ n from the integral and are left with N integrals, each containing one diagonal element [S −1 ] nn . Obviously, they cannot depend on the indices n, because with simple changes of variables by permuting the basis, we can map any diagonal element of S −1 on any other one. Put differently, all integrals must give the same result. Hence we may make the follwing replacement under the integral and find with tr Θ = tr Ξ. Importantly, the integral in Eq. (27) is invariant and may be written as an integral over the eigenvalues of S only, the integral over the orthogonal group which diagonalizes S is trivial. In contrast, the original integral in Eq. (25) also requires a nontrivial integration over the group and is thus much more complicated. In other words, we achieved a decoupling of the non-Markovian effects. We change to eigenvalue-angle coordinates where s n > 0, the matrix U parameterizes the orthogonal group. The volume element reads being the Vandermonde determinant. The invariant Haar measure dµ(U) is normalized to unity. We arrive at which reduces the problem to the calculation of an N dimensional eigenvalue integral.

Application of the Aomoto Integral and Final Result
The integral in Eq. (30) can be worked out with various techniques, we find it convenient to apply the Aomoto integral [26,27]. The calculation is carried out in Appendix A and we find which eventually yields for the first matrix moment. The singularity in the prefactor is a clear indication that the first moment only exists if the condition is fullfilled. With the choice M = 2L − 1 − (K + N) we can enforce existence of the first matrix moment which then coincides exactly with the result for the Gaussian model for random covariances or correlation matrices. Hence, the setting M = 2L − 1 − (K + N) allows us to interpret Σ directly as the (average) sample covariance matrix when comparing with data.

Second Matrix Moment
In Sec. 4.1, we decouple the non-Markovian effects, i.e., one of the fixed input matrices, from the matrix integrals. In Sec. 4.2, we trace two of the three resulting integrals back to invariant ones and infer them from the Aomoto integral. As the problem cannot be fully solved in this way, we carry out a non-trivial extension of the recursive technique to calculate integrals of the Ingham-Siegel type in Sec. 4.3. We give our final results, including the matrix variance, and compare with the Gaussian case in Sec. 4.4.

Matrix Integral and Decoupling of the Non-Markovian Effects
We begin with the algebraic case and calculate the squared matrix gradient of our formula (21). There are two contributions, where the last equality sign is best verified in a tedious, but straightforward computation in terms of the matrix elements. We thus find from Eq. (14) for ν = 2 where Ξ = Θ due to the invariance of the measure. We write out the traces, and observe once more, after reinserting in Eqs. (38) and (39) and pulling out the Θ n from the integrals, that the latter cannot depend on the indices n or m of the matrix elements [S −1 ] nm . We just use fixed ones, n = 1 and m = 2, do the remaining sums and arrive at with the integrals which are independent of Ξ = Θ. As in the case of the first matrix moment, we managed to decouple the non-Markovian effects from the matrix integrals to be calculated.

Invariant Integrals and Application of the Aomoto Integral
For certain combinations of the unknown integrals, invariant integrals can be constructed. We simply put Ξ = Θ = 1 N in Eqs. (42) and (43), use Eqs. (38) and (39) and find The difference can be cast into the form of an Aomoto integral which yields the details are given in Appendix A. The eigenvalue integrals as they stand in Eqs. (47) and (48) can, because of the squares in the trace terms, not be calculated with the Aomoto integral, but with other standard techniques, in particular with the method of integration over alternate variables and the theory of Pfaffians [30]. Nevertheless the two Eqs. (47) and (48) are not sufficient to obtain all three integrals Ψ d , Ψ p and Ψ m individually. While we managed in Sec. 3 to reduce the computation of the first matrix moment to the evaluation of an invariant integral, it is from a more general viewpoint quite inconceivable that such a reduction is possible to all orders, otherwise the original problem would be equivalent to an invariant integral.

Extending the Method to Calculate Integrals of Ingham-Siegel Type
We now put forward a method that does not rely on invariant integrals and allows us to directly compute Ψ d and Ψ p . Combined with the result (50), we obtain all three integrals. Our method extends the one given in Refs. [28,29] to compute the Ingham-Siegel integral (17). We begin with the integral (44). Observing that the element of an inverse matrix can always be expressed as the ratio of the adjugate and the determinant, we have where the (N − 1) × (N − 1) adjugate S 11 with respect to the first row and column appears in the block decomposition being an (N − 1) component vector. The determinant of S may then be written as while the trace is given by Since the matrix S is positive definite, the same holds for the adjugate S 11 , implying that both determinants are positive. Hence, the bracket in Eq. (53) must be positive as well and we obtain T † S −1 11 T < S 11 < ∞ as limits for the S 11 integration. Collecting everything, we find TheS 11 integral is again of Ingham-Siegel type (17) and is our final result for Ψ d .
To compute the integral Ψ p , we need the announced extension of the methods in Refs. [28,29]. We write both matrix elements of the inverse S −1 in terms of their adjugates S 11 and S 22 which both are (N − 1) × (N − 1) matrices, To obtain a convenient parameterization of the integration manifold, we introduce a block decomposition by slicing off the first and the second row and column, where S is a 2 × 2 andŜ an (N − 2) × (N − 2) matrix. Moreover, we set whereT 1 andT 2 are (N − 2) component vectors andT is an (N − 2) × 2 matrix. We also need block decompositions of the (N − 1) × (N − 1) adjugates, which are consistent with Eqs. (58) and (59). The block decompositions (60) lead to the expressions while the block decomposition (58) yields which is different as it involves two determinants on the right hand side. The latter is 2 × 2 and reads Since all determinants are positive, we may infer the limits for the integrals over S 11 , S 22 and S 12 ,T † 1Ŝ −1T with the short hand notation Inserting everything in the integral (57), we obtain after obvious shifts of the integration variables S 11 , S 22 and S 12 . With the rescaling x → √ y 1 y 2 x, the integrals over y 1 , y 2 and x become elementary. Moreover, theT 1 and T 2 integrals can be done and we are left with where the remaining integral is, once more, of Ingham-Siegel type (17). Collecting everything we end up with our final result for the integral Ψ p . With Eq. (50), we also find for the integral Ψ m . We notice the relations between these integrals.

Results
Formulas (42) with Ψ d given in Eq. (56). We plug this into Eq. (37) and arrive at which is our final result for the second matrix moment. Its existence is guaranteed if the condition holds, as the derivation shows. The Gaussian case is readily obtained by applying formulae (35) and (36) to the generating function (23), we find In the limit L, M → ∞ under the condition (6), the second matrix moment (75) in the algebraic case yields the second matrix moment (77) in the Gaussian case, as it should be. We finally also provide the results for the matrix variances in the algebraic and in the Gaussian case. In the former, a straightforward calculation gives which coincides with the latter, namely in the limit L, M → ∞ under the condition (6).

Conclusions
We studied an algebraic extension of the doubly correlated Wishart model recently introduced by us, generalizing earlier models by other authors. In contrast to the Gaussian doubly correlated Wishart model, it is based on a determinant in the denominator and thus includes in an expansion all matrix invariants. Our model is motivated by data featuring algebraic tails which are often not captured by the Gaussian version, even though mechanisms related to the Central Limit Theorem work in favor of the latter. The mathematics of our algabraic model is more demanding than that of the Gaussian version, in particular so as we consider real random model data matrices, implying real symmetric correlation and covariance matrices. We calculated the first and second matrix moments, and thereby encountered and solved non-trivial mathematical problems, because the real setup outruled the application of group integrals about which much less is known in the orthogonal case than in the unitary one. To circumvent group integrals, we developed an approach that, first, decouples matrices breaking the rotation invariance from the matrix integrals, allowing us to reduce them to invariant matrix integrals which we solve by mapping them onto the Aomoto integral. Nevertheless, some non-invariant matrix integrals remain which onne can view as generalizations of the real Ingham-Siegel integral. We succeeded, second, in calculating them by extending the recursive method for the standard Ingham-Siegel integral. We hope that our new technical devlopments might also be helpful in other problems.
Naturally, the question arises if our approach also facilitates the computation of higher matrix moments. Although we do not see a fundamental mathematical obstacle, even the computation of the third matrix moment becomes drastically more involved, which moves tackling such computations beyond the scope of this contribution. At present, we do not see competitive alternative methods, but we do definitely not exclude the possibility that they exist.

Appendix A. Application of the Aomoto Integral
Generalizing the Selberg integral [31], the N dimensional Aomoto integral [26,27] for positive constants a, b and γ as well as 0 ≤ m ≤ N is defined as and we again have to set a = L − (N + K + 1)/2. Formula (A.7) may also be obtained differently, but the present derivation seems to be a very convenient one.