Consecutive level spacings in the chiral Gaussian unitary ensemble: From the hard and soft edge to the bulk

The local spectral statistics of random matrices forms distinct universality classes, strongly depending on the position in the spectrum. Surprisingly, the spacing between consecutive eigenvalues at the spectral edges has received little attention, where the density diverges or vanishes, respectively. This different behaviour is called hard or soft edge. We show that the spacings at the edges are almost indistinguishable from the spacing in the bulk of the spectrum. We present analytical results for consecutive spacings between the $k$th and $(k+1)$st smallest eigenvalues in the chiral Gaussian unitary ensemble, both for finite- and large-$n$. The result depends on the number of the generic zero modes $\nu$ and the number of flavours $N_{\rm f}$, which are given in terms of characteristic polynomials, as motivated by Quantum Chromodynamics (QCD). We find that the convergence in $n$ is very rapid. The same can be said separately about the limit $k\to\infty$ (limit to the bulk) and $\nu\to\infty$ (limit to the soft edge). Interestingly, the Wigner surmise is a very good approximation for all these cases and, apart from $k=1$, shows a deviation below one percent. These findings are corroborated with Monte-Carlo simulations. We finally compare for $k=1$ with data from QCD on the lattice, being in this symmetry class.


Introduction
One of the cornerstones in applying random matrix theory is the local level spacing distribution. It provides a statistical description of the spacing between consecutive eigenvalues. For that purpose, the mean level density around the point where we zoom in has to be unfolded, meaning it is mapped to be approximately constant, see [1]. Then, the energies are measured in units of this constant. This procedure applies to the bulk of the spectrum. The spacing distribution continues to be a popular tool in today's applications, ranging from quantum spin chains [2], quantum circuits [3], chemical isotopes [4] and graphene [5] to classical integrable systems [6] and theoretical studies [7], which are all from past year. We refer to [1,8] for a list of more topics and references therein.
Let us also look back at the classical papers where the spacing distribution has been proposed in the study of quantum systems displaying chaotic behaviour, the Bohigas-Giannoni-Schmit conjecture [9], cf. [10]: quantum systems that are fully chaotic follow random matrix statistics. This field of research has been subsumed and influenced by Fritz Haake's famous book [11] in several editions. Let us mention an important mathematical contribution of Fritz Haake in this context. Together with Barbara Dietz [12], he has carried out a precise expansion of the true spacing distribution for the classical random matrix ensembles for comparison with data, that we will use as well. This expansion is the one developed by Padé [13], who introduced an expansion in rational functions instead of polynomials, as it is common for the Taylor expansion. This approximation for the level spacing distribution by Dietz and Haake goes beyond the frequently used Wigner surmise. At the same time, it proves a practical expression compared to a truncation of the infinite product of Fredholm determinant eigenvalues in Mehta's book [14], given in terms of an integral over spheroidal functions. In parallel, Fritz Haake has pursued an impressive research programme to prove random matrix statistics from a semi-classical expansion, cumulating in [15]. This has lead to a much deeper understanding of chaotic quantum systems.
A second example where the random matrix approximation is very well understood is the low-energy Dirac operator spectrum in Quantum Chromodynamics (QCD) introduced in [16], which has motivated our study. Here, the spectral edge representing a hard edge is focussed on, zooming into the origin where the lowest eigenvalues are located. Without going into detail of the vast literature on the application to QCD, and on comparing to lattice data from first principle lattice QCD simulations, cf. [17,18], let us focus on random matrix questions. In this field, typically the local density correlation functions, given by the universal Besselkernel [19], and the distributions of the smallest eigenvalues [20,21,22] are employed for comparison. However, few works have applied the level spacing distribution as a measure for random matrix statistics, both at the hard edge [23,24,25] and in the bulk of the spectrum [26]. Averaging over several consecutive spacings, a good agreement was found at the hard edge below the chiral phase transition. Only in the vicinity of the phase transition deviations from the bulk spacing were found [23,24,27,28]. It came as a surprise to us, that no analytical results were available at the hard edge for the spacing distribution, and that at least on average the bulk spacing leads to a good description.
In the present article, we will provide a closed form analytical expression for the spacing distribution in the chiral Gaussian unitary ensemble (χGUE) that is relevant for QCD with three colours in the fundamental representation, and other strongly interacting field theories which lie in the same universality class. We consider the spacing between two consecutive eigenvalues close to the origin, beginning with the smallest and second smallest eigenvalue. Our derivation follows ideas from the computation of the distribution of individual eigenvalues at the hard edge, starting with the smallest one [20,21,22]. The resulting spacings depend on several parameters of the ensemble, the number of exact zero eigenvalues ν, that is related to the gauge field topology, and a fixed number of (light) quark flavours N f with masses m f . Our results hold for finite matrix size n and in the asymptotic microscopic scaling limit n → ∞. It provides an alternative to the somewhat heavy machinery of analysing the Fredholm determinant of the underlying integral kernel, that typically leads to an expression including Painlevé V differential equations [29]. In contrast, we will obtain expressions in terms of k-fold integrals over a determinant of Bessel-functions of size N f + ν + k for the kth spacing.
Moreover, we will find that, both at finite (even small) matrix dimension n, as well as asymptotically at large n, the spacings at the hard edge are very close to the GUE bulk spacing distribution, including a very weak dependence on ν and N f . The deviations range from one percent to only a few per mill when varying the parameters. Hence a comparison to QCD lattice data, that we also undertake, does not allow to discriminate between the two, even for the smallest to second smallest eigenvalue spacing at k = 1. The situation at the soft edge is very similar, where the random matrix eigenvalue density vanishes as a square root. Here, apparently so far only the local Airy-kernel [30] and distribution of the largest eigenvalue [31] were studied, but not the spacing distribution. Even if we cannot offer analytical results in the soft-edge case, our numerical investigation suggests that also the spacing between the largest and second largest eigenvalue is very close to the bulk spacing. This is numerically corroborated at the inner soft edge when ν 1, too. Let us emphasise that we are studying the transition between two different local random matrix statistics, that is in our case the Bessel-kernel and sine-kernel statistics. Those statistics also appear in the transition ensembles between the χGUE and GUE, where the former ensemble has chiral symmetry. Such a transition has been studied on the level of correlation functions [32,33], where it is very pronounced and relevant for the Wilson Dirac operator spectrum [34]. In contrast, for the observable of consecutive level spacings the difference between the two ensembles turns out to be tiny at the origin (the hard edge), and we are almost immediately in the bulk. The transition between different random matrix symmetry classes is a classical question and has been studied in the bulk [35] and at the edges [36]. We will not touch upon another transition, between quantum chaotic and integrable behaviour which is generically Poisson, compare [11].
The remaining article is organised as follows. In Section 2, we study the limit from the hard and soft edge to the bulk, by taking the large argument asymptotic of the universal Bessel-and Airy-kernel, respectively, that yields the universal sine-kernel in the bulk. This provides a convergence rate at the level of the density correlation functions. Section 3 discusses general expressions for the gap probability and kth spacing distributions in terms of a k-fold integral over the ratio of partition functions with N f + ν + 2k flavours. These are then evaluated in Section 4 for the χGUE at finite matrix dimension n, yielding compact determinantal expressions of the size of N f + ν + k flavours. In Section 5, we take the large-n limit at fixed ν and N f which is known to be universal. The convergence of our analytical results for the hard edge spacing towards the bulk spacing is illustrated for quenched (N f = 0) and unquenched settings (N f = 0), and compared to the approximation of the bulk spacing through the Wigner surmise. Section 6 contains the comparison to data from lattice QCD, and in Section 7 we present our conclusions. In Appendix A we collect the exact expression for the GUE bulk spacing distribution from [12].

Transition from Bessel-and Airy-to the Sine-Kernel
The spacing between consecutive eigenvalues of random matrices with unitary symmetry in the bulk of the spectrum is well-approximated by Wigner's surmise [11,1] for 2 × 2 GUE matrices, We will also give a much more precise expression in Appendix A, cf. [14], which is universal. In the present case of the χGUE also called complex Wishart-Laguerre ensemble, deviations should show up at the hard edge because there the k-point correlation functions differ from the ones in the bulk of the spectrum. Before we address this question in terms of the nearest neighbour level spacing distribution in detail, see Section 3, let us investigate the transition on the level of the kernel of the underlying limiting determinantal point process, that is the transition from the Bessel-kernel at the hard edge to the sine-kernel in the bulk. Additionally, we also discuss the transition from the Airy-kernel at the soft edge to the sine-kernel in the bulk for completeness. Although we do not have analytical results at the soft edge, we will numerically study this transition for the spacing as well in Section 5.3.
The eigenvalue statistics of the χGUE follows a determinantal point process. This means the k-point correlation function of the eigenvalues has the form with correlation kernel K(x a , x b ). This holds already at finite matrix size, see (48) in Section 4, as well as in the asymptotic scaling limit in the bulk, at the hard and soft edge. For example, the microscopic level density is given by ρ(x) = R 1 (x). The difference between the eigenvalue statistics at different locations in the spectrum can be seen in the kernel which is the sine-kernel [14] inside the bulk of the spectrum, and the Bessel-kernel [30,29] at the hard edge of the spectrum, where J ν is the Bessel function of the first kind and the index ν counts the number of zero eigenvalues, measuring the strength of the repulsion from the origin. The Airy-kernel [30,31] is obtained at the soft edge of the spectrum, with Ai(x) and Ai (x) the Airy function (of the first kind) and its first derivative. All three kernels are universal for a broad class of systems and hold for a much larger class of unitarily invariant ensembles of random matrices than Gaussian, see the review by Kuijlaars [8,Chapt. 6]. For an extension to orthogonal and symplectic ensembles see the book by Deift and Gioev [37].
In principle, all spectral information is contained in the kernels in the respective region of the spectrum, including the spacing distribution. To assess the latter, one typically uses the gap probability, the probability P [a,b] that an interval [a, b] does not contain eigenvalues, expressed in terms of the Fredholm determinant of the corresponding kernel, where the integral operator is for an arbitrary (L 2 -integrable) test function φ. The spacing distribution then follows by differentiation, as it is explained in more detail in Section 3. However, we will not follow this route of Fredholm determinants and compute the spacing distribution in a different manner. The second information that can be retrieved from the kernel is the mean level spacing. It follows from the microscopic level density at large argument, which yields the macroscopic level density at that point in the spectrum, which gives the inverse mean spacing. This quantity is important for the unfolding of the spacing distribution. From the sine-kernel we have that is the mean level spacing is already normalised to unity. For the Bessel density we obtain after using l'Hôpital's rule implying that the mean level spacing is π in this normalisation of the Bessel-kernel. We will come back to this point in Section 5. The soft edge is an exception, as here the macroscopic density vanishes at the point we zoom in. However, also here is it possible to unfold the spectrum and we refer to [38]. The Bessel-kernel can yield the Airy-as well as the sine-kernel in certain limits, which is rather expected. For instance, a large index ν pushes the spectral edge away from the origin so that it becomes a soft edge, see e.g. [39], where this transition is studied on a microscopic level. This limit is not in our main focus in the present work. On the other hand, when taking the limit of large arguments in the Bessel-kernel, moving away from the hard edge, the sine-kernel is obtained. We are interested how much the hard edge statistics differs from the bulk statistics and will quantify this below in such a large-argument expansion.
To understand how the Bessel and the sine-kernel are related, one needs to exploit the following asymptotic expansion of the Bessel function [40,Eq. (10.17.3)] The aim is to asymptotically expand the Bessel-kernel for x a , x b → ∞, under the condition of a fixed difference x a − x b . In the kernel (4), we only encounter the product of two Bessel functions, such that we consider the asymptotic expansion Plugging these terms into the kernel, we either have to symmetrise or anti-symmetrise these terms in x a and x b . Thence, we arrive at the expansion Let us point out that we have taken x a , x b 1 to be of the same order. Below we will also assume that The sine-kernel can be easily identified as the first term in (12). The question is whether the 1/(x a + x b ) can be cancelled by an unfolding transformation. For that we underline that, after unfolding, the bulk level density is simply equal to unity, highlighting the translation invariance, cf. [1]. In the current situation, the level density corresponding to the kernel (12) is equal to We note that we understand x a = x 0 + δx a and x b = x 0 + δx b , with δx a and δx b being kept fixed in the asymptotic x 0 1. The unfolding works along the formula and ditto for y b and x b = x 0 + δx b . This relation is born out of the requirement dy a = ρ(x 0 + δx a )dδx a so that the level density in y is ρ(y) = 1, as it is for the sine-kernel. This relation can be inverted by recursively inserting δx a in the non-linear part, which leads to From the first term we see that in this way we rescale the mean spacing of the Bessel-kernel from 1/π to unity. Plugging this into (12) and multiplying the kernel with the Jacobian of this transformation, we eventually arrive at 1 The calculation above shows that indeed the rate of convergence of the Bessel-kernel to the sine-kernel goes with 1/x 0 which is not very fast. Thus, it is rather amazing how close the level spacing distribution at the hard edge is to the one in the bulk, as we will see in Section 5.2. It looks more like a 1/x 2 0 convergence than 1/x 0 . The reason is that one fixes the distance s = y a − y b and integrates over the the center of massȳ = y a + y b . This strongly suppresses the pre-factor of the leading correction -it might be equal to zero which we have not checked in the present work. Indeed, for the 2-point correlation function, we obtain 1 When integrating overȳ = y a + y b as it has to be done for the level spacing distribution, we see that the leading order correction vanishes, This heuristic argument works in general for all k-point correlation functions, so that their average starts from 1/x 2 0 , and as the level spacing distribution between two consecutive eigenvalues is described by such an average, see Section 3. We expect that this may explain our numerical observation of a faster convergence of the level spacing distribution from the hard edge to the bulk limit.
Since the analytical expression which we derive in the ensuing sections are too complicated to actually see whether this argument regarding the rate of convergence is indeed the case, it remains an open problem.
Finally, let us briefly describe the limit from the Airy-to the sine-kernel. In the scaling limit leading to the Airy-kernel (5), the bulk statistics is attained by taking the limit of large negative arguments. Therefore, we consider arguments x a = −x, x b = −y in (5) and take the limit x, y → ∞. The asymptotic expansion of the Airy function and its derivative read Ai (−y) = cf. [40, Eqs. (9.7.9), (9.7.10)], respectively. Rather than starting with the unfolded Airy-kernel [38, Eq. (VI.16)], we will depart directly from (5) and attain the sine-kernel through a change of variables. Multiplying out, we have for x, y → ∞ When changing variables from x, y to ζ, η, we need to take into account the Jacobian (symmetrised in both entries) (2/3) 1 3 (ζη) − 1 6 , which needs to be multiplied with the kernel, Here, the correction term is to be understood using ζ, η ≈ ζ 0 1. The first term does note quite yet look like the sine-kernel. However, an expansion about the base point ζ 0 yields the correct terms, where only the difference between ζ and η (or powers thereof) is of order unity. Thus, the first terms of (12) and (22) do agree. The second term, whose prefactor is approximately 1 3(ζη) can be absorbed via unfolding using the same route as starting from (12). Thus, we expect the same rate of convergence from the soft edge to the bulk statistics like at the hard edge.
To highlight and get a feeling how close the two edge statistics are, we need to unfold both. This is particularly important as it is known [39], that the Bessel-kernel becomes the Airy-kernel when taking the limit ν → ∞, while the edge will be located at x = ν, see [42]. To make both kernels, especially the Bessel-kernel with its varying parameter ν comparable, we need to properly unfold the microscopic spectrum. For the Airykernel (5), we have already seen that the unfolding is given by x = − sgn (λ)(3π|λ|/2) 2/3 with λ ∈ R. The factor π guarantees that the mean level spacing is equal to 1, and the minus sign reflects the spectrum at the origin so that the bulk will be to the right hand side as it is for the Bessel-kernel. The sign sgn (λ) of λ has to be dealt with separately, because of the root that has to be taken. Taking the limit x a , x b → − sgn (λ)(3π|λ|/2) 2/3 by applying l'Hopital's rule, and multiplying the kernel with the Jacobian [2π 2 /(3|λ|)] 1/3 , the unfolded microscopic Airy level density is given by [38] R (∞) 1,unf (λ) = sgn (λ) The superscript ∞ indicates that this expression agrees with the properly unfolded Bessel-kernel density (9) in the limit ν → ∞. Evidently there will be a singularity at the origin, which is a relict of the unfolding and has no deeper meaning, cf., Figure 1.
To get the proper unfolding for all ν of the hard edge statistics, we start from the integral representation of the Bessel function for an arbitrary integer ν ∈ N 0 . For large x 1 and ν 1, regardless what kind of relation they have, it is clear that the saddle point equation gives x cos(ϕ) = ν. This can be only solved when x ≥ ν, hinting at the creation of a soft edge. In the regime x ≥ ν, via standard stationary phase approximation the Bessel function becomes as follows In particular limits, this agrees with the asymptotic in [39,Theorem 1.3]. A simple comparison with (10) shows that the proper unfolding is equal to As a check, the Taylor expansion gives the behaviour λ ∝ (x/ν − 1) 3/2 in the limit ν/x ≈ 1, which is the unfolding of the Airy-kernel. To mimic the sign sgn (λ) that has appeared for the unfolding of the soft edge statistics, we choose One can check that g(x) is monotonously increasing on R + . Indeed again a Taylor expansion about ν/x ≈ 1 gives the proper unfolding of the soft edge statistics. To invert the relation between x and λ we have exploited the integral formula where Θ is the Heaviside step function. The Jacobian of this substitution is Thus, the unfolded Bessel-kernel is given by It indeed satisfies the point-wise limit lim ν→∞ R 1,unf (λ) for fixed λ, see (25) and Figure 1. The unfolded microscopic level densities (25) and (32) are illustrated in Figure 1. All curves from ν = 0 to ν = ∞ almost perfectly agree, starting from the fourth eigenvalue. Hence we would not expect a big difference in any of the level spacing distributions. For ν = 20 the Bessel-kernel seems to agree with the Airy-kernel even from the first eigenvalue, judged with the bare eye. Hence, we also expect a quick convergence in increasing ν. The biggest deviation for the level spacing distribution might be seen between the first and second eigenvalue for ν = 0.

Gap probabilities and Level Spacing Distributions
Let us underscore that most of the ensuing discussion in the present section holds for arbitrary positive real spectra with a finite number of eigenvalues. Thus our particular choice of an ensemble can be readily generalised.
We start from a generic joint probability density P n ({x}) of the set of n unordered eigenvalues {x} = . . , n. The emphasis on "unordered" is important, as some combinatorial constants depend on whether the eigenvalues have been ordered or not. In our case, the normalised joint probability density of the χGUE [16,41] with N f inserted characteristic polynomials is given by The properly unfolded microscopic level densities of the Bessel-kernel result (32) for ν = 0 (blue solid curve), ν = 1 (red finely dashed curve), ν = 20 (green coarsely dashed curve), and ν = ∞ (black solid curve). The latter agrees with the unfolded Airy-kernel result (25) reflected at the origin. As a guideline we have added the asymptotic unfolded level density, which is equal to the constant 1 (solid black horizontal line) and the half-sided picket fence spectrum (spectrum of the harmonic oscillator), with eigenvalues at n + 1/2 (dashed horizontal lines). These auxiliary lines indicate the proper unfolding of the spectra. The singularity at the origin is a relic of the unfolding, as it is not smooth at the macroscopic (as well as mesoscopic) spectral edge, cf. [38].
with the Vandermonde determinant defined as ∆ n ({x}) = 1≤a<b≤n (x b − x a ). This density depends on several external parameters: , the quark masses in applications to QCD, and the topological charge ν ∈ N counting the number of zero eigenvalues of the Dirac operator (rectangularity of the n × (n + ν) random matrix). For more details in this relation to QCD we refer to [18]. The normalisation constant (partition function) is denoted by As already mentioned, one common technique to compute the level spacing distribution is via gap probabilities using Fredholm determinants, see [8]. These yield the probability that a certain interval is void of eigenvalues. As we aim for the level spacing distribution between a specific pair of consecutive eigenvalues, we define the kth gap probability as follows. It implements the condition that k eigenvalues lie below the gap, and the remaining n − k eigenvalues are above the gap. When the gap is the interval [a, b] ∈ R + , this conditional gap probability is defined by the n-fold integral The combinatorial pre-factor takes care of the fact that the eigenvalues are unordered and ensures the proper normalisation. For k = 0, only the second set of integrals is present, that is all eigenvalues are larger than or equal to b. The gap probability can be also expressed in terms of the kernel of the corresponding orthogonal polynomials, that is the generalised Laguerre polynomials for the χGUE at N f = 0, see Section 4. However, we will directly compute the spacing distribution in a different way to be introduced now, circumventing the problem to determine the orthogonal polynomials when introducing N f flavours in the weight function. When taking the mixed second derivative of Eq. (35) in a and b for k ≥ 1, we obtain the joint density that the kth eigenvalue sits at the position a and the (k + 1)st eigenvalue at the position b. Defining the spacing s = b − a, and integrating over their starting point a = x k , yields the level spacing distribution between the kth and (k + 1)st eigenvalue, Here, we explicitly spell out the arguments of the joint density P n ({x}). When k = 1 the set of integrals over [0, x k ] is absent. It can be shown that the spacing distribution is properly normalised for all k, Our aim is to study the spacing for the smallest eigenvalues. Therefore, we fix the smallest k + 1 in the large n-limit. This means that the integral over the remaining n − k − 1 eigenvalues can be understood as being proportional to a partition function of n − k − 1 eigenvalues, depending on an extended number of shifted masses. Indeed, a standard trick [22] is to define new variables y j = These can be interpreted as eigenvalues of a positive definite Hermitian random matrix of dimension (n − k − 1) × (n − k − 1 + 2). The expression for the level spacing distribution in terms of the shifted variables only changes marginally Once again we have explicitly spelled out the arguments of the joint probability density. At first glance, the expression (37) looks more cumbersome than the one we started from. Yet, for the joint probability density (33) of the χGUE with N f flavours this is a considerable simplification, because it relates two joint probability densities of the same random matrix ensemble, but for different matrix sizes and different numbers of masses. In particular, using that the Vandermonde determinant is invariant under translation of its variables, and the fact that it can be split in terms of two sets of variables as follows, it holds that Here, we have defined an extended number of flavours The enlarged set of N f (shifted) mass parameters is reading The set of new masses (x k +s) is ν-fold degenerate, whereas the masses (x k + s) 2 − x 2 j are two-fold degenerate for j = 1, . . . , k. Obviously the last two masses can be simplified to s(2x k + s). Notice that the shift has promoted the ν zero-eigenvalues in (33) to become degenerate mass-terms, whereas the squared Vandermonde determinant has created a fixed number of zero-eigenvalues ν = 2 in the new variables y 1 , . . . , y n−k−1 .
This relation has been exploited several times in the literature [21,20,22]. In our case, it creates a short-cut as the pre-factor in front of the joint probability density P , the ratio of two partition functions, is y-independent. Hence, the integral of P ( N f ) 2,n−k−1 ({y}) over all its eigenvalues y 1 , . . . , y n−k−1 yields unity, so that we end up with a k-fold integral for the level spacing distribution of the χGUE with N f flavours Notice that the last line is proportional to s 2 , due to the term with j = k. Certainly, part of the difficulty has been moved into the evaluation of the new partition function Z Yet, its explicit form is known in principle and will be spelled out in the following section.
One last remark is in order which applies to arbitrary ensembles. The level spacing distribution (37) still has to be unfolded, meaning its first moment does not necessarily equal to 1 yet. This is especially the case for the result (42). Under the assumption that the macroscopic or mesoscopic level density is already in a constant form (the microscopic level density asymptotes to a constant when taking the limit into the bulk), we only need to rescale the spacing distribution to make it comparable with standard distributions such as the Wigner surmise (1), where the first moment is unity. This means, one needs to consider the rescaled spacing distribution p k,n (s) =s k,n p k,n (s k,n s) .
Note that when we rescale the spacing between two eigenvalues we also have to rescale all other eigenvalues, as well as the masses m j with the same scaling factor. For practical purposes one can also compute the mean spacings k,n through the difference of the mean positions of the kth and (k + 1)st smallest eigenvalue.

Orthogonal polynomials and partition function
To compute the partition functions Z  Table 18 is the definition of the generalised Laguerre polynomials in their standard normalisation. Notice that in this definition the parameter ν can be continued to negative integer values, meaning that the sum will be cut off from below, to start at l > −ν only. For later convenience we also introduce generalised Laguerre polynomials in monic normalisation, L n (z) = z n + O(z n−1 ), with its corresponding norms The corresponding kernel determines all k-point correlation functions of the χGUE with N f = 0 through (2) at finite-n, In contrast to Section 2, we use here squared variables for the (pre-)kernel K , as we later have to differentiate with respect to these. Also note that we have not included the weight function into the kernel. It only contains the orthogonal polynomials and is given by where in the second equality we have used the Christoffel-Darboux identity for x 1 = x 2 , see [40,Eq. 18.2.12].
Note that the determinant (48) is invariant under a rescaling of the kernel for any non-zero function g(x). Two kernels related in this way are called equivalent kernels, leading to the same k-point correlation functions and, thence, the same spectral statistics. When is employed to bring (50) into the form In this form, compared to (50), the leading order asymptotic of the Laguerre polynomials for large-n is no longer cancelling and thus non-vanishing. Equation (53) implies in particular the following expression for the spectral density at finite matrix dimension n, R 1,n (x) = x 2ν+1 e −x 2 K (χGUE) ν,n (x 2 , x 2 ) from (48). It is normalised to the number of points n.
When interpreting the N f flavours added to the χGUE in (33) as expectation values of characteristic polynomial in the χGUE at N f = 0, the following expression is well known, see e.g. [18] for a derivation, The factor n! in front reflects that the eigenvalues have not been ordered. For N f = 0 the first product and last ratio are absent, and we recover The preparation of (54) for the large n-analysis to be done in Section 5 is based on the relation [40, Eq. (18.9.14)] applied recursively. Taking linear combinations of columns in the determinant of (54) allows us to rewrite (54) as Now, the columns no longer become degenerate to leading order in the large-n asymptotic, as they would have done in (54). The normalisation of (57) can be readily checked in the quenched limit m 1 , . . . , m N f → ∞.

Partition function of shifted masses
In order to determine the second partition function Z with an extended number of shifted masses (41), which are partly degenerate, we will use the following theorem from [44]. It expresses the expectation value of the product of characteristic polynomials in several equivalent forms, in terms of the determinant of a matrix of kernels and orthogonal polynomials. Adapted to our setting of the χGUE, it reads for Notice that in [44] the expression on the right hand side is given here in terms of orthogonal polynomials in monic normalisation and their norms (47). The point is that we may split the K + L characteristic polynomials or masses in two groups as we like, resulting in many equivalent forms of determinants of different size. In [44], valid for Hermitian and non-Hermitian ensembles of random matrices, this split was introduced when considering the product of K characteristic polynomials and L complex conjugated ones in the non-Hermitian case.
Considering the N f new masses in (41), it is unavoidable to apply l'Hôpital's rule to the ν-fold degenerate mass x k + s. However, some simplification can be achieved by dividing the k two-fold degenerate masses into the two groups of characteristic polynomials with parameters {v 2 } and {u 2 }, thus choosing K = N f + ν + k and L = k in (58). Before taking into account this degeneracy, let us take one more step of preparation, in sending v 2 i → −v 2 i and u 2 j → −u 2 j in (58), to achieve the correct signs in the mass terms, by switching polynomials and kernels inside the determinant, and by moving from monic to standard generalised Laguerre polynomials where we have multiplied by (55). Regarding the degeneracy of some of the parameters v j , it is not difficult to see that for any set of suitably differentiable functions f 1 , . . . , f ν , the following limit applies: where f a (x) denotes the bth derivative with respect to the argument. The size of the determinant can be trivially extended, as long as the first ν rows depend in the same way on the arguments z b and thus become degenerate in the limit.
Let us turn to the evaluation of the partition function Z ( N f ) 2,n−k−1 ({m}). In order to apply (59), we split the N f = N f + ν + 2k masses into two sets, which is born out of (41), Notice that in the last term for j = k, we obtain which is proportional to √ s. These are the variables that will appear inside the polynomials and kernels on the right hand side of (59). In the Vandermonde determinants of parameters {u 2 } and {v 2 }, due to their translational invariance, the following simplifications occur: and with In the last line, we have already applied the degeneracy of the first ν masses. Before taking the degenerate limit of the first ν parameters v i , we multiply the first ν rows of the determinant by s 4 for later convenience, and thus divide the pre-factor by s 4ν . Putting all this together, we obtain the following expression for the partition function of N f shifted masses with n − k − 1 eigenvalues and 2 zero-modes Notice that all factors of (−1) have cancelled, and we recall the definition (61). In the same way as we simplified (54) to (57), in order to prepare the large-n limit, we can apply here (56) to modify the first N f + ν column as follows: applied to z → − s 2 . We would like to point out the subtlety that, to be applicable, the power in z ν and the order of the generalised Laguerre polynomials ν have to agree. This is the reason why we multiplied with s 4 in one of the previous steps. After differentiation the power and order are lowered by one unit, which allows to iterate. Therefore, we recombine the first ν rows beginning from the highest derivative, so that we obtain shifted derivative operators, (1 + ∂s2) a−1 instead of ∂ a−1 s 2 , making (68) in z → − s 2 applicable. We evaluate for the Laguerre polynomials, and for the kernel (49) In the last line, we have defined which includes the original kernel for a = 1, K In such a way, the factors s 4 and factorials in (69) and (70) can be pulled out of the first ν rows of the determinant in (67), and we obtain as the final answer for the partition function

Hard edge kth spacing distribution at finite-n
Inserting the expression (72) as well as (57) into (42), we obtain the following relatively compact expression for the spacing distribution at the hard edge between the kth and (k + 1)st smallest eigenvalue for arbitrary k, with N f flavours and topology ν at fixed n. Recalling the definition of the shifted variables from (61), we have The expression (73) is our first main result and will serve as the starting point for the asymptotic analysis in the next section. Before we come to this, let us make some comments on the duality between the index ν and the number of flavours with mass 0. Indeed, when taking the limit m N f → 0, the result (73) reduces to the case N f → N f − 1 and ν → ν + 1. This can be readily checked, while the other limit m N f → ∞ leads to the partially quenched case N f → N f − 1 and ν → ν, meaning the index ν stays the same.
Let us give some simple examples. In the absence of flavours (N f = 0) and zero-modes (ν = 0), we obtain the following expression for the kth spacing in the χGUE with square matrices Here, all shifted variables are explicitly written out. In particular, for the spacing between the first and second eigenvalue (k = 1) we obtain a one-fold integral representation × L (2) n−1 (−s(2x 1 + s))L (2) n−2 (−s(2x 1 + s)) − L n−2 (−s(2x 1 + s))L (1) n−1 (−s(2x 1 + s)) , where we inserted (53) for the kernel at equal arguments. This result still has to be unfolded, as explained at the end of Section 4. 1,n (σ) between 1st and 2nd eigenvalue (k = 1) at the hard edge, rescaled to first moment of unity, for different values of n = 2 (blue), n = 6 (orange), n = 10 (green), and n = 20 (red). The dashed curve is the bulk spacing distribution (A.1) for comparison. Top Right: Because the curves are almost indistinguishable we also compare the difference ∆p k (s) between the asymptotic spacing (90) to be derived in the next section, and our expression at finite-n (53), for n = 2 (blue), n = 4 (orange), n = 6 (green), n = 8 (red), n = 10 (purple), and n = 20 (dashedblack). Bottom Row: Differences of the finite n spacing (73) to their asymptotic counterpart (86) for k = 2, n = 4, 6, 8, 10, 20 (Left), and k = 3, n = 4, 8, 20 (Right). In both plots the dashed line denotes the difference for n = 20, both with their first moment set to unity. These plots illustrate the rapid convergence already at relatively small values of n. Please note the different scales of the y-axis.
In Figure 2 (top row), we compare (75) for small values of n and the convergence to the asymptotic limit. The result for k = 1 at n = 2 is already very close to the limiting result, showing the optimal rate of convergence 1/n 2 at the hard edge, as it was shown in [46]. This certainly is also true when increasing k to k = 2 and k = 3, see bottom row of Figure 2. The reason why the optimal rate of convergence applies here is that we unfold via the formula (44), since in [46] it was shown that already a scaling is sufficient to reach this rate.
Since the deviations of the finite n-results and the asymptotic ones lie in the per mill regime, we have plotted the difference ∆p k (s) =p

The large-n limit at fixed k, N f and ν
In the asymptotic large-n limit of the χGUE at the hard edge, with our convention of an n-independent weight function, it is well known that the eigenvalues, and therefore also the spacings and masses scale with √ n, see [18]. We therefore define the following variables where σ and the µ f remain finite in the large-n limit. In view of the integrations to be performed in (73), and the definition (61) of the shifted variables therein, we also redefine the integration variables in the following fashion: We begin with the large-n asymptotic of the generalised Laguerre polynomials, using [40,Eq. (18.15.19)] with I ν (z) the modified Bessel function of the first kind. In foresight, we already allow here for a general degree j of the polynomial, to be able to take the limit under the sum in (71). When j = O(n) we can set t = 1 in the above expression, as it will be the case for the unintegrated Laguerre polynomials. We apply this first to the mass dependent determinant of Laguerre polynomials in the denominator in the first line of (73). In view of the scaling (76), we also include the mass dependent pre-factor Next, we move to the building blocks of the large determinant in the last line of (73). We begin with distinguishing the Laguerre polynomials of different variables, multiplied by different powers. Inserting Eqs. (76) and (77), we obtain from the limit (78) In view of these results, it is convenient to define again a set of limiting shifted variables, in analogy to (61), in order to compactify the final answer for the limiting spacing distribution: We are now ready to take the limit of the kernel (71). For any e = 1, . . . , k − 1, it holds where we have defined after changing variables. The same result holds for e = k, replacing λ e by χ. Finally, for a = 1 we obtain for the limit of the χGUE kernel: for b = 1, . . . , N f and e = 1, . . . , k − 1, and likewise for different arguments µ b → λ c and λ e → χ. In the last step we have performed the integral, corresponding to the limit of the second line in (49), whereas the first line follows from replacing the sum with an integral. Finally, for equal arguments we have and likewise for λ c → χ. If we had taken the limit for Laguerre polynomials with positive argument in (78), this would be proportional to the Bessel density at ν = 2, see (9). The factor of two in the argument of the Bessel functions indicates that in our scaling limit (85) the mean level spacing is π/2. We will come back to this below. We now have all ingredients together to take the limit of the spacing distribution (73), where in view of the scaling (76) we define In the quenched case N f = 0 of the χGUE with ν = 0 it reduces to In the simplest case k = 1 (spacing between the first and second smallest eigenvalue), we thus obtain the following single integral representation where we have spelled out all shifted variables explicitly, as well as the kernel from (85). As explained at the end of Section 3, we still have to unfold, by computing the first moment and rescaling the spacing distribution accordingly. Here and in the setting with more parameters, we were only able to do this numerically. In the present case this means leading to the final answer (s)−p bulk (s) (red higher curve) is still considerably larger than the difference between the exact bulk spacing and the Wigner surmise (black, dashed curve).
As it was argued after (85), the approximate mean level spacing we expect here is π/2 ≈ 1.571. It is very close toσ 1 , but given the very small deviation from the bulk spacing to be discussed below, we better use its exact valueσ 1 to set the first moment ofp (0,0) 1 (σ) exactly to unity.

Comparison to the bulk spacing distribution
We now turn to the comparison between the hard edge spacing distribution for different parameter values k, N f and ν, and the bulk spacing distribution (A.1). We will also compare with the Wigner surmise (1), which is often used in comparison to real data, in order to illustrate the closeness between the hard edge and bulk spacing.
We begin with the quenched spacing between the first two smallest eigenvalues (k = 1) with proper unfolding, meaning the first moment is unity, see Figure 3. The deviations of this level spacing distribution from the Wigner surmise (1) or from the bulk spacing distribution (A.1) is about one percent and can be discerned with the bare eye at the maximum of the distribution. This has to be seen in contrast to the difference between the Wigner surmise and the bulk spacing distribution which is only about a few per mill, see [12]. As already mentioned at the end of Section 2, the case k = 1 and ν = N f = 0 will be the one that should show the strongest deviations. One approach to the bulk level spacing is certainly given when increasing k, as we literally move into the bulk. For simplicity we restrict ourselves to the quenched case (N f = 0) and a single flavour (N f = 1), see Figure 4. We find, that the deviation from the bulk spacing decreases very quickly. Already for k = 3 the difference to the bulk spacing distribution is of the same order as the difference between the bulk level spacing distribution (A.1) and Wigner's surmise (1). This agrees with our observation for the microscopic level density at the end of Section 2. Introducing flavours only suppresses this difference even more.
A warning is in order. The numerical evaluation of the multiple k-fold integral representation (87) becomes rapidly unstable when going to higher values of k, ν and N f . One reason is the increasing number of integrals to be carried out. Another reason is the ratio of the determinants in the integrand, which can become numerically unstable as the size of the determinant increases. It may happen that big as well as tiny numbers have to cancel. Metropolis-algorithms are also hard to implement, though they are in principle possible, because the functions in the integrand are quite involved.
To overcome this problem we have employed Monte-Carlo simulations of the random matrix ensemble to study the change of the level spacing distribution when increasing k, in particular for k ≥ 4. For this purpose, we only consider parameter sets with N f = 0, since otherwise we would need to rely on the Metropolis-Hastings algorithm. This again would exceed feasible computation time for a sufficiently large matrix dimension and  number of configurations. Furthermore, due to the duality between N f and ν, it is equivalent to only consider ν = 0 or N f = 0 with masses very close to zero, as discussed in Section 4.
As the eigenvalue statistics at the hard and the soft edge are fairly different, as can be seen in Figure 1, where the microscopic spectral densities of those two regions of the spectrum are displayed, we further want to study the level spacings obtained from numerical simulations, first for relatively small k, and secondly for the transition from the bulk to the soft edge. To get the clearest signal, we have first considered ν = 0, as for larger ν we are closer to the soft edge, cf., Figure 1. Figure 4 highlights that for k ≥ 3 the deviation between the level spacing distributions at the hard edge (Eq. (86)) and those in the bulk (Eq. (A.1)) is comparable to the deviation between the latter and the Wigner surmise (1). We have further quantified this result via a χ 2 -test in Table 1, where we have computed the L 2 -distance between level spacing distributions (86) and (A.1) for various k with Monte Carlo simulations.
It can be seen, that for the transition from the hard edge to the bulk region of the spectrum, the deviation between the level spacing and the bulk spacing decreases, as it should, and increases again when reaching the upper edge at k = n − 1 = 49 which is a soft edge and, thence, has to follow the Airy statistics.
If we now look at the level spacing distribution at the soft edge (see Figure 5) and the transition from the bulk to the latter, we find this transition to be equally fast. The differences obtained from the χ 2 -test between the numerical data and the bulk spacing are of the same order as for a spacing in the middle of the bulk, from the third spacing k = 3 upwards and from k = 47 downwards, as can also be seen in Table 1.  Table 1. The χ 2 -test for the level spacing distributions between the kth and (k + 1)st eigenvalue for ν = 0 at matrix dimension n = 50, in comparison to the bulk spacing distribution (A.1). We have chosen a bin size of roughly 0.2 and an ensemble size n conf. = 10 6 to keep the statistical error below one percent.

Asymptotics to the Airy statistics (ν 1)
Another limit is the large-ν limit which should approach the level spacing distributions at the soft edge. Actually this also holds when the number of flavours N f is sent to infinity, while the corresponding masses are of the order of 1/ √ n, as we have exploited this scaling in our computations. In general, we can understand the effect of non-vanishing flavours and topology as follows. Both push the eigenvalues further away from the origin, making the effect of the hard wall less relevant.
In the top plots of Figure 6 we have numerically evaluated (86) for k = 1, N f = 0 and ν = 0, 1, 2. Hence, it is the level spacing between the two smallest eigenvalues for a different number of zero eigenvalues. As mentioned before, we are unable to go much beyond this setting. Thence, we have generated Monte Carlo simulations with matrix size n = 100 for larger values of ν. Those are fairly good approximations of the asymptotic result, as the rate of convergence for the unfolded spacing distribution is 1/n 2 at the hard edge, cf. [46]. As all unfolded level spacing distributions are very close together, compare the upper left plot in Figure 6, we have studied the differences to the bulk spacing distribution p bulk (s), see (A.1), as a function of ν. For the histograms from the Monte Carlo simulations one needs to integrate p bulk (s) over the length of the bins, so that one can take such a difference. Thus, the general procedure has been as follows: (1) creating the histogram of the level spacing distribution p  One can readily see in Figure 6 (bottom plots), that the distance between the level spacing distribution of the two smallest eigenvalues and the one in the bulk first decreases with increasing ν (left plot). However, at about ν = 15 it again increases as the distribution slowly approaches the level spacing distribution of the soft edge, which we have added for comparison, see lower right plot in Figure 6. The spacing distribution of the two largest eigenvalues is also Monte Carlo simulated for the matrix dimension n = 100. However, one needs to be careful with this result, as the finite size error is quite big at the soft-edge. In [47] it was found that the optimal rate of convergence is n −2/3 . Thus, in the present case the systematic error is about 20% and, therefore, much larger than the deviations to the bulk level spacing distribution (A.1) we are comparing with. To have it of the same order one needs to generate matrices of the size n = 10 5 which is out of reach for our modest computing power.
As already discussed in Sec. 1, it is well known [39], that for large ν the Bessel-kernel converges to the Airy-kernel, as the hard edge of the Marchenko-Pastur law deforms into a second soft edge. To get a statistical quantification of this effect, we have performed another Monte Carlo simulation of the random matrix ensemble with ν ∼ n for increasing ν, and calculated their spacings, differences to the bulk and empirical densities to observe this effect (see Figure 7). We further quantify this transformation again in terms of the χ 2 -test comparing the statistics to the bulk spacing in Table 2. In the latter, it becomes apparent, that the deviation to the bulk spacing increases rapidly for decreasing values of q = n/(n + ν), which shows a rapid deformation of the hard edge into a soft edge. This has been already seen in Figure 1 for the microscopic level density, when ν = 20.
If we compare this to spacings at the soft edge for ν = 0 (Table 1), we find distances of the same order from the χ 2 -test, which shows that for larger ν the hard edge transforms into a soft edge.
In Figure 7 as well as Figure 6, it becomes apparent, that when moving from the hard to the soft edge the peak of the spacing shifts from the right to the left of the maximum of the bulk spacing distribution. This is expected as a stiffer level spacing, as it is the case for the hard edge, should have a maximum closer to 1.
In contrast, for a softer level spacing, like for the Airy statistics, the eigenvalues have more freedom to move, which shows in a maximum further away from 1. Since this effect is barely noticeable, the value of the χ 2 -test is additionally given in Figure 7. We omitted to show the difference to the bulk spacing distribution as the number of generated configurations is smaller here. This is because the matrix size in the second setting of Monte Carlo simulations has been increased already. This results in a statistical error of one percent, which q \ n 200 600 1000 0.98 0.01286 0.00283 0.00527 0.91 0.03533 0.0324 0.04262 0.83 0.01056 Table 2. The χ 2 -test for the level spacing of the smallest two eigenvalues for a few decreasing values of q = n/(n + ν) and increasing matrix dimension. The number of configurations is varying between n conf. = 10 5 and n conf. = 10 6 , where for the lowest q we only went up to n = 200. strongly overshadows the deviation from the bulk statistics.

Comparison to Data from Lattice QCD
Finally, we would like to compare with edge statistics from data, notably at the hard edge where we have derived analytical predictions. For that reason we consider empirical data from lattice QCD Dirac operators. It is known [17] that strongly interacting quantum field theories in the deepest infrared limit, the ε-regime, agree with the statistics obtained from random matrix theory. In QCD, as it appears in the standard model, namely with the gauge group SU(3) in the fundamental representation and a four dimensional Euclidean spacetime, the corresponding random matrix theory is the χGUE. For our purposes this means that the χGUE provides a description for the statistics of the smallest eigenvalues and their spacings for the QCD Dirac operator. This carries over to most lattice discretisations of this theory.
We gathered data from the JLQCD collaboration described in [48]. This data is obtained from numerical simulations on a Euclidean 16 3 × 32 space-time lattice at β = 2.30, with a lattice spacing a ∼ 0.12fm, via two different algorithms, namely domain wall and overlap fermions. From those two algorithms the 50 smallest eigenvalues of the overlap Dirac operator with a total number of configurations of about n conf. = 1000 were computed for different values of Fermion masses, where we will focus on the data with N f = 2, see top part of table 1 in [48].
Although we have derived detailed expressions for the spacing distribution as a function of the number of flavours N f and rescaled quark masses µ f (86), we have also seen that the difference between quenched and unquenched prediction is very small, see Fig. 4. Furthermore, it is evident from the seminal paper [9], that even for about 2000 spacings only a rather coarse comparison to the (bulk) spacing can be made. To further increase the statistics from [48] which is already exceptionally high for a given set of parameters, we have therefore decided to average over all given mass configurations resulting in n conf. = 6046. The mass range in units of lattice spacing of the data [48] is m ud = 0.015, 0.025, 0.035, 0.05, 0.07, 0.1, which maps to a range of about µ f = 1 − 50. Thus not all of these masses are large enough to apply the quenched approximation N f = 0 to the χGUE in (86).
In Figure 8, we find a quite large deviation to the quenched spacing distribution (88) at ν = 0 at the hard edge, of up to 8%. The L 2 -distance given by the χ 2 -test is given by d χ 2 = 0.44616. Here, one has to keep in mind, that this is to a certain extend due to the number of configurations of lattice simulations availabledespite averaging over different masses. The statistical error exceeds by one order of magnitude the difference between the spacing distributions in the bulk, at the hard and soft edge. Therefore, we can conclude that the level spacing distribution is not at all a good measure in lattice QCD to discern hard edge from bulk statistics. This is both because the effect of unquenching is so small and the number of configurations to detect this is exceedingly high.
Only deviations due to a mobility edge are visible as found in [24,23,27,28]. Here, the mobility edge denotes the boundary between localised states, associated with Poisson statistics, and delocalised states related to random matrix statistics. In QCD it is reached by increasing the temperature towards the chiral phase transition, as studied in [24,23,27,28]. Close to the mobility edge, the global symmetries of the states are a mixture with Poisson statistics, which diminishes the level repulsion drastically and yields deviations from the bulk level spacing above the percent threshold, that can be easily discerned by the bare eye.

Conclusion
In the application of random matrices we have learned to appreciate the predictive power for spectral correlations, depending on the location within the spectrum. One example is the microscopic Bessel-density close to the origin, that is relevant when chiral symmetry is important, e.g. in comparison to data from lattice QCD. It depends heavily on the number of zero-modes ν, that characterise different topological sectors of the theory with broken chiral symmetry, and on the number of light flavours N f when the quark masses are sufficiently small. A second example is the Tracy-Widom distribution for the largest eigenvalue, that is relevant in the vicinity of a soft spectral edge. It describes successfully the fluctuations of growth processes for example.
In this work we have learned, that in contrast to that, the spacing distribution between consecutive smallest or largest eigenvalues is a rather inefficient measure to quantify the specific properties of these spectral regions. We could provide compact expressions for the spacing to the kth smallest eigenvalue, that explicitly depend on k, the number of zero-modes ν and number of flavours N f . However, once evaluated these turned out to be very close to the bulk spacing for k = 1 and ν = 0 = N f already, converging very rapidly as close to the exact spacing as the Wigner surmise, almost independently of ν and N f . We have demonstrated this lack of distinction directly upon comparing with QCD lattice data, that are known to follow the predictions of the χGUE for the microscopic density and individual eigenvalues distributions in the ε-regime.
We expect that the very same feature persists when considering chiral ensembles of random matrices with orthogonal or symplectic symmetry. The spacing distribution in these two ensembles is different from the GUE in all parts of the spectrum. When comparing the respective spacing distributions at the hard and soft edge to the bulk, the result is probably equally close as found here in the unitary symmetry class. Substantial deviations from the bulk spacing at the spectral edges can only be expected when considering ensembles with heavy tails or other substantial deformations.