Missing levels in intermediate spectra

We derive an expression for the nearest-neighbor spacing distribution P(s) of the energy levels of quantum systems with intermediate dynamics between regularity and chaos and missing levels due to random experimental errors. The expression is based on the Brody distribution, the most widely used for fitting mixed spectra as a function of one parameter. By using Monte Carlo simulations of intermediate spectra based on the β-Hermite ensemble of random matrix theory (RMT), we evaluate the quality of the formula and its suitability for fitting purposes. Estimations of the Brody parameter and the fraction of missing levels can be obtained by a least-square two-parameter fitting of the experimental P(s). The results should be important to distinguish the origins of deviations from RMT in experimental spectra.


Introduction
Statistical analysis of spectra is a very important tool for understanding the dynamics of quantum and wave systems [1,2,3].Quantifying the level repulsion, for example, it is possible to study the transition between integrable and chaotic quantum or wave systems [4,5,6,7,8,9,10,11,12,13,14], between localized and extended states in disordered systems [15,16,17,18] and between ergodic and many-body localized phases in many body strongly correlated systems [19,20,21,22,23].This relationship is based on the identification of complex wave or quantum spectra with Random Matrix Theory (RMT) [24].These ideas initially came from nuclear physics but are now backed up with a semiclassical justification and a great body of numerical evidence behind them [3,25].Experimental studies of the spectral statistics of different quantum and wave systems from this perspective are numerous [2,26,27,28,13].However, there are intrinsic limitations that have prevented a more wide use of these tools in experimental spectra.For a meaningful statistical analysis one needs complete sequences with no missing levels, no mixing of symmetries and long enough to have sufficient statistics.When the spectrum to be analyzed does not fulfill these conditions the reliability of the statistical analysis can be compromised, leading to incorrect conclusions, as explained next.
In chaotic systems with time-reversal symmetry the appropriate RMT ensemble is the GOE (Gaussian Orthogonal Ensemble) whereas regular systems show spectral fluctuations described by Poisson statistics, corresponding to uncorrelated spectra [29].Thus a transition from chaos to regularity manifests in the spectral fluctuations as a transition from GOE to Poisson statistics.But this kind of transition can also be induced in a GOE spectrum by the loss of correlations caused by missing levels or mixed symmetries, taking its spectral statistics also towards Poisson.Thus, when statistical analysis of a spectrum throws an intermediate result between GOE and Poisson it is quite difficult and requires a complex analysis probably taking into account different statistics [30] to distinguish the actual origin of the intermediate behavior, whether it is due to a true mixed dynamics between chaos and regularity or to missing levels and/or mixed symmetries in an actual GOE spectrum.Moreover, if it is due to both reasons it is even more difficult (when not impossible, unless one has some previous theoretical or experimental information on the dynamics of the system or the completeness of the spectrum) to find out what is the main one or estimate the weight of each, that is, estimate the degree of chaos and the number of missing levels or mixed symmetries independently.
There has been a line of research that tries to circumvent some of these limitations and even take advantage of them in order to estimate the number of missing levels and the number of mixed symmetries in a particular experimental level sequence [5,31,32,33,34].These approaches are based on RMT and the key point in order to be able to extract reliable information about missing levels or mixed symmetries is to assume that the spectral statistics coincide with the GOE (or the corresponding RMT ensemble for each symmetry class), that is, the actual experimental spectrum is chaotic.For example, analyzing the spectral statistics of chaotic nuclei it is possible to estimate how isospin symmetry is broken [35].The number of missing levels in experimental sequences can also be estimated using these techniques.It is possible, then, to correct the value of experimentally obtained level densities [36,37,38,39].
To sum up, there are tools available to determine the degree of chaos assuming that the spectrum is complete (no missing levels) and tools to estimate the number of missing levels assuming that the spectrum is chaotic (GOE).But to the best of our knowledge, there are no tools with which one can obtain both parameters at the same time: there has been no attempt to study the effect of missing levels on the spectral statistics of a mixed system between chaos and regularity.It is the purpose of this paper to fill the gap.We have been able to derive a two-parameter formula for the nearest-neighbor spacing distribution P (s) that allow an independent estimation of the degree of chaos and the fraction of experimentally observed levels.Through Monte Carlo simulations of mixed spectra based on the β-Hermite ensemble we study the accuracy of our formula and perform some tests to prove its usefulness by fitting the P (s) of these simulated spectra.We prove then that our formula is useful for estimating at the same time the chaoticity and the fraction of missing levels of an experimental level sequence when this fraction is not very large.

P (s) for missing levels in intermediate systems
The two extremes (regular-Poisson and chaotic-GOE) in the nearest-neighbor spacing distribution (NNSD) are universal.However, there is not a universal transition for the intermediate dynamics.Two general behaviors in this transition can be distinguished in terms of level repulsion: fractional level repulsion and level repulsion of only a fraction of levels [40,41].Fractional level repulsion when the NNSD behaves as P (s) ∼ s q for small values of s is the most common.It can be described phenomenologically quite well by the Brody [4] and the Izrailev [6] distributions.Both describe this kind of repulsion at small spacings but are slightly different at large spacings.They are phenomenological distributions, although the Brody distribution can be derived from a power-law ansatz for the level repulsion function [42].We choose the Brody distribution in this paper as it is the more extensively used for fitting results in intermediate systems.The Brody's formula is given by: where , and q is the so called Brody's parameter or mixing parameter, which allows to interpolate between Poisson distribution for q = 0 (regularity): and Wigner distribution for GOE for q = 1 (chaos): The NNSD is the first of a series of spacing distributions between neighbors of n−order, p(n, s), n = 0 being the NNSD.These distributions are defined after the unfolding of the spectrum where the level density is locally normalized to unity [3].A proper unfolding is not an easy task [43] but, for our purposes, we assume that the spectrum has been properly unfolded.Moreover, for our calculations here with β-ensemble spectra the unfolding is straightforward as the level densities are known from RMT.For unfolded levels these distributions need to be normalized and centered with the following conditions: When levels are missing from a sequence the higher order spacing distributions start to play a role.Using simple statistical considerations, Bohigas and Pato derived the general formula for the NNSD of a sequence with only a fraction of observed levels f (the fraction of missing levels would be then 1 − f ) as a function of the p(n, s) distributions and assuming that the missing levels are randomly distributed [31]: The interpretation of this formula is easy as the prefactor (1 − f ) k for k ≥ 1 is the probability that there are k missing levels between two given levels.An observed nearest neighbor spacing (left hand side) can really be a 2nd neighbor spacing with 1 missing level in between, a 3rd neighbor spacing with 2 missing levels in between,... a (k + 1)th neighbor spacing with k missing levels in between... in the original spectrum (right hand side).One, then, needs to use s/f instead of s, given that the level sequence with a fraction f of observed levels is unfolded to < s >= 1, so the average spacing in the original sequence without missing levels is a factor 1/f higher.From Eq. ( 6) and the form of the higher-order spacing distibution for GOE one can see that the small spacing behavior of the transition from GOE to Poisson when we have fractional behavior (P (s) ∼ s q ) is very different from the transition to Poisson due to missing levels where we still have P (s) ∼ s.This can be seen in Fig. 1, where we show the evolution with the type of dynamics of Eq. ( 1) for different values of q compared to the evolution with missing levels of Eq. ( 6) for different values of f .This gives us hope that we can derive a two-parameter distribution that can distinguish the origin of the transition through a fit to experimental spectra.Given Brody's nearest-neighbor spacing distribution (1), Abul-Magd and Simbel [44] derived a general expression for the level-spacing distributions for higher order neighbors p(n, s) using a statistic treatment proposed by Engel, Main and Wunner [42].They obtained the following generalization of Brody's formula: for n ≥ 1.Here a n = [(n + 1)q + 1]b n and both constants, a n and b n , are determined by p(n, s) normalization conditions ( 4) and ( 5).This expression presents many complications when trying to calculate high-order spacings.This is due to the fact that an exact solution for b n can only be found in the case f=0.1 f=0.3 f=0.6 f=0.9 f=1 (GOE) Figure 1: Comparison of the evolution of P (s) from GOE to Poisson with the mixing parameter q of Eq. ( 1) for the transition with the type of dynamics (a) and with the fraction f of observed levels of Eq. ( 6) (b). in which the system is regular (q = 0) where b n = 1, and in all the other cases it has to be calculated by solving the integrals numerically.Those calculations are too heavy for practical purposes but we have found that Gaussian approximations for n ≥ 3 work very well.This trick was already used in the original work of Bohigas and Pato for missing levels in GOE [31].Then, we have proceed as follows.
For n = 1, a generalization of the Brody formula for the next-nearest-neighbor distribution is obtained by substituting equation (1) for p B (0, s) into equation (7).
where a 1 = (2q + 1)b 1 .The parameter b 1 was parametrized by Abul-Magd and Simbel and has the form: For n = 2, following the same steps and substituting equation ( 8) into equation (7), we find that Here a 2 = (3q+1)b 2 and b 2 has to be obtained numerically from the normalization conditions.As Abul-Magd and Simbel did for b 1 , we have calculated using Monte Carlo simulations the values of b 2 for a scan of values of q (q = 0.1, 0.2, . . .0.9, 1) and parametrized the result in the form: b 2 (q) = 1 1 + 6.7q + 1.3q 2 + 51q 3 . (11) For n ≥ 3, we found that the nth-order level-spacing distribution follows mostly a Gaussian distribution making the calculation of b n not worth the time it takes, especially taking into account that one should keep many terms to have a good approximation to the infinite sum of expression (6).In this work we have kept up to 150 terms.Thus instead of using the expression (7) to describe the p(n, s) distribution, we use now a Gaussian approximation given by where µ is the mean of the distribution, µ = s = n + 1, and σ(n, q) is its standard deviation.From now on, σ(n, q) will be calculated using spline interpolation from a battery of σ(n, q) values previously obtained from spectra with different q values (q = 0.1, 0.2, . . .0.9, 1).
Once we have an expression for p(n, s), we are ready to propose an expression to describe the nearest-neighbor spacing distribution, P (s), associated with incomplete spectra of intermediate and chaotic systems.Substituting equations ( 1), ( 8), ( 10) and ( 12) into equation ( 6), we obtain: where all the parameters have already been defined.This is the central result of this work.Despite its complicated appearance it is suitable for calculations and fitting purposes, as shown in the next sections.But besides its practical interest, the two-parameter formula is interesting in its own as a first step to start to unravel the puzzle of the actual origin of intermediate behavior of fluctuations in a given spectrum.

Comparison with β-ensemble results
The β-Hermite random matrix ensemble was proposed as a continuous generalization for all values of the repulsion parameter β > 0 of the classical random matrix ensembles corresponding to integer values β = 1 (GOE), 2 (GUE) and 4 (GSE) [45].It allows a transition between the Poisson integrable results corresponding to β = 0 and the chaotic GOE results β = 1 with fractional level repulsion.Moreover, due to its simple form the use of the β-Hermite ensemble results in an unrivalled speed-up of numerical simulations.These features make the β-Hermite ensemble the best choice for checking our two-parameter formula for P (s) with a huge amount of matrices of different sizes with a fine scan of values of β and fractions f of observed levels.The precise value of P (s) for β-Hermite spectra does not follow exactly the Brody distribution but a more complicated two parameter formula, though the Brody formula is a good description except for low values of β [46].However, this fact also fits our purpose as we want a practical distribution that can be used to obtain a good estimation of the fractional level repulsion and the fraction of missing levels in any type of system without worrying if it follows exactly the Brody distribution.We summarize below how the matrices belonging to the β-Hermite ensemble are constructed.
The matrices in the β-Hermite ensemble are real symmetric tridiagonal matrices whose matrix elements are constructed as: That is, the diagonal matrix elements are random variables with a Gaussian distribution N(µ, σ 2 ) of zero mean and variance σ 2 = 2 while the non-diagonal matrix elements come from a χ ν distribution with ν = β(N − n), N being the dimension of the matrix.The level density of the β-Hermite ensemble is the semicircle law as in the case of the GOE, so the unfolding is easy in this case: In order to check the accuracy of the formula for P (s, f, q), Eq. ( 13), we need spectra of as high dimension as possible.Thus, we have generated ensembles of 1000 spectra from β-Hermite matrices of size N = 10000 (equivalent to have spectra of dimension 10 7 ) and values of the repulsion parameter β = 0, 0.1, 0.2, . . ., 0.9, 1 where we take out a certain number of levels N(1 − f ) in order to have a fraction f of observed levels.
In Fig. 2 we show the comparison of our formula P (s, f, q) (with q = β) with the P (s) obtained from these ensemble averages in two extreme cases (very low and very high values of β and f ) and four cases with intermediate values, which are the ones of practical interest.In Table 1 we show the values of χ 2 for all the distributions we have calculated.We have found that the agreement is excellent in most cases.We make additional comments on two cases below.
First, looking at the figures in the region of very low values of β and f the P (s) distributions are very close to Poisson statistics but, being the β-ensembles compatible with fractional level repulsion, the transition from β > 0 to β = 0 cannot be smooth and that is why, even though the values of χ 2 shown in Table 1 in this region are small, the curves are not as similar to the one for Poisson statistics as expected.In any case, we have scanned all these values of β and f for the sake of completeness but this region is of no much practical interest.No one can expect reasonable estimations with too low fractions of observed levels in too uncorrelated spectra.
Second, looking at Table 1 there is another region of values which could be strikingly higher than the rest: f ≥ 0.9, q ≤ 0.3, where χ 2 is of order tenths whereas the rest of values are of order hundredths or less.However, this is also expected as this region correspond to the fitting of Brody's formula for low values of β for a practically complete spectrum (f ≃ 1), and, as we have mentioned before, the P (s) of the β-ensembles deviate from the Brody distribution in this case [46].
In summary, in this section we made use of the β-Hermite ensemble to show that the derivation of the two-parameter formula is correct by checking its accuracy with very high dimension.In the next section we explore if the formula could be used for realistic spectra to gain information confidently about the chaoticity and the number of missing levels simultaneously by fitting data with our expression.

Distributions of fitted results
In this section we again take advantage of the β-ensemble in order to show that the twoparameter formula is suitable for fitting purposes and the possible limitations one should be careful with.We generate ensembles of matrices with certain values of the dimension N and the parameters β and f as in the previous section, but now we perform fits of these data with our two-parameter formula and we represent in a two-dimensional histogram the distribution of the results for q and f , so we can analyze the probability of obtaining the correct values of the parameters when analyzing a single spectrum of interest.Thus, the uncertainty in the estimation of the parameters is not only related to errors in the fit but to the variance of the q f 0. for the NNSD of β-ensembles of 1000 matrices of dimension N = 10000 generated with mixing parameter q = β and fraction of observed levels f with respect to the formula of Eq. 13.
parameter distribution in the proper ensemble [34].This variance increases when the matrix size is reduced.We want to stress that we do not intend to give a general recipe on how to use the formula.We have used the β-ensemble as a simple RMT model to simulate the transition from regularity to chaos.However, the transition is not universal and can, then, be different in actual experimental spectra.So we will not present here results for a fine scan of both parameters as in the previous section, but only show some representative cases to explain important aspects to take into account and give a general recommendation.That is, we make available the formula and await for experimental results to evaluate its real applicability and usefulness when approaching each case with its particular features.
For few missing levels and dynamics close to chaos (high values of f and β), the spectra are still very correlated and we can obtain reliable estimations of both parameters.We show an example in Fig. 3 where we have represented the results for the joint distribution of fitted parameters for an ensemble of 10000 spectra of dimension N = 1000 with f = 0.8 and β = 0.9.Here we find a narrow peak of values of the parameters concentrated around the correct estimations.The ensemble variance is small in this case, so the probability of obtaining the correct values of the parameters q and f from a fit of the data is high.
However, it is obvious that when spectra are very uncorrelated (very close to Poisson statistics) little information can be gained.The loss of correlations can occur when there are many missing levels (low values of f ) or when the dynamics is very close to regularity (low values of β).This intuition should be reflected in a very high ensemble variance of the parameters.For example, in Fig. 4 we show the results of the distribution of the fitted parameters of the two-parameter formula for P (s) in an ensemble of 10000 spectra of dimension N = 1000 with f = 0.6 and β = 0.3.As can be seen the results for (q, f ) are spread over a wide region.One obtains with a high probability values around (f ≃ 1, q ≃ 0.3)  of missing levels and dynamics between chaos and regularity with β = 0.6, which we can represent here by one individual member of a β-ensemble generated with f = 0.7 and β = 0.6.We can perform a fit to the two-parameter formula to obtain a first estimation of the parameters q and f .But in view of the previous figures 4 and 3 it seems clear that the best estimation of the uncertainty of the parameters deals with a simulation of an ensemble of spectra of the same dimension of the experimental one to analyze the dispersion of values of the parameters more than with any error estimation from the fit.We represent the results for this ensemble, for N = 1000, in the left panel of Fig. 5 and we can see that the spread of values is wider than the one for higher values shown in Fig. 3, and one could have a first estimation of the uncertainties of the parameters from the widths of this distribution.Moreover, for lower dimensions one obtains even wider distributions of the parameters, as shown in the central panel of Fig. 5 (N = 500) and the right panel of Fig. 5 (N = 200).That is, the probability of obtaining the correct values of the parameters from a single fit decreases, as expected, confirming that a safe estimation of the parameters and their errors cannot be obtained from a single fit but analyzing the distributions of several simulations of ensembles.In these two panels we can also observe the same effect as in Fig. 4 for the extreme of very uncorrelated spectra when the P (s) can be well-fitted in some realizations of the ensemble with a combination of f and q such that one of them is close to unity and the other one is representative of the actual intermediate behavior of the NNSD.That is, the conclusion from the fit in these realizations would be that the intermediate statistics is due to just one reason, missing levels or intermediate dynamics.
Thus, in view of these examples, our conclusion and our recommendation when analyzing a spectrum of interest would be not only performing the fit but simulating several β-ensembles of the same dimension and observe the histograms of the distribution of the fitted results in order to try to obtain the most reliable estimation of the parameters and their uncertainties.For example, when having a spectrum like the ones in the left panel of Fig. 5 one could most probably obtain parameters near the center of the distribution and then start simulating ensembles with similar values of f and β and estimating their uncertainties from the width of the distributions.On the other hand, for a spectrum of lower dimension like the ones in the right panel of Fig. 5 one could obtain parameters around the right centered peak or parameters nearer the two extreme peaks around f ≃ 1 or q ≃ 1.Then for lower dimensions, when obtaining a value close to unity for any of the two parameters one should take this result with more caution and try to perform more simulations and checks to be sure of its reliability and make the best error estimation.
Apart from intrinsic limitations from low values of the parameters or low dimension, here we only have simulated spectra and have assumed that not any previous information on them is known, but in practical cases if one has some previous information about the type of dynamics of the system or the estimated fraction of missing levels, the limitations become less.In order to have the safest estimations of the uncertainties of the parameters we recommend to perform simulations of several ensembles in any case, but if the values can be restricted to some ranges from previous known information, much narrower distributions will be obtained with only one peak instead of two or three, even for low dimensions.

Conclusions
Quantum spectra with statistical properties intermediate between the GOE result of RMT and the Poisson uncorrelated spectra can be due to genuine intermediate properties between chaos and integrability or because there are missing levels destroying correlations in the level fluctuations induced by chaotic properties.But what happens when both things occur at the same time?Is it possible to distinguish the different causes of intermediate behavior and to quantify them?
In this work, we have tried to answer this question.We have developed a two-parameter formula for fitting short-range spectral correlations with one of the parameters based on the Brody distribution accounting for the chaoticity of the spectra, q, and the other for the fraction of observed levels, f .The formulas work perfectly well when comparing with RMT results of the β−Hermite ensemble.This proofs the correctness of the formula for describing fractional level repulsion and missing levels in the spectra at the same time.
The theoretical interest of the formula is clear, as a first step to start to unravel the puzzle of the actual origin of intermediate behavior of fluctuations in a spectrum.Now, the next step would be to judge its applicability for actual experimental individual spectra.In this work we have shown some examples by simulating individual spectra from matrices of β-Hermite ensembles (q = β) and analyzed the accuracy and precision of the fit by computing the full distribution of the fitted results of the whole ensemble.From these distributions one can evaluate the probability to obtain reliable results depending on the values of β and f and the size of the spectrum, and estimate the uncertainties of the parameters.So this is our recommendation on how to proceed when using the two-parameter formula.
In view of the distributions of fitted results, we conclude that reliable results of both parameters at the same time can be obtained when the fraction of missing levels is not very large f > 0.6 and the total number of levels in the analyzed sequence is large enough N > 1000, assuming no previous information on the spectrum of interest.Though, in each particular case one can have some previous information on the values of q and f of the spectrum, which can help to shorten their ranges and improve the results even for lower dimensions.The formula would certainly be applicable to estimate the fraction of missing levels if we know in advance what is the value of the repulsion parameter q or viceversa.
A similar procedure as the one we describe here for obtaining the two-parameter formula could be implemented for the Izrailev distribution, as it is also suitable to describe fractional level repulsion.We expect very similar results as the ones we present here.It should also be possible to start with the Berry-Robnik distribution that is able to fit the spectral statistics in the so-called 'far-semiclassical regime' which sometimes show a different behavior for small spacings, P (0) = 0 and its value is given by the fraction of the classical regular phase space.Thus, in these systems there is not fractional level repulsion but full level repulsion only for a fraction of levels of the spectrum.In order to obtain a suitable two-parameter formula for the P (s) in this case one should consider the Berry-Robnik distribution as a starting point [40,41].
We have taken advantage of the β-Hermite ensemble for testing our formula as it is a simple way to scan the transition chaos (β = 1)−regularity (β = 0) and its simple form results in an unrivalled speed-up of numerical simulations, but actual experimental spectra can be different from the β-ensemble, as the transition chaos-regularity is not universal.Thus, we make available this two-parameter formula and await results from experimental spectra to evaluate its real applicability and usefulness.The difficulties we have found in the estimation, specifically when the number of levels is limited, point also to the fact that, whenever possible, one should analyze together all the information available including short and long-range statistics and statistics of the widths in order to have more reliable conclusions regarding the chaoticity and the completeness of experimental spectra.

Figure 3 :
Figure 3: Histogram of the joint distribution of fitted values of the parameters (q, f ) of an ensemble of 10000 β-Hermite matrices with f = 0.8 and β = 0.9.The matrix sizes are N = 1000.

Figure 4 :
Figure 4: Histogram of the joint distribution of fitted values of the parameters (q, f ) of an ensemble of 10000 β-Hermite matrices with f = 0.6 and β = 0.3.The matrix sizes are N = 1000.

Figure 5 :
Figure 5: Histogram of the joint distribution of fitted values of the parameters (q, f ) of ensembles of 10000 β-Hermite matrices with f = 0.7 and β = 0.6.The matrix sizes are from left to right N = 1000, N = 500, and N = 200.