A shortcut through the Coulomb gas method for spectral linear statistics on random matrices

In the last decade, spectral linear statistics on large dimensional random matrices have attracted significant attention. Within the physics community, a privileged role has been played by invariant matrix ensembles for which a two dimensional Coulomb gas analogy is available. We present a critical revision of the Coulomb gas method in Random Matrix Theory (RMT) borrowing language and tools from Large Deviations Theory. This allows us to formalize an equivalent, but more effective and quicker route toward RMT free energy calculations. Moreover, we argue that this more modern viewpoint is likely to shed further light on the interesting issues of weak phase transitions and evaporation phenomena recently observed in RMT.


Introduction
This article contains a critical revision of the two dimensional (2D) Coulomb gas analogy in Random Matrix Theory (RMT) and presents an alternative method to compute large deviation functions for spectral linear statistics on invariant matrix models. The Coulomb gas technique has been used in several physical problems for many decades. A more modern viewpoint on the subject, based on concepts and tools borrowed from Large Deviations Theory (LDT) seems worthwhile, as certain fundamental identities have been so far overlooked in the vast majority of previous works on this topic.
The goal of this paper is twofold: (i) to present an effective shortcut to the standard Coulomb gas technique for the evaluation of probabilities of linear statistics on invariant random matrix models. The method relies on classical thermodynamic identities ((53) and (68) below) based on the Legendre duality [11].
(ii) to argue -based on existing evidence from the huge literature on this subjectthat this more modern viewpoint on an otherwise well-established technique has the potential to shed further light on phase transitions and evaporation phenomena in RMT.
This paper is organized as follows. Section 2 illustrates the well-known connection between invariant matrix models and 2D Coulomb gases. For further details we refer to [2,26,40]. Section 3 presents a brief review of the previous works. Section 4 contains a few rigorous results from LDT that set the natural stage for our discussion. The reader interested in the details of the proofs is advised to look at [22]. The large-N limit of matrix integrals is discussed in Section 5 and the general ideas of the Coulomb gas method are presented in Section 6 through the prism of LDT. Section 7 contains the thermodynamic identities for linear spectral statistics and their use to compute large deviation functions; this section also contains simple, convincing examples illustrating the power of the shortened method and a brief discussion on the thermodynamical meaning of the Legendre duality. In addition to the shortcut, our investigations naturally prompt a corpus of new ideas on the issues of phase transitions and evaporation phenomena in Coulomb gas systems. These ideas and related open problems are discussed at length in Section 8.

Random matrices and 2D Coulomb gases
The definition of a matrix ensemble consists of: i) The choice of a set of matrices M β . We shall consider the following sets: real N × N symmetric matrices; complex N × N Hermitian matrices; quaternion N × N self-dual matrices. They are conveniently labeled by the Dyson index β = 1, 2 and 4, respectively.
ii) The choice of a probability measure on M β . Each of the sets with β = 1, 2, 4 is naturally equipped with its own Lebesgue measure µ L . However, for these sets µ L is infinite and hence has no probabilistic meaning. Instead, we shall consider the probability measure These ensembles are usually called invariant ensembles. For the physical motivations behind the choices of M β as in i) we refer to the second Chapter of Mehta's book [40]. Matrices with β = 1, 2, 4 are diagonalizable with real spectrum. As a consequence of the 'rotational' invariance (2), eigenvalues and eigenvectors of M are statistically independent, and any basis is 'equally likely' to diagonalize M. Therefore, the eigenvectors of invariant ensembles are quite immaterial. On the other hand, with the choice (1) as probability measure on M β the joint law of the eigenvalues (λ 1 , . . . , λ N ) is [40] dP (λ 1 , . . . , with where Z N,β =´dλ 1 · · · dλ N e −βE(λ 1 ,...,λ N ) is a normalization constant and hereafter, unless otherwise specified, all summations run from 1 to N. It is evident that (3) is the canonical distribution of a system of identical particles on the line labeled by their positions λ 1 , . . . , λ N at inverse temperature β. These N particles interact (logarithmic repulsion) and experience a confinement by the external potential NV (λ). Note that the logarithmic interaction is solution of the 2D Poisson equation. For this reason, the system is referred to as a 2D Coulomb gas constrained on the line (the eigenvalues are real) and the normalization constant Z N,β acquires the meaning of a (positional) partition function. This Coulomb gas picture is a useful and physically intuitive device for making educated guesses on the behavior of invariant ensembles. In view of certain applications to physics (see, e.g., [26,45]), it is also worth studying the canonical measure (3)-(4) of a particle system with logarithmic pair repulsion at inverse temperature β > 0 independently of a concrete matrix model realization. Note that explicit (non invariant) matrix models with arbitrary non-quantized index β > 0 are known in RMT [19].
Let us first start with a quick review of the method and its wide range of applications so far.

Review of the Coulomb gas method
The basic idea of the Coulomb gas picture goes back to the works of Wigner [51] in the 1950s. However, it was with the formidable series of papers by Dyson [20] in the 1960s that the exact correspondence between the eigenvalue distributions of some random matrix models and the statistical mechanics of classical 2D Coulomb gases attracted the attention of physicists. At the heart of the Coulomb gas method lies the following observation [20]: the probability law of a spectral statistics F = f (λ 1 , . . . , λ N ) on an invariant matrix ensemble can be expressed as a ratio of two partition functions The numerator in (6) is the partition function of the associated Coulomb gas with the constraint F ∈ B; the denominator is the partition function of the unconstrained Coulomb gas. Dyson made the assumption that for large N the Coulomb gas obeys the laws of classical thermodynamics. He obtained many results based more on well-rooted statistical mechanics intuitions rather than arguments of a mathematically rigorous kind. Later, several aspects of the problem have been put on a more formal ground thanks to the development of modern ideas of LDT. However, this program is far from being completed, as we will try to argue below.
The seminal idea of Dyson has been rediscovered several times in the past and pushed forward to compute the large deviation functions of spectral statistics on matrix models. The first remarkable "Coulomb gas calculations" appear in the works [4,31,18] on the large-N approximation in various gauge theories. More recently, a version of the technique with hard-wall constraints has been engineered to compute large deviations of the extreme eigenvalues of Gaussian matrices [16]. Scientists from several areas have recognized a 2D Coulomb gas picture in several unrelated problems such as spectral statistics in classical matrix models [8,25,49,34,35,36,37,39,12]; distribution of entanglement measures in bipartite quantum systems [43,23,17,42,24]; distribution of linear statistics on the transmission matrix and time-delay matrix of mesoscopic cavities [50,46,30,13]; distribution of maximal displacements for vicious random walkers in one dimension [41]; non-local correlation functions in domino tilings [9]; thermodynamics of the six-vertex model [10]; fluctuations in the 2D one-component plasma [1,15], and many others.
However, a revision of the existing literature reveals that the full arsenal of statistical mechanics tools (some of which were at the heart of Dyson's calculations) has been severely underemployed thus far (with the exception of [12,30,13,15]). Since many of these stat-mech considerations have been lately formalized as LDT theorems, it seems appropriate to first summarize the relevant rigorous results. Next, we will use them to offer a modern take on various statements scattered in the physics literature.

Notions of large deviation theory
Large deviations theory is an asymptotic theory which deals with probability of 'rare events'. See [22,32,47]. Let X be a Polish space, i.e. a complete separable metric space. A sequence of probability measures Pr N on X satisfies a large deviation principle (LDP for short) with speed v N and rate function I if: i) v N → ∞ (as N → ∞); ii) I: X → [0, ∞] is lower semicontinuous; and iii) for all Borel measurable sets B of X Here,B andB are the interior and the closure of B, respectively, and we used the notation I(S) = inf x∈S I(x). In the following we will use the shorter notation for expressing the chain of inequalities (7). For a sequence of random variables X N on X we will loosely speak of 'LDP of X N ', meaning the LDP for the sequence of probability laws of X N . A simple example of Polish space is the familiar X = R n . A class of Polish spaces arising naturally in applications is obtained by taking a Polish space X and then considering the set of probability measures M(X ) on it (equipped with the Lévy metric). In fact, in what follows we shall concern with the cases X = R n and X = M(R n ).
We recall here two practical ideas which are commonly used to prove a LDP. The first one is quite general: a LDP can be inferred from another one.
Lemma 1 (Contraction principle) Let X and Y be Polish spaces and g: X → Y be a measurable map. If the X -valued random variables X N satisfies a LDP with speed v N and rate function I and g is continuous on {x ∈ X : I(x) < ∞}, then Y N = g(X N ) satisfy a LDP with same speed and rate function Notice that the above lemma is nothing but an instance of the so-called saddle-point approximation.
A somehow converse statement would be the following. Suppose that for a large class of functions F : X → R the limit exists ( · denotes the expectation with respect to the probability law of the random variable X N ). Then we might expect that X N satisfies a LDP with speed v N . In fact an exhaustive class of functions F is known [5], but for practical purposes that is too large. Amazingly, in the finite dimensional case X = R n , Gärtner [27] and Ellis [21] have shown that it is sufficient to check (10) for linear functions only (provided some regularity is assumed). And in this case it is also possible to reconstruct the rate function.
Lemma 2 (Gärtner-Ellis theorem for random vectors) Let X N be a real random vector in R n . Let D be the set of s ∈ R n such that the following limit exists [Note that 0 ∈ D (hence D = ∅) and J( s) is concave (by Hölder inequality)]. If i) 0 ∈D, ii) J( s) is differentiable inD and iii) ∇J( s) → ∞ on the boundary points of D, then X N satisfies a LDP with speed v N . The rate function Ψ( x) is the Legendre-Fenchel transform of J( s): The function J( s) is known as scaled cumulant generating function (cumulant GF for short) of X N . Both the cumulant generating function J( s) and the rate function Ψ( x) are generically called large deviation functions.
The rate function Ψ( x) provides information not only about the large deviations, but also about typical fluctuations and the most probable values which correspond to the zeros (global minima) of Ψ( x). In the case where Ψ( x) has only one global minimum, the most probable value is also the typical value, that is X N satisfy a Law of Large Numbers for large N. Moreover, if J( s) is analytic at s = 0, the mixed cumulant of X N = (X N,1 , . . . , X N,n ) at leading order in N are given by where · c denotes the connected average (this is in fact the origin of the name 'cumulant GF'). The requirement iii) in Gärtner-Ellis theorem is known as steepness condition, and it has to be checked whenever D = R n (if J( s) exists for all s ∈ R n , then condition iii) is trivially true). Although the steepness condition might appear a rather technical hypothesis, it is in fact a very natural requirement valid, for instance, if J( s) is everywhere differentiable in its domain D (not only inD). It may however happen that ∇J( s) does not diverge as s approaches the boundary of D. In these cases, only a 'local version' of the Gärtner-Ellis theorem holds and, in general, the Legendre-Fenchel transform of J( s) provides only the 'convex envelope' of Ψ( x). One should always check whetherJ( s) is steep or not, as the failure of this condition might have deep consequences on the statistics of X as we will discuss later for the case of spectral statistics.
As a final remark, we remind that the Legendre-Fenchel transform (12) yields a convex function. A nontrivial fact is that a rate function Ψ( x) obtained from the Gärtner-Ellis theorem is necessarily strictly convex [44].
We are now ready to briefly revisit the standard route followed in the physics literature thus far in the light of the rigorous results above. This should lead quite naturally to the proposed shortened route.

Limit of large size matrices
One of the central problem in RMT is to find methods to compute averages with respect to the probability measure (1) on the invariant ensemble M β Suppose that F (M) = f (λ 1 , . . . , λ N ) depends only on the spectrum (λ 1 , . . . , λ N ) of M.
The 'rotational' invariance (2) of the measure allows us to integrate out the angular degrees of freedom and get In the Coulomb gas picture, this corresponds to computing averages of thermodynamic quantities. In the majority of cases, such averages are hard to compute for finite N. It is quite natural, therefore, to look for simpler and more convenient evaluations of these averages for large N.
Clearly, at zero temperature β → ∞ the only contribution to the matrix integral (15) comes from the configurations of the gas which minimize the energy function E(λ 1 , . . . , λ N ). At finite temperature β < ∞, one has to look for asymptotic formulae valid when the number of particles of the Coulomb gas is large. In the thermodynamic limit, the task of computing averages is replaced by most probable values assumed to be approximately equal to the corresponding averages. In the large N limit, a concentration of measure phenomenon indeed makes the thermodynamic variables close to their averages: the most probable value is also the typical value.
These "classical" arguments can now be stated rigorously as a LDP for the configurations of the Coulomb gas. The eigenvalues (λ 1 , . . . , λ N ) (microstates of the system) are conveniently described in terms of the normalized empirical measure (macrostates) For large N the energy function has a mean-field limit where the mean-field energy density functional is This functional enjoys many nice analytical properties. Most importantly, under some mild assumptions on V (λ), there exists a unique probability measure ̺ * that minimizes In this case the following result due to Ben Arous and Guionnet [3,32,2] holds (see also [7]).
Theorem 1 (LDP for the empirical measure) If lim inf |λ|→∞ V (λ)/ log |λ| > 1, then ̺ N satisfies a large deviation principle with speed βN 2 and rate function A physical translation of Theorem 1 is as follows. If in (4) the external potential V (λ) grows faster than the logarithmic repulsion (so that a confinement of the Coulomb gas really occurs), then the probability of a configuration ̺ N is Pr This probability is exponentially suppressed with speed βN 2 and the precise measure of "unlikeliness" of an out-of-equilibrium configuration is given by the energy penalty ∆E[̺] with respect to the equilibrium configuration in the thermodynamic limit ̺ * . It is remarkable that for a system of N particles, the rate for this suppression is O(N 2 ). Hence, from the LDP, the matrix integral (15) can be estimated as the value of f computed at the minimizers of the energy function E(λ 1 , . . . , λ N ). The accuracy of this saddle-point approximation increases exponentially in N 2 . In fact, once a LDP for the empirical measure of the gas is established, using the contraction principle (Lemma 1) one can deduce a LPD for observables of the gas, i.e. spectral statistics.
In the following, we will be mainly concerned with linear spectral statistics (or linear statistics for short), i.e. functions on the spectrum of M of the form for some continuous real-valued f (x). We denote by P N (x) the probability distribution of F (M). Note that a linear spectral statistics is a linear functional of the empirical measure (16) The generalization to more linear statistics is immediate. For n > 1 linear statistics continuous on the probability measures ̺ with finite mean-field energy, a LDP with speed v N = βN 2 and joint rate function holds. How to compute these rate functions?

Coulomb gas method and linear spectral statistics
The strategy to compute the rate function of linear statistics is the following. For definiteness we shall discuss the case of a single linear statistics F (M). The idea is to compute first its cumulant GF, as in (11), and then recover the rate function Ψ(x) as Legendre-Fenchel transform by the Gärtner-Ellis theorem (Lemma 2). As in Section 3, the Laplace transform P(s) of the probability law of F (M) can be cast as a ratio of two partition functions where we have suitably rescaled the Laplace variable for later convenience. Here is the partition function of the original Coulomb gas subject to an additional singleparticle potential sNf (λ): Note that Z N,β (0) is the partition function in the absence of the additional potential, and therefore coincides with the normalization constant Z N,β in (3). Thus the cumulant GF J(s) may be expressed as excess free energy of the 2D constrained Coulomb gas with respect to the unperturbed system The existence of the N → ∞ limit above is ensured by the LDP in Theorem 1 and Lemma 1. Hence, the computation of the joint cumulant GF J(s) is tantamount to evaluating the leading order in N of the partition function Z N,β (s). The LDP of the empirical measure ̺ N states that for large N the partition function Z N,β (s) in (29) is dominated by ̺ * s (λ), the unique minimizer of an effective mean-field energy functional E s [̺] in the space of normalized distributions: The effective energy functional is for some constant C, and for all λ belonging to the support of ̺ * s , namely λ ∈ supp̺ * s . At equilibrium, the 2D Coulomb gas arranges itself in such a way that each particle has equal electrostatic energy (the left hand side of (36)). Otherwise stated, there is no net electric field among the charges (a necessary condition for electrostatic equilibrium). There exist many techniques to solve (36) and we shall not discuss them here, referring instead to [48,26]. Notice that (36) is not a linear integral equation. Indeed, the unknown equilibrium measure ̺ * s (λ) must satisfy the identity (36) for λ in its support supp̺ * s , which is itself unknown. The meaning of the equilibrium density is the following: ̺ * s (λ) corresponds to the typical configuration of eigenvalues yielding a prescribed value of F (M) as given by In the language of statistical mechanics, ̺ * s (λ) is the macrostate of the system that realizes the (unlikely) event F (M) = x in the most likely way, where x =´f (λ)d̺ * s (λ). Hence, a possible route to evaluate J(s) is to find the saddle-point density ̺ * s (λ) as a function of s (as solution of (36)) and to insert it back into the energy functional (35) to evaluate the leading order of Z N,β (s) as excess free energy in (32). Using (33) we have Once J(s) is known, the rate function Ψ(x) of the LDP P N (x) ≈ exp(−βN 2 Ψ(x)) can be computed as Legendre-Fenchel transform (12) of J(s). This technique has been exploited in the last decade in many physical problems to compute the large deviations of single observables. However, this route entails the explicit computation of the mean-field energy at the saddle-point density E s [̺ * s ] = − 1 2˜l og |λ − λ ′ |d̺ * s (λ)d̺ * s (λ ′ ) +´[V (λ) + sf (λ)]d̺ * s (λ), which is not necessarily an easy task. The situation gets even worse in the case of joint statistics. For n > 1 linear statistics (24) one has where s ∈ R n and ̺ * s (λ) is the minimizer of the effective energy functional with f (λ) = (f 1 (λ), . . . , f n (λ)).
Can we bypass at all the explicit and often cumbersome evaluation of the integrals in E s [̺ * s ] or E s [̺ * s ]?

The Coulomb gas method via Legendre duality
It is indeed possible to introduce a shortcut (based on classical thermodynamic identities, discussed extensively in [11]) which outmaneuvers unwieldy integrals altogether. We discuss first the case of one linear statistics n = 1. The extension to n > 1 linear statistics as well as a discussion on the thermodynamic meaning of certain identities is then presented.

Shortcut
If J(s) satisfies the hypotheses of the Gärtner-Ellis theorem (Lemma 2), then the rate function of the LDP of the linear statistics F (M) is given by where we applied (in order) (12), (38), (33), (34), and (20). On the other hand, from the LDP of the empirical measure of the Coulomb gas (Theorem 1) and the contraction principle (Lemma 1), the rate function is given by (23): where the infimum is taken over the measures with a prescribed value We see that the formulation in the Laplace space (44), is nothing but the implementation of the constrained minimization problem (45) with a Lagrange multiplier s (the Laplace variable) that takes into account the constraint F [̺] = x.
We have already remarked that J(s) is concave. This implies that J(s) is continuous inD, and is differentiable almost everywhere. In the following we assume further that J(s) is strictly concave and everywhere differentiable. Since Ψ(x) in (41) is always strictly convex we can apply again the Legendre-Fenchel transform to get In the case of regular functions, the Legendre-Fenchel transform (47) is given by where x * (s) is the unique solution of Ψ ′ (x) = −s. Analogously, the transform (41) reads where s * (x) is the solution of J ′ (s) = x. Therefore, we get and the function s * (x) is the inverse of x * (s), i.e. the solution of s * (x * (s)) = s. We stress the fact that the strict concavity of J(s) implies the existence and uniqueness of the solution s * (x) of J ′ (s) = x. Only in this case one has a one-to-one correspondence (50) between the slopes of the cumulant GF J(s) and the slopes of the rate function Ψ(x).
What is the meaning of these quantities? The answer is provided by (34)- (38) and the stationarity condition δE s [̺ * s ]/δ̺ = 0. Indeed, (51) as anticipated in (37). The identity (47) can be informally written in the (almost) symmetric form This equation should be handled with care. In fact, in (52), there is only one independent variable: either s or x. The relation between the conjugate variables x and s is ruled by (50), and the identity (52) is to be interpreted as either (48) or as (49). In any case we can stipulate that dJ(s) = x * (s)ds, with x 0 = x * (0). These equalities show that, when a large parameter (N in our case) is involved, the Laplace and the Legendre transforms are intimately connected through the saddle-point approximation of the thermodynamic limit. The thermodynamic relations (52) and (53) provide an alternative method to compute the large deviation functions. The steps of this shortened method are (i) Solve the saddle-point equation (36) to get ̺ * s ; (ii) Make the relation between x and s explicit by evaluating x * (s) = F [̺ * s ] in (37), and its inverse s * (x); (iii) Compute the large deviation functions J(s) and Ψ(x) using the relations (53).
Hence we have This procedure is usually many times faster than the standard route, as the following examples clearly demonstrate.
Example 1: Coulomb gas in a box. As a first example we consider a Coulomb gas confined in the interval [−1, 1]. The canonical measure of the N-particle gas at inverse temperature β > 0 is Let us take the center of mass of the gas F = N −1 i λ i as our natural observable. Clearly −1 ≤ F ≤ 1. We will derive the rate function Ψ(x) of F in the large N asymptotics P N (x) ≈ exp(−βN 2 Ψ(x)) using the shortened route. Note that Ψ(x) = +∞ (zero probability) if x < −1 or x > 1.
The first step is to compute the saddle-point density ̺ * s (λ) of the effective energy functional (35) with f (λ) = λ and V = 0. This optimization can be performed using, for instance, Tricomi's formula [48]. Skipping details, one eventually finds that By inserting the expressions of a and b from (57) into (58) one gets the explicit form of x * (s) = F [̺ * s ] and its inverse s * (x): According to formulae (54), a straighforward integration of (59)-(60) provides the large deviation functions Note that both the cumulant GF J(s) and the rate function Ψ(x) are not analytic. More precisely, the third derivative of the free energy J(s) (the rate function Ψ(x)) of the Coulomb gas is discontinuous at s = ±1 (at x * (±1) = ∓1/2). These non-analyticities correspond to phase transitions of the Coulomb gas. We will come back to this point in the last section.
Example 2: Planar approximation of quantum field theories. Using Wick calculus it is possible to show that the cumulants (connected averages) of powers of traces of complex Gaussian matrices have a 1/N expansion in terms of maps with given genus. See [52]. To leading order in N, these cumulants are given by enumeration of planar diagrams (maps of genus zero). Let M be a Gaussian matrix dµ(M) ∝ exp(−βN Tr M 2 /4) and let F (M) = N −1 Tr M 4 . For β = 2 (complex Gaussian model) the cumulants of F (M) to leading order in N enumerate the connected planar vacuum diagrams of the ϕ 4 -theory. In [4] the cumulant GF of these numbers (among other results) was computed using a Coulomb gas picture. Here we reproduce their result using the shortened version of the Coulomb gas method (beware of different notation: our Laplace parameter s corresponds to 2g in the convention of [4]).
The Gaussian ensemble corresponds to a quadratic potential V (λ) = λ 2 /4 in (4). The spectral statistics we are interested in is We compute the sum of connected vacuum diagrams J(s) (the cumulant GF of the linear statistics F ) using the Legendre duality. Notice that the Laplace transform P(s) =´e −βN 2 sF (M ) dµ(M) is finite only for s ≥ 0. Hence, a priori, the first hypothesis of the Gärtner-Ellis theorem (Lemma 2) is not satisfied.
The saddle-point density ̺ * s (the minimizer of E s [̺]) can be computed explicitly using standard methods for |λ| ≤ 2 L(s) with L(s) = (1/48s)( √ 1 + 96s − 1) > 0. Using residue calculus as in the previous example one obtains Insisting on the validity of the Legendre duality, an elementary integration gives: This cumulant GF was computed in [4,Eq. (20) and (

Joint linear statistics and thermodynamic identities
The generalization to n > 1 linear statistics F (M) in (24) is immediate. Now, J( s) and Ψ( x) are functions of n real variables. Again we assume J( s) strictly concave and differentiable so that the Legendre-Fenchel transform reduces to the simplest Legendre transform. Let ̺ * s be the minimizer of the mean-field energy functional (40). The multidimensional analogue of (52) is where x = x * ( s) has components and s = s * ( x) is its inverse map. We can write the differential equations supplemented by the 'initial conditions' Hence the computation of J( s) and Ψ( x) amounts to finding the saddle-point density ̺ * s , computing F i [̺ * s ] (i = 1, . . . , n) in (67) and then integrating the differential forms (68). The identities (68) have been employed [12,30,13] and explicitly stated [29,30,13] in a few previous works on joint linear statistics on matrix models. However, their precise LDT conditions of applicability, as well as the implications for weak phase transitions and evaporation phenomena (discussed below in Section 8) do not seem to have been examined elsewhere.

Back to thermodynamics
In order to convey the main ideas of this method and show its relation with thermodynamics, we discuss the n = 2 case, i.e. the case of two linear statistics F 1 (M) and F 2 (M). Now x = (x 1 , x 2 ) and s = (s 1 , s 2 ). Once the variational problem δE s 1 ,s 2 [̺ * s 1 ,s 2 ]/δ̺ = 0 is solved, one gets and the pair of inverse maps s * 1 (x 1 , x 2 ), s * 2 (x 1 , x 2 ). The differential relations (68) in this case read or, for short, The reader should have recognized the familiar Maxwell relations for thermodynamic potentials [33]. This should not come as a surprise, as LDT may be really seen as the proper mathematical framework in which statistical mechanics can be formulated rigorously (several authors have elaborated on this idea; see e.g. [22]). Let us then elaborate briefly on this point. In thermodynamics, temperature T and pressure p say, are the control parameters of (the Lagrange multipliers associated with) entropy S and volume V , respectively. A classical (p, V, T ) system is conveniently described in terms of the internal energy U(S, V ) or the Gibbs free energy G(T, p). Bearing in mind the different sign convention in the thermodynamic tradition, the analogues of (71) are [33]: or for short (cf. (72)), where

Discussion and Outlook
In the previous sections we have presented a revision of the Coulomb gas method for linear spectral statistics on random matrices. Thanks to the contraction principle (Lemma 1) it is possible to retrieve information from the behavior of the 'gas of eigenvalues' and establish a LDP for linear statistics. In order to compute the large deviation functions through the Gärtner-Ellis theorem (Lemma 2), we have made an extensive use of the Legendre duality. The bottleneck of the standard method summarized in Section 6 is the evaluation of the energy functional (40) at the saddle-point density, which entails the computation of many integrals. Here we have illustrated a computationally simpler approach. This shortened method only requires the explicit relation between the real variables x and the Laplace variables s from (67) and its inverse. The workload can be drastically reduced by exploiting the differential relations (68), with the welcome consequence that the otherwise daunting task of finding joint large deviations is also made possible (see e.g. the recent works [12,13,30]). The method can also be applied almost verbatim to real spectral statistics on invariant normal non-Hermitian ensembles (see [15] for an application on complex Ginibre matrices [28]).
As already mentioned, the rate function Ψ( x) of a spectral linear statistics provides an overall description of the fluctuations about the most probable values. If J( s) is analytic at s = 0, the mixed cumulant of F 1 (M), . . . , F n (M) to leading order in N are given by In fact, this is not true in general and stronger conditions are required to infer a Central Limit Theorem [6].
In what follows, we first briefly discuss the regularity properties of the cumulant GF and the associated phase transitions within the underlying Coulomb gas framework. Then we conclude with a discussion of some subtleties of the Coulomb gas method.

Phase transitions in 2D Coulomb gases
The regularity properties of Ψ( x) and J( s) are ultimately determined by the smoothness of the map R n ∋ s −→ ̺ * s ∈ M(R). Coming back to a single statistics (n = 1) for simplicity, it has been observed in several problems that the cumulant GF J(s) is not real analytic (see Example 1). Since J(s) is the excess free energy (38) of the associated 2D Coulomb gas, the non-analyticity points s cr correspond to phase transitions of the Coulomb gas system. The order of the phase transition at s cr (in the sense of Ehrenfest) is the smallest integer ℓ ∈ N such that ∂ ℓ s J(s) is discontinuous at s cr . Phase transitions in matrix models abound. These transitions are remarkably weak, usually of third-order (the third derivative of J(s) being discontinuous at some s cr ; in the Example 1 we have seen such a discontinuity of J(s) at s cr = ±1). See [9,17,18,23,24,31,42,50]. Similar third-order phase transitions, driven by the very same mechanism (known as 'hard-wall mechanism', see the broad survey [38]) have been also observed for joint (n > 1) linear statistics, revealing rich phase diagrams [13]. Third-order phase transitions associated to the merging/splitting of the gas density in multiple supports have also been studied [50]. Recently a fourth-order transition (driven by a change of topology mechanism) has been detected in the Ginibre ensemble [15].
For a single statistics, the identity J ′ (s) = x * (s) shows that the Coulomb gas undergoes an ℓ-th order phase transition at s cr if ∂ ℓ−1 s x * (s) is discontinuous at s cr where x * (s) =´f (λ)d̺ * s (λ). For instance, in our Example 1, x * (s) in Eq. (59) has a continuous first derivative; its second derivative ∂ 2 s x * (s) is discontinuous at s cr = ±1 and hence the non-analyticities of the free energy J(s) are of third-order. Also note that, since −Ψ ′ (x) = s * (x), the rate function Ψ(x) and the cumulant GF J(s) have the same regularity.
The study of phase transitions in invariant matrix models is very much a topic of current research. A natural question is to know whether the free energy J(s) associated to a linear statistics on a invariant matrix models has critical points and to characterize the order of the transitions. Physically one certainly expects a relation between the particular behavior of the gas density ̺ * s and the arising non-analyticities of the corresponding thermodynamic observables (see Example 1). However, this relation is not completely understood in general, and each particular case requires explicit working. Interestingly, the identities discussed in this article can be used to get new insights on the emergence and characterization of these weak phase transitions [14].

Breakdown of the method: discontinuous statistics and evaporation phenomena
At this point it is worth mentioning the limitations of the method based on Legendre duality. The procedure outlined in Section 7 relies on the Contraction Principle (Lemma 1) and the Gärtner-Ellis theorem (Lemma 2).
The method does not work in general for discontinuous statistics -a paradigmatic example is the fraction of eigenvalues F (M) = N −1 i χ I (λ i ) in a fixed interval I. This is a fundamental obstruction because in these cases it is not even guaranteed that a LDP exists: the contraction principle does not work in general for F (M) = N −1 Tr f (M) when f is not continuous. In many cases [20,35,39], the hallmark of this obstruction is a logarithmic contribution to the variance Var(F (M)) ∼ (log N)/N 2 , in contrast to the O(N −2 ) general behavior for smooth linear statistics discussed above.
More interesting limitations of the method are related to the properties of the cumulant GF J(s). The Gärtner-Ellis theorem (Lemma 2) is not applicable if the so-called steepness condition ∇J( s) → ∞ (as s → ∂D) is not verified. In this case, only a 'local' version of the Gärtner-Ellis theorem holds since the map s * ( x) is not globally defined. This is the case for many if not all previous works on linear statistics [12,17,42,24,46,30] where a phenomenon of evaporation of eigenvalues in random matrices was claimed.
To simplify the discussion consider a single statistics F (M) = N −1 Tr f (M) (n = 1) and suppose again that J(s) is strictly concave (so that J ′ (s) = x * (s) is a strictly decreasing function). For concreteness suppose that J(s) is defined for s ≥ s 0 and that J ′ (s + 0 ) = C (failure of the steepness condition). Hence J ′ (s) = x * (s) ≤ C. The inverse function s * (x) = −Ψ ′ (x) is therefore defined only for x ≤ C and the rate function Ψ(x) cannot be recovered by integration (54) when x is larger than C. This means that the rate function computed as Legendre transform of the non-steep cumulant GF provides only a partial description (for values x ≤ C) on the limiting distribution P N (x) ≈ exp(−βN 2 Ψ(x)) of F (M).
Note that F (M) = N −1 i f (λ i ) is a sum of strongly correlated random variables. Several papers in the physics literature ran into a similar obstruction: the constrained Coulomb gas calculation provides the rate function Ψ(x) of a LDP with speed O(N 2 ) as long as x is less (say) than a critical value C. In this regime, the typical eigenvalue distribution constrained by F [̺] = x ≤ C is the solution of a variational problem like (36), and all the eigenvalues 'democratically' contribute to satisfy the constraint. Guided by numerics and physical arguments, in several problems it has been argued that the statistics of values of F (M) larger than the critical value C is instead driven by evaporation phenomena: in the large N limit, a single eigenvalue splits off from the continuous density of M. In this scenario, large values of the sum F (M) are dominated by the contribution of a single summand. The hallmark of this is a change of speed in the large deviation estimate, usually from O(N 2 ) in the 'democratic regime' to O(N) in the regime where a single eigenvalue carries a macroscopic contribution to F (M).
A thorough review of these works, in the modern viewpoint adopted in this paper, reveals that the aforementioned obstruction can be essentially traced back to the nonsteepness of the cumulant GF of F (M). Therefore, a failure of the steepness condition may correspond to a 'change of speed' in the LDP driven by evaporation phenomena. This phenomenon is not restricted to random matrices and is naturally explained by a careful consideration of LDT detailed prescriptions.