Superconductivity, generalized random phase approximation and linear scaling methods

The superfluid weight is an important observable of superconducting materials since it is related to the London penetration depth of the Meissner effect. It can be computed from the change in the grand potential (or free energy) in response to twisted boundary conditions in a torus geometry. Here we review the Bardeen-Cooper-Schrieffer mean-field theory emphasizing its origin as a variational approximation for the grand potential. The variational parameters are the effective fields that enter in the mean-field Hamiltonian, namely the Hartree-Fock potential and the pairing potential. The superfluid weight is usually computed by ignoring the dependence of the effective fields on the twisted boundary conditions. However, it has been pointed out in recent works that this can lead to unphysical results, particularly in the case of lattice models with flat bands. As a first result, we show that taking into account the dependence of the effective fields on the twisted boundary conditions leads in fact to the generalized random phase approximation. Our second result is providing the mean-field grand potential as an explicit function of the one-particle density matrix. This allows us to derive the expression for the superfluid weight within the generalized random phase approximation in a transparent manner. Moreover, reformulating mean-field theory as a well-posed minimization problem in terms of the one-particle density matrix is a first step towards the application to superconducting systems of the linear scaling methods developed in the context of electronic structure theory.


I. INTRODUCTION
The most striking manifestation of superconductivity is the vanishing of electrical resistence below the critical temperature. Quantum mechanics predicts that in a perfectly crystalline solid the resistance is zero, at least at zero temperature where phononic excitations are frozen out, however this ideal situation is never realized in practice since impurities and defects in the crystal structure are present in any material. Impurities and crystalline defects, hereafter collectively called disorder, are responsible for the finite resistence in the zero temperature limit in normal metals. On the other hand, superconducting materials are perfect conductors below the critical temperature as if disorder vanished altogether. Nevertheless, disorder has a significant effect on many of the observable properties of superconductors and when increased above a certain threshold it drives a transition to the normal state, which can be metallic or even insulating [1][2][3][4][5][6]. The interplay between superconductivity and disorder is a vast research subject and many questions remain open [7][8][9].
The phenomenon of perfect conductivity and the electromagnetic response of supercon-ducting materials are well described by combining Maxwell's equations with the constitutive relation J = −D s A introduced by Heinz and Fritz London [10]. This is a peculiar relation since a gauge invariant quantity, the current density J, is proportional to the vector potential A, which is not gauge invariant. This is not wrong since the London constitutive relation is valid only in a specific gauge, the London gauge [10]. The coefficient D s is known as the superfluid weight and it has been argued that a nonzero superfluid weight is the defining property of a superconductor [11,12]. Indeed, beside dissipationless transport, the phenomenon that defines superconductivity is the Meissner effect, the expulsion of the magnetic field from the bulk of a superconductor. In the superconducting state, the magnetic field can only penetrate up to a characteristic length scale λ L away from the surface of the sample. This so-called London penetration depth is simply the superfluid weight expressed in different units since the two quantities are related by λ L = (µ 0 D s ) −1/2 , with µ 0 the magnetic constant. Moreover, the superfluid weight determines the critical temperature in a two-dimensional superfluid according to the Nelson-Kosterliz relation [13]. As explained below in Section V, the superfluid weight is obtained as a specific limit of the current-current response functions [11,12]. An alternative and equivalent definition is as the change of the grand potential (or free energy) under a twist of the boundary conditions [12,14,15] (see Eq. (50) below). Twisted boundary conditions can be introduced by means of a constant vector potential A in a torus geometry, which cannot be gauged away since it amounts to nonzero magnetic fluxes through the holes of the torus (see Section II).
In general, one has to resort to approximations since computing exactly either the currentcurrent response function or the grand potential of a quantum many-body system is an intractable task. The standard approximation used to describe superconducting materials is mean-field theory, known also as Bardeen-Cooper-Schrieffer (BCS) theory [16][17][18][19]. Within BCS theory, the quartic interaction term in the many-body Hamiltonian is replaced by a quadratic one (see Eq. (11) below), which depends on effective fields Γ i,j , ∆ i,j that have to be computed self-consistently. In particular, the pairing term in the mean-field Hamiltonian has the form 1 2 i,j ∆ i,jĉ † iĉ † j +∆ * i,jĉ jĉi and is essential for superconducting systems since the pairing potential ∆ i,j plays the role of the order parameter of the superconducting transition.
Typically, the mean-field approximation for D s is obtained by replacing the many-body HamiltonianĤ with the mean-field HamiltonianĤ m.f. in the current-current response function, which can then be evaluated using Wick's theorem [12,20]. This procedure is equivalent to taking the partial derivatives of the mean-field grand potential Ω m.f. (A, ∆, Γ) with respect to the constant vector potential A associated to twisted boundary conditions, more precisely D s,lm ∝ ∂ 2 Ω m.f. (A, ∆, Γ)/∂A l ∂A m | A=0 (in anisotropic systems the superfluid weight is a tensor in general). However, it is known that the current-current response function computed usingĤ m.f. breaks gauge invariance [12] and, as noted in recent works [21,22], it can give clearly unphysical results in the case of multiorbital lattice models, in particular when flat bands are present in the band structure.
The problem is that the effective fields Γ i,j , ∆ i,j are obtained by minimizing the meanfield grand potential for each separate value of A, which means that they are themselves functions of A, but this dependence is usually neglected [12,20,23]. In Ref. 21, it was shown how the full second derivative of Ω m.f. A, Γ(A), ∆(A) with respect to A can be evaluated in terms of the solution of the mean-field problem for A = 0 only. Here we go a step further than Ref. 21 since the first main result of this work is to show that computing the superfluid weight as the full second derivative of the mean-field grand potential (see Eq. (51) below) is in fact equivalent to another well-known gauge invariant approximation, the generalized random phase approximation for superconducting systems introduced by Anderson [24] and Rickayzen [25]. The expression for the superfluid weight in the generalized random phase approximation is given below in Eq. (66) and is a result not contained in previous works, in particular in Refs. 21, 22, where the importance of taking into account the A-dependence of the effective fields has been pointed out.
In order to derive the expression of the superfluid weight in the generalized random phase approximation (66), we start by reviewing the BCS mean-field theory of superconductivity in Section III, using as a starting point the variational principle at finite temperature. At finite temperature the variational principle is encoded in the Bogoliubov inequality (Eq. (10) below), from which it follows that the mean-field grand potential is an upper bound for the exact grand potential Ω(A) ≤ Ω m.f. A, Γ, ∆ (23). Thus the mean-field solution, that is the choice of the effective fields Γ i,j , ∆ i,j minimizing Ω m.f. , gives the best possible upper bound on the exact grand potential. The fact that mean-field theory is a variational approximation for the grand potential makes it clear that the latter is a better starting point for computing the superfluid weight rather than the current-current response function. This is important to justify the results of Section V. Whereas the results of Section III are standard, namely the self-consistency equations of BCS theory (18)- (20), their derivation is more rigorous than the one usually found in standard textbooks [18,26] and provides a better starting point for the derivation of the original results of this work, contained in Sections IV and V.
To derive the self-consistency equations of BCS theory in Section III, we find it convenient to perform a Legendre transform and switch from the effective fields j , ĉ jĉi as the variational parameters. These are the matrix elements of the one-particle density matrix P . In particular the anomalous (not numberconserving) expectation values, ĉ iĉj and ĉ † iĉ † j are essential to describe superconducting systems. In Section IV, we go a step further and obtain the mean-field grand potential as an explicit function of the one-particle density matrix (see Eqs. (41)-(42) below), which is a natural generalization of the functional used by Mermin to prove the extension of the Hohenberg-Kohn theorems to finite temperature [27]. Moreover, we identify precisely the class of one-particle density matrices within which the variational optimization is performed. In this way the mean-field problem is recast as a mathematically well-posed minimization problem in terms of the one-particle density matrix. This is the second original result of this work.
It turns out that the derivation of the expression for the superfluid weight in the generalized random phase approximation (66) is particularly transparent if the starting point is the mean-field potential Ω m.f. A, P (A) as a function of the one-particle density matrix, where P (A) is the density matrix which minimizes Ω m.f. (A, P ) for a given value of the constant vector potential A. This is the approach used in Section V. Equivalently, one can start from the grand potential expressed in the terms of the effective fields Ω m.f. A, Γ(A), ∆(A) , as done in Ref. 21, but then the derivation of (66) becomes rather cumbersome, which is probably the reason why the connection with the random phase approximation has not been unveiled in previous works. This illustrates the advantage of reformulating mean-field theory in terms of the one-particle density matrix.
The original results presented in Section IV are potentially important for another reason, namely they are a first step towards the application of linear scaling methods to superconducting systems. Linear scaling means that the computational resources (time and memory) required for the numerical solution of a problem scale linearly ∼ O(N) with the number of constituents N, for instance the lattice sites in a tight-binding model. The nu-merical solution of the mean-field problem generally scales as O(N 3 ) since it requires the full diagonalization of the mean-field Hamiltonian. This scaling severely limits the accessible system size [9]. This is a problem when studying disordered superconductors since it is generally necessary to consider rather large systems. The need to simulate large systems has been further exacerbated with the discovery of superconductivity in moiré heterostructures [28,29], such as magic-angle twisted bilayer graphene, which can contain thousands of atoms in a moiré unit cell. The naïve application of mean-field theory in this case is computationally very demanding and one has to introduce further approximations, such as restricting to a limited number of moiré minibands and/or renormalization procedures, which have been shown to lead to problems when evaluating important observables such as the superfluid weight [30][31][32]. Moreover, the simulation of moiré heterostructures in the presence of disorder is currently out of reach.
In the field of electronic structure theory, many different approaches have been developed to improve the scaling of density functional theory and empirical tight-binding calculations. An important class of methods, known as O(N)-or linear scaling methods [33,34], takes advantage of the fact that the density matrix is ranged, i.e., it decays with distance The basic approximation behind linear scaling methods is to neglect the matrix elements for large separation, for instance |r − r ′ | > R, and thus represent P as a sparse matrix. This approximation is justified only if the decay in (1) is sufficiently fast. Quantifying the exact decay behavior of the density matrix is a nontrivial problem that has received a lot of attention and definite results have been obtained. For instance, in insulators one has an exponential decay P (r, r ′ ) ∼ e −γ|r−r ′ | , where the decay rate γ can be related to the energy gap [33]. If the range R is fixed, the number of nonzero matrix elements of P scales linearly with the system size and the computational effort required to find the optimal value of these parameters scales linearly as well. From a numerical point of view, for achieving linear scaling it is necessary to use standard sparse matrix methods for matrix multiplication and for the solution of linear systems of equations. Linear scaling methods can take advantage of massively parallelized high performance computing and have been successfully applied to systems with millions of atoms, which are relevant for material science and biology. On the other hand, linear scaling methods have not been employed so far in the field of superconductivity. To this end, it is essential to reformulate standard BCS mean-field theory as a minimization problem in which the independent variable is the one-particle density matrix P , as done in Section IV. The essential new ingredient is the inclusion in the density matrix of the anomalous (not number-conserving) expectation values, ĉ iĉj and ĉ † iĉ † j , that are the trademark of BCS theory. In Section IV we also discuss why it should be possible to directly apply to superconductive systems the linear scaling methods that have been developed so far.
The structure of this work is as follows. In Section II, we introduce the class of lattice models we are interested in, we define the current density operator and explain how to represent twisted boundary conditions by means of a constant vector potential. The concept of gauge invariance is discussed as well. In Section III, we derive BCS mean-field theory starting from the Bogoliubov inequality to emphasize its origin as a variational approximation for the grand potential. In Section IV, we express the mean-field grand potential as an explicit function of the one-particle density matrix: the main result is given by (41)- (42). We conclude the section by arguing that, based on our results, the application of linear scaling method to superconducting systems should be rather straightforward. In Section V, we compute the superfluid weight by taking the full second derivative of the mean-field grand potential with respect to the constant vector potential and show that this leads to a gauge invariant result (66), which we identify with the generalized random phase approximation. This is compared in Appendix A to the standard mean-field formula for the superfluid weight found in the literature. In Section VI, we summarize our results and discuss possible directions for future work.

II. BASIC DEFINITIONS
Here we consider a generic tight-binding Hamiltonian with a static interaction term. More precisely, the non-interacting part of the Hamiltonian takes the form whereĉ i andĉ † i are the usual fermionic fields operators and K is the hopping matrix collecting all the hopping matrix elements (K i,j for i = j) and the on-site potential (K i,i ). Here i, j are collective indices labeling all the degrees of freedom in the model (spatial, orbital, spin, . . . ). We adopt the convention that symbols with a hat, such asĤ free andĉ i , denote operators in the many-body Fock space, while operators on the single-particle Hilbert space, such as the hopping matrix K, do not have hats.
The interaction term is quadratic in the occupation number operatorsn i =ĉ † iĉ î The factor 1/2 in the above equation eliminates double-counting since we require the interaction potential to be symmetric V i,j = V j,i and we also set V i,i = 0. One may consider other types of interaction terms, and we leave to the interested reader the task of extending our arguments to more general settings. The full many-body Hamiltonian is then the sum of (2) and (3)Ĥ =Ĥ free +Ĥ int .
Linear scaling methods take advantage of the locality of the hopping matrix (2) and of the density matrix (1). Locality entails some notion of distance between different degrees of freedom, which is introduced by embedding the tight-binding model in space. In the case of infinitely extended systems or a finite system with open boundary conditions, a position vector r j is assigned to the degree of freedom labeled by j and the distance d(i, j) = |r i,j | is the length of the displacement vector r i,j = r i − r j . The position vectors take values in a periodic lattice L, which can be a composite (non-Bravais) lattice as shown in Figure 1. In the following we find convenient to use the word "site"as an alternative to "degrees of freedom,"but it is understood that site i and site j may by assigned to the same lattice point r i = r j . Physically, this is the case when the two sites are in fact different electronic orbitals on the same atom or the two different spin states of the same atomic orbital. For simplicity, the geometric arrangement of the sites is given by a periodic lattice. However the Hamiltonian is not necessary periodic, in particular disorder is usually introduced as a random perturbation of the matrix K.
In the case of periodic boundary conditions, the points of the lattice L are identified if they differ by integer linear combinations of two Bravais lattice vectors R 1 and R 2 (we consider only two-dimensional systems here), which define the size and shape of the finite system, called also the the "supercell."An equivalence class [r j ] obtained under this identification, that is a point on a torus, is attached to each site j. In the case of periodic boundary conditions, the distance d(i, j) and the displacement vectors r i,j are defined in the caption of Figure 1. We provide a precise definition of the displacement vectors in Figure 1 since they are essential to introduce twisted boundary condition and the current density operator, see below.
In realistic cases, the hopping matrix elements between atomic orbitals decrease exponentially with distance, namely K i,j ∼ e −αd(i,j) for some α > 0, therefore it is practical to consider only hopping matrices with finite range R, that is It is assumed in the following that the hopping matrix has a range of the order of a few lattice constants and, in the case of periodic boundary conditions, the range is much smaller than the linear size of the supercell R ≪ |R 1 |, |R 2 |.
Using the continuity equation for the occupation number operatorsn j , one identifies the microscopic current operator whose expectation value gives the particle current flowing from site j to site i. We set = 1 from now on. One is mainly interested in the Fourier transform of the current density, which in the long-wavelength limit (q → 0) readŝ For periodic boundary conditions, A = |R 1 × R 2 | is the area of the supercell (for dimension d = 2) and the wavevector q belongs to the reciprocal lattice of the Bravais lattice generated by R 1 , R 2 , meaning that q · R i = 2πm i with m i an integer. The conservation of current is a consequence of the gauge symmetry of electromagnetism, which in a discrete lattice model consists in the invariance under unitary transformationŝ U g acting on the fields operators asÛ where θ j are arbitrary site-dependent phases. Some operators change form under gauge transformations (Ĵ(q),Ĥ free ), while others do not (n j ,Ĥ int ). Observable quantities (expectation values) are always gauge invariant. Consider the following modified noninteracting HamiltonianĤ The Peierls phase e iqA·r i,j accounts for the presence of a constant vector potential A.
The particle charge is set to q = 1 from now on. In the case of an infinitely extended system or a system with open boundary conditions, a constant vector potential is a pure gauge, which does not affect the observable properties in any way. Indeed, since r i,j = r i − r j , the Hamiltonian (8) is obtained by performing a gauge transformation of . Denote by [r] = r + m 1 R 1 + m 2 R 2 m 1 , m 2 ∈ Z the equivalence classes obtained by this identification. In the case of periodic boundary conditions, an equivalence class [r i ] is attached to each site i and the distance between the sites is defined as d(i, j) = min |r| r ∈ [r i − r j ] . This is known as the flat torus distance. The displacement vector going from site j to site i is the vector r i,j ∈ [r i − r j ] with the property that d(i, j) = |r i,j |, if this vector is unique. If there are multiple vectors with this property, the displacement vector is not defined. If d(i, j) ≪ |R 1 |, |R 2 | the displacement vector r i,j is always defined. As shown in the figure, the displacement vector from site 1 to site 2 is On the other hand, there is in general no analogous gauge transformation in the case of periodic boundary conditions. The reason is that even a constant vector potential amounts to nonzero magnetic fluxes through the holes of the torus in which the lattice is embedded: these fluxes are given by Φ i = A · R i with i = 1, 2. The magnetic flux in lattice models is defined up to integer multiples of the magnetic flux quantum φ 0 = 2π /q (= 2π in our units) and is invariant under gauge transformations. It follows that two Hamiltonianŝ H free (A) andĤ free (A ′ ) can be mapped into each other by a gauge transformation if and only if the respective magnetic fluxes differ by an integer multiple of the magnetic flux quantum: and m i arbitrary integers. Using a gauge transformation, it can be shown that a constant vector potential is equivalent to twisted boundary conditions parametrized by the two magnetic fluxes Φ i=1,2 [15].
For our purposes, it is useful to note that the average (q = 0) current density operator for nonzero A is given by the relation This definition coincides with (6) for A = 0.

III. MEAN-FIELD THEORY FOR SUPERCONDUCTING SYSTEMS FROM THE BOGOLIUBOV INEQUALITY
The idea of mean-field theory is to approximate an interacting system by means of an effective noninteracting one. At zero temperature, this means finding among the wavefunctions that are represented as a single Slater determinant the one that minimizes the expectation value Ĥ of the many-body Hamiltonian (4). Indeed, mean-field theory is an application of the variational principle of quantum mechanics [26]. In the case of Hartree-Fock mean-field theory, the search is restricted to the space of wavefunctions with fixed particle number. The key contribution of Bardeen, Cooper and Schrieffer [16][17][18][19] was to understand that mean-field theory can be successfully applied to superconducting systems, if one relaxes the condition of fixed particle number. As a consequence, anomalous expectation values of the form ĉ iĉj and ĉ † iĉ † j can be nonzero and play the role of the order parameter of the superconductive state.
In the following, we derive the self-consistency equations of BCS mean-field theory (18)-(20) using the Bogoliubov inequality as a starting point. Beside the advantage of providing a self-contained and pedagogical exposition, there are two main reasons for using this approach: i) it emphasizes the fact that BCS theory is based on a rigorous variational principle, even at finite temperature. This is essential to justify the result for the superfluid weight presented in Sec. V. ii) It allows to recast the mean-field problem as the minimization of a suitable function of the one-particle density matrix, as detailed in Sec. IV. Our derivation is similar to the one presented in Ref. 22, but is more general since we consider a wider class of interaction terms and no unnecessary assumptions on the symmetries of the system are made (for instance time-reversal symmetry is not required).
Consider a generic Hamiltonian which is the sum of two partsĤ =Ĥ 0 +λĤ 1 , where λ is a parameter. It is proven in Ref. 35 Using the concavity property, it is easy to obtain the Bogoliubov inequality [35][36][37] where Tr e −βĤ (0) and the expectation value on the right hand side is taken with respect to the HamiltonianĤ 0 , that is · = Tr · e −βĤ 0 /Z 0 . The left hand side of the inequality is the free energy of the many-body system which we would like to approximate, the right hand side is the variational mean-field free energy which needs to be minimized to obtain the best possible approximation. This is similar to the zero temperature case, in which the optimization is carried out over all possible wavefunctions of noninteracting fermions. According to the variational principle, the best wavefunction is the one with the lowest energy since any energy expectation value taken on the manybody Hamiltonian is bound to be larger than the exact ground state energy. In fact, the variational principle can be recovered from (10): in the zero temperature limit the free energy becomes the ground state energy F → E GS and F 0 → E GS,0 = Ĥ 0 = Ψ GS,0 |Ĥ 0 |Ψ GS,0 where |Ψ GS,0 is the ground state ofĤ 0 (a nondegenerate ground state is assumed), thus one obtains E GS ≤ Ĥ = Ψ GS,0 |Ĥ|Ψ GS,0 . SinceĤ 0 can be an arbitrary Hamiltonian, the wavefunction |Ψ GS,0 is also arbitrary and one recovers the variational principle. This shows that the Bogoliubov inequality is the right tool to generalize the variational principle to the finite temperature case. We apply the Bogoliubov inequality withĤ =Ĥ free +Ĥ int −µN , where the noninteracting partĤ free is given by (2) or (8), the interaction termĤ int by (3) and we have added a chemical potential term −µN (N = in i is the total particle number operator). The chemical potential is necessary to control the particle density in the case of Hamiltonians that do not conserve the particle number. This means that we are working in the grand canonical ensemble and the free energies F, F 0 appearing in (10) are in fact grand potentials Ω, Ω 0 and will be referred to as such from now on.
For the variational HamiltonianĤ 0 we take [38] H The coefficients Γ i,j and ∆ i,j are variational parameters with the only constraints Γ * i,j = Γ j,i , sinceĤ 0 must be Hermitian, and ∆ i,j = −∆ j,i because of the fermionic anticommutation relations {ĉ † i ,ĉ † j } = {ĉ i ,ĉ j } = 0. The last term on the right hand side of (11) is simply the most general quadratic Hamiltonian, also including terms of the formĉ † iĉ † j andĉ jĉi that break particle number conservation, which are crucial in superconducting systems. It is convenient to separate inĤ 0 the quadratic termĤ free − µN that appears also in the full many-body Hamiltonian. In this way the coefficients Γ i,j and ∆ i,j can be interpreted as effective fields that describe the averaged effect of interactions: Γ i,j is the Hartree-Fock potential and ∆ i,j is known as the pairing (or pair) potential [39].
Since the variational HamiltonianĤ 0 is quadratic, the expectation value Ĥ −Ĥ 0 in (10) can be evaluated using Wick's theorem [40] Ĥ The minimization of the right hand size of (10) is most easily performed not with respect to the effective fields, but rather with respect to their conjugate variables, that is the expectation values of quadratic operators appearing in (12), which are obtained as derivatives of the grand potential Ω 0 = −β −1 ln Tr e −βĤ 0 (denoted by F 0 in Eq. (10)) associated to the as one can show using (A1) in Appendix A. Assuming that the mapping from the effective fields to the expectation values Γ i,j , j , ĉ jĉi , as given by (13)- (14), can be inverted, one can perform the Legendre transform of the grand potential Ω 0 , which is The Legendre transform Ω 0 is a function of the expectation values ĉ † iĉ j , ĉ † iĉ † j , ĉ jĉi , which are the new independent variables, while the effective fields are now dependent variables obtained as the derivatives of Ω 0 We have now reduced the problem to minimizing the function Ω m.f. ĉ † iĉ j , ĉ † iĉ † j , ĉ jĉi on the right hand side of (10) (the mean-field grand potential), given by By setting the partial derivatives of Ω m.f. to zero and using (16) one obtains These are known as the self-consistency equations of BCS mean-field theory.
It is convenient at this point to define the mean-field HamiltonianĤ m.f. aŝ where, as a reminder, the expectation value is taken with respect to the density matrix ρ 0 = e −βĤ 0 /Z 0 . On the other hand,Ĥ m.f. andĤ 0 differ by a constant, therefore all averages taken with respect to the statistical ensemble defined byĤ m.f. are the same as the averages taken with respect toĤ 0 . In other words, the two Hamiltonians give the same density matrixρ 0 = e −βĤ m.f. / Tr e −βĤ m.f. . For this reason, we do not introduce any new notation to indicate the averages taken with respect toĤ m.f. . The mean-field Hamiltonian has the property that Ĥ = Ĥ m.f. .
Using the mean-field Hamiltonian (21), the Bogoliubov inequality (10) can be rewritten as with Ω m.f. the mean-field grand potential introduced in (17). These last two equations provide a better understanding of the finite temperature variational principle: the variational mean-field HamiltonianĤ m.f. is optimized so as to minimize the grand potential Ω m.f. with the only constraint being that the expectation value of the many-body HamiltonianĤ computed with respect to the statistical ensemble defined byĤ m.f. is the same as the expectation value Ĥ m.f. , which is computed with the same ensemble.

IV. MERMIN-GIBBS FUNCTIONAL FOR SUPERCONDUCTING SYSTEMS AND LINEAR SCALING METHODS
In the previous section, mean-field theory for superconductive systems has been formulated as the minimization of the mean-field grand potential Ω m.f. (17), which, after a Legendre transform, becomes a function of the expectation values ĉ † iĉ j , ĉ † iĉ † j , ĉ jĉi . These expectation values replace the effective fields Γ i,j and ∆ i,j as the variational parameters. This change of variables has been useful to derive the self-consistency equations in a straightforward manner. The goal of this section is to explicitly write down the mean-field grand potential as a function of the expectation values ĉ † iĉ j , ĉ † iĉ † j , ĉ jĉi and identify the domain of this function. This is the first main result of this work and it is important in order to have a mathematically well-defined minimization problem, which is the starting point for the application of linear scaling methods. We start by reviewing the Nambu space formalism.
The idea of Nambu was to write the variational HamiltonianĤ 0 (11) in the "row multiplies matrix multiplies column"format [18,38] where K, ∆ and Γ are square matrices with matrix elements K i,j , ∆ i,j and Γ i,j , respectively, µ is the chemical potential andĉ (ĉ † ) is the column (row) vector of annihilation (creation) operatorsĉ i (ĉ † i ). We indicate with A T the transpose of the matrix A and with v T the row vector which is the transpose of the column vector v.
The single-particle Hamiltonian appearing in (24) is known as the Bogoliubov-de Gennes (BdG) Hamiltonian [39] and in general it has no constraints other than h = K − µ + Γ = h † and ∆ = −∆ T . The BdG Hamiltonian acts on an enlarged Hilbert space called the particle-hole space, or alternatively the Nambu space. This space is the tensor product of the original single-particle Hilbert space with the socalled particle-hole degree of freedom. The block structure of the BdG Hamiltonian can be conveniently summarized by the condition together with H 0 = H † 0 . Here Σ x is a unitary matrix that acts as the σ x Pauli matrix on the particle-hole degree of freedom.
The BdG Hamiltonian can be diagonalized by a unitary transformation G that respects the block structure encoded by (26) [38]. Specifically, we have H 0 = G † DG with We use the unitary transformation G to introduce new field operatorsd i as and from (24) we obtain From this last result, we see that the eigenvalues E i of H 0 can be interpreted as excitation energies of fermionic quasiparticles on top of the mean-field ground state. Using (30), the grand potential Ω 0 becomes On the other hand, it is not difficult to prove the identity Combining this last result with (31) leads to In this way the grand potential Ω 0 has been conveniently expressed in terms of the singleparticle BdG Hamiltonian H 0 . As a consequence, (13), (14) and (33) together give Thus, it is clearly possible to express these expectation values in terms of the matrix elements of the single-particle operator P = (e βH 0 + 1) −1 . The eigenvalues p i of P are the occupation numbers of the eigenstates of the BdG Hamiltonian, which in the fermionic case satisfy the inequalities 0 ≤ p i ≤ 1. One can equivalently express this condition as the matrix inequalities 0 ≤ P ≤ 1, meaning that 0 ≤ ψ|P |ψ ≤ 1 for any normalized single-particle state |ψ . Moreover, P is constrained by the particle-hole symmetry of H 0 (26), which leads to Together with P = P † this implies P = P 11 P 12 P 21 P 22 = P 11 P 12 −P * 12 1 − P T
From (34), (35) and (37) one obtains the useful relations Moreover, the relation gives explicitly the mapping ĉ † iĉ j , ĉ † iĉ † j , ĉ jĉi → Γ i,j , ∆ i,j , ∆ * i,j , which has been used to perform the Legendre transform of the grand potential Ω 0 in (15). Using this last result one can easily show that the mapping is one-to-one, therefore the Legendre transform is well-defined.
In order to obtain the Legendre transform Ω 0 (P ) as an explicit function of P , we write the second term on the right hand side of (15) as Here, we have identified the (von Neumann) entropy ] of a gas of noninteracting fermionic (quasi-)particles (note the factor 1/2, which is absent in the usual definition of the von Neumann entropy). Thus minimizing Ω 0 is the same as maximizing the entropy with the constraint that the expectation value of the energy (the second term in (41)) is fixed. Indeed, the inverse temperature β = (k B T ) −1 is the Lagrange multiplier corresponding to the energy constraint. In the presence of interactions, the function to be minimized is (see (17)) where V (P ) = Ĥ int is the quadratic form defined as the second term in the first line. The minimization of Ω m.f. (P ) has to be carried out within the domain of single-particle operators P which satisfy the conditions 0 ≤ P ≤ 1 and the symmetry constraint (36). At zero temperature (β = +∞) the entropy is zero, namely the eigenvalues of P can be only zero or one, therefore one has the idempotency constraint P 2 = P on the one-particle density matrix. From (41) and (42) it is clear that gauge invariance is respected by the mean-field approximation. Indeed, under a gauge transformation (7) a single-particle operator O, corresponding to a quadratic many-body operatorÔ = i,jĉ † i O i,jĉj , transforms as It is then evident that both Ω 0 and Ω m.f. are invariant under the transformations The explicit form of the mean-field grand potential Ω m.f. (P ) as a function of the singleparticle density matrix P as given by (41) and (42) is one of the main results of this work and serves as the starting point for the application of linear scaling methods for superconducting systems. Here, we have carried out the derivation in the case of a BdG Hamiltonian of the form in (25), which is the most general structure that a BdG Hamiltonian can have. It is straightforward to specialize (41)- (42) to systems with additional symmetries, such as timereversal symmetry, which belong to different classes within the topological classification of noninteracting fermionic Hamiltonians [41,42].
The result in (41)-(42) is aesthetically appealing since it takes the same form as the Mermin-Gibbs functional for a system of noninteracting fermions [27,43,44] with the only difference that the single-particle density matrix P includes the anomalous expectation values ĉ † iĉ † j , ĉ jĉi in the off-diagonal blocks (38). Linear scaling method are generally implemented at zero temperature, in which case the density matrix satisfies the idempotency condition P 2 = P . Ensuring that the minimization of the total energy is performed within the space of idempotent density matrices is the main technical difficulty and a number of ingenious techniques have been developed to address this problem [33,34]. In the case of superconducting systems one would also need to enforce the additional constraint given by particle-hole symmetry (36). However, this is a much simpler constraint to deal with compared to idempotency since it simply reflects a redundancy in the matrix elements of the density matrix. For this reason, we expect that all the linear scaling methods developed so far for normal (non-superconducting) system can be readily implemented and used for superconducting systems as well. Linear scaling methods have been employed also at nonzero temperature in a few cases [43] and again we do not see any fundamental difficulty in doing the same for superconducting systems. Being able to work at finite temperature is obviously important in the context of superconductivity.
The key fact behind the success of linear scaling methods is that the one-particle density matrix is ranged (1). There are no available results regarding the decay of the density matrix with distance in the case of superconducting systems. However, given that the BdG Hamiltonian H 0 is generally gapped, it is reasonable to expect an exponential decay as in insulators. The important problem of the localization properties of the density matrix in superconducting systems will be the subject of future studies. We note in passing that superconductivity is the manifestation of off-diagonal long range-order in the two-particle density matrix. In the case of a conventional s-wave superconductor this means ĉ for |r i − r j | → ∞ (note that we have introduced the spin here). Thus the anomalous expectation values keep track of this off-diagonal long-range order, while there is no off-diagonal long-range order in the one-particle density matrix. Any form of off-diagonal long-range order in P would hinder the application of linear scaling methods, which are based on the approximation of setting the matrix elements P i,j to zero, if d(i, j) > R for a given range R.

V. SUPERFLUID WEIGHT AND GENERALIZED RANDOM PHASE APPROXIMATION
In this Section, we present one of the main results of this work, namely we show that taking the full derivatives of the mean-field grand potential with respect to the constant vector potential A, which implements twisted boundary conditions (see Section II), leads to the superfluid weight in the generalized random phase approximation. The derivation is based on the results of the previous Section IV, in particular the explicit expression for the mean-field grand potential as a function of the one-particle density matrix (41)- (42). We first explain how the superfluid weight can be calculated in different but equivalent ways.
The superfluid weight is defined in terms of a response function χ lm (q, q ′ , ω), that of the current induced in the system to linear order in the vector potential [18,45] Here J(r, t) = q dω 2π J(q, ω)e i(q·r−ωt) is the current density with J(q, ω) its Fourier transform, and similarly A(q, ω) is the Fourier transform of the vector potential A(r, t). We do not assume translational invariance here, therefore the response function depends on two wavevectors (q, q ′ ). The response function is the sum of two parts χ lm = χ d lm + χ p lm , the diamagnetic response χ d lm and the paramagnetic response χ p lm , given by (in the long-wavelength limit) In the above equationsĴ l (q, t) = e iĤtĴ l (q)e −iĤt is the l = x, y component of the current operator (6) in the Heisenberg picture, while we denote with [r i,j ] l the l component of the displacement vector r i,j , which is defined in the caption of Figure 1. In (46) and (47) the expectation values are taken with respect to the many-body Hamiltonian, that is · = Tr · e −βĤ /Z, contrary to the convention of Sections III-IV. The superfluid weight is defined as the following limit of the response function where the wavevector q ′ = q ′ m + q ′ ⊥ is decomposed into the collinear (q ′ ) and perpendicular (q ′ ⊥ ) components with respect to the m = x, y axis (m is the second index appearing in both D s,lm and χ lm andm is the unit vector along the same direction). As a consequence of gauge invariance, the alternative limit of the response function is zero [12,18] lim q ′ →0 The fact that these two limits do not commute is a consequence of the presence of long-range correlations in the current-current response function [12]. An equivalent definition of the superfluid weight is as the second derivative of the grand potential with respect to the fluxes Φ i parametrizing twisted boundary conditions [14]. As discussed in Sec. II, twisted boundary conditions are equivalent to a constant vector potential A that cannot be gauged away for periodic boundary conditions, thus we have where Ω(A) = −β −1 ln Tr e −βĤ(A) ,Ĥ(A) =Ĥ free (A) +Ĥ int andĤ free (A) is given by (8). It is understood in the above definition that Ω(A) is the grand potential of a finite system with area A and that the thermodynamic limit (A → ∞) is performed after the derivative is taken. The equivalence of the two definitions (48) and (50) is discussed for instance in Refs. 12 and 46. There is ambiguity in the literature as whether one should use the free energy (fixed particle number) or the grand potential (fixed chemical potential) in (50). The two definitions usually lead to the same result, but this is not guaranteed in general. For simplicity, we choose to use the grand potential here and refer to Ref. 21 for an in depth discussion regarding this issue. Evaluating the exact response function (45)-(47) is a very difficult task, therefore one has to resort to approximations such as the mean-field approximation presented in Section III. The two equivalent definitions (48) and (50) suggest two possible strategies for evaluating the superfluid weight within the mean-field approximation. The first one is to replace the manybody HamiltonianĤ with the mean-field HamiltonianĤ m.f. when evaluating the expectation values in (46)-(47). This approach (or equivalent ones) has been used in Ref. 12 (see Sec. III in this reference) and Refs. 20, 23, but has the disadvantage that the response function does not respect gauge invariance since the gauge constraint (49) is not satisfied [12]. Moreover, the procedure of replacing the many-body Hamiltonian with the mean-field Hamiltonian in the response function is a rather uncontrolled approximation since the latter Hamiltonian is simply a collection of variational parameters, as explained in Section III, and there is no a priori reason to use it to compute response functions. Given that mean-field theory can be formulated as a variational approximation for the grand potential, a more justified approach consists in replacing the exact grand potential Ω(A) in (50) with its mean-field approximation Ω m.f. (A). The second main result of this work is to show that this approach leads to a well-known gauge-invariant approximation, the generalized random phase approximation.
The effective fields Γ and ∆ are chosen so as to minimize Ω m.f. (A, Γ, ∆) for a given value of A, therefore they become themselves functions of the vector potential. This means that the mean-field grand potential depends on A either directly through the noninteracting HamiltonianĤ free (A) in (11) or indirectly through the effective fields. Using (50), we have that an approximation for the superfluid weight is given by In this equation and in the following, we denote with d/dA l the full derivative with respect to A, including both the direct and the indirect dependence, while the partial derivative ∂Ω m.f. A, Γ(A), ∆(A) ∂A l denotes the derivative with respect to the first argument only (direct dependence). We note that replacing the full derivative d/dA l with the partial derivative ∂/∂A l in (51), as done for instance in Ref. 23, is the same as using the response functions (46)- (47) computed with the mean-field Hamiltonian (see Appendix A). In two recent works [21,22] it has been pointed out that it is important to take into account the dependence of the effective fields on the vector potential A when using (51), otherwise one may obtain unphysical results in the case of multiorbital lattices, in particular when flat bands are present. We mention in passing that in Ref. 21 an alternative approach based on a modified form of linear response theory and equivalent to (51) has been proposed. In Ref. 21 it has been shown that it is possible evaluate (51) by solving the mean-field problem for A = 0 only. The basic idea is the following: the variational parameters Γ(A) and ∆(A) are chosen so as to minimize the mean-field grand potential, as a consequence Taking the total derivative with respect to A of these equations leads to implicit equations for the derivatives ∂Γ(A)/∂A and ∂∆(A)/∂A , which require only the knowledge of correlation functions at A = 0. Using this approach and the results of Ref. 21 as a starting point it is possible to arrive to our final result (66) after some cumbersome algebra. We do not present the details of the derivation here since it is simpler to use the same idea in the case where the mean-field grand potential is a function of the one-particle density matrix P (A), namely Ω m.f. (A, P (A)) in (51), as shown in the following.
To simplify the notation we use a variation of the Einstein summation convention when we take derivatives with respect to the matrix elements of P . For instance we have The factor 1/2 in the last two terms takes into account the fact that the derivative with respect to [P 12 ] i,j appears twice in the summation since [P 12 ] i,j = −[P 12 ] j,i (see (37) and (38) Different subscripts a, b, c, . . . added to P as in the definition (53) are used to keep track of multiple partial derivatives with respect to the matrix elements of the one-particle density matrix.
The first full derivative of the mean-field grand potential is simply the average current density In the second equality we have used the fact that the one-particle density matrix P (A) minimizes the mean-field grand potential for each value of A, thus the collection of first derivatives with respect to the matrix elements of P vanishes This equation is the same as (52) but with the effective fields replaced by the one-particle density matrix. The third equality in (54) is a consequence of the fact that in (41)-(42) the only term that gives a direct dependence on A is Tr[K(A)P 11 ] = Ĥ free (A) , while the last equality comes from (9). The result in (54) is valid for arbitrary A, therefore, for the second full derivative of the grand potential we have To proceed we need to express the partial derivative ∂P (A)/∂A m in terms of quantities evaluated at A = 0. To this end, we use the method proposed in Ref. 21 and take the full derivative of both sides of (55) (compare with the discussion around (52)) The first term on the right hand side is easy to evaluate since On the other hand, we have from (42) It is straightforward to evaluate the second term on the right from the definition of the quadratic form V (P ). Because of (16), the first term is identified with the Jacobian matrix of the mapping ĉ where in the matrix on the right hand side each entry stands for the collection of derivatives labelled by all possible site indices. Here and in the following, we denote by ∆ a = (Γ i,j , ∆ i,j , ∆ * i,j ) the collection of all independent effective fields, and we use a similar notation as in (53) Since the inverse mapping Γ i,j , j , ĉ jĉi exists, we denote its Jacobian, the inverse of the matrix in (61), as ∂P a ∂∆ b and we have in our notation where δ c a is the Kronecker delta. It is also clear from (13)- (14) that Note that here we denote with ∂ 2 Ω 0 ∂P b ∂P a −1 the inverse of ∂ 2 Ω 0 ∂P b ∂P a as a matrix and not the inverse of a single matrix element. The relation in (64) is useful since ∂ 2 Ω 0 ∂∆ b ∂∆ a is a collection of correlation functions similar to the one found in (47) with the crucial difference that they are evaluated on the statistical ensemble given by the meanfield HamiltonianĤ m.f. . These objects are easy to compute and their explicit expression is provided in Appendix A. From (57), (60) and (64), we can now express the derivative of the density matrix as Inserting this into (56) gives our main result for the superfluid weight In the last equality we have identified the diamagnetic response function (46), with the only difference that here the expectation value is evaluated on the mean-field statistical ensemble. On the other hand, the remaining term is not simply the mean-field version of the paramagnetic response (47). Infact, the usual mean-field paramagnetic response is obtained by setting ∂ 2 V /∂P b ∂P a = 0 in the above result, as explained in Appendix A. This corresponds to the response function computed in Ref. 12 and 20 for instance. On the other hand, the response function χ p,GRPA lm has the same structure as the well-known result for the density-density response function within the random phase approximation [45] χ RPA (q, ω) = 1 where χ −1 0 (q, ω) is the density-density response function in the absence of interactions (the Lindhard response function [45]) and v(q) the Fourier transform of the interaction potential (translational invariance has been assumed). Indeed, the superfluid weight within the usual (not generalized) random phase approximation is obtained from (67) by retaining only the "Hartree term"in the quadratic form in (42), namely by taking V (P ) = 1 2 i,j V i,j n i n j . Retaining also the "Fock"or "exchange term", that is if jĉ i , leads to the time-dependent Hartree-Fock approximation, which is discussed in Ref. 45. The time-dependent Hartree-Fock approximation is the next step beyond the time-dependent Hartree approximation, which is an alternative name for the standard random phase approximation (68). By extension, one may call (67) the time-dependent Hartree-Fock-Bogoliubov approximation since all terms obtained from the application of Wick's theorem in (12) are retained in V (P ), including the "pairing"or "Bogoliubov term" ĉ † iĉ † j ĉ jĉi . The generalization of the random phase approximation for superconductors was originally developed originally by Anderson [24] and Rickayzen [25] (see also Ref. 18), however the result for the superfluid weight in the simple form of (66) has never been presented before. Most importantly, it was not appreciated before our work that the generalized random phase approximation can be obtained by simply replacing the exact grand potential with the mean-field one in (50) and taking into account the A-dependence of the effective fields (see (51)). We have argued that this procedure is more theoretically sound since mean-field theory is a variational approximation for the grand potential, as explained in Sec. III, and gauge invariance is preserved. Indeed, one of the main reasons for introducing the generalized random phase approximation is to cure the problem of the loss of gauge invariance occurring when the superfluid weight and other observables are evaluated by replacing the exact Hamiltonian with the mean-field one in the correlation functions (46) and (47).
We expect our gauge invariant result (66) to be useful to deal with the problems pointed out in Refs. 22 and 21 regarding the evaluation of the superfluid weight in multiorbital lattices. It would also be interesting to investigate whether (66) can be implemented numerically in a way which scales linearly with the system size. This would be the case if it turns out that the matrix of correlation functions ∂ 2 Ω 0 ∂∆ b ∂∆ a , or better its inverse − ∂ 2 Ω 0 ∂P b ∂P a , can be approximately represented by a sparse matrix. To this end, it is important to estimate the decay behavior with distance of the elements of these matrices, as in the case of the one-particle density matrix (1), for which little is known in the case of superconductive systems. We leave the answers to these interesting questions for the future.

VI. CONCLUSION AND PERSPECTIVES
In this work, we have begun exploring the advantages of reformulating mean-field theory for superconducting systems in terms of the one-particle density matrix. There is the promise of significant computational advantages since (41)-(42) provide the starting point for applying to superconducting systems the linear scaling methods developed in the context of electronic structure theory [33,34]. Whether this approach will allow for the simulation of large supercells of disordered superconductors, larger than it is currently possible, ultimately depends on how fast the off-diagonal matrix elements of the density matrix decay with distance. As mentioned before, further work in this direction is required in the case of superconducting systems. The application of linear scaling methods to superconductors with many atoms in the unit cell, such as the recently discovered twisted bilayer graphene and other moiré materials, is another important motivation for the present work. As discussed in Section IV, it should be rather straightforward to adapt to superconducting systems the linear scaling algorithms that are currently available.
By writing the mean-field grand potential as an explicit function of the density matrix, we have also elucidated the relation between two popular methods used to study superconducting systems: the mean-field approximation and the generalized random phase approximation, which is usually considered a beyond mean-field approximation. Indeed, we have calculated the superfluid weight as the full second derivative of the grand potential and found a general and gauge invariant result (66), which differs from the standard mean-field result used in the literature (A4) (with the exception of Refs. 21 and 22). The difference is in the paramagnetic current-current response function (67), which has the form typical of the random phase approximation (68). This is an example of how formulating mean-field theory in terms of the density matrix can be advantageous for analytical calculations since our derivation of (66) is very transparent. Our result could be useful for solving the problems encountered when evaluating the superfluid weight in multiorbital systems, in particular when flat bands are present, as pointed out in Refs. 21 and 22.
From a numerical point of view it is an interesting question whether (66) can be used to compute the superfluid weight with a computational cost that scales linearly with the system size. This depends on whether the Jacobian matrix (61) has matrix elements that decay rapidly with distance, in the same way as the density matrix (1), another interesting and important question for the future. Finally, it would be also important to extend some of our results to include more realistic interactions, such as the retarded electron-phonon interaction, which is responsible for the formation of Cooper pairs in many superconducting materials. To this end, it could be interesting to combine, if possible, our formulation with Eliashberg theory, used for superconductors with strong electron-phonon coupling [47].
At a general level we hope that our work will stimulate the adoption of density matrixbased methods in the context of superconductivity.