Resta-like preconditioning for self-consistent field iterations in the linearized augmented planewave method

Convergence in self-consistent-field cycles can be a major computational bottleneck of density-functional theory calculations. We propose a Resta-like preconditioning method for full-potential all-electron calculations in the linearized augmented planewave (LAPW) method to smoothly converge to self-consistency. We implemented this preconditioner in the exciting code and apply it to the two semiconducting systems of MoS$_2$ slabs and P-rich GaP(100) surfaces as well as the metallic system Au(111), containing a sufficiently large amount of vacuum. Our calculations demonstrate that the implemented scheme performs reliably as well as more efficiently regardless of system size, suppressing long-range charge sloshing. While the suitability of this preconditioning higher for semiconducting systems, the convergence for metals is only slightly decreased and thus still trustworthy to apply. Furthermore, a mixing algorithm with the preconditioner shows an improvement over that with the Kerker preconditioner for the investigated semiconducting systems.


INTRODUCTION
Kohn-Sham density-functional theory (DFT) has become one of the standard ab initio approaches to predict electronic properties of materials [1,2]. A major challenge in determining the ground-state charge density arises from the nonlinearity of the Kohn-Sham equations. This obstacle can be solved by considering a fixed-point iteration technique, also referred to as the self-consistentfield (SCF) iteration. During the SCF cycle, the Kohn-Sham potential, v in KS , is calculated using a given charge density ρ in subsequently one can solve the Kohn-Sham equations. From this, a new density ρ out is generated, serving as the input for the next SCF iteration. This iterative way is terminated when self-consistency to a given convergence threshold is achieved.
A tactical approach for efficient SCF convergence is mixing ρ in i and ρ out i of previous steps to construct a new density ρ in i+1 in the next step. In the last decades, various mixing methods and their variations have been reported for electronic-structure theory [3][4][5][6][7][8][9][10][11][12][13][14][15][16], and they have proven to be successful in many systems. However, one often faces poor convergence -and hence large computational costs -due to fundamental challenges. A common problem is the charge sloshing instability that induces oscillations of the charge density during the SCF iteration [17][18][19][20][21][22][23][24][25][26][27]. This issue occurs when a unit cell becomes large, and it is generally more pronounced in metallic systems. One of the major sources of the charge sloshing is an element of G −2 in the Hartree potential at long-wavelengths.
Different methods exist to overcome this shortcoming [8,9,[18][19][20][21][28][29][30][31][32]. Among them, the Kerker preconditioner improves the convergence to alleviate the charge sloshing [19]. This preconditioner is known that it is suit-able for metallic systems. However, the question arises to what extent the preconditioner is applicable in the case of insulating or semiconducting, or inhomogeneous systems. References 32 and 33 have reported that the Kerker preconditioner does not ensure the convergence for those systems because this preconditioner cannot properly describe their long-range screening [25]. An alternative preconditioner to render the self-consistency procedure efficient for such systems is the Resta preconditioner [34]. In practice, both preconditioners are computationally less demanding compared to other methods for large unit cells. They were originally devised for planewave based methods. Recent theoretical studies have shown a formulation of the Kerker preconditioner as well as its implementation in full-potential (FP) calculations with the linearized augmented planewave (LAPW) basis set [26,27]. The Resta preconditioner, however, is not yet reformulated to implement in the FP-LAPW method.
In this paper, we develop a Resta-like preconditioning scheme to be applicable in the FP-LAPW method for stable and rapid SCF convergence. The Resta-like preconditioner is implemented in exciting code [35]. We examine the performance of our Resta scheme and compare it with another preconditioner using examples of insulator and semiconducting systems: MoS 2 slabs and P-rich, p(2 × 2) GaP(100) surfaces. We find that the implemented preconditioner improves robustness and accelerates overall convergence of the self-consistency iterations, which would extend the range of systems accessible to FP-LAPW with a given computational power.

PRECONDITIONERS: KERKER AND RESTA
The simplest mixing approach is to linearly mix the previous input and output charge density. In this case, a arXiv:2205.11209v1 [cond-mat.mtrl-sci] 23 May 2022 new guess at (i+1)th iteration, ρ in i+1 , is defined as where α is a damping parameter and appropriate values lie in the range between zero and one [36]. α generally relies on the considered material, for instance, a small value of α is suitable for metallic systems [37]. Using this simple method, SCF convergence can already be improved in some cases. However, in many structures, particularly metallic, large-scale, and inhomogeneous ones, poor convergence or even divergence is observed, even if one uses the optimal α. In practice, strong fluctuations in ρ out i caused by a small change of ρ in i prevent the convergence during the self-consistency iteration. This phenomenon is well-known as charge sloshing [17][18][19][20][21][22][23][24][25][26][27], which is more severe in metals with large supercells. To avoid this problem, one can add an effective preconditioner in Eq. 1 in the following manner: with P being the preconditioner. A number of preconditioners have been proposed to suppress the charge sloshing and to improve the SCF convergence [9,18,19,[29][30][31][32]34]. In this paper, we focus on two types of preconditioner: Kerker and Resta preconditioners [19,34]. Such preconditioners can be simply applied to common mixing schemes as multipliers and are computationally cheaper.
The Kerker preconditioner is typically used to capture the long-range screening behavior, causing a stable performance of the self-consistency procedure, especially for metallic systems with large unit cells. This preconditioner has been developed on base of the Thomas-Fermi screening model, and it reads where λ is a parameter that controls the screening at long wavelengths. For instance, P(G) Kerker corresponds to one when λ is zero. On the other hand, the value of P(G) Kerker is prone to reduce with increasing λ at short wavevectors G. The Thomas-Fermi screening wave vector k TF has been proposed as this parameter [25], and it can be written by with N (ε F ) being the density of states at the Fermi energy ε F . As discussed above, the ideal systems for the Kerker preconditioner are commonly metals, which can be treated in the picture of a homogeneous electron gas [25,32]. Unfortunately, the use of such a preconditioner does not guarantee achieving smooth convergence for insulators or semiconductors, or inhomogeneous systems like surfaces [32,33] as it does not capture properly the incomplete screening in those systems. Thus, one considers the Resta preconditioner, which can be used for those systems. The Resta preconditioner is given by where β, κ and γ are parameters. According to previous studies [25,38,39], those parameters are related to the screening length, Fermi-momentum-related quantity, and static dielectric constant, respectively. As suggested by Resta [34], one can obtain the static dielectric constant using other parameters as follows: Here, κ can be expressed as where k F is the valence electron Fermi momentum, and is defined as with n 0 being the valence electron density. Finally, we estimate β as the lattice constant of a system. Since those parameters are already determined for the specific system, no further adjustment to choose their optimal values is required.
In the Kerker and Resta preconditoner cases, it is straightforward and efficient to implement them in planewave-based methods, since they are initially devised in reciprocal space, but that is not straightforward for other methods like the FP-LAPW method. To implement these preconditioners in the FP-LAPW scheme, we firstly discuss the expression of the charge density. Here, the space of a unit cell consists of an interstitial region, I, as well as atomic spheres around the centers of atoms, which are called muffin-tin spheres. The muffintin part of the charge density is expanded in terms of spherical harmonics, while the density is represented by planewaves in the interstitial part: To make such preconditioners compatible with the FP-LAPW method, we need to transform them into the real space representation. For this reason, we can rewrite Eqs. 3 and 5 as P(r) Kerker and respectively. It should be noted that we approximate in Eq. 5 via a Taylor expansion, and this is truncated before the quadratic term. A further challenge that makes the proconditioner in Eqs. 10 and 11 difficult to develop is how to apply the operator, (∇ 2 − λ 2 (or κ 2 )) −1 , directly for the FP-LAPW case. Alternatively, we consider the screened Coulomb potential V (r) that is the solution of the screened Poisson equation: The input density ρ(r) in Eq. 12 can be expressed it in terms of the residual obtained from a mixing approach, We can solve Eq. 12 by employing Weinert's pseudocharge method [40]. According to this method, a smooth charge densitỹ ρ α (r) is used instead of the input charge density in muffin-tin spheres, therefore, the modified charge density can be expressed as where the first term of the right-hand side is the density in the interstitial region. Equation 13 then allows to perform a Fourier transformation. Now, one can calculate the screened potential in the interstitial region: The Fourier component ofρ(r) is analytically defined as with j l and i l being the spherical Bessel functions and modified spherical Bessel functions of the first kind. We, in turn, determine the potential inside muffin-tin spheres by solving the Dirichlet boundary-value problem, and it reads Here, G(r) is a Green-function that is given as where k l denote the modified spherical Bessel functions of the second kind. Detailed derivations of the pseudocharge method for the screened Poisson equation are provided in Refs. 27, 41, and 42.

PULAY MIXING
In general, Pulay mixing [5,6], which is called direct inversion in the iterative subspace (DIIS), performs better than linear mixing. The main difference between Pulay and linear mixings is that an iterative history of the input densities ρ in i (r) and residuals R in i (r), which are the difference between ρ out i (r) and ρ in i (r), is stored for the former scheme, determining optimum densities and residuals: Minimizing the norm of the residual is required for the optimum residual. To do so, weights ω i in Eq. 18 satisfy the constraint as These weights can be obtained by solving a system of linear equations [17]: with λ being a Lagrange multiplier. With a preconditioner P, the new input density for the next iteration in the Pulay mixing is updated as follows: where α is the same parameter discussed for the linear mixing case. In this work, we also consider Kerker and Resta-like preconditioners for this mixing. It should be noted that one can exploit the same approach of the pseudo-charge method explained in Sec. when these preconditioners are involved in the second term of the righthand side of Eq. 21.

COMPUTATIONAL DETAILS
As test cases, the two systems of MoS 2 slabs and Prich, p(2 × 2) GaP(100) surfaces are considered. We depict both structures in Fig 1. In particular, we model 10 layers (30 atoms) and 20 layers (60 atoms) for the former, which has an AB stacking pattern, and 15 layers (64 atoms) and 19 layers (80 atoms) for the latter. In the latter structures, a surface reconstruction is formed, containing buckled P-P dimers on top, together with one hydrogen atom per dimer. For the sake of comparison, we also consider the Au(111) surface (with 15 layers) as generic example for a metallic system. Such structures are constructed with lattice constants of 3.19 Å, 5.45 Å, and 4.19 Å for the MoS 2 , GaP(100), and Au(111), respectively. As mentioned before, an elongated unit cell tends to exhibit slow SCF convergence due to the charge sloshing. We therefore set around 30 Å of vacuum along the perpendicular direction that also prevents spurious interactions between neighboring replica for both systems.
The aforementioned methods in Secs. and are implemented in the full-potential all-electron densityfunctional theory (DFT) code exciting. All calculations are performed using this code. We employ the muffin-tin radii R M T of 2.4, 2.1, 1.7, 1.5, 0.9, 1.9 bohr for Mo, S, Ga, P, H, and Au atoms, respectively. For the exchangecorrelation functional, we adopt the generalized gradient approximation in the Perdew-Burke-Ernzerhof (PBE) parametrization [43]. A basis-cutoff of R M T G max = 6. GaP(100), and Au(111) surfaces. We assume that calculations are converged when the root mean squared (RMS) change of residual matrices is smaller than 10 −6 e/bohr 1.5 between two consecutive iterations. The RMS is defined as where Ω denotes the unit cell volume. In our investigated systems, the mixing parameter α is set to 0.4, and we fix a history length of densities as well as residuals obtained from previous steps to 15 for the case of Pulay mixing (m = 15). We set λ = 0.529 bohr −1 of the Kerker preconditioner for the MoS 2 and GaP(100) systems. In the case of Au(111), Eq. 4 is used for the determination of λ.

RESULTS AND DISCUSSION
To verify the efficiency of our implemented methods, we first start with 10 layers of MoS 2 slab. In this case, we use linear mixing and the corresponding mixings that include Kerker and Resta-like preconditioners (termed Kerker mixing and Resta mixing, respectively). Figure 2 exhibits the convergence behavior of the SCF iteration in terms of RMS change of residuals between two consecutive steps, ∆RMS. We find that the performance is highly sensitive to the type of preconditioner in this structure. It is clear that standard linear mixing (no preconditioner) does not reach the target threshold within 100 iterations. This scheme shows that the RMS fluctuation remains mainly around 10 −1 e/bohr 1.5 . On the contrary, its performance is improved by employing preconditioners. Both preconitioners show smoothly varying lines as shown in Fig. 2. The Kerker mixing, however, does not achieve the self-consistency during the first 100 steps. This mixing decays ∆RMS well until 35 steps, but it remains almost unchanged after that. Compared to other two methods, the Resta mixing converges to the target precision in 40 steps. Unfortunately, convergence is rather too slow to be practical. To further improve the efficiency, the Pulay mixing method with preconditioners is required. We observe that the slow convergence issue is tackled by Pulay-based approaches. This is displayed in the top panel of Fig. 3. Like for our previous results, computed by linear, Kerker, and Resta mixing methods (see Fig. 2), we compare the convergence of ∆RMS for three different cases, i.e., the Pulay without and with Kerker and Restalike preconditioners. In contrast to linear and Kerker mixings, those three approaches achieve the convergence within 50 steps. The Pulay approach with the Restalike preconditioner outperforms the other methods. This method converges in only 15 steps, and it is faster by at least 50 % than the standard Pulay (no preconditioner), but also faster than with the Kerker preconditioner.
To analyze the dependence of system size on the SCF convergence, we increase the number of layers to 20. The RMS convergence behavior is depicted in the bottom panel of Fig. 3. Compared to the 10 layers case, the behavior of the Pulay mixing without any preconditioner is significantly changed, showing no convergence. This is attributed to the further elongated unit cell that causes the above-mentioned charge sloshing. Interestingly, the Pulay mixing methods with both preconditioners are insensitive to the system size. For instance, the required number of steps for the 20 layers case to reach the tar- get precision is 14 in the Resta-like preconditioning approach, which is only one step less than in the case of 10 layers MoS 2 . For the Kerker one, the difference of the convergence steps between both structures is three. This implies that the long-wavelength instability is magnificently suppressed. Overall, the Pulay with the Resta-like preconditioner is the most effective and efficient among discussed methods for our insulating system. Next, we choose P-rich, p(2 × 2) GaP(100) surfaces consisting of 15 and 19 layers as the second benchmark system with a classical surface reconstruction. In Fig. 4, for both structures, we present the RMS convergence using different methods, i.e., the Pulay without and with Kerker and Resta-like preconditioners during the selfconsistency cycle. Note that our results exhibit a similar tendency compared to those for the MoS 2 slabs. The conventional Pulay mixing has a dependence on the size of the system, while others are independent of it. We demonstrate that Pulay mixings with both preconditioner are practical since they converge in 15-30 steps, almost irrespective of the number of layers. The Pulay mixing combined with the Resta preconditioner is indeed superior to that with the Kerker one, saving around 30 % of the number of iterations. This indicates that such a scheme can capture the incomplete screening feature well in these surfaces. We conclude that the Resta-like preconditioner is more appropriate for use with insulators or semiconductors with large unit cells.
In cases, the Pulay mixings with Kerker and Resta-like preconditioners exhibit robust convergence. On the other hand, unlike our previous calculations, the Kerker case is slightly faster than the Resta-like one. The difference of their steps is 4, corresponding to 15 % more steps until convergence is reached. This is not surprising because the Resta model has been originally designed to describe screening in semiconductors. In principle, the static dielectric constant of metals is ideally infinite, yet our calculation yields a finite value that leads to deteriorating the performance. Despite the slightly slower convergence compared to Kerker, the Resta-like preconditioner is also applicable to metallic systems with a moderately slower convergence.

SUMMARY AND CONCLUSIONS
In this work, we have established a formulation of the Resta-like preconditioner that can be directly applied in the FP-LAPW method, and evaluated its performance for insulating and semiconducting systems. This algorithm has been implemented in the all-electron fullpotential code exciting. We have demonstrated that using this preconditioner in mixing methods leads to a more stable and faster SCF convergence than without preconditioner or with the Kerker preconditioner, and which is not sensitive to the system size. The performance improvement for insulating and semiconducting systems is significant, while the performance loss for metallic systems is only moderate. The increased robustness and efficiency extend the range of systems accessible in the FP-LAPW method, especially for materials of semiconducting or insulating nature as well as inhomomgeneous systems.