Linear scaling approach for optical excitations using maximally localized Wannier functions

We present a theoretical method for calculating optical absorption spectra based on maximally localized Wannier functions, which is suitable for large periodic systems. For this purpose, we calculate the exciton Hamiltonian, which determines the Bethe–Salpeter equation for the macroscopic polarization function and optical absorption characteristics. The Wannier functions are specific to each material and provide a minimal and therefore computationally convenient basis. Furthermore, their strong localization greatly improves the computational performance in two ways: first, the resulting Hamiltonian becomes very sparse and, second, the electron–hole interaction terms can be evaluated efficiently in real space, where large electron–hole distances are handled by a multipole expansion. For the calculation of optical spectra we employ the sparse exciton Hamiltonian in a time-domain approach, which scales linearly with system size. We demonstrate the method for bulk silicon—one of the most frequently studied benchmark systems—and envision calculating optical properties of systems with much larger and more complex unit cells, which are presently computationally prohibitive.


I. INTRODUCTION
Simulations of optical properties such as UV-vis-NIR absorption or reflection spectra are crucial for designing or improving opto-electronic devices with novel materials.In this context, accurate theoretical predictions help to find suitable materials much faster and at lower cost, thus complementing and guiding experimental efforts.However, calculating optical properties is computationally demanding, which limits calculations to small systems with only a few atoms per unit cell.
The reason is that optical properties are inherently affected by many-body effects.For example, the optical response of semiconductors and insulators is determined by the Coulomb interaction between electrons and holes in a material, which leads to the formation of bound electron-hole states called excitons [1][2][3].For the calculation of optical properties such as UV-vis-NIR absorp-tion spectra it is therefore necessary to describe two-particle states of electrons and holes that are created upon optical excitation.A suitable description of such many-body effects can be derived in terms of a Bethe-Salpeter equation (BSE) [3][4][5][6][7][8][9][10] for the polarization function.For almost all real materials, however this BSE is too difficult to solve.Important simplifications can be obtained for non-spin-polarized systems, where the BSE splits into singlet and triplet parts, which can be treated independently [3].Optical transitions, described by transition matrix elements that are diagonal in spin space, cannot induce spin-flips, and it is sufficient to calculate the singlet case only, which is already a huge simplification.Furthermore, the singlet-BSE can be rewritten into a generalized eigenvalue problem and further simplified by performing the Tamm-Dancoff approximation for electronically gapped systems [3,11,12].The resulting Hamiltonian matrix is still very large and dense but can in principle be diagonalized for small system sizes using popular simulation packages [13][14][15][16].In addition, very dense k-meshes are needed in order to obtain converged results, a problem that is known from the independent particle picture [17] and which becomes more severe for excitons.This has lead to strategies like the use of hybrid meshes [18,19], where specific parts of the Brillouin zone are sampled with higher precision.Despite all these works on different computational aspects, it is still challenging to include exciton effects in the calculation of optical absorption spectra, in particular for systems with many atoms per unit cell.
In this paper we present an approach based on maximally localized Wannier functions (MLWF) [20,21], which can deal with large and/or complex systems.MLWF are directly obtained from underlying quasi-particle wave functions and represent a minimal basis set that is adapted to the specific material.Moreover, they can be obtained for specific bands, e.g., near the band gap, making the calculation independent of the number of atoms in a unit cell.Furthermore, we show that the resulting representation has important computational advantages, namely that the Hamiltonian matrix becomes very sparse, and can therefore be solved very efficiently, thus enabling optical calculations of large systems.For convenience, we use the term LSWO (linear scaling Wannier optics) for the presentation of the entire approach.

A. General formalism
We start from the two-particle eigenvalue problem in Tamm-Dancoff approximation [3,11,12], where c and v label the conduction and valence bands, respectively, and A describes the exciton amplitude.The crystal momentum k is the same for electron and hole because only vertical excitations are considered in the optical limit.The hermitian singlet-exciton Hamiltonian H is given by and consists of effective single-particle contributions from conduction and valence band structures (first term), which are diagonal with respect to k, and two-particle contributions from screened electron-hole interactions H SC and local field effects H LFE , which couple different k and k ′ via Coulomb interaction.While the occurrence of a screened electron-hole interaction is intuitively plausible, the local field effects (LFE) term seems less obvious and some comments are appropriate.LFE arise when the system is inhomogeneous on the microscopic scale, i.e. the microscopic dielectric function ϵ GG ′ is not diagonal with respect to reciprocal lattice vectors G [22][23][24].By including LFE in the Hamiltonian, it is ensured that one can later calculate the macroscopic rather than the microscopic dielectric tensor directly from E Λ and A Λ .Note that the LFE matrix elements are in the form of electron-hole pair exchange interactions.[25] H SC and H LFE can be obtained from single-particle Bloch functions for conduction ϕ ck (x) and valence states ϕ vk (x).A natural choice for ϕ ck (x) and ϕ vk (x) are Kohn-Sham orbitals leading to where W (x − x ′ ) is the screened Coulomb interaction and Ṽ (|q + G|) = 4πe 2 ϵ 0 1 |q+G| 2 is the Fourier transformed bare Coulomb potential.The screening might be obtained from different approaches, including a GW calculation or using a model screening function or just a constant relative permittivity.Here, we use a model dielectric function ϵ −1 (q) = 1 − (η + αq 2 /q 2 TF ) −1 that has been shown to yield good results for typical semiconductors [26].The parameter η = (1 − ϵ −1 ∞ ) −1 with the electronic dielectric constant ϵ ∞ of the material, and q TF is the Thomas-Fermi wave vector.
no time dependence, which is the most frequent approach.However, we note that current efforts also investigate extensions to the frequency dependence of screening [27,28].By taking the Fourier transform we obtain the corresponding potential in real space, which is the superposition of a screened Coulomb and a Yukawa potential.A more detailed derivation can be found in Section B of the appendix.
Independently of the type of screening, the numerical evaluation of Eq. ( 1) can be quite expensive because a very fine k-mesh is usually required to obtain converged results and the Hamiltonian matrix that needs to be diagonalized is very large and, in general, a dense matrix.Furthermore, the underlying Bloch functions, that are needed for the evaluation of Eq. ( 3) and Eq. ( 4), are delocalized which leads to additional challenges for numerical calculations.These obstacles are circumvented by transforming above equations into a localized basis of Wannier functions which will be explained here below.

B. Exciton-Hamiltonian in basis of MLWF
For an efficient treatment of the exciton problem in Eq. ( 1), it is advantageous to employ a localized basis of MLWF w mR (x).MLWF are routinely used to investigate single-particle observables [21,29] and have been shown to be advantageous for many-body first-principles calculations, including electron-electron interactions and screening [30,31], spin excitations [32] or quadratic optical response [33].They are directly related to the underlying Bloch functions ϕ nk (x) by the transformation, where R represents a unit cell vector and U (k) is a unitary matrix.It can be chosen such that the obtained Wannier functions are maximally localized, i.e. their spread ⟨x 2 ⟩ − ⟨x⟩ 2 is minimal.
To be more precise, U (k) disentangles the individual energy bands in case of band crossings or degeneracies and fixes the k-dependent gauge phase e iθ(k) that each Bloch function has.U (k) can be obtained from an optimization algorithm [20,21] for specific groups of bands, e.g.all valence bands.The obtained MLWF are orthogonal to each other and must be real valued [20].Owing to translational symmetry, MLWF at different unit cells R have the same shape and are related to each other by w mR (x) = w m0 (x − R) , which is known as shift property.
For the LSWO approach it is advantageous to obtain MLWF for conduction and valence bands near the fundamental band gap separately.Therefore, the obtained MLWF keep the character of either an electron or a hole.We denote them as conduction-WF and valence-WF in the following.
Even though the conduction and valence MLWF are obtained separately, they are orthogonal since valence and conduction states are non-degenerate for all k-points.Hence, they represent a suitable basis for the excitonic two-particle Hilbert space.
As mentioned above, only a subspace of the two-particle Hilbert space in which electrons and holes have the same momentum is relevant for the calculation of optical properties.This means we need to transform the Bloch representation with the indexes cvk into a real-space description of MLWF with indexes mnS.This mapping is achieved by a unitary transformation of the two particle basis using the matrix where the U matrices are obtained from Wannier transformations of valence and conduction bands and the unit cell vector S = R − L is the distance between electron unit cell R and hole unit cell L. Excitonic wave functions in the optical subspace (i.e. at vanishing photon momentum q → 0) are obtained by We have used that MLWF are real and therefore the excitonic wave function fulfills ξ mnS = ξ * mnS .Eq. ( 8) is a manifestation of the convolution theorem in terms of Bloch functions and corresponding MLWF.At this point we should mention that the use of the variable R (electron unit cell) as summation index by no means introduces any asymmetry in the treatment of electrons and holes.
The same result can also be expressed by centre of mass and relative coordinates.The centre of mass motion is not relevant for optics due to translational symmetry of the crystal and only the relative distance S between electron and hole remains in ξ mnS .
We also use F mnS, cvk to transform Eq. ( 1) into the Wannier basis, where the exciton eigenvector is obtained as and the exciton Hamiltonian becomes According to Eq. ( 2) the single-particle band contributions are obtained as where H cond. m ′ m (S −S ′ ) and H val. nn ′ (S −S ′ ) are the single-particle Wannier Hamiltonians for conduction and valence bands, respectively.They are directly accessible from the Wannier transformation of the first-principles electronic structure.[20,21] The screened electron-hole interaction can be obtained by virtue of Eq. ( 8) and by applying the shift property of MLWF (see appendix), with the general Coulomb matrix elements which depend on three different unit cell vectors (corresponding to three k-vectors in reciprocal space).HSC mnS, m ′ n ′ S ′ only depends on two unit cell vectors because electrons and holes have the same momentum.For a more intuitive and physically comprehensible description, we introduce the unit cell vectors R c , R v , and R D , which correspond to the relative shifts between conduction WFs, between valence WFs, and to the electron-hole distance, respectively.We substitute 14) and use the shift property of MLWF to obtain where ρ mm ′ Rc (x) = w m0 (x)w m ′ Rc (x) and ρ nn ′ Rv (x) = w n0 (x)w n ′ Rv (x) are (overlap) densities of two electrons and (overlap) densities of two holes, respectively.
Before we come to the integration strategy in Sect.III, we comment on the distance dependence of these matrix elements.Since the overlap between two different MLWF is exponentially suppressed with increasing distance, it is clear that the overlap densities vanish for large values of R c and R v .Therefore, the corresponding Coulomb integrals Eq. ( 15) also vanish rapidly for large displacements R c or R v .This substantially reduces the number of calculations required and constitutes a significant advantage over a plane wave basis set.In contrast, R D is associated with long-range Coulomb interactions, which always yields contributions that decay very slowly.Substituting back the original variables S, S ′ , and A, we see that finite Coulomb integrals contribute only to matrix elements HSC mnS, m ′ n ′ S ′ near the diagonal and R D corresponds to the position along the diagonal.The matrix representation is therefore very sparse.This is a great advantage for numerical computations, since diagonalization or alternative treatments can be performed very efficiently and with low memory requirements.It is thus not surprising that other localized basis sets leading to sparse representations of Coulomb interactions have shown large performance advantages for GW calculations in the past.[34,35] The diagonal elements for which m = m ′ , n = n ′ , and R c = R v = 0 (or alternatively A = 0 and S = S ′ = −R D ) are expected to yield the largest contributions to HSC .They represent interactions of classical charge densities with total charge of one, because MLWF are normalized.The non-diagonal elements of HSC correspond to interactions where at least one density is an overlap density, i.e. ρ mm ′ Rc or ρ nn ′ Rv contains two different MLWF.Such overlap densities have zero total charge because MLWF are orthogonal.We therefore expect the non-diagonal elements to be significantly smaller.Finally, contributions from LFE, Eq. ( 4), are calculated in analogy to Eq. ( 13), This matrix is, like HSC , very sparse since the overlap between MLWF is exponentially suppressed with increasing distance.Consequently, only matrix elements with small values S and S ′ , where electron and hole have closest distance, are affected by LFE.In the limiting case of strongly localized Wannier functions only matrix elements with S = S ′ = 0 would contribute.We thus have a complete description of the singlet exciton Hamiltonian in the Wannier basis Eq. ( 9) that can be used to calculate optical properties.

C. Optical properties
The macroscopic dielectric function ϵ M (q, ω) could be calculated within the original Bloch representation directly from the solutions of Eq. ( 1) and the optical transition matrix elements M cvk (q) that can be obtained from conduction and valence Bloch functions, The macroscopic dielectric function is given as [3] ϵ Like in the previous section we transform these expressions into the basis of MLWF by utilizing the matrix F mnS, cvk to calculate ϵ M (q, ω) directly from B Λ mnS and corresponding transition matrix elements.The transformation is applied to the scalar product in Eq. ( 18), cvk where was defined in the last step.Using Eq. ( 8) we can rewrite the transition matrix elements in terms of MLWF, Taylor expanding the exponential up to linear order (higher orders are irrelevant in the optical limit q → 0) [36,37] we get From Eq. ( 21) we can see that the transition matrix elements are proportional to transition dipole moments, i.e. dipole moments of electron-hole overlap densities, which nicely connects to expectations from finite systems.The evaluation of transition dipole moments does not cause any problems (like one would have with delocalized Bloch functions) since Wannier functions are localized in real space.Finally, the macroscopic dielectric function becomes With Eqs. ( 22),( 21) and ( 11) the entire problem is formulated in the Wannier basis.The remaining task is to evaluate all required matrix elements for the screened Coulomb interaction and LFE in this basis, which will be discussed below.

III. NUMERICAL EVALUATION OF TWO-PARTICLE MATRIX ELEMENTS AND MACROSCOPIC DIELECTRIC FUNCTION A. Evaluating Coulomb matrix elements in the basis of MLWF
For the numerical evaluation of the screened Coulomb interaction we insert the model-screened potential Eq. ( 5) into Eq.( 15) and evaluate the Coulomb and Yukawa potentials separately, While the integral with the Yukawa potential (second term of Eq. ( 23)) can be solved efficiently in reciprocal space, the numerical evaluation of the Coulomb integral (first term of Eq. ( 23)) is quite challenging, because the potential diverges in both real and reciprocal space for x → 0 and q → 0. However, the integral is nevertheless finite as can be shown on general grounds.
The problem is still complicated by the fact that MLWF are typically obtained numerically from DFT or GW calculations and analytic forms are usually unknown.Strategies to circumvent such issues include expansions of MLWF using spherical harmonics and appropriate radial functions [38,39], where the Coulomb integrals can be rewritten and partly solved analytically, or attempts to expand MLWF around the origin in k-space by a suitable Taylor expansion.While the latter is numerically inconvenient, the expansion in spherical harmonics can provide satisfactory results for simple systems [38], especially when the Wannier functions are expressed in a form of atomic orbitals and only a small number of expansion coefficients are needed.This, however, may not be the case, which means that in general an extreme large set of spherical harmonics becomes necessary, especially when satellite structures far away from the charge centre exist.Alternatively, one might consider choosing a different system of functions where the Coulomb integrals can be solved analytically.A well-known example is Gaussian basis functions, which are routinely used in quantum chemistry codes [40].However, an expansion of MLWF in terms of such basis functions is usually very complicated and requires sophisticated optimization and fitting algorithms.Despite some proof of principle studies [41], there are no commonly available tools to perform such an elaborated task.Here, we want to use a numerical method that yields satisfactory results for all types of MLWF and is easily applicable.This method follows the ab-initio philosophy in the sense that we avoid any fitting.
The numerical evaluation of the first term of Eq. ( 23) is performed in multiple steps.We start by introducing auxiliary densities ρ aux mm ′ Rc (x) and ρ aux nn ′ Rv (x) for each ρ mm ′ Rc (x) and ρ nn ′ Rv (x), respectively.These auxiliary densities are Gaussian functions with the constraint that they have the same charge as the corresponding overlap density, i.e., The centre and variance of each Gaussian function is in general not important, albeit specific choices might be numerically favourable.We continue by adding and subtracting auxiliary densities for each integral and separate four different terms, where the individual contributions are given by, The last term I 4 can be evaluated analytically because only Gaussian functions are involved.For instance, chosing radial symmetrical Gaussians ρ aux mm ′ Rc (x) = α π 3/2 e −α|x−B| 2 and ρ aux nn ′ Rv (x) = γ π 3/2 e −γ|x−C| 2 , one obtains [40], The remaining three terms I 1 , I 2 and I 3 are solved in Fourier space.This is demonstrated for I 1 , which, in Fourier space reads where the Fourier transformed quantities are and the Fourier transformed potential Ṽscr (q) ∝ q −2 .The divergence at q → 0 is integrable, i.e.
the integral is finite for all finite regions including volumes around the origin.
Since the auxiliary densities have the same charge as the corresponding overlap densities (cf. Eq. ( 24)), it becomes clear that f mm ′ Rc (q = 0) = f nn ′ Rv (q = 0) = 0 by construction.For a discrete numerical evaluation of the integral Eq. ( 28), this means that the q = 0 term can be omitted, since it must be zero (finite value times zero).The only remaining task is to perform the q-sum for all q ̸ = 0, where no problems occur, and we obtain Integrals I 2 and I 3 are solved in full analogy.After summation and back substitution we obtain the desired (screened) Coulomb matrix elements Eq. ( 14).

B. Evaluating LFE in the basis of MLWF
The numerical calculation of LFE matrix elements in Eq. ( 16) is much easier than the screened Coulomb interaction because the used potential is not divergent (G = 0 is not contained).The potential in Fourier space is obtained as, The overlap densities are now between conduction and valence WF and are known as transition densities.We denote their Fourier transform as Finally, Eq. ( 16) becomes which can be easily evaluated numerically with a Fast Fourier algorithm.

C. Time domain approach for calculating the macroscopic dielectric function
We have now everything at hand to construct the exciton Hamiltonian in the basis of MLWF.
The remaining task would be to solve the eigenvalue equation and use Eq. ( 22) to obtain the macroscopic dielectric function ϵ M .Numerically this could be done by using a sparse matrix diagonalization algorithm.However, we want to use a time-domain approach [42] which allows us to calculate ϵ M without a formal high-scaling diagonalization or restrictions to a few number of eigenvalues.Therefore, we rewrite Eq. ( 22) in the time domain by taking a Fourier transform.We start with the dielectric function in the Cartesian direction êj , This is equivalent to a time-domain formulation [42], where the time-initial state is given by ψ (j) mnS (t = 0) = M * mnS (ê j ) and is propagated with the exciton Hamiltonian,

IV. COMPUTATIONAL DETAILS
To demonstrate our approach for the example of silicon crystals, which has been frequently studied experimentally and theoretically in the past, [11,13,42,43] we proceed in multiple steps.
First, electronic states are obtained using density functional theory (DFT) with the PBE exchangecorrelation functional and PAW pseudo potentials [44,45] as implemented in the vasp code [46,47].We use an energy cut-off of 350 eV and a 11 × 11 × 11 Monkhost-Pack k-points grid for converged DFT calculations.From these results, we calculate four MLWF which correspond to all valence bands and six MLWF for the lowest-energy conduction bands separately by utilizing the wannier90 code [48].It was carefully checked that all obtained MLWF are real-valued and reproduce the DFT band structure very accurately.The obtained Wannier functions are very localized with maximal spreads of 2.18 Å2 for valence WF and 5.25 Å2 for conduction WF.Since the underlying DFT-GGA calculations do not provide the correct band gap, we apply a scissors shift of 0.9 eV which is is similar to previously calculated quasi-particle shifts [3].The Wannier Hamiltonians for valence and conduction bands provide all single-particle contributions of the exciton Hamiltonian Eq. ( 12).The two-particle integrals entering HSC and HLFE are evaluated on a regular grid in Fourier space as described in Section III A and III B, which captures a supercell of 11 × 11 × 11 primitive unit cells.The grid is determined by the Fourier space grid of the vasp calculation.(Overlap-)densities and auxiliary functions are also constructed on this real space grid and Fourier transformations (c.f.Eqs. ( 29), ( 30) and ( 33)) are performed using the FFTW library [49].For the screening model introduced in Sect.II A we use ϵ ∞ = 11.68 for Si.From the obtained single-particle and two-particle contributions we construct the exciton Hamiltonian Eq. ( 11) in a sparse matrix format where S, S ′ are running over 61 lattice vectors in each direction for converged results.To test the capability of the LSWO approach we also performed calculations with 111 lattice vectors in each direction, which is equivalent to 1.37 million k-points.
The time evolution for the calculation of ϵ M (c.f.Section III C) is performed by a Chebyshev polynomial expansion [50,51] of the time evolution operator, which has proven to be very accurate and efficient in the past [52][53][54].We set the maximum time to 14.77 ps, use 2000 time steps and 16 polynomials.When calculating the spectrum we assumed a broadening of η = 65 meV.Fig. S-2 shows the time-autocorrelation function which enters Eq. (36).
We also carefully tested the implementation of the LSWO approach at multiple levels.This includes the comparison to an analytic Wannier-Mott exciton model and the reproduction of its energies.The interested reader is referred to Section C 1 of the appendix for more details.

A. Overlap densities and Coulomb integrals
Before discussing the optical absorption of bulk Si, we investigate more closely the distancedependence of the two-particle contributions of the exciton Hamiltonian.We start by discussing the overlap densities ρ mm ′ Rc (x) and ρ nn ′ Rv (x), which contribute to the screened Coulomb interaction via Eq.( 15).Furthermore, ) is more sensitive to r v than r c because valence WFs are more localized than conduction WFs.In both cases, the overlap densities ρ mm ′ Rc and ρ nn ′ Rv vanish for large separations where the Coulomb integrals become zero.As a consequence, the corresponding screened Coulomb operator HSC is very sparse and the largest values contribute to the diagonal of the Hamiltonian matrix, as suggested.Similar results can be found for HLFE (not shown), which leads to a very sparse total exciton Hamiltonian.
We next turn to the diagonal elements of the Hamiltonian that correspond to electron-hole interaction of classical charge densities.They are shown in Fig. 2 for different distances between electrons and holes, which depends on the unit cell distance R D and the positions of the MLWF (charge centres) within a unit cell.The Coulomb integrals W mm nn (0, 0, R D ) become smaller with increasing distance and can be approximated for distances larger than 10 Å by the monopole-monopole interaction (grey dashed line).Notable deviations from the monopole-monopole approximation are and valence WF.The interaction is dominated by the monopole-monopole interaction (dashed line).Only interactions between overlapping densities with small distances differ significantly.
found here only when electron and hole densities start overlapping at smaller distances.As a result of the multipole expansion, only a relatively small fraction of the Coulomb integrals need to be calculated numerically, which reduces the computational effort substantially.For example, in the present study, we only need to compute 2496 out of 5.4 million density-density Coulomb integrals in full detail (less than 0.5 % for a 61 × 61 × 61 supercell with 4 valence and 6 conduction WFs) and assume the monopole-monopole approximation for the vast majority of terms.In general, the value of 10 Å does not have to be universal and deviations from the leading monopole-monopole term could occur also at larger distances, for instance in systems with Wannier functions that are less strongly localized.However, we are confident that systems with larger orbital spreads can also be treated very efficiently.

B. Optical absorption spectrum
With the obtained exciton Hamiltonian we calculate the optical absorption spectrum of Si.Fig. 3(a) shows a comparison the LSWO approach (black solid line) to experimental data (orange dashed line).The spectrum contains the peaks E 1 and E 2 (naming convention from Ref. [56]), in good agreement with experiment.Most importantly, the characteristic (direct) exciton peak at E 1 = 3.5 eV is a clear sign of bound exciton states that arise from electron-hole interactions.This peak is not present at GW or DFT theory level as shown by the dotted gray line.Compared to Peak labels are in agreement with previous conventions [56].(b): Comparison with other theoretical calculations.References: Gajdos [14], Puschnig [13], Schmidt [42], Arnaud [43], Marini [57] the quasiparticle spectrum, the excitonic effects result in a significantly redshifted spectrum, as generally expected which is a consequence of the electron-hole interaction.Residual deviations of the exciton spectrum to experiment might be related to the screening model (which is frequently used but still remains an approximation) or missing quasi-particle corrections in the band structure that go beyond a scissors shift.Fig. 3(b) compares LSWO results to other theoretical calculations.
The height of the E 1 exciton peak varies significantly among different methods, which might be related to different treatments of the screening.Our results are closely comparable to the approach by Marini [57] and performs better than others in the literature.

C. Scaling and performance of the LSWO approach
Finally, we discuss the performance and scaling with respect to the size of the exciton Hamiltonian, which depends on the number of valence and conduction states and the number of k-points (or equivalently S-points in Eq. ( 9)).The overall performance depends on two parts, i.e., firstly the calculation of all required matrix elements of the Hamiltonian and secondly the evaluation of the optical absorption spectrum using the time evolution approach.Fig. 4 shows the scaling of both parts for various numbers of k-points.All computations are performed on a single CPU core and normalized to a reference computation.Note that in the current implementation we do not exploit the symmetry of the crystal.
The most time-consuming part for the construction of the exciton Hamiltonian, which is shown in Fig. 4(a), is the evaluation of the Coulomb and LFE integrals that enter ĤSC and ĤLFE .In contrast, the time required to generate the single-particle contributions of the Hamiltonian, i.e. valence and conduction bands, is negligible.As a result, the computing time scales with the number of two-particle integrals that need to be evaluated numerically on a real space grid.As we have shown in the previous section, the majority of such integrals either vanish if R c or R v deviate sufficiently from zero, or become analytical monopole-monopole interactions for larger values of R D .Consequently, only a finite number of integrals need to be evaluated, leading to a saturation of CPU time in Fig. 4(a).This plateau is already reached for a supercell of 7 × 7 × 7-unit cells (corresponding to a k-lattice of the same dimensions) which can be done with moderate effort.Once all integrals have been obtained, one can proceed to even denser k-grids (corresponding to very large supercells S) without additional effort for the computation of H.
The second step that is crucial to the performance of the LSWO method is the time evolution with the exciton Hamiltonian, which is shown in Fig. 4(b).This time propagation is performed in a step-by-step fashion, where each time step has the computational complexity of a sparse matrix-vector multiplication.Such operations can be performed very efficiently in linear scaling as shown in the figure.For comparison, the time-evolution approach in a Bloch representation, where the Hamiltonian is dense, would scale with O(N 2 ) [42], which is similar to implementations that use a Lanczos-Haydock approach as implemented in the Yambo code [58].Note that a direct diagonalization of the Hamiltonian scales with O(N 2 ) in the case of a sparse matrix or with O(N 3 ) in the case of a dense matrix.

VI. CONCLUSION AND OUTLOOK
We have presented a method for describing the exciton Hamiltonian of the Bethe-Salpeter equation using maximally localized Wannier functions, which represent a minimal, spatially localized and material-specific basis set that accurately represents the quasiparticle band structure.The electron-hole interaction, i.e., local field effects and screened Coulomb attraction, are evaluated numerically in this basis, where the required number of two-particle matrix elements to be computed is greatly reduced due to the localized character of Wannier functions.Moreover, Coulomb integrals where electron and hole densities have large distances can be treated very efficiently in monopole approximation.Therefore this description in real space leads to a very sparse exciton Hamiltonian that can be calculated and used with high efficiency and offers intuitive user control over the simulations.With this implementation at hand, the macroscopic dielectric function for optical properties is calculated in the time domain using a linear-scaling algorithm.We have demonstrated the approach for a Si crystal where the optical subspace was constructed with millions of simple unit cells (corresponding to millions of k-points).The calculated absorption spectrum agrees well with experimental results.
In the future, we expect that the described LSWO approach will be very efficient for materials with many atoms per unit cell, which are not accessible with alternative current implementations.
We hope that excitonic effects in optical spectra, which are relevant in a large number of crystalline systems, become more easily accessible.
We finally show that the hermiticity relation of the Hamiltonian can be traced back to relations between single Coulomb integrals W mm ′ nn ′ (A, S, S ′ ).For this we substitute A → −A.
Performing the sum over A on both sides, we obtain the hermiticity relation of the Hamiltonian.

Appendix B: Model screening potential
We start from the screened potential Coulomb interaction as defined in Section II A and define α ′ = α/q 2 TF for simplicity, A simple rearrangement of the terms yield the Coulomb and Yukawa potential in reciprocal space, The Fourier transform then yields Eq. ( 5).36) and is calculated using Eq. ( 37).

Implementation test
We have carefully and extensively tested all implementations, of which we want to discuss one particular test case that demonstrates the ability to compute excitons.For this purpose, we propose a simple test system that can be solved analytically.It consists of one orbital per unit cell in a cubic lattice of length L and nearest neighbor transfer integrals for electrons and holes.The electronic structure is given by a tight-binding model, We construct the exciton Hamiltonian and include the electron-hole interaction.For simplicity we chose a static screening with ϵ ∞ and do not include local field effects.The resulting model is given by where Ṽ (k −k ′ ) is the bare Coulomb potential in k-space.The model system is therefore similar to the Wannier-Mott exciton model [59].To obtain an analytical solution of this model, we perform a Taylor expansion of the band energies around k = 0 By expanding the exciton Hamiltonian up to second order we obtain the hydrogen-like problem, with an effective mass µ = ℏ 2 2(t el +t h )L 2 and E g = E 0 − 6(t el + t h ) the band gap without electron-hole interaction.The exciton energies follow a Rydberg series, where the exciton Rydberg energy R ex and exciton Bohr radius a B are, We note that this result can be further improved by calculating the energy shifts due to the k 4 term in Eq.(C5), which would correspond to a relativistic correction of the hydrogen atom (fine structure without spin-orbit coupling).In complete analogy, they can be calculated using perturbation theory (more details on the derivation can be found in Ref.The analytical model will be compared with results of our Wannier implementation.Towards this end, the exciton Hamiltonian is set up in real space using the tight-binding models for valence and conduction bands (c.f.Eq. (C1)) and statically screened monopole-monopole interaction.The results can then compared for various model parameters (L, t el , t h or ϵ ∞ ).For converged numerical results, it is necessary to ensure that the size of the supercell (corresponding to the number of k points) is large enough to host the eigenfunctions (hydrogen-like wavefunctions).More specifically, it must be much larger than the exciton Bohr radius a B .To avoid discretization errors, the spacing of the lattice points must be small compared to a B so that the eigenfunction can be represented on a real space lattice.By varying the parameters, one can obtain converged numerical results that are arbitrary close to the analytical result.On example is shown in Fig. S-3, where the parameters are L = 5 Å, t el = t h = 8 eV, and ϵ ∞ = 1.The calculations are performed in a 700 × 700 × 700 supercell and we have used an efficient Lanczos algorithm to calculate the density of states (DOS).
The figure shows perfect agreement between the numerical and analytical results, demonstrating the correctness of our implementation and the ability to simulate various excitons.

FIG. 1 .
FIG. 1. (a): Examples of overlap densities ρ nn ′ Rv for valence WF with n = n ′ and different R v .Yellow colours represent positive and blue negative values.All densities are plotted for the same iso-value magnitude of 0.001 and blue lines indicate the Si crystal.(b): Coulomb integrals for different hole-hole distances r v and electron-electron distances r c in the corresponding overlap densities ρ nn ′ Rv and ρ mm ′ Rc .While R v and R c are only unit cell vectors, r v and r c also consider the position of Wannier centres within their unit cell.

Fig. 1 (
a) shows selected overlap densities ρ nn ′ Rv of the valence WF (with n = n ′ and different R v ).In this case, the overlap density for R v = 0 is a classical charge density in the shape of σ-bonded combination of sp 3 hybrid orbitals.The density is positive everywhere (yellow colour) with total charge of one.On the other hand, finite shifts R v introduce negative regions (blue colour) in ρ nnRv and result in a total charge of zero.It is clearly seen that large values of R v lead to smaller overlaps as expected.The implications of the decay of the Coulomb integralsW mm ′ nn ′ (R c , R v , R D )with distance are shown in Fig. 1(b).Blue stars denote data with varying distance between conduction WF r c and orange dots show data with varying distance between valence WF r v .The distances r c and r v depend on the unit cell separation R c and R v , respectively, and on the position of the Wannier centres within the unit cell.It is clearly visible that already small separations in the overlap densities of a few angstroms lead to much smaller values in the Coulomb integral.The largest Coulomb integrals are observed for r c = r v = 0, where classical charge densities (with total charge of one) interact with each other.Our above discussion has therefore been confirmed numerically.

FIG. 4 .
FIG. 4. Scaling behaviour for (a) construction of the exciton Hamiltonian and (b) calculation of the optical absorption spectrum.N is the rank of the Hamiltonian N = N el • N h • N k .For comparison, a direct diagonalization of the exciton Hamilton in the Bloch basis (dense matrix) scales with O(N 3 ).Using the time evolution approach of Ref. [42] scales with O(N 2 ).The legend is shared for both figures.Calculations are performed on a single CPU core.

3 .
FIG. S-3.Rydberg series of Wannier excitons as a test case.Analytical results are shown as vertical dashed lines.