Floquet formulation of the dynamical Berry-phase approach to non-linear optics in extended systems

We present a Floquet scheme for the ab-initio calculation of nonlinear optical properties in extended systems. This entails a reformulation of the real-time approach based on the dynamical Berry-phase polarisation [Attaccalite&Gr\"uning, PRB 88, 1-9 (2013)] and retains the advantage of being non-perturbative in the electric field. The proposed method applies to periodically-driven Hamiltonians and makes use of this symmetry to turn a time-dependent problem into a self-consistent time-independent eigenvalue problem. We implemented this Floquet scheme at the independent particle level and compared it with the real-time approach. Our reformulation reproduces real-time-calculated $2^{nd}$ and $3^{rd}$ order susceptibilities for a number of bulk and two-dimensional materials, while reducing the associated computational cost by one or two orders of magnitude.


Introduction
Nonlinear optical spectroscopies are an invaluable tool for investigating materials properties. For example, second harmonic generation (SHG), due to its sensitivity to electric fields and symmetry properties, has been traditionally used on surfaces and interfaces [1][2][3] and more recently for the characterisation and imaging of twodimensional flakes [4][5][6][7]. In addition, SHG is being used as a highly sensitive probe of magnetic ordering in atomically thin materials and in multiferroics [8][9][10]. Due to the sensitivity to changes in the electric polarisation, SHG can also probe the dynamic of excited systems, tracking-for instance-the formation of excitons, excitonphonon coupling and demagnetisation of antiferromagnets [11,12]. Further, nonlinear optical properties of materials are intensely investigated for applications to quantum technologies, optical frequency metrology and optoelectronics. Recent developments have highlighted the crucial role low dimensionality can play in this field [13][14][15][16], for instance, strengthening the nonlinear response as well as facilitating the integration into devices. Other advances include epsilon-near-zero media [17], the new generation of infrared nonlinear optical materials [18] and the use of structured light [19]. The development of ab-initio approaches to nonlinear optical properties is then of utmost importance for interpreting experimental measurements and guiding the design of functional materials. While the ab-initio prediction of optical properties in the linear regime is well-established [20,21], the theoretical description of nonlinear processes still presents challenges concerning its complexity and the associated computational cost.
The theoretical methods available in the literature for the calculation of nonlinear optical properties are either perturbative or non-perturbative. Perturbative approaches are often extensions of frameworks that proved successful for linear optics. For instance, Sipe et al. presented a scheme for the calculation of nonlinear optical response of semiconductors within the independent particle approximation (IPA) and derived expressions for the second order susceptibility, χ (2) [22]. A study by Dal Corso et al. introduced a Sternheimer approach for the second order response of insulators within the local density approximation (LDA) [23]. Luppi et al. derived perturbative expressions for χ (2) in extended systems from time-dependent density functional theory (TD-DFT) [24]. The latter included excitonic effects by means of a long-range contribution to the exchange-correlation kernel and proved valid for weakly bound excitons [24]. Finally, the inclusion of many-body effects at the Bethe-Salpeter equation (BSE) level warranted a few attempts up to the second order [25,26]. Unfortunately, perturbative approaches require a specific formulation for each order in the response one intends to calculate and their generalisation to higher orders is not straightforward. Indeed, the resulting expressions for nonlinear susceptibilities become extremely complex with increasing orders in the perturbation and increasing levels of theory as regards correlation.
At variance, non-perturbative approaches involve explicit time propagation and can describe nonlinear phenomena to several orders in the electric field simultaneously, thus offering a convenient workaround to the shortcomings described above. Moreover, they are flexible in the sense that including many-body effects amounts to just adding the relevant operators into the effective Hamiltonian. In these methods, the integration of an equation of motion (EOM) allows for the calculation of the dynamical polarisation, from which susceptibilities to any order (in principle) can be extracted. The quantity evolved in the EOMs varies among the different time-propagation methods. For instance, TD-DFT implies the time evolution of the electron density and is typically applied to isolated systems [27][28][29][30][31]. Propagating the Green's function was proposed in the so-called Kadanoff-Baym equations (KBE) [32]. A simplification of the KBE using the time-diagonal of said Green's function, i.e., the density matrix, was proposed by Attaccalite, Grüning and Marini [33]. Subsequently, Attaccalite and Grüning proposed a scheme based on evolving the periodic part of the Bloch functions [34]. Crucially, this method is valid for systems with periodic boundary conditions (PBCs) since it is based on the modern theory of polarisation [35][36][37] and uses the Berry phase formulation of the dynamical polarisation [38] (see Section 2). This real-time approach [34] has been successfully applied for the calculation of nonlinear optical properties in extended systems [39][40][41][42][43].
The main drawback of time-propagation approaches lies in their elevated computational cost, which results from the short time steps and long simulated times required. This implies repeating a handful of operations, e.g., building a Hamiltonian matrix, for tens of thousands of time steps. The computational cost of these schemes renders the calculation of nonlinear optical properties prohibitively costly in many cases, certainly for large systems and complex materials. Therefore, finding alternative formulations and methods that could alleviate these computational demands is of utmost importance. In order to tackle this challenge, the use of time-periodicity and Floquet theory offers an interesting avenue to explore. In periodically-driven systems, one could avoid the explicit integration of the EOMs and reformulate them as a time-independent eigenvalue problem [44] by invoking Floquet's theorem. This has been attempted at the TD-DFT level [45,46] and intensely debated [47][48][49][50]. However, to the best of our knowledge, it has been only applied to atomic and molecular systems [45,46,51,52].
In this work, we introduce an efficient Floquet approach to nonlinear optics valid for extended systems. Our scheme works under PBCs since we use the Berry-phase polarisation, and its conjunction with Floquet theory defines the originality of our contribution. We implemented our method at the IPA level and achieved a sizeable computational advantage compared to the real-time approach [34] while retaining its main benefits, i.e., it is non-perturbative in the electric field and offers flexibility for the inclusion of many-body effects. The remainder of the manuscript is structured as follows. First, we review the real-time approach [34] in detail (Section 2). In Section 3, we apply Floquet theory to the problem at hand and present a Floquet formulation of the electron-field coupling operator derived from the Berry-phase polarisation. In this section, we also give details on the computational implementation of our scheme. We tested our method with several materials and benchmarked it against the real-time approach in Section 4. Further discussion on the performance of our scheme and its limitations follows, before reaching the conclusions in Section 5.

Theoretical background
We consider the Hamiltonian of a crystalline solid coupled to a time-dependent electric field,Ĥ whereĤ 0 is the zero-field unperturbed Hamiltonian whileĤ E represents the perturbation. We denote the Bloch eigenstates of the cell-periodic unperturbed Hamiltonian, e −ik·rĤ 0 e ik·r , as ψ kn (r) = e ik·r µ kn (r). In what follows, the periodic part of these functions will be referred to as the zero-field time-zero states, |µ kn , and will be used as a starting point for time integration or as a basis.

Real-time approach
The real-time approach to nonlinear optics, as referred to within this manuscript, was set out by Attaccalite and Grüning [34], and follows the scheme presented by Souza et al. for the Berry-phase polarisation [38]. The central objects in this formalism are the time-dependent Bloch states, |v kn , which represent the periodic part of the states, ψ kn (r, t) = e ik·r v kn (r, t). The latter are obtained upon time-evolution (with an electric field) of the Bloch eigenstates ofĤ 0 k . An EOM is then formulated for these time-dependent states as, with the effective Hamiltonian, The unperturbed Hamiltonian in Eq. 3,Ĥ 0 , is a single-particle operator that varies according to the level of theory considered [53,54]. The perturbation,Ŵ (E), represents the coupling with the external field, E, and is defined as, In Eq. 4,ŵ(E) is the electron-field coupling operator in its Berry-phase formulation as outlined in Refs. [34] and [38], with σ = ±1 and k σ i = k + σ∆k i , i.e., the next k-point in the grid along the i Cartesian direction (the definition of next depends on the sign of σ). The projector operator in Eq. 5 has the form, where m runs over the occupied bands, M. The state |ṽ k σ i m is the so-called dual of the state |v km , namely, with the overlap matrix elements The real-time approach [34] consists on integrating the EOMs given by Eq. 2 starting from the corresponding zero-field time-zero states, |µ kn . This allows us to obtain the time-dependent states |v kn at every time step t i , with which we can update the overlaps [S kk σ i ] n,m (Eq. 8). Ultimately, we can use these overlaps to calculate the polarisation in its Berry-phase formulation, with the electron charge e, occupation factor f , unit cell volume v. Eq. 9 provides the dynamical polarisation in the direction α of the lattice vector a α . The corresponding reciprocal lattice vector b α is used to determine the number of k-points in a string along its direction, N kα , as well as the number of k-points in a plane perpendicular to b α , namely N k ⊥ α . Within the regime where the dynamical polarisation is time-periodic, it can be formulated as a Fourier series, where scalar magnitudes are used for simplicity. In addition, one can consider its expansion in orders of the electric field E [55], where the tensor nature of the susceptibilities χ (n) is omitted for brevity. Comparing Eqs. 10 and 11 finally allows us to extract susceptibilities to any order. The relation between the Fourier coefficients, p (n) , and the desired susceptibilities will depend on the order, n, and the shape of the electric field, which would typically be a sine function ( e iω 0 t −e −iω 0 t 2i ). As mentioned in the introduction, this framework has two main advantages. First, as a non-perturbative scheme, it allows for the simultaneous determination of susceptibilities to different orders in the electric field. This is also facilitated by having a Berry-phase derived electron-field coupling operator that remains valid to every order in the electric field. Second, the inclusion of many-body effects is as simple as adding terms to the effective Hamiltonian used in the EOM, Eq. 2 [53].
Despite its many virtues, the real-time approach often presents challenges regarding its elevated computational cost. This is its biggest disadvantage and originates from its time-propagation nature. Arguably, there is a particular case in which much of this cost is avoidable, i.e., computing nonlinear optical susceptibilities. In these calculations, the system is driven by a periodic perturbation and the response is sampled at a handful of times within one period, i.e., only one period worth of dynamical polarisation data is needed to extract susceptibilities. However, a considerably longer time is required to dephase the response before sampling it, in order to filter out all the eigenfrequencies that are excited when the electric field is first introduced. This amounts to a total simulated time that greatly exceeds the time window actually used to probe the response. This long simulated time combined with the expensive numerical integration of the EOMs (often with short time steps) render this kind of calculations particularly costly. It would then be desirable to devise a strategy where the dephasing is not needed, numerical timeevolution is avoided and/or the problem becomes time-independent altogether. We will see in Section 3 how Floquet theory offers a framework in which all of the above are possible.

Nonlinear optics via Floquet theory and Berry-phase polarisation
In this section, we apply Floquet theory to the EOM (Eq. 2) of the real-time approach [34]. We start by defining the so-called Floquet-Kohn-Sham (FKS) basis in Section 3.1. This allows us to turn Eq. 2 into a self-consistent time-independent eigenproblem (Section 3.2). In Section 3.3, we derive an expression for the electron-field coupling operator in FKS basis. We use its Berry-phase formulation [34,38], which makes our approach valid for extended systems and distinguishes it from previous Floquet works [45]. Finally, we describe the computational implementation of our method in Section 3.4.
Within this manuscript, we will choose the IPA to be defined at the density functional theory (DFT) level plus a static quasi-particle correction. With this assumption, the effective Hamiltonian in Eq. 3 takes the form, whereĤ KS represents the Kohn-Sham (KS) Hamiltonian andĤ IPA,0 is timeindependent. While the quasi-particle correction, ∆ QP , could be determined by a GW calculation, we only considered a rigid shift to the band structure. Since the unperturbed Hamiltonian,Ĥ 0 , is formulated at the DFT level, we can also refer to the zero-field states, |µ kn , simply as KS states.

Time independent Floquet-Kohn-Sham basis
Let us assume that the effective Hamiltonian in Eq. 2 is time-periodic with a period, T = 2π ω 0 , given by the frequency of the perturbing electric field, ω 0 . Invoking Floquet's theorem, one can assert this EOM will admit solutions in the form of the so-called Floquet basis functions, i.e., e −iξαt φ α (t), where ξ α is the so-called Floquet quasi-energy and the time-dependent Floquet states, φ α (t), retain the periodicity of the Hamiltonian, φ α (t) = φ α (t + T ). The general solution to Eq. 2 would then be a linear combination of said Floquet functions, Making use of the adiabatic approximation for weak fields [50], we can assume each time-zero KS state will evolve adiabatically into a single time-periodic Floquet state and retain only one term in the summation in Eq. 13, i.e., where we replaced the label α with the index of the state at k-point k and band n.
Projecting over the zero-field KS states, |µ kn , we get, where the index i runs over both occupied and empty bands. As the coefficients d kni (t) ≡ µ ki |φ kn retain the time periodicity of the Floquet states, d kni (t) = d kni (t+T ), they can be expanded in a Fourier series, where η will be referred to as the Floquet mode. Finally, with Eqs. 15 and 16, we arrive at the representation of the time-dependent Bloch states we will use in this work, where the coefficients d kni (η) depend on the band index i and the Floquet mode η. The states given by |kni; η ≡ e −iξ kn t e −iηω 0 t |µ ki form what we will refer to as FKS space. This extended Hilbert space includes L 2 [0, T ] plus the space spanned by KS eigenvectors, i.e., H. The inner product in L 2 [0, T ] ⊗ H is defined as ·|· ≡ T 0 dt ·|· , with ·|· the usual inner product in H. The dimension of FKS space must be truncated to N b × (2η max + 1) for any practical calculation. N b is the number of bands used for the expansion in Eq. 15 while η max is the maximum Floquet mode used in the expansion in Eq. 16. We note that, within this manuscript, we define η max in relation to the FKS states, i.e., η max implies the definition,

Quasi-energy eigenproblem
Choosing the Hamiltonian in Eq. 2 at the IPA level (Eq. 12) and expanding the timedependent Bloch states in FKS basis (Eq. 17), we arrive at the EOM, Acting the operatorsĤ 0 and −i∂ t we obtain, where E IPA are the KS energies shifted by the quasi-particle corrections and the time-dependence ofŴ k (E) is shown explicitly. We multiply to the left by dt e +iξ kn t e +iηω 0 t µ ki | and arrive at where W kij (η, γ) are the matrix elements ofŴ k (t) in FKS space. It is also worth noting that the operator Ĥ 0 k − i∂ t is diagonal in FKS space, with matrix elements given by E IPA kj − ξ kn − γω 0 . Now, Eq. 21 can be rearranged as an eigenvalue problem for the Floquet quasi-energies, We define the operator on the LHS of Eq. 22 as the quasi-energy operatorK k . Its matrix elements in FKS space are, and the eigenvalue problem in Eq. 22 reduces to the shorthand notation, Formally, the matrix elements W kij (η, γ) can be obtained by expressing the time-periodic electron-field coupling operator in Floquet space, i.e.,Ŵ k (t) = +∞ ζ=−∞ e −iω 0 ζtW k (ζ), and taking the inner product, Replacing the time integral with a delta function, we arrive at, An expression for W kij (η, γ) will be obtained in Section 3.3. We can anticipate from Eq. 26 that W kij (η, γ) will couple different Floquet modes in the eigenproblem of Eq. 22.

Electron-field coupling operatorŴ (E)
In order to obtain the matrix elements W kij (η, γ) (Eq. 26), we consider a sinusoidal electric field of amplitude E 0 and frequency ω 0 , and re-write Eq. 5 as, where σ 2 = ±1. The summations in Eq. 27 add up to twelve equivalent terms. In what follows, we will work with just one of them for simplicity. Choosing the positive exponential (σ 2 = +1) and replacing k σ i by k + , we define, We start by actingP + kk + on a time-dependent Bloch state, where we have expanded the state |v kn in FKS basis as shown in Eq. 17. Multiplying the RHS of Eq. 29 to the left by dt e +iξ kn t e +iηω 0 t µ ki |, we arrive at In Eq. 30, we have extracted an expression for the matrix elements of the projector, Using Eqs. 6 and 28, we find The dual |ṽ k + m can be expressed via the overlaps matrix as in Eq. 7, Eq. 33 contains the inverse of the matrix S kk + , i.e., the time-dependent overlap already defined in Eq. 8. Transforming Eq. 8 to FKS basis (Eq. 17), we arrive at, Defining the time-zero zero-field overlaps as [S 0 kk + ] j 2 ,i 2 = µ kj 2 |µ k + i 2 and making the replacement η ′ 2 = η 2 − ζ 2 , we get to, We now define the terms in the parentheses as [ S kk + ] m,m 1 (η ′ 2 ) and obtain, which implies the need for a self-consistent solution to the eigenproblem in Eq. 22, given that the matrix elements W kij (η, γ) will depend on the solutions { d kni (η)} k,n,i,η through the overlap matrices [S kk + ] m,m 1 . According to Eq. 33, we need the inverse of the matrix S kk + . How this inversion is performed will be discussed in Section 3.4. For now, we will assume we can find an expression for [S −1 kk + ] m 1 ,m expanded in Floquet modes as, where the signs of the quasi-energy exponentials have been inverted with respect to Eq. 36. Inserting Eq. 37 into Eq. 33, we arrive at, Furthermore, we now use Eq. 17 to replace the time-dependent Bloch states in Eq. 38 as, and Inserting Eqs. 39 and 40 into Eq. 38, the quasi-energy exponentials cancel each other out and we arrive at where µ ki 3 |µ kj = δ i 3 ,j eliminates the summation over i 3 and a new zero-field overlap is formed, namely [S 0 kk + ] i,i 1 = µ ki |µ k + i 1 . Grouping all the Floquet mode summations and exponentials together, we obtain where the first parenthesis encapsulates the time dependence. This term results in the condition (γ − η − 1 + η ′ 2 + η 1 − ζ 1 ) = 0. Choosing to replace ζ 1 and thus eliminate this summation via δ ζ 1 ,γ−η−1+η ′ 2 +η 1 , we finally arrive at The expressions for the eleven remaining instances ofP σ 2 kk σ i can be derived by analogy to Eq. 43. Once all these projectors are computed, we can go back to Eqs. 27, 28 and 4 to finally obtain the matrix elements of the electron-field coupling operator W kij (η, γ) in FKS space.
To summarise, we have laid out all the steps needed to reformulate the timedependent real-time EOM (Eq. 2) into a time-independent eigenproblem (Eq. 22) in FKS basis (Eq. 17). Crucially, our scheme is valid for extended systems since we use the Berry-phase derived electron-field coupling operator. The latter depends on the solutions { d kni (η)} k,n,i,η and thus the eigenproblem in Eq. 22 must be solved self-consistently. As our Floquet reformulation is time-independent, it does not require dephasing or expensive numerical time-integrations, which will alleviate the computational burden. Nonetheless, we retain the main advantage of the real-time approach, i.e., the scheme remains non-perturbative in the electric field, allowing for the simultaneous calculation of susceptibilities to different orders in the electric field.

Computational implementation
The Floquet scheme presented here involves, for each frequency ω 0 , a self-consistency cycle where the eigenvectors calculated in one iteration (solving Eq. 22) are fed to the next one (Eqs. 35 and 43) until convergence is reached. The condition for convergence is based on the absolute error in the real and imaginary parts of the susceptibility to every order requested by the user. As the real-time approach [34], this Floquet implementation offers parallelisation in frequencies and k-points. Moreover, it works both in the nonmagnetic (spin unpolarised) and magnetic non-collinear (spinorial) formulations. One of the challenges met during this implementation concerns the inversion of the overlap matrix S kk + , which is required to obtain the coefficients [ D kk + ] m 1 ,m (η ′ 2 ) as defined in Eq. 37. To this end, two strategies were implemented and compared. The first one would entail avoiding the time domain entirely and remaining in Floquet space, i.e., one could obtain the coefficients [ D kk + ] m 1 ,m (η ′ 2 ) (see Eq. 37) directly from the coefficients [ S kk + ] m 1 ,m (η ′ 2 ) (see Eq. 36). This is indeed possible for scalar functions [56] and was extended to matrices. The alternative option is to trivially go to the time domain evaluating Eq. 36 for several sample times t i , invert the matrices [S kk + ] m 1 ,m (t i ) numerically at each t i , and Fourier transform the resulting [S −1 kk + ] m 1 ,m (t i ) back to Floquet space, i.e., solve Eq. 37 for [ D kk + ] m 1 ,m (η ′ 2 ). This numerical inversion in time domain resulted both more robust and less time-consuming than the approach based on Duffin's theorems [56]. The calculation of the polarisation via Eq. 9 proved to be another obstacle in our implementation. As before, the problem originates from the fact that what is available to us are the Floquet coefficients of each overlap matrix, [ S kk + ] m 1 ,m (η ′ 2 ), rather than the matrix itself, [S kk + ] m 1 ,m . Inserting Eq. 36 into Eq. 9, we see that we would need to calculate the logarithm of a sum, which should be linearised by expanding it into a logarithmic series if the η ′ 2 = 0 term dominates. While this does indeed lead to a manageable set of equations to solve, it would turn our scheme into a perturbative one, thus losing one of the great advantages of the real-time approach. In fact, the issue of what order in this perturbative expansion corresponds to which order of the response in the electric field does not seem to be a trivial one. As before, the alternative is to switch to the time domain, i.e., evaluate Eq. 36 for a handful of sample times t i , calculate the polarisation at each t i with Eq. 9 and proceed with the usual steps in Eq. 11 to extract the required susceptibilities.
This resembles the usual choice one has in systems with translational invariance of going back and forth from real to reciprocal space to calculate whatever operator is simpler in either basis. By analogy, we can switch to time domain to perform a series of operations and then Fourier transform back to Floquet space. It is key to understand that transforming to the time domain does not necessarily imply the long simulated times and short time steps (i.e., tens of thousands of sample times) characteristic of the real-time approach to nonlinear optics. Rather, the assumed time periodicity means that one just needs to calculate the observables across one time period only. Moreover, the number of time steps required within that period is very limited as it corresponds to the total number of Floquet modes one needs for [ D kk + ] m 1 ,m (η ′ 2 ), which happens to be (2 × 2η max + 1), i.e., η S max = 2η max . Finally, we introduced dissipation effects as a phenomenological damping term in the diagonal of the quasi-energy operator, where ν d is a positive real number that provides the broadening to the spectra. This damping term deals with the rich structure of avoided crossings characteristic of Floquet quasi-energy operators, effectively removing the singularities that appear at resonant frequencies. The term (1 − δ γ,0 ) ensures that processes to all orders are damped to the same extent. We also introduced a small imaginary contribution to some KS eigenenergies (typically ν d ×10 −4 ) in order to avoid singularities arising from degeneracies or crossings in the KS band structure. This applies to those eigenvalues that differ in less than, e.g., 1 × 10 −7 Ha, and is only added at the 0 th Floquet mode. Due to the introduction of these imaginary contributions, the quasi-energy matrix is no longer Hermitian. We used a diagonalisation routine suitable for non-Hermitian matrices and  Fig. 1 verified that the imaginary part of the Floquet quasi-energies remained negligible.

Results and Discussion
The Floquet approach to nonlinear optics presented in this manuscript has been implemented into the Yambo code [57,58] and tested with a number of well-known materials. The latter have been studied before from an ab-initio perspective [24,59,60] and within the real-time formalism in particular [34,53], which makes them ideal for validating and benchmarking our method. To this end, a systematic comparison between the real-time [34] and Floquet approaches has been conducted, where the realtime calculations were also performed using the Yambo code [57,58]. The starting KS wavefunctions and energies were computed with Quantum Espresso [61].

Second Harmonic Generation
In this section, we report selected SHG spectra for bulk AlAs, monolayer h-BN and monolayer MoS 2 calculated both via the real-time and Floquet approaches (see computational details in Table 1). Fig. 1 presents SHG spectra for bulk AlAs calculated by both methods. The agreement between the two spectra is almost complete, despite the small broadening deliberately used to highlight differences. The small discrepancies towards the 4-6 eV region are due to the choice of time-step in the real-time approach (see discussion below) and vanish with a shorter time step (see inset in Fig. 1 and details in Table 1). Fig. 2 shows the spectra for h-BN while Fig. 3 presents data for MoS 2 . It is apparent that the results produced by either method are indistinguishable from one another on this scale. This close matching between our Floquet method and the realtime approach was found across a variety of k-point grids and broadening conditions for all three materials, extending also to the linear response regime (the full set of SHG and linear response results is provided in the Supplemental Material). Both real-time and Floquet calculations were carefully converged with respect to the relevant parameters in each case (see convergence tests in Supplemental Material). The Floquet approach requires convergence with respect to the number of Floquet modes included in each calculation, i.e., η max as defined in Eq. 18. Our tests indicate that convergence with respect to η max is very fast for SHG spectra, e.g., η max = 2 is enough to compute a well-converged SHG spectrum, even at higher intensities where higherorder contributions should play a greater role (see convergence tests in Supplemental Material). In addition, our Floquet method requires an accuracy threshold to control the self-consistency loop. This threshold should be selected according to the magnitude of the response and will impact the computational cost of our approach through the number of iterations required. For SHG, we chose an accuracy threshold of 2.44 × 10 −5 pm/V, which must be satisfied individually by both the real and imaginary parts of the second order susceptibility. This resulted in off-resonant frequencies converging within three iterations, while energies close to resonance took four or five iterations.
Convergence in the case of real-time calculations must be studied with respect to two crucial parameters, the total simulated time and the time step used in the numerical integration of the EOMs. Both are system-dependent and must be subjected to convergence tests for every material. Regarding the former parameter, failure to allow for sufficient simulated time would result the response not being properly dephased. In this case, some eigenmodes of the system will still be excited since the introduction of the electric field at time zero, manifesting as oscillations in the real-time spectrum. For instance, the well converged spectrum in Fig. 3 was obtained with 85 fs of total time. Fig. 4(a) shows the same spectra of Fig. 3 alongside an underconverged spectrum calculated with a total simulated time of 65 fs, which is evidently not enough to suppress these oscillations (see blue curve in Fig. 4(a)). In terms of the time step, this needs to be sufficiently small since longer steps, albeit more efficient, will introduce unphysical features in the spectra. For instance, the well-converged h-BN spectrum in Fig. 2 was obtained with a time step of 2.5 as. A portion of this spectrum was re-calculated with a low broadening (0.04 eV) to highlight these unphysical features and is shown in Fig. 4(b). While the well-converged 2.5-as spectrum matches Floquet closely, the real-time spectrum integrated with a 10 as step (thus, four times faster) is severely underconverged, as is apparent from the blue curve in Fig. 4(b). Overall, real-time spectra tend to the Floquet solution upon progressively increasing the total simulated time and decreasing the time step, as inferred from Fig. 4. Summarising, the convergence with numerical parameters impacts the execution time of the real-time approach to a larger extent with respect to the present formalism, which is not based on explicit time integration. The computational advantage of the present approach over the real-time one is demonstrated in Section 4.3).

Third Harmonic Generation
In this section, we report selected third harmonic generation (THG) spectra of bulk Si calculated both via the real-time and Floquet approaches. The full set of THG results can be found in the Supplemental Material. Fig. 5 shows very good agreement between the spectra calculated by either method. Convergence of the Floquet approach was also fast in this case, requiring only η max = 3 for THG spectra. However, additional Floquet modes may be required at higher intensities in case one wants to capture higher-order contributions to the third order response, which are present in the real-time result. These contributions depend on the intensity of the electric field and thus gain relevance at high intensities. In order to demonstrate this, we re-calculated a portion of these spectra with a perturbation of higher intensity and compared the results in Fig. 6. The latter shows that η max = 3 suffices to reproduce the low-intensity real-time result while additional Floquet modes (η max = 5) are required at higher intensities to capture these higher-order contributions.

Computational cost
We compared the computational cost of the real-time and Floquet approaches across the entire data set of SHG calculations. Our results account for a so-called Floquet speedup of 1-2 orders of magnitude (see Fig. 7). This speed-up is calculated as the ratio of the CPU time required by either approach to perform the exact same calculation (see Supplemental Material for CPU time of each approach individually). Controlling the accuracy and convergence of the spectra played an important role in this comparison. As regards Floquet, we used η max = 2, which is well converged, and a uniform self-consistent accuracy threshold for all calculations. We then verified the latter was smaller than 0.1% of the real-time result we intended to reproduce. We believe this is in line with what a regular user would do, however we note that there would have been potential for greater speed-ups had this threshold been fine-tuned in every calculation. As shown in Section 4.1, the systematic way of converging real-time calculations implies increasing the total simulated time and decreasing the time step, i.e., two choices that increase the associated computational cost. While this was carefully tested for each material, we avoided overconverging these parameters as it would have unduly penalised the realtime approach (see Supplemental Material for computational details, convergence tests and all spectra in the data set). The number of Floquet modes is indicated in the legend as a subscript, e.g., FL 5 corresponds to η max = 5. Computational details in Fig. 5 apply, with two exceptions. A time step of 1 as was used to rule out underconvergence in the real-time spectrum as a reason for the discrepancies. The total simulated times were set in excess of those required by the convergence tests for the same reason.
The results in Fig. 7 show the influence of the real-time convergence parameters, e.g., the speed-up is higher in h-BN as it required the longest simulated times and shortest time steps (2.5 as) to closely match the Floquet spectra. MoS 2 shows an intermediate speed-up since it was well converged with a 10-as time step but also needed long simulated times. Finally, AlAs was calculated with a time step of 10 as, which is well-converged in the region of interest despite the small discrepancies at 4-6 eV (see Fig.  1). While reducing the latter with a 2.5 as time step (see Fig. 1) would have quadrupled the Floquet speed-up, it would have unduly penalised the real-time approach in our view.
One parameter that is present in both methods and plays an important role in the speed-up achieved is the broadening of the spectra. In the real-time approach, the broadening is introduced through the dephasing term and impacts its ability to filter out excited eigenfrequencies. In general, a small broadening will require a long time to properly dephase the system and converge the spectrum. Hence, the broadening has an inverse impact in the computational cost of the real-time method through the total simulated time needed. At variance, the computational cost of our Floquet approach is almost insensitive to the broadening, which is introduced via the damping term in Eq. 44. It would be expected that a smaller broadening could make convergence more difficult at some particular (resonant) frequencies. However, while it is true that the CPU time required by Floquet scales linearly with the average number of self-consistent iterations per frequency, adding one or two cycles at a handful of frequencies did not impact the total CPU time significantly. As a result, the speed-up achieved with Floquet is much larger in small-broadening calculations. In fact, Fig. 7 shows two groups of points per material. Within the data of a given material, the uppermost points correspond to small-broadening results (0.04 eV) while the lowermost ones reflect the large-broadening calculations (0.15 eV). We also performed comparisons on bulk Si THG spectra, which were calculated for just one broadening (0.15 eV) but two intensities, i.e., 1 × 10 3 and 1 × 10 7 kW/cm 2 (referred to as low and high intensity, respectively). In line with Section 4.2, we report high-intensity calculations with η max = 5 and low-intensity runs with η max = 3. This results in larger Floquet speed-ups at low intensities, as shown in Fig. 8. Since highintensity spectra require η max = 5, there is an increased computational cost related to the additional Floquet modes in comparison with low-intensity calculations, for which η max = 3 is well converged (see Supplemental Material for convergence tests). Nonetheless, the computational advantage offered by Floquet becomes clear close to convergence with respect to k-point sampling (see Fig. 8), regardless of the intensity. We point out that, with a broadening of 0.04 eV, the low-intensity speed-up for an 8×8×8 k-grid was 15.1 (not shown in Fig. 8). This allows us to reliably estimate a speed-up of around 146 for a 32×32×32 k-grid. Hence, at low intensities where η max = 3 is well converged, the speed-ups obtained for bulk Si THG spectra are comparable to those achieved for bulk AlAs SHG spectra. Low I -FL 3 Figure 8. Computational cost comparison for bulk Si THG in the form of a CPU-time ratio between a real-time and a Floquet calculation producing the same spectra. In the labels, low and high I stand for intensities of 1 × 10 3 and 1 × 10 7 kW/cm 2 , respectively. The number of Floquet modes used is denoted by a subscript in the legend, e.g., FL 5 means η max = 5.
The drop in Floquet speed-ups at high intensities, i.e., when including higher harmonics to capture higher-order processes, points to a poor scaling of our method with respect to the number of Floquet modes. This is because each additional mode enlarges the dimension of the quasi-energy matrices by 2N b , which in turns impacts the time required for their diagonalisation (see near-quadratic scaling in Supplemental Material). At the other end of the scale, linear response calculations are much more efficient with the present Floquet formalism than with the real-time approach. This is because the Floquet matrices are very small (of dimension 3N b ) and convergence very fast (typically 2 self-consistent iterations are sufficient for linear response). However, first-order Floquet is still more expensive than the usual frequency-domain responsebased approach and the latter remains the best option in the linear response regime, at least at the independent particle level.
The main contribution to the computational cost in our scheme is the diagonalisation of the quasi-energy matrices. For this particular task, we use the QR algorithm (i.e., 'full' diagonalisation). There is room for improvement in this diagonalisation since the matrices are of dimension N b × (2η max + 1) but only M eigenvectors are needed (i.e., the number of occupied bands, which is a fraction of N b ). In the cases considered here, M represents between 5 and 10% of the dimension of the corresponding matrices. This opens the possibility of exploring more efficient eigensolvers (e.g., those in the SLEPc library [62]) such as Krylov subspace methods [63] or even variational approaches [64,65], which would further improve the performance of our Floquet method. Moreover, this would reduce the scaling of the computational cost with the number of Floquet modes.

Limitations
The limitations of the Floquet approach proposed in this manuscript are mainly related to the requirement of time-periodicity in the Hamiltonian. First and foremost, this framework can only apply to continuous and monochromatic perturbations. Otherwise, the basic conditions for the application of Floquet's theorem would not be present. For instance, modelling pump-and-probe experiments is beyond what one can do with the present formalism, and falls within the broader capabilities of the real-time approach. Second, we use the adiabatic approximation in order to ensure the periodicity of the effective Hamiltonian of our method, since the Berry-phase electron-field coupling operator depends self-consistently on the solution of the quasi-energy eigenproblem. While this limits the validity of the approach to weak field intensities, we performed tests up to 1 × 10 7 kW/cm 2 and found no signatures of non-adiabaticity. However, care should be taken when using our approach in this regime and beyond. In particular, we believe our method is not well-suited for the extreme nonlinear regime, where nonadiabaticity is expected to play an important role.

Conclusions
In this work, we developed and implemented a Floquet approach to nonlinear optics for extended systems. This constitutes a reformulation of the real-time formalism [34] based on the Berry-phase dynamical polarisation [38] and thus valid under PBCs. Exploiting the time-periodicity of the Hamiltonian in the presence of a monochromatic perturbation, we invoked Floquet's theorem to reformulate this time-dependent problem as a self-consistent time-independent eigenvalue problem. Our method applies to periodically-driven systems and is valid for weak electric fields, since we make use of the adiabatic approximation.
This Floquet formulation retains the non-perturbative nature of the real-time approach, allowing for the simultaneous extraction of susceptibilities to different orders in the electric field and for treating several nonlinear phenomena within the same formalism. Also, many-body effects can be included at different levels of approximation in our scheme by changing the effective one-particle Hamiltonian, e.g., excitonic effects can be included using a screened-exchange Coulomb-hole self energy or within densitypolarisation functional theory, as it is done within the real-time approach [53,54]. On the other hand, our reformulation tackles the often prohibitive computational cost associated with real-time calculations, which originates from the expensive numerical integration of the EOMs (often requiring very short time steps). This cumbersome time propagation is entirely avoided in our time-independent formalism.
We demonstrated the validity and effectiveness of the proposed Floquet scheme by implementing it at the independent particle level and testing it extensively on a number of well-studied materials. We calculated optical absorption and SHG spectra of bulk AlAs, monolayer h-BN and monolayer MoS 2 , plus THG spectra of bulk Si. In all cases, the Floquet method produced spectra in agreement with the implementation of the real-time approach in Ref. [34,58]. The proposed scheme showed a consistent computational advantage in comparison to the real-time formalism, resulting up to two orders of magnitude faster. A further computational speed-up could be achieved by employing a more efficient eigensolver for the quasi-energy eigenproblem, as those available from the SLEPc library [62]. In light of these results, our contribution holds the promise to enable the ab-initio calculation of nonlinear optical properties for a range of complex materials that are too demanding for currently available methods.

Author Contributions
MG put forward the idea for the approach, developed it and assisted with its implementation into Yambo. MG also contributed to the analysis of the results and reviewed the manuscript. IMA contributed to the development of the approach and implemented it into Yambo. IMA also carried out the calculations, analysed the results and wrote the manuscript.