Grassmann time-evolving matrix product operators for equilibrium quantum impurity problems

Tensor-network-based methods are promising candidates to solve quantum impurity problems (QIP). They are free of sampling noises and the sign problem compared to state-of-the-art continuous-time quantum Monte Carlo methods. Recent progress made in tensor-network-based impurity solvers is to use the Feynman–Vernon influence functional to integrate out the bath analytically, retaining only the impurity dynamics and representing it compactly as a matrix product state. The recently proposed Grassmann time-evolving matrix product operator (GTEMPO) method is one of the representative methods in this direction. In this work, we systematically study the performance of GTEMPO in solving equilibrium QIPs at a finite temperature with a semicircular spectrum density of the bath. Our results show that its computational cost would generally increase as the temperature goes down and scale exponentially with the number of orbitals. In particular, the single-orbital Anderson impurity model can be efficiently solved with this method, for two orbitals we estimate that one could possibly reach inverse temperature β ≈ 20 if high-performance computing techniques are utilized, while beyond that only very high-temperature regimes can be reached in the current formalism. Our work paves the way to apply GTEMPO as an imaginary-time impurity solver.


Introduction
Quantum impurity problems (QIPs) are prototypical models for studying open quantum systems and strongly correlated phenomena [1,2].They are also the central ingredients for quantum embedding theories such as the dynamical mean field theory [3].However, the strongly correlated nature of the problem and the exponential growth of the degrees of freedom pose great challenges for their accurate numerical solutions.
In this work, we focus on the equilibrium scenario where the impurity and the bath are in thermal equilibrium with a finite temperature and aim to calculate the Matsubara Green's function.For such problems, the current method of choice is the class of continuous-time quantum Monte Carlo (CTQMC) methods, due to their efficiency and accuracy [4][5][6][7][8][9][10][11][12][13].These methods are free of bath discretization errors as the baths are integrated out exactly.Moreover, they are also free of time discretization errors and can cope with up to five orbitals.The possible drawbacks of these methods are that sampling noises are unavoidable as a general feature of Monte Carlo methods, and they could suffer from the sign problem [14].In addition, since they mostly work in the imaginary-time axis, one needs to perform analytical continuation to obtain the spectral function, which is a numerically ill-posed operation [15].
Tensor-network-based methods are a class of methods that can in principle solve these drawbacks of CTQMC methods and also be scalable to multi-orbital QIPs.Various tensor-network-based impurity solvers have been proposed to date.For example, methods by representing the impurity-bath wave function as a matrix product state (MPS) have been applied to study equilibrium QIPs in both the real-time [16][17][18] and imaginary-time axis [19].These methods usually focus on the zero temperature scenario due to the difficulty of representing a finite-temperature bath state as an MPS.We note the recent works that overcome this difficulty using the thermofield transformations [20,21] to avoid explicit preparation of thermal bath states [22,23], but are limited to a single orbital.The numerical renormalization group method [24] is able to deal with finite-temperature baths and has been demonstrated up to three orbitals [25][26][27][28], which, however, mostly focuses on the low-energy spectrum and could be less accurate in the high-energy part.Generally, these tensor-network-based methods still cannot match CTQMC methods in terms of efficiency, accuracy or flexibility.This is mainly because they need to explicitly deal with the bath, leading to bath discretization errors as well as higher numerical costs.
In the last five years, important progress has been made in tensor-network-based impurity solvers by using the Feynman-Vernon influence functional (IF) [29] to integrate out the bath analytically, which avoids explicitly treatment on the bath.This idea has been first implemented in bosonic systems under the name of the time-evolving matrix product operators (TEMPO) [30], which has achieved unprecedented accuracy and efficiency in studying the non-equilibrium dynamics for bosonic impurity problems [31][32][33][34][35][36], and then in fermionic systems [37][38][39][40] (the latter methods will be referred to as the tensor network IF methods in the following).In our previous work, we propose a Grassmann time-evolving matrix product operator method which is a full fermionic analog of the TEMPO method [41].Compared to the tensor network IF methods, GTEMPO has several advantages originating from TEMPO: 1) it is universal for both real-and imaginary-time (or even mixed-time) calculations and is transparent to implement as it only depends on the Grassmann path integral (PI).In comparison, the tensor network IF method requires further additional steps to convert the Grassmann PI back into fermionic expectation value calculations; 2) it efficiently constructs the IF as a Grassmann MPS (GMPS), referred to as the MPS-IF, using a series of GMPS multiplications.Meanwhile, the tensor network IF method uses additional transformations to build the IF as an MPS which also introduces new hyperparameters into the algorithm.For non-equilibrium dynamics of the single-orbital Anderson impurity model (AIM), it has been demonstrated that GTEMPO can reach higher precision with lower numerical cost compared to the tensor network IF methods [41].
Although the GTEMPO method is universal for very broad types of QIPs, the current study is limited to the non-equilibrium scenario [41].In this work, we systematically study the efficiency of the GTEMPO method for equilibrium QIPs at finite temperatures, focusing on a semicircular spectrum of the bath which is commonly used in literature.The article is organized as follows.In Sec. 2, we describe the method in detail, as well as techniques specific to the finite-temperature scenario to accelerate its performance.In Sec. 3, we show the accuracy of the method for the analytically solvable case of non-interacting fermions and numerically study the growth of the bond dimension of the MPS-IF.It allows us to accurately estimate the computational costs of multi-orbital AIMs.In Sec. 4, we study the single-orbital AIM for different interaction strengths and compare our results to CTQMC calculations.In Sec. 5, we present high-temperature calculations for higher-orbital AIMs which show the flexibility as well as the current limitation of the method.

GTEMPO for equilibrium QIPs
In this section, we describe the major steps of the GTEMPO method when working in the imaginary-time axis to solve equilibrium QIPs at finite temperatures.In addition, we present techniques specific to such a scenario to accelerate the numerical performance.

Description of the Hamiltonian
The Hamiltonian for a general QIP can be written as where Ĥimp denotes the impurity Hamiltonian, Ĥbath denotes the bath Hamiltonian, and Ĥhyb denotes the hybridization Hamiltonian between the impurity and the bath.The impurity Hamiltonian Ĥimp can be generally written in the form Ĥimp = p,q t p,q â † p âq + p,q,r,s where p, q, r, s are fermion flavors that contain both the spin and orbital indices.The first term on the righthand side of Eq.( 2) is the tunneling term and the second is the interaction term.Ĥbath describes flavors of non-interacting fermions, which can be written as where k labels the momentum, and ε p,k is the corresponding energy.The number of flavors in Ĥbath is exactly the same as that in Ĥimp .In this work, we consider Ĥhyb in the form of linear coupling between the impurity and the bath where V p,k is the hybridization strength.Throughout this work we assume Ĥbath and Ĥhyb to be independent of the flavors, namely ε p,k = ε k and V p,k = V k , for simplicity of notations, but we note that GTEMPO can be straightforwardly applied to more general cases with no additional efforts (as long as Ĥhyb takes the form of linear coupling).

Grassmann path integral on the imaginary-time axis
For equilibrium QIPs at inverse temperature β, the impurity partition function Z imp (β) = Tr e −β Ĥ / Tr e −β Ĥbath (we assume that the bath has zero chemical potential throughout this work) can be written as a PI over imaginary-time Grassmann trajectories [42,43]: where āp = {ā p (τ )} and a p = {a p (τ )} are Grassmann trajectories for flavor p over the continuous imaginarytime interval [0, β], ā = {ā p , āq , • • • } and a = {a p , a q , • • • } are Grassmann trajectories for all flavors.The measure is K[ā, a] encodes the bare impurity dynamics which only depends on Ĥimp .I[ā p , a p ] is the IF for flavor p, which can be written as The bath effect is fully characterized by the hybridization function ∆(τ, τ ′ ), which can be computed as Here ) is the free bath Matsubara Green's function: where n(ε) = (e βε + 1) −1 is the Fermi-Dirac distribution and Θ is the Heaviside step function.Throughout this work we will consider a semicircular spectrum (this type of spectrum was also considered in, e.g., Refs.[22,39,44,45]) and set the half bandwidth D = 1 as the unit.
To numerically evaluate the PI in Eq.( 5), we split β = (M − 1)δτ and then the continuous trajectories {ā p (τ )}, {a p (τ )} are discretized as āp = {ā p,M , . . ., āp,1 } and a p = {a p,M , . . ., a p,1 } accordingly (we still use the same notation as the continuous case to reduce the number of notations).In addition, we denote ā,j = {ā p,j , āq,j , . . .} and a ,j = {a p,j , a q,j , . . .} as the Grassmann variables (GVs) for all the flavors at discrete time step j.Under these notations, the double integral in the IF in Eq.( 7) can be discretized via the QuaPI method [46][47][48] as where ∆ ij is the hybridization matrix which can be calculated as We note that as compared to the non-equilibrium case where there are two branches in the real-time axis (forward and backward branches) [41], there is only one branch in the imaginary-time axis in the equilibrium case.As a result, the total number of GVs is reduced by half for the same M , and there is only one hybridization matrix ∆ ij after discretization.The discretized bare impurity dynamics K[ā, a] can be evaluated as where we have used Ûimp = e −δτ Ĥimp .Here the boundary condition, which corresponds to the final trace when evaluating Z imp (β) in Eq.( 5), is handled by GVs ā,1 and a ,1 .However in this situation, the GVs ā,1 , a ,1 do not belong to the same time step: ā,1 live at τ = β while a ,1 live at τ = 0, which is inconvenient for QuaPI.Thus we introduce extra GVs ā,0 and a ,0 to solely take care of the boundary condition, and equivalently write K[ā, a] as Now the first and last terms on the right-hand side of Eq.( 14) are responsible for the boundary condition, and each term in the middle is a propagator from time step j − 1 to j, defined as G j,j−1 := ⟨a ,j | Ûimp |a ,j−1 ⟩.Generally, a first-order expression of G j,j−1 is j,j−1 := e ā,j a,j−1−δτ Himp(ā,j ,a,j−1) , where the term H imp (ā ,j , a ,j−1 ) in the exponent is simply obtained by replacing in Eq.( 2) with â † p → āp,j and âp → a p,j−1 .However, a first-order expression of the propagator would often result in very large numerical errors (See Fig. 9 for example).For simple models such as the single-orbital AIM, one could easily obtain the exact analytical expression of the propagator.In general cases, calculating the exact analytical expression for G j,j−1 is tedious and not scalable.Nevertheless, we propose the following trick which can calculate numerically accurate propagators to arbitrary precision with very little numerical cost.The idea is to break δτ into m smaller time steps with δτ ′ = δτ /m.Correspondingly, one inserts m − 1 pairs of GVs between a ,j−1 and a ,j , denoted as a ,j,l and ā,j,l with 1 ≤ l ≤ m − 1 .Then one can calculate a more precise G j,j−1 by starting from a refined discretization with step size δτ ′ and then integrating out the intermediate GVs, that is, where the propagator Û ′ imp = e −δτ ′ Ĥimp in the integrand can be approximated by their first-order expressions as in Eq. (15).With this trick, the numerical error in calculating G j,j−1 decreases from O(δτ ) in Eq.( 15) to O(δτ ′ ) in Eq.( 16).For time-independent Ĥimp , the propagator is time-translationally invariant.Therefore Eq.( 16) only needs to be evaluated once and then can be used for all time steps.As a result, one can set δτ ′ to be very small and obtain K with very high precision and little numerical effort.In all our calculations, K can be built as a GMPS within a few seconds.
(a) K : K :  17), the purple solid lines represent the actions of partial IFs [41], and the red dashed lines represent the boundary actions which are responsible for the final trace in the impurity partition function.The free impurity dynamics part in K are not shown since they are compatible with IF, namely they only act in an intra-flavor manner.

Ordering of Grassmann variables
The ordering of the GVs is a crucial factor that affects the performance of GTEMPO because the entanglement in an MPS can vary significantly depending on the ordering of the local basis (GVs in our case).The first principle of designing an appropriate ordering is that the GVs should be next to (or at least close to) their conjugates.This is because one needs to integrate out all the GVs to get a scalar output at the end of the calculation, and two GVs can only be conveniently integrated out as a pair if they are conjugate of and next to each other.Besides that, there are two very different strategies to order the GVs, namely the time-local ordering and the flavor-local ordering.For time-local ordering, the GVs at the same imaginary-time step are located in nearby positions.For flavor-local ordering, the GVs from the same flavor are located in nearby positions.The two strategies as well as the actions of the interacting dynamics and the IF are shown in Fig. 1 for the specific case of two flavors, corresponding to the single-orbital AIM.The discretized interacting dynamics can be decomposed into M four-body terms, exemplified as Therefore each four-body term can be written as the addition of two separable GMPSs (the unit is the vacuum state of the Grassmann space, which is itself a separable GMPS), which is thus a GMPS with bond dimension 2. From Fig. 1(a), we can see that in the time-local ordering the four-body terms (the brown solid lines) at different times overlap at most twice.Therefore the bond dimension of K will be exactly 4 for this ordering.However, the action of IF on the two flavors completely overlaps with each other.If we assume that acting the IF on each flavor will result in a GMPS with bond dimension χ, then the bond dimension of I in Fig. 1(a) will be χ 2 .In comparison, for the flavor-local ordering in Fig. 1(b), the actions of the IFs on the two flavors do not overlap, therefore the bond dimension of I will still be χ.In the meantime, the four-body terms of the interacting dynamics completely overlap with each other, and the bond dimension of the GMPS for K will scale as 2 M .These arguments can be generalized to more flavors.In general for n flavors, in the time-local ordering, if one builds the IF as a single GMPS, its bond dimension will scale as χ n , while the bond dimension of K is a constant; in the flavor-local ordering, the IF can be built as a GMPS with bond dimension χ, but the bond dimension of K scales as 2 M .Fig. 1 thus reveals the origin of the hardness of GTEMPO: the interacting dynamics (more generally, the inter-flavor couplings) and the IF favors completely different ordering of GVs.One may think that timelocal ordering is preferred for strong interaction and flavor-local ordering is preferred for weak interaction.However, in practice, we observe that the flavor-local ordering could only be advantageous if the interaction strength is very small compared to the rest scales (such that one can truncate the bond dimension of K without significant loss of accuracy), and since one is usually interested in the long-time dynamics with large M (the low-temperature regime), the time-local ordering is almost always preferable.Therefore throughout this work, we will stick to the time-local ordering.
For multiple flavors, building the IF for different flavors as a single GMPS would itself be very expensive, due to the exponential scaling of the bond dimension with the number of flavors in the time-local ordering.
A workaround is that one can build one MPS-IF per flavor independently, and then use the zipup algorithm proposed in Ref. [41] to compute expectation values in the end.In the latter approach, one needs to build n MPS-IFs in total, each with bond dimension χ, which is usually very cheap.However, the computational cost of computing expectation values still grows exponentially, but with a smaller exponent as will be discussed in detail in Sec.2.4.
Last, we mention that the red dashed lines in Fig. 1, which correspond to the boundary terms in Eq.( 14) and are responsible for the final trace, will also increase the bond dimension of K.For two flavors there are 4 such terms in total.In the time-local ordering, two of them are short-range and the rest two are long-range.This is in comparison with the PI in the real-time axis where no such boundary terms exist.Each long-range term will increase the bond dimension of K by a factor of 2. For multi-orbital AIMs this increase in bond dimension due to these boundary terms becomes significant (for example 64x for 3 orbitals).This problem can be solved by realizing that the boundary terms are simply two-body terms similar to the form of Eq. ( 17) (see Eq.( 28) for example).Each two-body term, when acting on an existing GMPS with bond dimension χ K, results in the addition of two GMPSs, each with bond dimension χ K.For n flavors, K can be written as the addition of 2 n GMPSs resulting from splitting of the long-range boundary terms: Similar to the zipup algorithm where GMPSs are multiplied on the fly, we does not need to actually perform the addition.Instead, the expectation values can be directly computed with each Kl and then sum over all the results.Using this technique, for an n o -orbital AIM, we can equivalently transform one calculation related to a single huge GMPS for K with bond dimension χ K into 2 2no calculations related to 2 2no smaller GMPSs for the Kl s in Eq.( 18), each with bond dimension χ K = χ K /2 2no .This technique will be used throughout this work, even for cases with only a single flavor or a single orbital. (a)

The zipup algorithm for computing expectation values
In this section, we describe the zipup algorithm proposed in Ref. [41] and analyze the advantage of it compared to the original TEMPO approach in detail.In the formalism of the original TEMPO, the bare impurity dynamics and the IF are multiplied together to obtain the augmented density tensor (ADT) as a single MPS, namely Based on the ADT, one can compute multi-time correlation functions by inserting GVs at different times into the PI and integrating the resulting expression.For equilibrium QIPs, a primary class of observables of interest is the Matsubara Green's function defined as In the discretized path integral formalism, g pq (j) at time jδτ can be calculated as However, directly evaluating the ADT as in Eq.( 19) could be extremely demanding in the fermionic cases compared to the bosonic case.This is because in the fermionic case, one often needs to consider multiple orbitals and each orbital has two spins, while for the prototypical spin-boson model [49] there is only a single bath without spin.In GTEMPO, the IF and K are all built as GMPSs using a series of GMPS multiplications [41].Assuming that the bond dimension of K is χ K , and that there are n MPS-IFs corresponding to n flavors, each with bond dimension χ, then the bond dimension of A, denoted as χ A , would scale as χ A = χ K χ n in the worse case.As a result, each site tensor of A could have size 2χ 2 K χ 2n (the factor 2 is for the physical index).One would easily run into memory issues even for very moderate values such as χ = 10 and n = 4 (two orbitals).This fact also reveals the much greater computational costs of fermionic impurity problems compared to bosonic ones.
The central idea of the zipup algorithm is that one could arrange multiple GMPSs vertically as a quasitwo-dimensional tensor network whose physical indices at the same time step (on the same vertical line) share the same GV.Then one could perform a sweep from left to right (or equivalently from right to left) to accomplish the integration, during which the physical GVs at the same time step are multiplied together on the fly, as shown in Fig. 2.More in detail, the left-to-right sweep starts with a trivial environment tensor of rank-(n + 1) (n for the MPS-IFs and 1 for K) with a single element 1 (e.g., the Grassmann vacuum).The sweep can be performed iteratively by a series of elementary updates and for each update, one integrates a pair of conjugate GVs.Fig. 2(b) shows such an elementary update, during which one needs to perform 2n Grassmann tensor multiplications between a large tensor with size χ K χ n−1 × χ and a small tensor with size χ × χ (the two tensor multiplications related to K are neglected since we usually have χ K ≪ χ), therefore the computational cost of an elementary update is O(2nχ K χ n+1 ).Overall, one needs to perform O(nM ) such elementary updates during a full sweep, and therefore the time cost of calculating a single expectation value scales as in terms of the number of floating point operations (FPOs).We have taken the technique to split K as in Eq.( 18) into account in the second equality of Eq.( 22).Since only one copy of the environment tensor needs to be stored, the memory cost is (neglecting the memory costs of storing the MPS-IFs and K) We can see that compared to the original TEMPO which builds up a single ADT, the zipup algorithm approximately reduces the exponent in both the time and memory costs by half if χ A ∝ χ K χ n .Besides the multiplication of two Grassmann tensors, two additional operations are required to accomplish one elementary update.The first operation is the multiplication of several physical indices at the same time step into a single physical index, which is used in the two middle steps in Fig. 2(b).Mathematically, a Grassmann tensor A in the space spanned by n GVs from ξ 1 to ξ n can be written as [41] where the coefficient A in,in−1,...,i1 is a normal rank-n tensor, each dimension with size 2. If the ξ i s are all the same, namely ξ i = ξ, then A can be reduced into a rank-1 Grassmann tensor B = The other operation is to integrate out a pair of GVs which are conjugate of each other.Assuming that ξ 1 = ξ2 , then integrating out these two GVs (with the measure d ξ1 dξ 1 e − ξ1ξ1 ) would result in a rank-(n − 2) These two operations together with the Grassmann tensor multiplication (for which we used the package TensorKit.jl[50]) are sufficient to implement the zipup algorithm.

Parallelizability of GTEMPO
To this end, we briefly discuss the parallelizability of the GTEMPO method for equilibrium QIPs at finite temperatures.Generally, if one builds one MPS-IF per flavor, then the cost of building those MPS-IFs is negligible compared to the later zipup algorithm for computing expectation values (the calculations of n MPS-IFs can also be perfectly parallelized over n processes).For the computationally intensive part of calculating expectation values, e.g., Eq. ( 21), there are two major sources for perfect parallelization (with essentially no data communication).The first is that one could straightforwardly parallelize the calculations of g pq (j) for different p, q and j.The second is that once we decompose K as in Eq.( 18), then the calculation becomes the addition of 2 n terms, each with exactly the same numerical cost and the calculation of each term is independent of each other.Moreover, these two sources of parallelization can be carried out simultaneously.As a result, the calculation the Matsubara Green's function for all the discrete time steps can be perfectly parallelized over M × 2 n processes for fixed p and q.Taking moderate values of M = 1000 and n = 4, the GTEMPO method can be easily and perfectly parallelized over more than 10 thousands of processes, which is an interesting feature since the parallelizability of tensor-network-based algorithms is often very poor compared to the Monte Carlo methods.This feature could be very helpful in the future to significantly speed up the GTEMPO calculations for equilibrium QIPs.
In the following, we will numerically demonstrate the effectiveness of the GTEMPO method on equilibrium QIPs with increasing complexity.

Toulouse model
First, we study the non-interacting case, referred to as the Toulouse model [51] or the Fano-Anderson model [52], the impurity Hamiltonian of which is simply where ε d is the on-site energy.Here the flavor indices are suppressed since there is only one flavor.The Toulouse model is ideal for us to explore the effectiveness of the GTEMPO method since 1) the results can be compared to analytic solutions; 2) the scaling of the bond dimension of the MPS-IF, in this case, is the same as the MPS-IF per flavor for higher-orbital impurity models, thus one could use the information here to accurately predict the computational costs of more complicated cases based on Eqs.(22,23).The bare impurity dynamics of the Toulouse model can be calculated exactly: where η = e −δτ ε d .The first and last terms on the rhs correspond to the boundary terms in Eq.( 14).In the time-local ordering we have χ K = 2 (χ K = 4).The only two sources of errors in the GTEMPO method, given that K can be calculated very precisely (either in an analytical or numerically very accurate way as shown in Sec.error of the PI and the bond truncation error during compression of the MPS-IF.In the following, we denote the bond truncation tolerance of the MPS-IF per flavor as ς (which means that we throw away the singular values whose relative weights are below ς [41,53]).For our calculations of the Matsubara Green's function G pq (τ ) in all cases, we set p and q to be the first flavor and omit them.In Fig. 3, we study the accuracy of GTEMPO against the two hyperparameters δτ and ς, where we focus on the symmetric case (half filling) with ε d = 0 and β = 40.In Fig. 3(a), we show the GTEMPO results against the analytical solutions for different values of ς.We can see that the accuracy improves monotonically with decreasing ς.For ς = 10 −8 , the differences between GTEMPO results and the analytical solutions are of the order 10 −2 (less than 2%), and we will stick to this value of ς when building the MPS-IF per flavor in our rest simulations.In Fig. 3(b), we show the GTEMPO results against the analytical solutions for δτ = 0.05, 0.1, 0.5.We can see that the case δτ = 0.1 is the most accurate of all the three cases considered.For larger δτ = 0.5, the results in the middle part (with 10 ≤ τ ≤ 30) are almost as accurate as the results for δτ = 0.1, however at the boundaries, the results are clearly away from half filling.Interestingly, the results with a smaller δτ = 0.05 are less accurate in the middle part compared to the rest two cases.This reveals an intricate interplay between these two hyperparameters: for smaller δτ one generally needs to use a smaller ς as well.We also observe that a smaller δτ would result in smaller χ for the MPS-IF even under a slightly lower ς, and generally the results at the boundaries are less accurate than those at the intermediate imaginary times.These results suggest that one may use a smaller δτ at the boundaries, but a much looser δτ in the middle to accelerate the performance while maintaining accuracy.In our following simulations, we will stick to δτ = 0.1.
In Fig. 4, we study the accuracy of GTEMPO for both the symmetric and non-symmetric (with ε d = 1) cases with β = 60.We can see that in the symmetric case, the errors compared to analytical solutions are still within 2%, despite a larger value of β.In the meantime, the errors in the non-symmetric case are more peaked at the boundaries compared to the symmetric case.This is because, in the non-symmetric case, the Matsubara Green's function has different values at two boundaries.Since the IFs act on different flavors independently, for higher-orbital impurity models, if one builds one MPS-IF per flavor and computes expectation values using the zipup algorithm, then those MPS-IFs will have exactly the same bond dimension.Therefore by studying the scaling of the bond dimension χ of the MPS-IF for the Toulouse model (which has only one flavor), one can already determine the computational costs for more complicated cases (the bond dimension χ K of K could also increase exponentially with the number of orbitals n o , however, its effect is not so significant as we only consider n o ≤ 3 in this work and χ K is independent of β in the time-local ordering).
In Fig. 5 we study the scaling the χ against β, for a fixed ς = 10 linearly against β.This is in sharp comparison with the calculations in the real-time axis [41], where one observes that χ increases at the beginning and quickly saturates at a value less than 20 for the Toulouse model.The reason for the increase of bond dimension of the MPS-IF in the imaginary-time axis can be understood from the factor e −ε(τ −τ ′ ) in the hybridization function in Eq.( 8).Due to the existence of this factor, the correlation in the IF (e.g., the exponent in the IF in Eq.( 7)) will not simply decay with |τ − τ ′ | since it contains contributions of ∆(τ, τ ′ ) from both τ > τ ′ and τ < τ ′ .For example, when ε > 0, then ∆(τ, τ ′ ) will decay with increasing |τ − τ ′ | for τ > τ ′ , but will increase for τ < τ ′ .In comparison, for the real-time path integral, the correlation in the IF would generally decay with the time distance |t − t ′ |, thus the bond dimension of the MPS-IF in the real-time axis could be very moderate even for very large t.

Single-orbital impurity model
Now we consider the single-orbital AIM for which the impurity Hamiltonian is where ε d is the on-site energy of the impurity and U is the Coulomb interaction strength, σ ∈ {↑, ↓} is the spin index.The bare impurity dynamics of this model can also be calculated exactly, with each propagator G j,j−1 = e η σ āσ,j aσ,j−1 e η 2 (e −δτ U −1)ā ↑,j ā↓,j a ↓,j−1 a ↑,j−1 , where η is the same as the case of the Toulouse model.For the single-orbital or higher-orbital cases, we will focus on the half-filling case, providing a quick and straightforward check for the GTEMPO results.For single-orbital AIM the half-filling condition is In Fig. 6, we study the accuracy of GTEMPO against CTQMC for the single-orbital AIM for β = 10, 20, 40, 60 in the four panels, respectively.For each β, we consider U = 1, 2, 4, 8.Here we have calculated one MPS-IF per spin species for GTEMPO.The CTQMC results are calculated using the TRIQS package [54,55], where we have used 32 Markov chains and generated 10 7 samples from each chain.We can see that the GTEMPO results agree quite well with the CTQMC results for all the cases considered.Generally, the deviations of the GTEMPO results from the CTQMC results are slightly larger for larger β, and for each value of β, the deviations become smaller for larger U (The scale of U = 8 is quite different from the rest U s and thus the case U = 8 is not shown in the insets, but it can be seen that for this case the two sets of results are almost completely on top of each other).This latter feature is because we have chosen the time-local ordering where we can easily treat K exactly and that the weight of K in the ADT in Eq.( 19) becomes larger for larger U (the MPS-IF is the same for different U s).
To this end, we give some explicit estimations of the computational cost of the GTEMPO method for higher-orbital AIMs.We assume that the bond dimension of the bare impurity dynamics K scales as χ K ∝ 2 n .From Fig. 5, we consider χ = 74 for β = 10, χ = 138 for β = 20 and χ = 236 for β = 40.Then from Eqs. (22,23), for calculating a single expectation value of a two-orbital AIM, the time costs are C t β=10 ≈ 1.8 × 10 15 , C t β=20 ≈ 8 × 10 16 and C t β=40 ≈ 2.4 × 10 18 FPOs, and the memory costs are C s β=10 ≈ 3.6, C s β=20 ≈ 43.5 and C s β=40 ≈ 372 GB respectively.We can see that the cases β = 10, 20 are manageable for state-of-the-art high-performance computing systems (the newest NVIDIA H100 has higher than 100 TFLOPS single-precision performance with 80 GB memory), while the β = 40 case is very demanding (taking into consideration that one needs to do this calculation for O(M ) times).
One way to further accelerate the performance of GTEMPO for multi-orbital AIMs is to combine several MPS-IFs for several flavors into a single larger GMPS, and compress the resulting GMPS using another bond truncation tolerance ς ≤ ς such that its bond dimension is smaller than the product of the bond dimensions of the MPS-IFs.The price to pay is that one may lose precision and the overall performance can only be evaluated for practical applications.A natural choice in this direction is to combine the two MPS-IFs for the same spatial orbital but for different spins, namely I ↑ and I ↓ , into a single larger GMPS, denoted as I ↑↓ .Denoting the bond dimension of I ↑↓ as χ ↑↓ , the time and space costs for calculating a single expectation value of an n o -orbital AIM become and respectively.In Fig. 7, we show the scaling of the bond dimensions of the GMPSs for three cases: . Effect of the numerically accurate K calculated using Eq.( 16) with δτ ′ = 10 −3 (red dashed line), compared to a first order K with δτ = 0.1 (green dashed line).Here we have used the two-orbital AIM with impurity Hamiltonian in Eq.( 33).The CTQMC results (blue solid line) are also shown for reference.
The effectiveness of the joint MPS-IF approach is demonstrated in Fig. 8 for the half-filling singleorbital AIM at β = 10 and U = 1, where we show the Matsubara Green's function calculated by CTQMC, by GTEMPO with separate MPS-IFs for different spins (labeled by I ↑ I ↓ ), and with a joint MPS-IF under different ς labeled by I ↑↓ .We can see that the error is about 10 percent compared to the CTQMC results with ς = 10 −8 , and becomes negligible with ς = 10 −9 .Since one expects that the accuracy requirement on the MPS-IF is less stringent for larger U (see Fig. 6), this joint MPS-IF approach is expected to be most suitable to work in the strongly interacting regime.Now we can estimate the computational costs using Eqs.(31,32) under ς = 10 −9 .From Fig. 7 we have χ ↑↓ = 876, 1850 for β = 10, 20.Then for two orbitals, the time costs are 5.5 × 10 14 and 1.0 × 10 16 FPOs, and the space costs are 0.09 GB and 0.4 GB, respectively (the storage of I ↑↓ increases significantly compared two separate I ↑ and I ↓ , but is still moderate considering the memory of current computers and they do not need to be stored in the first-level memory, such as the GPU memory).We can see that the joint MPS-IF approach has a significant advantage in terms of the memory cost for multi-orbital AIMs.For three orbitals, the time and space costs for β = 10 are already 1.7 × 10 19 FPOs and 323 GB respectively, which is likely out of reach currently.The joint MPS-IF approach will be used for all our calculations for two and three orbitals in the following, with fixed ς = 10 −9 .

Multi-orbital impurity model
Now let us consider the multi-orbital Anderson impurity model [10,56]  â † y,↓ ây,↑ âx,↓ ), (33) where ε d is the on-site energy, U is the Coulomb interaction strength, J is the Hund's coupling strength, and we have used x, y for orbitals.We will consider at most three orbitals, with fixed U = J = 0.5.The half-filling conditions for the two-orbital and three-orbital cases are ε d = −(3U − 5J)/2 [57] and ε d = −(2.5U− 5J) [58] respectively.For two orbitals the GVs within time step j are arranged as a 1↑,j ā1↑,j a 1↓,j ā1↓,j a 2↑,j ā2↑,j a 2↓,j ā2↓,j (the first index in the subscript is the orbital index) in the time-local ordering, and similarly for three orbitals.
For the impurity Hamiltonian in Eq.( 33), we are not able to derive the exact analytical expression of the propagator.Therefore we use the technique in Eq.( 16) to obtained numerically accurate expression for K with δτ ′ = 10 −3 .In Fig. 9, we show the improvement of our numerically accurate expression for K compared to a bare first-order expression with δτ = 0.1.We can see that with the accurate K the GTEMPO results well agree with the CTQMC results, while with the first-order K one obtains qualitatively wrong results.
Finally, we study the two-orbital AIM at β = 10 and the three-orbital AIM at β = 2, and the results are shown in Fig. 10(a, b) respectively.We can see that the GTEMPO results in both cases agree very well with the CTQMC results, which demonstrates the flexibility of the GTEMPO method.The calculations in Fig. 10 are already very demanding.For example, in Fig. 10(a), we have used 10 threads (each with 2.9 GHz frequency) for around one week, where we have parallelized the calculations of the Matsubara Green's function at different time steps.Nevertheless, utilizing nowadays high performance computing techniques, we vision that for two-orbital AIMs one could reach β = 20, but for the three-orbital model, it would still be very demanding for larger β (see our analysis at the end of Sec. 4).

Summary
In summary, we have systematically studied the effectiveness of the recently proposed GTEMPO method by computing the Matsubara Green's function of equilibrium quantum impurity problems at finite temperatures.Several techniques are proposed to accelerate the performance of GTEMPO for finitetemperature calculations, including the technique to deal with the boundary condition in the impurity partition function, ordering of Grassmann variables, as well as the joint MPS-IF approach.Our numerical calculations reach β = 60 for the single-orbital AIM, β = 10 for the two-orbital AIM and β = 2 the three-orbital AIM, and in all cases, we show good agreement of the GTEMPO results with the CTQMC calculations.Based on rigorous estimations of the computational costs, we vision that β = 20 could be reached for the two-orbital AIM with nowadays high-performance computing techniques, while for high-orbital AIMs only very high-temperature regimes could be explored under the current formalism of GTEMPO.Our work thus paves the way to apply GTEMPO as an imaginary-time impurity solver.

Figure 1 .
Figure 1.The action of the interacting dynamics in K and the influence functional I for (a) the time-local ordering and (b) the flavor-local ordering for the single-orbital AIM.The brown solid lines represent the actions of the four-body interacting terms in the form of Eq.(17), the purple solid lines represent the actions of partial IFs[41], and the red dashed lines represent the boundary actions which are responsible for the final trace in the impurity partition function.The free impurity dynamics part in K are not shown since they are compatible with IF, namely they only act in an intra-flavor manner.

Figure 2 .
Figure 2. (a) The zipup algorithm to integrate the Grassmann path integral with one GMPS for K and n MPS-IFs denoted from I 1 to In.The left bar is a trivial Grassmann tensor with a single element 1, which is the starting point of the left-to-right sweep.(b) An elementary update during the sweep which integrates a pair of conjugate GVs denoted as a and ā, the final result has the same form as the input, namely a rank-(n + 1) Grassmann tensor.

Figure 3 .
Figure 3. Matsubara Green's function of the Toulouse model at ε d = 0 and Dβ = 40 for (a) different values of ς and (b) different values of δτ .The blue solid line in both panels is the analytic solution.In panel (a), the dashed lines top to bottom are GTEMPO results calculated with ς = 10 −6 , 10 −7 , 10 −8 respectively.In panel (b), the dashed lines from top to bottom are GTEMPO results calculated with Dδτ = 0.05, 0.5, 0.1 respectively.The inset in both panels shows the error between the GTEMPO results and the analytic solutions.

Figure 4 .Figure 5 .
Figure 4. Matsubara Green's function of the Toulouse model in the symmetric (with ε d = 0, purple lines) and non-symmetric (with ε d = 1, green lines) cases.The dotted lines are GTEMPO results and the solid lines are the corresponding analytical solutions.The inset shows the error between the GTEMPO results and the analytical solutions in both cases.

− 8 .Figure 6 .
Figure 6.Matsubara Green's function of the single-orbital Anderson impurity model for (a) Dβ = 10, (b) Dβ = 20, (c) Dβ = 40 and (d) Dβ = 60.The dashed lines in each panel from top to bottom are GTEMPO results for U/D = 1, 2, 4, 8 respectively, and the solid lines with the same colors are the corresponding CTQMC results.The inset in each panel shows the zooms at small values of Dτ .

I 8 I 9 I
(a) directly multiplying I ↑ and I ↓ together without any further bond truncation (thus the bond dimension of the resulting I ↑ I ↓ is simply χ 2 ); (b) I ↑↓ with ς = 10 −9 ; (c) I ↑↓ with ς = 10 −8 .We can see that compared to the base case (a), the bond dimension drops by about one order of magnitude in case (b) and by more than two orders of magnitude in case (c).↑ I ↓ , ς = 10 −↑↓ , ς = 10 −↑↓ , ς = 10 −8

Figure 7 .Figure 8 .
Figure 7. Scaling of the bond dimension against the inverse temperature β, for I ↑ I ↓ where the two MPS-IFs I ↑ and I ↓ are multiplied together without truncation (the red line with triangle), for I ↑↓ with ς = 10 −9 (the yellow line with circle) and with ς = 10 −8 (the green line with square).

Figure 10 .
Figure 10.Matsubara Green's function for (a) the two-orbital AIM at Dβ = 10 and (b) the three-orbital AIM at Dβ = 2.The red dashed line and blue solid line in both panels are the GTEMPO results and the CTQMC results respectively.