Discontinuous Galerkin discretization for quantum simulation of chemistry

Jarrod R McClean; Fabian M Faulstich; Qinyi Zhu; Bryan O’Gorman; Yiheng Qiu; Steven R White; Ryan Babbush; Lin Lin

doi:10.1088/1367-2630/ab9d9f

1. Introduction

Predicting properties of both molecular and extended systems from first principles has long been the goal of electronic structure in both correlated classical methods [1], including new approaches based on tensor networks [2–5], and now many approaches based on quantum computing [6–15], some of which have even been implemented on experimental devices [16–25]. A crucial aspect of such simulations is the representation of the problem in a tractable discretization scheme such as a finite-difference method, or (more commonly) a basis set, also known as a Galerkin discretization of the problem. The selection of the basis influences not only the accuracy of the calculation but also the fundamental scaling of the cost of the simulation as a function of the system size.

The design of basis sets for correlated electronic structure has a long and rich history, packing much of the essential physics and chemistry into very compact representations to exploit the power of existing methods. While these basis sets have ranged from general purpose to those optimized for individual computations, a common mainstay has been the use of Gaussian-based molecular orbitals [1, 26–28]. Molecular orbitals tend to offer compact representations of correlated problems and also present an energy ordering of orbitals that facilitates further reduction of the space through active-space methods. However, a side effect of this reduction in the number of orbitals is often a Hamiltonian with a quartic number of terms and orbitals that are delocalized in space (figure 1).

**Figure 1.** A cartoon schematic of the general objective of this work depicted in terms of the sparsity pattern of the two-electron integrals. At the top, we depict a dense two-electron integral tensor with $O\left({N}_{a}^{4}\right)$ nonzero two-electron integrals, but relatively few basis functions. This is often the case with molecular orbital or diffuse Gaussian basis sets. In the center we depict a diagonal primitive basis set, such as Gausslets, with a relatively high number of basis functions but an overall scaling in $O\left({N}_{p}^{2}\right)$ . Finally, we depict this for a discontinuous Galerkin (DG) basis set built from a diagonal primitive basis set that interpolates between the two regimes by using fewer basis functions than strictly diagonal sets, while retaining an overall $O\left({N}_{d}^{2}\right)$ scaling through its block-diagonal structure. We note that the form of block-diagonal structure is a particular sparsity pattern which depends on the arrangement of indices when considering a matrix form; the text defines this sparsity pattern precisely.
Download figure:
Standard image High-resolution image

**Figure 1.** A cartoon schematic of the general objective of this work depicted in terms of the sparsity pattern of the two-electron integrals. At the top, we depict a dense two-electron integral tensor with $O\left({N}_{a}^{4}\right)$ nonzero two-electron integrals, but relatively few basis functions. This is often the case with molecular orbital or diffuse Gaussian basis sets. In the center we depict a diagonal primitive basis set, such as Gausslets, with a relatively high number of basis functions but an overall scaling in $O\left({N}_{p}^{2}\right)$ . Finally, we depict this for a discontinuous Galerkin (DG) basis set built from a diagonal primitive basis set that interpolates between the two regimes by using fewer basis functions than strictly diagonal sets, while retaining an overall $O\left({N}_{d}^{2}\right)$ scaling through its block-diagonal structure. We note that the form of block-diagonal structure is a particular sparsity pattern which depends on the arrangement of indices when considering a matrix form; the text defines this sparsity pattern precisely.
Download figure:
Standard image High-resolution image

The utility of localizing orbitals in both the occupied and virtual space has been recognized in the development of many linear-scaling methods [29], such as those based on pair natural orbitals [30–32]. In these cases the localization helps not only in allowing the screening of some terms in the Hamiltonian, but also reducing the correlations that need to be treated to reach a certain level of accuracy. However, even the most localized orbitals that can be produced from transformations of a standard basis are often not strictly local in the sense that they have heavy tails that can extend throughout the system to enforce orthogonality between orbitals. While reasonably compatible with some methods, these tails negatively impact tensor-network methods such as the density matrix renormalization group (DMRG), where an area-law entanglement system that is solvable in modest polynomial time becomes a volume law system with exploding bond dimension in a completely delocalized basis [33]. This effect, in combination with consideration of the number of Hamiltonian terms, led to the recent development of Gausslet basis sets [34, 35], which use properties of wavelet transforms to maintain strict localization, orthogonality, and a diagonal Coulomb operator.

In the case of quantum computing-based approaches, while basis localization may impact the representational power of an ansatz-based variational approach, the structure and number of terms in the Hamiltonian represent the dominant cost factor [36–39]. Moreover, some traditional density-based truncations and localizations can be difficult to utilize in quantum computers, due to the need to measure the density and rotate the basis to maximize the benefits. Recent advances in quantum-computing approaches have shown that Hamiltonians with a diagonal structure with a quadratic number of terms allow algorithms to execute numerically exact real-time dynamics and perform full configuration interaction (assuming a reference state with nonvanishing overlap on the ground state can be prepared) with a cost in terms of basis size that scales as roughly $O\left({N}_{p}^{1/3}\right)$ in first quantization [40] or $O\left({N}_{p}^{2}\right)$ in second quantization [41]. While the first work was done for periodic systems based on basis sets related to discrete variable representations (DVR) [36], these results apply to all basis sets with these properties. This is in contrast to the most advanced methods based on Gaussian molecular orbitals on quantum computers, which have costs more like $O\left({N}_{a}^{3}\right)$ in first quantization [42] or $O\left({N}_{a}^{4}\right)$ in second quantization [38] (see figure 2 for an overview of set size abbreviations used throughout the article).

**Figure 2.** Compact description of the notation used throughout the paper in counting basis functions in different representations for the electronic structure problem. Here discontinuous Galerkin (DG) is the block basis we construct from primitive functions to represent the active-space orbitals with a block-diagonal Hamiltonian representation.
Download figure:
Standard image High-resolution image

While the scaling advantages of basis sets with diagonal representations appear at reasonable system sizes, reduced representational flexibility can mean the cost of switching representations for equivalent accuracy is still higher than current quantum computers with limited resources can afford. Thus, it is desirable to be able to split the difference between strictly diagonal basis sets and molecular orbitals. The adaptive local basis set [43–47] was recently introduced to achieve this tradeoff in the setting of density functional theory (DFT). The adaptive local basis is constructed on-the-fly to capture atomic and environmental effects, by solving eigenvalue problems restricted to local domains called elements. Each basis function is only supported on one element and is discontinuous on the level of the global domain. These basis functions are 'glued' together to approximate the continuous electron density using the interior penalty discontinuous Galerkin (DG) formalism [48, 49]. Compared to typical numerical methods for solving DFT, the DG formulation includes extra correction terms to handle the discontinuities of the basis functions at the boundary of the elements. Numerics indicate that roughly 10 to 40 basis functions per atom are often sufficient to achieve chemical accuracy for DFT calculations using pseudopotentials [50, 51].

In this work, we introduce a variation of the DG approach for simulation on quantum computers, which has two main differences compared to previous works [43, 44]. First, our target is to model electronic structure at the correlated level instead of the DFT level, and hence we can afford to obtain the best local basis functions, e.g., by starting from molecular orbitals or approximate natural orbitals. This removes a significant step of approximation in the original DG approach due to the solution of local eigenvalue problems with certain artificial boundary conditions (such as the Dirichlet or periodic boundary conditions). Second, our basis functions are represented as linear combination of primitive basis functions, which means that the basis functions are no longer strictly discontinuous in real space. We demonstrate that the two-body operator can still maintain a block-diagonal structure for efficient quantum simulation, and for simplicity we will still refer to the basis set as the DG basis set. The use of the primitive basis functions removes the need for correction terms that account for the discontinuity, which is similar in spirit to the discrete discontinuous basis projection method [52] for DFT calculations. The DG basis set enables one to interpolate between cheap diagonal representations and compact nondiagonal basis sets with blocks of specified size.

We begin by introducing the standard discretization of the electronic structure problem in a basis, and define precisely what is meant by a diagonal basis representation and strictly localized functions. We then briefly review basis sets that exhibit spatial locality and diagonal interactions that have already been used in the context of DMRG and quantum computing. This allows us to highlight the cost advantages of these representations within each approach. Then, the DG approach is introduced as a general framework for maintaining these properties by using those basis sets as a primitive building block. The block diagonal structure of the resulting Hamiltonians leads us to new cost models based on swap networks for quantum algorithms. We show that the new approach both maintains accuracy in correlated calculations, and demonstrates a crossover to dramatically lower costs at modest system sizes between 15 and 20 atoms. We finish with an outlook on how this approach will influence quantum and classical approaches to correlated electronic structure alike.

2. Discretizing the electronic structure problem

A crucial aspect of essentially all algorithms for the simulation of electronic systems is the discretization of the system into some tractable representation. This step takes the electronic Hamiltonian that acts in some continuous space, and maps it to a discrete space. The continuous electronic structure Hamiltonian is in atomic units given by

$\begin{equation}\hat{H}=-\sum _{i}\frac{{\nabla }_{{\mathbf{r}}_{i}}^{2}}{2}-\sum _{I,j}\frac{{Z}_{I}}{\vert {\mathbf{R}}_{I}-{\mathbf{r}}_{j}\vert }+\sum _{i{< }j}\frac{1}{\vert {\mathbf{r}}_{i}-{\mathbf{r}}_{j}\vert }+{E}_{\text{II}},\end{equation} \tag{ 1 }$

where r_i denotes the position of the ith electron, and we have assumed the Born–Oppenheimer approximation such that the positions of nuclei, denoted by R_I are constants (giving rise to a constant energy correction E_II for the nuclear–nuclear interaction). In the case of Galerkin discretizations, where one chooses a basis set given by some set of orthonormal (single-particle) functions {χ_i(r)} (for simplicity we assume a spin-restricted formulation), and enforces the antisymmetry of electrons in the operators. Then, one may express the Hamiltonian in the standard second quantized form. It is possible to select basis functions from a number of complete sets that allow the electron–electron interaction to be represented in a way that is entirely diagonal under a Jordan–Wigner representation in the computational basis and also exhibits a diagonal property under matricization. In this representation, the second quantized Hamiltonian is given by

$\begin{equation}{\hat{H}}^{\left(p\right)}=\sum _{\mu ,\nu =1}^{{N}_{p}}{h}_{\mu \nu }^{\left(p\right)}{\hat{b}}_{\mu }^{{\dagger}}{\hat{b}}_{\nu }+\frac{1}{2}\sum _{\mu ,\nu =1}^{{N}_{p}}{v}_{\mu \nu }^{\left(p\right)}{\hat{n}}_{\mu }{\hat{n}}_{\nu },\end{equation} \tag{ 2 }$

where ${\hat{b}}_{\mu }^{{\dagger}}$ is a creation operator in the primitive basis, and ${\hat{n}}_{\mu }={\hat{b}}_{\mu }^{{\dagger}}{\hat{b}}_{\mu }$ is a number operator. We refer to Hamiltonians written in this form as 'diagonal Hamiltonians', and use such basis sets here as our 'primitive' basis (associated with the superscript (p) in Ĥ^(p)) to efficiently construct compact bases that partially retain this property.

Generally, the coefficients in equation (2) are given by the following integrals,

$\begin{equation}\begin{aligned}\hfill {h}_{\mu \nu }^{\left(p\right)}& =\int \mathrm{d}\mathbf{r}\enspace {\overline{\chi }}_{\mu }\left(\mathbf{r}\right)\left(\frac{-{\nabla }_{\mathbf{r}}^{2}}{2}-\sum _{I}\frac{{Z}_{I}}{\vert {\mathbf{R}}_{I}-\mathbf{r}\vert }\right){\chi }_{\nu }\left(\mathbf{r}\right),\hfill \\ \hfill {v}_{\mu \sigma \gamma \nu }& =\int \mathrm{d}\mathbf{r}\mathrm{d}{\mathbf{r}}^{\prime }\enspace \frac{{\overline{\chi }}_{\mu }\left(\mathbf{r}\right){\overline{\chi }}_{\sigma }\left({\mathbf{r}}^{\prime }\right){\chi }_{\gamma }\left(\mathbf{r}\right){\chi }_{\nu }\left({\mathbf{r}}^{\prime }\right)}{\vert \mathbf{r}-{\mathbf{r}}^{\prime }\vert },\hfill \end{aligned}\end{equation} \tag{ 3 }$

and the defining property of our primitive basis sets may be written as ${v}_{\mu \sigma \gamma \nu }\to {v}_{\mu \nu }^{\left(p\right)}{\delta }_{\mu ,\gamma }{\delta }_{\sigma ,\nu }$ , i.e., v becomes a diagonal matrix when we view (μ, σ) as the row index and (γ, ν) as the column index, respectively. Note that the relation between v_μσγν and ${v}_{\mu \nu }^{\left(p\right)}{\delta }_{\mu ,\gamma }{\delta }_{\sigma ,\nu }$ is not necessarily an equality: what one requires is that solutions to the Schrödinger equation using the two different forms of the interaction can systematically approach each other as one approaches the complete basis set limit. A grid, defined via finite differences, has the diagonal property, although there is no underlying basis. Here we consider only basis sets, but the basis functions with the diagonal property are naturally associated with a uniform or nonuniform grid, with typical spacings between grid points much smaller than the interatomic distances.

For a basis set, a sufficient condition for equality of these expressions is that one has functions with strictly disjoint support, such that (formally) χ_μ(r)χ_σ(r) = 0 for all μ ≠ σ. However, such a requirement would be unrealistic because upon careful inspection it would imply a simultaneously diagonal kinetic and potential operator if evaluated in the Galerkin formulation in equation (3) (in contrast to a finite-difference or overlapping finite-element approach). Classical finite-element methods can produce near-diagonality by allowing overlapping elements between only spatially neighboring sites, which retains the favorable scaling, but generically introduces nonorthogonality in the basis [53–56]. Accordingly, methods requiring a return to orthogonality, such as quantum-computing methods, suffer a transformation to $O\left({N}_{p}^{4}\right)$ terms on orthogonalization, the introduction of unnecessary orthogonality tails, or both. More recent developments in classical discretization have extended this idea of disjoint cells to allow for truly disjoint basis sets, but at the cost of the introduction of additional surface terms and discontinuity penalties to reintroduce physicality into the problem. These methods are known as discontinuous Galerkin (DG) methods [48, 49] and were first introduced to electronic structure in the context of density functional theory [43–46]. They represent a general and rigorous framework for constructing problem representations that have this property of strict (block) locality, and we develop a variation of these methods in this work to achieve this goal for correlated electronic structure methods without the need for the introduction of surface terms.

While the condition of spatial disjointness is a sufficient condition to obtain a Hamiltonian with $O\left({N}_{p}^{2}\right)$ terms, it is not necessary. Two basis sets that have the diagonal property without the equality of the individual matrix elements, and thus without the strict spatial disjointness property, are the plane-wave dual basis [36] and Gausslet basis [34]. The plane-wave dual basis is related to periodic sinc functions and discrete variable representations (DVR) [36] and is constructed from an aliased Fourier transform of plane waves for a given box to yield a diagonal Hamiltonian. Here the diagonality originates from the plane wave dual functions (also called the periodic sinc functions) that are Lagrange interpolation functions on a uniform grid. It has the advantage that the basis is naturally periodic, and thus is well suited for the treatment of materials and other condensed phase systems. Moreover, a modification using a truncated Coulomb interaction can enable the treatment of isolated systems. Two key downsides of this approach are that it inherently reflects a uniform discretization in space of the problem, lacking the ability to adaptively refine sharp features such as the electron–nuclear cusp, and that it has long tails responsible for maintaining the orthogonality of the basis. This first property is reflected in a large overhead for representing atomic systems to a level of accuracy similar to Gaussians, and the second makes the method difficult to use with geometric entanglement based approaches such as tensor networks.

When integrated with smooth functions, Gausslets behave like δ functions, which results in the diagonal property. The Gausslets are obtained as linear combinations of arrays of Gaussians using wavelet transforms. Specific moment properties of the wavelet transforms make the Gausslets integrate polynomials up to a certain order like a δ function. The δ-function property survives under smooth coordinate transformations, so unlike plane-wave dual bases, Gausslets can have variable resolutions, with more degrees of freedom near the nuclei to represent the electron–nuclear cusp. Also in contrast to the plane-wave dual basis functions, Gausslets have strong localization characteristics, avoiding long tails. The lack of tails means these basis sets are naturally suited for tensor-network methods and, relatedly, a variational quantum ansatz may have more expressive power at shorter circuit depths. The ability to more naturally represent inhomogeneous features means the representational overhead is expected to be modest relative to plane-wave representations.

We note, however, that both methods can benefit from the introduction of pseudopotentials, and that the representational power of all single-particle basis sets are expected to be limited by the same asymptotic scaling in the limit of a very large number of basis functions. This means that for very large basis set sizes approaching the complete basis set limit, the representational overheads are expected to be negligible, and the determining factors are the other properties of the basis sets. Despite this, however, there is considerable interest in treating systems before reaching this limit, where the representational overhead for basis sets can differ considerably.

In order to meet the demands of compactness one could start from a more generic basis set and use the expressions in equation (3) to determine the Hamiltonian. However, an approach that will prove fruitful here is to use the fact that it is equivalent to start from a complete primitive basis set, such as those above, and project into a compact 'active-space' Hamiltonian. Gaussian, molecular orbitals, or other active-space constructions can been seen as a specific case of the active space we refer to here.

Since the number of primitive basis functions is typically large compared to the number of electrons, quantum chemistry calculations, especially at the correlated level, commonly utilize more physically motivated basis sets that result in a more compact description (e.g. correlation-consistent basis sets in the LCAO approach). Let ${\left\{{\varphi }_{p}\left(\mathbf{r}\right)\right\}}_{p=1}^{{N}_{a}}$ be a set of orthonormal single-particle functions; we refer to them as active-space orbitals (for instance, canonical Hartree–Fock orbitals, or natural orbitals), which can be expanded using a primitive basis set as

$\begin{equation}{\varphi }_{p}\left(\mathbf{r}\right)=\sum _{\mu }{\chi }_{\mu }\left(\mathbf{r}\right){{\Phi}}_{\mu p}.\end{equation} \tag{ 4 }$

Here ${\Phi}\in {\mathbb{C}}^{{N}_{p}{\times}{N}_{a}}$ is a matrix with orthogonal columns. While this projection step represents an approximation depending on the nature of the primitive and active bases, it is one that can be systematically controlled and understood by increasing the size of the underlying primitive basis without significantly increasing the size of the resulting block basis that we will construct subsequently. For flexibility and accuracy, we will allow quite general definitions of the active-space basis set in this work. It will pertain to traditional Hartree–Fock canonical orbitals as well as Gaussian basis sets such as the dunning cc-pVDZ basis set [26], which allows us to use Gaussians with strictly block diagonal properties by going through the primitive basis set using point sampling through the approximate delta function properties of the primitive basis set. It has been shown previously that for the Gausslet primitive basis set we use, point sampling is extremely accurate due to the delta function property of the basis [34]. We will also make use of this construction to build hybrid active spaces, where we use a weighted density matrix from multiple basis sets to define Φ through its most important natural orbitals. In particular, we will combine the expressive power of a large primitive basis, such as Gausslets, to capture static correlations through an unrestricted Hartree–Fock defined active space, while including a Gaussian basis empirically refined to express dynamic electronic correlation effects, e.g. cc-pVDZ through weighting. In this work we make use of a joined set of density matrices built from a unrestricted Hartree–Fock (UHF) solution D^UHF and Gaussian orbitals αD^Gaussian, to form D = D^UHF + αD^Gaussian in the primitive basis, and use a natural-orbital truncation of D to define Φ. Empirically we use a value of α ≈ 0.01 later in this work when we combine these basis sets, which appears to give an excellent improvement in accuracy. A more detailed description of this procedure and refinement of the value α is left to a future work. We term this the hybrid active-space approach. As we only use this hybrid approach in conjunction with the DG-blocking procedure, we offload concerns about orthogonality in the projected basis to the singular value decomposition (SVD) used in the DG procedure.

Taking as granted the construction of the matrix Φ, we define a rotated set of creation and annihilation operators in the active space as

$\begin{equation}{\hat{a}}_{p}^{{\dagger}}=\sum _{\mu =1}^{{N}_{p}}{\hat{b}}_{\mu }^{{\dagger}}{{\Phi}}_{\mu p},\quad {\hat{a}}_{p}=\sum _{\mu =1}^{{N}_{p}}{\hat{b}}_{\mu }{\overline{{\Phi}}}_{\mu p},\end{equation} \tag{ 5 }$

where ${\overline{{\Phi}}}_{\mu p}$ denotes the complex conjugate and we may project the Hamiltonian as

$\begin{equation}{\hat{H}}^{\left(a\right)}=\sum _{p,q=1}^{{N}_{a}}{h}_{pq}^{\left(a\right)}{\hat{a}}_{p}^{{\dagger}}{\hat{a}}_{q}+\frac{1}{2}\sum _{p,q,r,s=1}^{{N}_{a}}{v}_{\mathit{pqrs}}^{\left(a\right)}{\hat{a}}_{p}^{{\dagger}}{\hat{a}}_{q}^{{\dagger}}{\hat{a}}_{r}{\hat{a}}_{s},\end{equation} \tag{ 6 }$

which we refer to as the active-space Hamiltonian.

Generally, we see that our primitive basis sets have favorable scaling in number of terms in the Hamiltonian ( $O\left({N}_{p}^{2}\right)$ vs $O\left({N}_{a}^{4}\right)$ ), which often corresponds to better scaling algorithms. While N_p and N_a have the same asymptotic scaling, for modest sized calculations it is often observed that N_p ≫ N_a in order to achieve comparable accuracy. Here we will seek a way to split the difference between these two regimes by forming a more compact basis that partially retains the diagonal properties, i.e., the resulting Hamiltonian is block-diagonal.

3. Discontinuous Galerkin discretization

At a high level, we construct the block-diagonal basis by fitting spatially connected blocks of the primitive basis set to the active basis set, while preserving the properties of the primitive basis set. We therewith interpolate between the primitive basis set and the active basis set. We will refer to the general class of basis sets that achieve both completeness in some limit and have the diagonal property as primitive basis sets.

Our goal is to systematically compress the active basis set ${\left\{{\varphi }_{p}\left(\mathbf{r}\right)\right\}}_{p=1}^{{N}_{a}}$ into a set of orthonormal basis functions partitioned into elements (groups), so that basis functions associated with different elements have mutually disjoint support. Assume that the index set Ω = {1, ..., N_p} can be partitioned into N_b nonoverlapping index sets

$\begin{equation}\mathcal{K}=\left\{{\kappa }_{1},{\kappa }_{2},\enspace \dots ,\enspace {\kappa }_{{N}_{b}}\right\},\end{equation} \tag{ 7 }$

so that ${\cup }_{\kappa \in \mathcal{K}}\kappa ={\Omega}$ . Then the matrix Φ can be partitioned into N_b blocks ${{\Phi}}_{\kappa }{:=}{\left[{{\Phi}}_{\mu p}\right]}_{\mu \in \kappa }$ for $\kappa \in \mathcal{K}$ and p ∈ {1,...,N_a}. Performing the singular value decomposition for Φ_κ,

$\begin{equation}{{\Phi}}_{\kappa }\approx {U}_{\kappa }{S}_{\kappa }{V}_{\kappa }^{{\dagger}},\end{equation} \tag{ 8 }$

where U_κ is a matrix with orthonormal columns corresponding to the leading n_κ singular values up to some truncation tolerance τ, we obtain our compressed basis

$\begin{equation}{\phi }_{\kappa ,j}\left(\mathbf{r}\right)=\sum _{\mu \in \kappa }{\chi }_{\mu }\left(\mathbf{r}\right){\left({U}_{\kappa }\right)}_{\mu ,j}.\end{equation} \tag{ 9 }$

The basis set is adaptively compressed with respect to the given set of basis functions, and are locally supported (in a discrete sense) only on a single index set κ. In the absence of SVD truncation, we clearly have span{φ_p} ⊆ span{ϕ_κ,j}. We refer to this basis set {ϕ_κ,j} as the DG basis set. Note that each DG basis function ϕ_κ,j is a linear combination of primitive basis functions which are themselves continuous, so ϕ_κ,j is also technically continuous in real space. In fact, ϕ_κ,j might not be locally supported in real space if each primitive basis function χ_p is delocalized. When the primitive basis functions are localized, ϕ_κ,j can be very close to a discontinuous function. (See figure 3 for an example.) When computing the projected Hamiltonian, we do not need to evaluate the surface terms in the DG formalism. If we form a block-diagonal matrix

$\begin{equation}U=\mathrm{diag}\left[{U}_{1},\enspace \dots ,\enspace {U}_{{N}_{b}}\right],\end{equation} \tag{ 10 }$

the total number of basis functions is thus ${N}_{d}{:=}{\sum }_{\kappa \in \mathcal{K}}{n}_{\kappa }$ . We remark that the number of basis functions n_κ can be different across different elements.

**Figure 3.** Three DG functions in the X–Z plane, represented in real space. The axes are in units of Bohr and the color intensity represent the amplitude |ϕ_κ,j|. Each of the functions is localized to its block, where the block divisions are shown in dotted gray lines. The DG functions are represented by linear combination of plane wave dual basis functions. Each DG function is strictly still a continuous function, but its nodal values (defined according to the center of each plane-wave dual function) are only supported within only one block.
Download figure:
Standard image High-resolution image

To facilitate the complexity count below we may, without loss of generality, assume that n_κ is a constant and that N_d = N_bn_κ (for a schematic illustration of the subsequent procedure see figure 4). Then we have defined a new set of creation and annihilation operators

$\begin{equation}{\hat{c}}_{\kappa ,j}^{{\dagger}}=\sum _{\mu }{\hat{b}}_{\mu }^{{\dagger}}{\left({U}_{\kappa }\right)}_{\mu j},\quad {\hat{c}}_{\kappa ,j}=\sum _{\mu }{\hat{b}}_{\mu }{\left({\overline{U}}_{\kappa }\right)}_{\mu j},\end{equation} \tag{ 11 }$

with κ = 1, ..., N_b and j = 1, ..., n_κ that correspond to the DG basis set.

**Figure 4.** A schematic illustration of the compression process of delocalized active-space basis functions into DG basis functions. Beginning with a matrix representing the projection of a primitive basis onto a chosen active basis (left), with the primitive basis grouped into blocks represented by rows here. Those blocks are then reduced by a singular value decomposition (center), which finally leads to the DG basis that has a block-diagonal two-electron integral representation (right).
Download figure:
Standard image High-resolution image

Unlike equation (5), the basis set rotation in equation (11) is restricted to each element κ. We readily obtain the projected Hamiltonian in the DG basis as

$\begin{align}\hfill {\hat{H}}^{\left(d\right)}& =\sum _{\kappa ,{\kappa }^{\prime };j,{j}^{\prime }}{h}_{\kappa ,{\kappa }^{\prime };j,{j}^{\prime }}^{\left(d\right)}{\hat{c}}_{\kappa ,j}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{j}^{\prime }}\hfill \\ \hfill & \quad \enspace +\frac{1}{2}\sum _{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}{v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}{\hat{c}}_{\kappa ,i}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{i}^{\prime }}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{j}^{\prime }}{\hat{c}}_{\kappa ,j}.\hfill \end{align} \tag{ 12 }$

The matrix elements are

$\begin{equation}{h}_{\kappa ,{\kappa }^{\prime };j,{j}^{\prime }}^{\left(d\right)}=\sum _{\mu \nu }{\left({\overline{U}}_{\kappa }\right)}_{\mu j}{h}_{\mu \nu }^{\left(p\right)}{\left({U}_{{\kappa }^{\prime }}\right)}_{\nu {j}^{\prime }},\end{equation} \tag{ 13 }$

and

$\begin{equation}{v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}=\sum _{\mu \nu }{\left({\overline{U}}_{\kappa }\right)}_{\mu i}{\left({\overline{U}}_{{\kappa }^{\prime }}\right)}_{\nu {i}^{\prime }}{v}_{\mu \nu }^{\left(p\right)}{\left({U}_{\kappa }\right)}_{\mu j}{\left({U}_{{\kappa }^{\prime }}\right)}_{\nu {j}^{\prime }}.\end{equation} \tag{ 14 }$

In general, the one-body matrix h^(d) can be a full dense matrix, but the two-body tensor v^(d) always takes a 'block diagonal' form in the following sense (it has a specific sparsity pattern). In principle, the two-body interaction in the DG basis set should take the form

$\begin{equation}\frac{1}{2}\sum _{\kappa ,{\kappa }^{\prime },\lambda ,{\lambda }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}{v}_{\kappa ,i;{\kappa }^{\prime },{i}^{\prime };\lambda ,j;{\lambda }^{\prime },{j}^{\prime }}{\hat{c}}_{\kappa ,i}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{i}^{\prime }}^{{\dagger}}{\hat{c}}_{{\lambda }^{\prime },{j}^{\prime }}{\hat{c}}_{\lambda ,j}.\end{equation} \tag{ 15 }$

Compared to equation (12), we find that

$\begin{equation}{v}_{\kappa ,i;{\kappa }^{\prime },{i}^{\prime };\lambda ,j;{\lambda }^{\prime },{j}^{\prime }}={v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}{\delta }_{\kappa \lambda }{\delta }_{{\kappa }^{\prime }{\lambda }^{\prime }}.\end{equation} \tag{ 16 }$

In other words, v can be viewed as a block-diagonal matrix with respect to the grouped indices (κκ', λλ').

We remark that the convergence of the DG basis set is independent of the choice of the primitive basis set so long as the primitive basis has sufficient degrees of freedom to form a good approximation to the active-space functions of interest. At the end of this adaptive procedure, we expect the number of elements in the Hamiltonian to scale as $O\left({N}_{b}^{2}{n}_{\kappa }^{4}\right)$ . However, as we expect the number of basis functions required to reach a fixed accuracy within a block (i.e., n_κ) to be bounded by a constant as system size grows, and the scaling, with system size becomes $O\left({N}_{d}^{2}\right)$ . We substantiate the rapid asymptotic convergence of n_κ for real systems later in this work, however, simple arguments from spatial locality and basis set completeness lead to the same conclusion.

4. Quantum simulation with a DG basis

Here we introduce the methods used to exploit the properties of the DG basis on a quantum computer for evolution under the Hamiltonian. We first describe a method that implements evolution using a Suzuki–Trotter decomposition through swap networks, taking advantage of recent advances in quantum swap networks. An alternative Trotter approach based on low-rank decompositions is detailed in appendix A. We then describe a method for time evolution based on the linear combinations of unitaries approach, which will provide the required background for determining the degree of advantage for using DG basis sets for fault-tolerant quantum computations of chemistry.

4.1. Swap networks for block diagonal Hamiltonians

In the first work using a strictly diagonal basis in quantum computing for chemistry [36], the ability for quantum computers to perform fast Fourier transforms on quantum wavefunctions was exploited to capitalize on the representational advantages of being in either the plane-wave basis or its Fourier-transformed dual. That method was originally restricted to Hamiltonians with that particular structure in the coefficients, similar to split-operator Fourier transform methods used in classical simulation of quantum systems. However, it was soon realized that the structure of any diagonal Hamiltonian could be similarly exploited. This generalization used a linear, fermionic swap network to achieve perfect parallelization of a Trotter step with depth that scales linearly in the number of orbitals [57], even when gates are restricted to act on nearest neighbors of a line of qubits.

Fermionic swap networks are analogous to sorting networks from traditional computer science except built upon the primitive of the fermionic-swap operation,

$\begin{equation}{\hat{f}}_{\text{swap}}^{pq}=1+{\hat{a}}_{p}^{{\dagger}}{\hat{a}}_{q}+{\hat{a}}_{q}^{{\dagger}}{\hat{a}}_{p}-{\hat{a}}_{p}^{{\dagger}}{\hat{a}}_{p}-{\hat{a}}_{q}^{{\dagger}}{a}_{q},\end{equation} \tag{ 17 }$

$\begin{equation}{\hat{f}}_{\text{swap}}^{pq}{\hat{a}}_{p}^{{\dagger}}{\left({\hat{f}}_{\text{swap}}^{pq}\right)}^{{\dagger}}={\hat{a}}_{q}^{{\dagger}},\end{equation} \tag{ 18 }$

where ${\hat{f}}_{\text{swap}}^{pq}$ is the fermionic swap that swaps the labeling of modes. The fermionic-swap operation was introduced in [58] and also studied in the context of tensor networks. The difference between such a swap and a traditional swap is by swapping fermionic modes instead of assignments to qubits, nonlocal parity strings used to enforce the fermionic anticommutation relations can be avoided.

The basic idea of the linear swap network is to fermionic swap all neighboring qubits, interact them with their current neighbors, and repeat until all qubits have interacted with each other. Since the introduction and use of these linear fermionic swap networks in quantum algorithms, they have been generalized for use in nondiagonal Hamiltonians with some overhead. For example, the quantum chemistry Hamiltonian can be decomposed into a sum of diagonal Hamiltonians (each in a rotated basis) using techniques similar to Cholesky or density-fitting methods, where each diagonal Hamiltonian can then be implemented in sequence [39]. We show how to use this method in the DG representation in appendix A.

For the general Hamiltonian in quantum chemistry, which is nondiagonal, a generalized swap network that works directly with such Hamiltonians was developed [59]. This network implements time steps for generic $O\left({N}_{a}^{4}\right)$ Hamiltonians in a time that scales as $O\left({N}_{a}^{3}\right)$ , and we take advantage of it here with specializations for the block diagonal structure. To implement a Trotter step of the Hamiltonian, the swap network dynamically updates the mapping from qubits to orbitals so that for each term in the Hamiltonian, the involved orbitals are mapped to adjacent qubits. We say that a swap network 'acquaints' a set of orbitals when it brings them together at some point in this way, and represent that point by an empty box in the circuit diagrams, which acts as a placeholder for the logical gate to be executed there. Prior work utilizing swap networks has applied them to two extremal regimes with respect to the structure of the two-electron terms in the Hamiltonian: the strictly diagonal case, which can be implemented with O(N_p) depth [57]; and the fully general case, which can be implemented in $O\left({N}_{a}^{3}\right)$ [59]. Here we show how to interpolate between these to achieve $O\left({N}_{b}{n}_{\kappa }^{3}\right)=O\left({N}_{d}{n}_{\kappa }^{2}\right)$ depth for block-diagonal Hamiltonians. (For simplicity, in this section we will assume that all blocks have the same size n_κ, but the techniques generalize in a straightforward way to nonuniform block sizes.)

We focus on how to implement the quartic terms in the Hamiltonian (i.e., two-electron terms involving four distinct spin orbitals). The lower-order terms can be addressed with negligible additional resources by incorporating them into the quartic terms. The quartic terms in the block-diagonal Hamiltonian satisfy the following properties:

(a)
Two orbitals are from one block κ and two orbitals are from another block κ' (or all four from the same block when κ = κ').
(b)
The orbital spins have even parity (i.e., all up, all down, or two and two).

We will exploit both of these properties in constructing our swap network, which uses primitives originally designed for implementing unitary coupled cluster [59]. Figure 5 shows the overall swap network. Initially, the orbitals are arranged on the line in lexicographical ordering; only the block index κ and spin are indicated for concision. The logic of the strategy is as follows:

(a)
The first layer acquaints all sets of four spin orbitals within each block in which all four orbitals have the same spin. This is achieved by a 'four-complete' swap network on each half block of orbitals, denoted by ${K}_{{n}_{\kappa }/2}^{4}$ because the sets of orbitals it acquaints correspond to the edges of a complete four-uniform hypergraph; it has depth $O\left({n}_{\kappa }^{3}\right)$ . Note that the edges of the complete k-uniform hypergraph ${K}_{n}^{k}$ on n are the $\left(\genfrac{}{}{0.0pt}{}{n}{k}\right)$ sets of k vertices. The 'uniform' qualifier indicates that all of the hyperedges have the same number of vertices. See [59] for details.
(b)
The second layer acquaints all sets of four spin orbitals within each block in which two orbitals have spin up and the other two have spin down. This is achieved by a 'double bipartite' swap network on each block in depth $O\left({n}_{\kappa }^{3}\right)$ ; see figure D2.
(c)
The third layer permutes, in O(n_κ) depth, the orbitals within each block in preparation for the inter-block acquaintances to follow.
(d)
The rest of the strategy consists of N_b alternating layers that acquaint pairs of parts. In each layer, each block of qubits is paired up with an adjacent one and a 'balanced double bipartite' swap network is executed on the pair of blocks; see figure D3. Each balanced double bipartite swap network acquaints the sets of four orbitals containing two from each block and with even ('balanced') spin parity. This also has the effect of swapping the blocks, so overall a balanced double bipartite swap network is applied to every pair of blocks. Each double bipartite swap network has depth $O\left({n}_{\kappa }^{3}\right)$ .

**Figure 5.** Acquaintance strategy for block-diagonal Hamiltonian with N_b = 4 and n_κ = 10. ' ${K}_{n}^{4}$ ' indicates a four-complete swap network on n qubits, i.e., one that acquaints the $\left(\genfrac{}{}{0.0pt}{}{n}{4}\right)$ subsets of four qubits with each other. The other gates are double bipartite swap networks, explained in figures D2 and D3.
Download figure:
Standard image High-resolution image

**Figure 5.** Acquaintance strategy for block-diagonal Hamiltonian with N_b = 4 and n_κ = 10. ' ${K}_{n}^{4}$ ' indicates a four-complete swap network on n qubits, i.e., one that acquaints the $\left(\genfrac{}{}{0.0pt}{}{n}{4}\right)$ subsets of four qubits with each other. The other gates are double bipartite swap networks, explained in figures D2 and D3.
Download figure:
Standard image High-resolution image

Overall, the depth is $O\left({N}_{b}{n}_{\kappa }^{3}\right)$ , dominated by the latter swap networks that effect the inter-block interactions. The components of this approach are explained in more detail in appendix D and an alternative approach that may have asymptotic advantages in some regimes is discussed in appendix A.

4.2. LCU approaches for simulation

The quantum simulation algorithms discussed in the previous section and in appendix A are useful for implementing Trotter steps of the chemistry Hamiltonian. Such Trotter steps can be repeated to perform time-evolution for modeling dynamics or for preparing eigenstates via the phase estimation algorithm [60, 61], but they can also serve as an ansatz for composing quantum variational algorithms [15, 62]. This in conjunction with these Trotter steps requiring only minimal (linear) connectivity makes them attractive algorithms for near-term quantum computing. However, within cost models appropriate for a fault-tolerant quantum computer, Trotter steps are not the most competitive technique for chemistry simulation. In such a cost model, the key resource to minimize is the number of T gates required by the algorithm. This is because in most practical error-correcting codes (e.g., the surface code), T gates require many physical qubits for their distillation and are orders of magnitude slower to implement than Clifford gates.

Currently, the lowest T complexity quantum algorithms for simulating chemistry are all based on linear combinations of unitaries (LCU) methods [63]. This family includes Taylor series methods [64] (applied to chemistry in [14, 42]), qubitization [65] (applied to chemistry in [38, 65, 66]) and Hamiltonian simulation in the interaction picture [41] (applied to chemistry in [40, 41]). What all LCU methods have in common is that they involve simulating the Hamiltonian from a representation where it can be accessed as a linear combination of unitaries,

$\begin{equation}H=\sum _{\ell =1}^{L}{\omega }_{\ell }\enspace {U}_{\ell },\quad \lambda =\sum _{\ell =1}^{L}\left\vert {\omega }_{\ell }\right\vert ,\end{equation} \tag{ 19 }$

where U_ℓ are unitary operators, ω_ℓ are scalars, and λ is a parameter that determines the complexity of these methods. The second-quantized Hamiltonians discussed in this paper satisfy this requirement once mapped to qubits (e.g., under the Jordan–Wigner transformation) since strings of Pauli operators are unitary.

LCU methods perform quantum simulation in terms of queries to two oracle circuits defined as

$\begin{equation}\mathrm{SELECT}\vert \ell \rangle \vert \psi \rangle {\mapsto}\vert \ell \rangle {U}_{\ell }\vert \psi \rangle ,\end{equation} \tag{ 20 }$

$\begin{equation}\mathrm{PREPARE}\vert {0\rangle }^{\otimes \mathrm{log}L}{\mapsto}\sum _{\ell =1}^{L}\sqrt{\frac{{\omega }_{\ell }}{\lambda }}\vert \ell \rangle ,\end{equation} \tag{ 21 }$

where |ψ⟩ is the system register and |ℓ⟩ is an ancilla register which usually indexes the terms in equation (19) in binary and thus contains log L ancillae. Up to log factors in precision and other system parameters, LCU methods can perform time-evolution with gate complexity scaling as

$\begin{equation}O\left(\left({C}_{\mathrm{S}}+{C}_{\mathrm{P}}\right)\lambda \enspace t\right),\end{equation} \tag{ 22 }$

where C_S and C_P are the gate complexities of SELECT and PREPARE respectively, and t is time.

To implement the LCU oracles one must be able to coherently translate the index ℓ into the associated U_ℓ and ω_ℓ. In the quantum chemistry context the U_ℓ are related to the second-quantized fermion operators (e.g., ${\hat{a}}_{p}^{{\dagger}}{\hat{a}}_{q}^{{\dagger}}{\hat{a}}_{r}{\hat{a}}_{s}$ ) and the ω_ℓ are related to the molecular integrals. While the U_ℓ have a structure that is straightforward to unpack in a quantum circuit (see [38, 66] for explicit implementations), the ω_ℓ are typically challenging to compute directly from this index (unless one pursues the highly impractical strategy of computing the integrals on-the-fly, as in [14]). As a consequence, with state-of-the-art LCU methods for simulating chemistry the primary bottleneck has been implementation of PREPARE rather than SELECT [38, 66].

In the most performant LCU approaches for chemistry (see [66] for plane waves and [38] for arbitrary basis sets), PREPARE is implementing (and bottlenecked) by using a data-lookup routine referred to as QROM (quantum read-only memory) to load the ω_ℓ values into superposition. Using this approach, the cost of PREPARE is a function of the number of unique coefficients ω_ℓ in the Hamiltonian. Using the QROM of [67] (which improves on the original concept from [66]), one can (up to log factors) implement PREPARE with T complexity scaling as O(L/g + g) where L is the number of molecular integrals and g is a free parameter so long as at least g dirty ancilla are available during the implementation of PREPARE. As PREPARE acts only on the ancilla register, there are typically at least N_d dirty ancilla available (from the |ψ⟩ register).Thus (assuming $L{ >}{N}_{d}^{2}$ ) the scaling is O(L/N_d) without ancilla or $\mathcal{O}\left(\sqrt{L}\right)$ with $O\left(\sqrt{L}\right)$ ancilla (often a reasonable tradeoff within error-correction).

When simulating in a molecular orbital basis with $L=O\left({N}_{a}^{4}\right)$ , one can (up to log factors) evolve the Hamiltonian for some time t and achieve T complexity $O\left({N}_{a}^{2}\lambda t\right)$ [38]. If a straightforward extension of the approach in [38] is applied to the DG basis Hamiltonian with $L={n}_{\kappa }^{4}{N}_{b}^{2}$ then this scaling would be reduced to $O\left({n}_{\kappa }^{2}{N}_{b}\lambda t\right)$ . However, if one can index the ω_ℓ in a way that exploits symmetries in the coefficients, then one can further reduce the effective value of 'L' in both of these expressions. For instance, in [66] it is recognized that while there are ${N}_{p}^{2}$ coefficients ω_ℓ in the Hamiltonian of equation (2) (one for each value of ν and μ), there are only N_p unique coefficients due to the translational invariance of the Coulomb operator (essentially, one for each unique value of the index μ–ν), and the cost of the algorithm is reduced accordingly. As a rough rule of thumb, the scaling becomes O(J/g + g) where J can be thought of as the number of unique scalars needed to completely describe the Hamiltonian without recomputing the molecular integrals. In [38], it is shown that because only $J=O\left({N}_{a}^{3}\right)$ numbers are needed to describe the low-rank factorized Coulomb operator in an arbitrary basis [39], one can improve the scaling to $O\left({N}_{a}^{3/2}\lambda t\right)$ . By tailoring such techniques to symmetries such as periodic crystalline symmetry or other redundancies in the DG Hamiltonian (e.g., those exploited in the low-rank representation of appendix A) one can also further improve the $\mathcal{O}\left({n}_{\kappa }^{2}{N}_{b}\lambda t\right)$ scaling.

Finally, we note that the λ value associated with the molecular orbital basis Hamiltonian is likely larger than the λ value associated with the DG Hamiltonian, and we quantify the extent to which that is the later in this paper. This is yet another way in which the DG representation should lead to even more efficient implementation of LCU methods for chemistry simulation. Note also, that the value of λ is important for the scaling of quantum simulation in a near-term cost model as well. In particular, for quantum variational algorithms the number of measurements (corresponding to the number of circuit repetitions) required to estimate the energy of a Hamiltonian to within error scales as O(λ²/²) [68].

5. Numerical results

To understand the performance of different discretization schemes, we examine the costs and associated accuracy for each of the methods, using hydrogen chains of increasing lengths as test systems. We show that the crossover point occurs around 15 to 20 atoms, above which the DG representation with the block-diagonal Hamiltonian structure has significant lower costs. For two-dimensional and three-dimensional hydrogen lattices, we expect that the number of basis functions will increase slightly from our experience at the DFT level [43, 51].

The subsequent calculations are twofold. We begin with numerical simulations designed to exhibit the expected crossover behavior. We investigate hydrogen chains of increasing length (up to H₃₀) in a Gaussian cc-pVDZ basis [26] (the active space to be fit), where we choose the primitive diagonal basis set to be plane wave dual functions with refinement built to match the accuracy of the Gaussian basis set to a specified tolerance at the level of density functional theory with the Perdew–Burke–Ernzerhof (PBE) exchange–correlation functionals [69]. We are not aware of any electronic structure software package using a plane-wave basis set for practical all-electron DFT calculations. Hence we use the pseudopotential formulation based on the ONCV pseudopotential [70]. As we fit to the span of the basis set itself, rather than a density matrix generated with a fixed level of theory, these results should be representative of performance across different levels of theory including, but not limited to, correlated methods. This is done to observe the crossover without extrapolation, as by the size the crossover occurs, the more accurate correlated calculations become prohibitively expensive using traditional classical methods.

In the second set of calculations, we instead fit to a natural-orbital active space using a Gausslet basis set as the primitive diagonal basis set [34, 35] to demonstrate that this representation maintains accuracy in correlated calculations. This is possible as the properties of a Gausslet basis allow accurate DMRG calculations for comparison on these systems. Similar DMRG calculations are expected to be more expensive in basis sets with heavy tails, such as the plane-wave dual basis, which makes the one-particle Hamiltonian a dense matrix, and result in excessive bond dimension requirements for accurate descriptions. These calculations exhibit the versatility of this method and its ability to achieve accuracy, even in a correlated active space built fit to a highly nonlocal active space. Furthermore, we demonstrate that by fitting the DG basis simultaneously to two active basis sets (molecular orbitals at the UHF level and the cc-pVDZ basis functions), the DG basis set can obtain accurate results when compared to quantum Monte Carlo calculations at the complete basis set limit, while achieving a 1–2 orders of magnitude reduction of computational cost over the primitive Gausslet basis or the Gaussian basis set.

5.1. Scaling crossover in a DG plane-wave dual basis

In order to demonstrate the performance of the DG approach for a long hydrogen chain, we use the DFT-based MATLAB toolbox KSSOLV [71]. KSSOLV uses a plane-wave discretization in combination with pseudopotentials to make larger systems computationally tractable. Figure 6 shows that chemical accuracy is achieved when the kinetic-energy cutoff (E_cut) is set to around 20 hartree for H₁₀ using the ONCV pseudopotential. The kinetic-energy cutoff needs to be much larger for all-electron calculations using the plane-wave basis set. With this kinetic-energy cutoff, we can perform calculations on hydrogen chains up to H₃₀. For H₃₀ the box size is $10{\times}10{\times}118\enspace {a}_{0}^{3}$ , and the grid size is 20 × 20 × 238, or 95 200 plane-wave dual basis functions (around 3000 basis functions per atom).

**Figure 6.** Convergence of the total energy per atom for a H₁₀ system with respect to the kinetic-energy cutoff using the plane-wave basis set and the ONCV pseudopotential. Chemical accuracy (black dashed line) is achieved around E_cut = 20 hartree, which is the value used for all examples in this section.
Download figure:
Standard image High-resolution image

**Figure 7.** Potential energy surfaces for H₂₀ in DG basis with different tolerances and full plane-wave basis. At equilibrium position, the average number of DG basis per atom is given by ⟨n_κ⟩ for each fit tolerance specified by DG , where is the SVD cutoff. For comparison the number of primitive functions per atom here is approximately 3000. The primitive basis set is more expressive by design than the active-space Gaussian cc-pVDZ basis against which the DG fit is performed. This allows even fairly loose DG fits to match the accuracy of the active-space basis.
Download figure:
Standard image High-resolution image

epsilon — **Figure 7.** Potential energy surfaces for H₂₀ in DG basis with different tolerances and full plane-wave basis. At equilibrium position, the average number of DG basis per atom is given by ⟨n_κ⟩ for each fit tolerance specified by DG , where is the SVD cutoff. For comparison the number of primitive functions per atom here is approximately 3000. The primitive basis set is more expressive by design than the active-space Gaussian cc-pVDZ basis against which the DG fit is performed. This allows even fairly loose DG fits to match the accuracy of the active-space basis.
Download figure:
Standard image High-resolution image

We start by constructing an active basis set using the cc-pVDZ basis with the given molecular geometry, and sample it with an underlying grid of plane-wave dual functions. This yields a real-space discretization of the cc-pVDZ basis, which is then transformed into a block-diagonal form by means of a DG blocking procedure, also implemented in KSSOLV. As a technical note, the partitioning boundaries in the DG approach are important. The hydrogen chains that are the subject of this numerical investigation are quasi 1D-problems. This and the fact that we use a real-space discretization suggests that the ideal partitioning of the basis in terms of a DG-scheme (see section 3) corresponds to a nonuniform partitioning strategy, so that each hydrogen atom is approximately located at the center of each element. We remark that by construction the DG-blocking procedure is able to produce accurate results even if the partition is nonideal, e.g. when an hydrogen atom is located near the boundary of an element. However, this can require a larger number of basis functions per atom compared to the ideal partitioning strategy.

We first verify the accuracy of our DG-basis in figure 7 by comparing potential energy surfaces (PES) in DG discretizations of different tolerances with the corresponding PES in the primitive basis. The primitive basis has been chosen to be much more expressive than the Gaussian active-space basis, showing a lower absolute energy. This allows even a coarse DG fit to reliably match the Gaussian active-space accuracy, while being far more compact than the original primitive basis set. In fact, the energies obtained from the DG basis are slightly lower than those from the Gaussian basis set. This is because as n_κ increases, the span of the Gaussian basis set becomes approximately a subspace of the span of the DG basis set, and hence the DG basis can possibly yield lower energies due to the variational principle. Also due to the variational principle, the energies obtained from the DG basis are noticeably higher than those from the primitive plane-wave basis set, and the main limiting factor is the cc-pVDZ basis set to which DG is fitted. The average number of functions per atom ⟨n_κ⟩ is shown for each tolerance and is approximately 20, as compared to the primitive basis which is built from roughly 3000 functions per atom. The results suggest that the overall energetic accuracy is relatively insensitive even to rather aggressive singular value truncations in the DG-blocking procedure.

We then plot the average number of DG-basis functions per atom in figure 8 for fixed SVD thresholds in the DG-blocking procedure. The data confirms that, as the system size grows, ⟨n_κ⟩ approaches a constant value for a fixed SVD-truncation threshold. The combination of the energetic insensitivity beyond a kinetic-energy cutoff of 20 hartree and the asymptotically constant ⟨n_κ⟩ strongly supports the existence of a crossover regime where a DG discretization is more cost effective than other representations. We confirm the crossover directly by examining the quantities most relevant for fault-tolerant cost models of chemistry on a quantum computer, i.e., the number of nonzero two-electron integrals in each representation and the λ factor equation (B1) in figures 9 and 10, respectively (for λ factors for different bond lengths see appendix B, figure B1). A threshold of 10⁻⁶ is used to count an individual integral as 0 when calculating these quantities empirically on the systems of interest. We choose to illustrate the cost crossover for the bond length 1.7a₀ since we here detected the largest deviation with respect to the SVD-truncation threshold.

**Figure 9.** The number of nonzero two-electron integrals in different representations, for equilibrium and dissociation bond lengths with SVD-truncation tolerances of 10⁻¹, 10⁻² and 10⁻³, plotted on a log–log scale. We fit a trendline plotted with black dots from the second point onward to extract the scaling as a function of system size as N^α + c for some constant c, and list the exponent α beside each representation in the legend. As predicted, for these system sizes the number of two-electron integrals lies between the primitive and active-space representations, tending closer to the O(N²) scaling of the primitive representation, requiring fewer functions. Note that for the Gaussian basis set, certain elements of the two-electron integrals can vanish due to the symmetry of the atomic configuration of the hydrogen chain. This has a larger impact for small system than for large systems, and therefore the scaling is observed to be slightly larger than $O\left({N}_{a}^{4}\right)$ .
Download figure:
Standard image High-resolution image

**Figure 10.** λ value for Gaussian and DG basis. A core quantity in determining the cost in quantum algorithms, λ, is plotted as a function of system size for different representations. The notation DG indicates an SVD truncation threshold of in the blocking procedure. We observe an advantageous crossover before or around H₂₀ in all cases with respect to an actual value. We fit a trendline plotted with black dots from the second point onward to extract the scaling as a function of system size as N^α + c for some constant c, and list the exponent α beside each representation in the figure legend. We see the scaling for the DG basis is significantly better in all cases than for the active-space basis as well.
Download figure:
Standard image High-resolution image

Considering the cost-model in the fault-tolerant setting outlined in section 4.2, one can already observe a scaling advantage for the DG representation over simple Gaussian based active-space representations. Recall from that section that the cost using an LCU method to evolve for some time t is roughly $O\left(\sqrt{L}\lambda t\right)$ where L is the number of nonzero terms in the Hamiltonian. We consider here only the two-electron integrals as they represent the dominant cost contribution in most cases. For molecular orbital representations, this is generally $O\left({N}_{a}^{2}\lambda t\right)$ , which was subsequently improved to $O\left({N}_{a}^{3/2}\lambda t\right)$ using low-rank structure in the problem. For the hydrogen chains examined here in the Gaussian basis set, we see empirically over the system sizes considered $L\propto {N}_{h}^{4}$ and $\lambda \propto {N}_{h}^{2.5}$ , where N_h is the number of hydrogen atoms in the chain, leading to an expected cost scaling of $O\left({N}_{h}^{4.5}t\right)$ when not exploiting further low rank structure. In the DG representation for the same problem, we see that L ∝ 2.25 and λ ∝ 1.5, which suggests an empirical scaling for this physical system of $O\left({N}_{h}^{2.6}t\right)$ . The same calculation for the primitive basis suggests a scaling of $O\left({N}_{h}^{1.8}\right)$ , which is the lowest seen, however, the simulation cost and qubit counts required are many orders of magnitude higher here as seen in the figures. In both quantities we observe a constant factor crossover where the DG representation has strictly lower unique two-electron integrals and λ factor when compared to the Gaussian active-space basis at around 15 to 20 hydrogen atoms.This suggests that the DG representation is not only advantageous in the asymptotic scaling, but also for modest finite system sizes for fault-tolerant implementations.

5.2. Correlated calculations in a DG Gausslet basis

Here we demonstrate that the performance of the DG basis set for a vastly different regime, which uses a natural-orbital active space from an exact DMRG calculation fit to Gausslet primitive functions in a correlated calculation. The Gausslet basis set is a recent approach to improve the discretization of quantum chemical problems [34, 35]. It has a special focus on sparsity, spatial locality, and orthonormality, to fulfill the needs of strong-correlation methods like the DMRG, while keeping the number of basis functions lower than other grid-based bases. Our calculations illustrate the accuracy of the coupled-cluster method with single and double excitations (CCSD) method in a DG-basis with an underlying primitive Gausslet basis set with respect to the original active space. The block-diagonal form of the Hamiltonian in a DG-basis yields an asymptotic improvement in the number of nonzero two-electron integrals, which is directly related to the circuit depth and size. The calculations presented here are limited to short hydrogen chains due to the size of the Gausslet basis set and expense of correlated calculations. The goal of these calculations is to demonstrate that the DG approach maintains accuracy for correlated calculations due to its construction as a fit to the span of the active basis.

Specifically, we compare CCSD-energy results in an active-space basis set with calculations in a DG-basis fit to the same active-space basis. This serves to quantify the overhead in restricting the basis to have block locality versus the totally delocalized natural-orbital active space. The calculations are performed for H₂, H₄, H₆ and H₈ with varying symmetrically stretched bond lengths. Figure 11 shows the calculation for H₈ with optimal DG partitioning of the Gausslet basis set, the respective figures for H₂, H₄, and H₆ are presented in appendix C (figures C1–C3). From these plots one may see that enforcing a block-diagonal structure of the Hamiltonian does not reduce the accuracy further from the active-space approximation.

The Gausslet basis for computations resulting in figure 11 consists of 1336 Gausslets corresponding to an average of 167 functions per atom. This is significantly lower than the number of basis functions per atom in the plane-wave dual basis which reflects the variable resolution available with the Gausslet basis. Averaging over all values along the PES, we find that the full two-electron integral tensor for this basis has ∼2 310 318 nonzero elements. Using a DG basis it is possible to reduce this number to ∼511 449 nonzero two-electron integrals without losing accuracy compared to the respective active-space approximation. This suggests cost reductions for correlated calculations that are similar to the observations in section 5.1. Note, however, that H₈ is a small system compared to the systems considered in section 5.1, consequently, the number of nonzero two-electron integrals in the DG-basis is still larger than in the active-space approximation (∼52 346). This aligns with the results from the previous section, which showed that the improvement of the nonzero two-electron integral count for the DG-basis becomes observable for 15 to 20 atoms, depending on the imposed truncation tolerance. A naive extrapolation of the nonzero two-electron integral count in table C2 (see appendix C figure C4) suggests a crossover around 25 atoms for the coupled-cluster calculations, which is again in agreement with the computations in section 5.1 with a low truncation tolerance (see appendix B table B1). We conclude that the trial calculations for small systems performed here together with results from section 5.1 suggest that the DG approach maintains the accuracy of active-space approximations for correlation methods but with a more efficient representation on quantum devices once a certain system size is reached.

Table B1. Crossover regions for different bond lengths and different truncation tolerances.

Tolerance	10⁻³	10⁻²	10⁻¹
b = 1.0	H₁₆–H₁₈	H₁₄–H₁₆	H₈–H₁₀
b = 1.2	H₁₈–H₂₀	H₁₄–H₁₆	H₆–H₈
b = 1.4	H₂₀–H₂₂	H₁₄–H₁₆	H₆–H₈
b = 1.6	H₂₀–H₂₂	H₁₂–H₁₄	H₆–H₈
b = 1.7	H₂₀–H₂₂	H₁₀–H₁₂	H₆–H₈
b = 1.8	H₁₈–H₂₀	H₈–H₁₀	H₈–H₁₀
b = 2.0	H₁₆–H₁₈	H₈–H₁₀	H₈–H₁₀
b = 2.4	H₁₀–H₁₂	H₈–H₁₀	H₈–H₁₀
b = 2.8	H₈–H₁₀	H₈–H₁₀	H₈–H₁₀
b = 3.0	H₈–H₁₀	H₈–H₁₀	H₈–H₁₀
b = 3.2	H₈–H₁₀	H₈–H₁₀	H₆–H₈
b = 3.6	H₈–H₁₀	H₈–H₁₀	H₄–H₆

5.3. DMRG calculations in a DG basis

Here we examine the power of the DG technique to compactly and cheaply represent problems for classical, correlated DMRG calculation by constructing an active space that takes static correlation from a UHF calculation done in a flexible primitive basis set such as Gausslets, combined with contributions from a basis set that has been empirically refined to capture dynamic correlations. These calculations have the dual purpose of demonstrating the power of the DG approach even in highly-correlated calculations more generally, including calculations on a quantum computer. Here we demonstrate a hybrid active-space approach for cheaply finding a balanced active-space. Specifically, after performing a UHF calculation in a Gausslet basis, we project both the cc-pVDZ basis set and the UHF orbitals onto the primitive Gausslet basis. The UHF calculation can be performed cheaply. By including the UHF orbitals in our active-space, we ensure that there are no HF-level errors in the basis. The only lack of completeness is associated with correlation beyond HF. To (partially) capture correlation in a compact way, we include contributions from the empirical Gaussian basis with a slightly smaller factor, α = 0.01 here, then perform the DG blocking procedure to develop the basis as before. The resulting basis maintains the flexibility of the HF solution in the Gausslet basis for the low-lying orbitals, while now including the refined features of the Gaussian orbitals without losing the block-diagonal structure. Although it is difficult to compare the efficiency of primitive and DG bases in a precise manner, we find that this approach can achieve a 1–2 orders of magnitude reduction of computational cost over either the primitive Gausslet basis or Gaussian basis set, while achieving excellent accuracy with respect to the complete basis set limit.

Numerical calculations for the H₁₀ system are shown in figure 12. For comparison, we perform an unrestricted Hartree–Fock calculation in both a traditional Gaussian basis set (cc-pVDZ), and a multi-sliced Gausslet basis set. The Gaussian basis set contains 5 spatial orbitals per atom, totaling to 50 spatial orbitals with the associated nondiagonal two-electron integrals as one would expect. The Gausslet basis is formed adaptively according to pre-determined cutoffs, and the number of functions ranges from 7000–10 000 spatial orbitals for these calculations, while retaining the diagonal property, making calculations at the UHF level relatively straightforward, even with such a formidable number of basis functions. This large primitive basis gives UHF results near the complete basis set limit, well beyond the accuracy of this Gaussian basis. This high accuracy in the HF comes at little cost; the UHF is still fast compared to the correlated calculations and the large number of primitive functions do not strongly affect the size of the DG basis. For larger systems the accuracy could be reduced to keep the HF manageable.

To construct the DG basis, we construct 10 spatial blocks, 1 around each atom. We make use of the UHF calculation density matrix and keep 7 total orbitals per block, yielding 70 total spatial orbitals, a number almost identical to the number of Gaussians in the cc-pVDZ basis, but maintaining the block diagonal property of the two-electron integrals. By construction, UHF in this basis can accurately match the UHF results of the Gausslet basis set. The introduction of correlation through DMRG on this basis shows improvement as expected but a relative offset from the exact answer due to UHF's focus on static correlation. By using the weigthing procedure, in order to include UHF orbitals in the DG construction, we find excellent agreement with calculations done in the complete basis set limit. To that end, a weighting factor of α = 0.01 was imposed. The number of functions kept is 15 per block, yielding 150 functions in total, which is about three times larger than the cc-pVDZ basis. However, the structure of the interactions and spatially local construction of the DG functions allows the DMRG calculation to be done with a 1–2 order of magnitude reduction in computational cost as compared to using the DG or Gaussian basis sets alone for the current implementation. This suggests that the hybrid active-space approach with the DG-blocking procedure is a powerful technique for recovering both static and dynamic correlation in a cost-effective manner.

To elaborate on the scaling of DMRG, m be the bond dimension for the state required for the desired accuracy. Then the computational cost of the Gaussian basis sets is expected to be $O\left({N}_{a}^{3}{m}^{3}\right)$ . In contrast, for Gausslets and DG representations based on Gausslets, using matrix product operator (MPO) compression and let D be the MPO dimension, one expects the asymptotic cost to be O(N_pDm³) and O(N_dDm³), respectively. Due to the localization properties, D is expected to depend weakly on length, and be comparable for both the Gausslet and DG block representations. One can see this from considering the MPO decomposition in the Gausslet representation then transformed to the DG representation. While some expansion of the bond dimension could happen within a block, it is bounded by the Gausslet dimension of the block from the properties of a Schmidt decomposition, and no inter-block mixing occurs. As a result, the bond dimension will be comparable. Hence, it is the massive reduction in number of basis functions, from 10 000 to 150 that reduces the cost by several orders of magnitude while maintaining excellent accuracy. The Gaussian basis set, while advantageous in number of functions, suffers from lack of spatial locality, but the cost of this nonlocality in terms of the required bond dimension for equivalent accuracy has not been studied, and likely depends significantly on the completeness of the basis. It is easier to compare the scaling of the Gaussian DMRG due to the lack of diagonality, resulting in a much larger set of two-electron integrals. In this case, the block diagonality of the DG basis typically results in approximate linear scaling in the number of atoms, versus cubic dependence with Gaussian DMRG. Hence in both cases, the DG approach offers a significant reduction in computational cost for an accuracy that nearly approaches the complete basis set limit, which was obtained through accurate quantum Monte-Carlo calculations [35, 72].

6. Conclusion

The discretization problem is a crucial aspect determining the cost and effectiveness of quantum chemistry methods both for classical and quantum methods. The most popular basis sets for correlated calculations, Gaussian based molecular orbitals, are notably more compact than alternatives and have a well developed set of tools for their use. Unfortunately, in cost models for quantum computation, the overhead of a quartic number of terms in the Hamiltonian leads to poor scaling with system size. On the other side of the spectrum, representations that achieve a strictly quadratic number of terms and a diagonal representation, such as Gausslets or plane-wave dual functions exhibit excellent scaling with system size, but have overheads that make them undesirable for modest size implementations.

Here we introduced a systematic method for interpolating between the two regimes through the use of a blocking procedure, motivated from the discontinuous Galerkin (DG) method. This method is able to use any primitive basis with the diagonal property to represent a delocalized active-space basis, while maintaining the diagonal property between blocks. By choosing a plane-wave dual primitive basis and Gaussian active space, we were able to show how one can adaptively interpolate between these two regimes to attain both a scaling and constant factor advantage over the target active space.

When these empirical results are put into the context of known costs for exact quantum algorithms for chemistry, we observed a scaling improvement over Gaussian basis sets from $O\left({N}_{h}^{4.5}\right)$ to $O\left({N}_{h}^{2.6}\right)$ with a constant factor crossover around 15 to 20 hydrogen atoms. This suggests that for modest sized systems, such as those just beyond the classically tractable regime, this representation will be the optimal choice for quantum algorithms. Moreover, we showed that for high accuracy DMRG calculations, one may take advantage of this representation to achieve a high accuracy calculation with a cost reduction that is over an order of magnitude with respect to traditional representations. In all cases, one may use this methodology to scale between a compact representation and one with superior integral scaling depending on the requirements of a particular method.

Acknowledgments

This work was partially supported by the Department of Energy under Grant No. DE-SC0017867, the Quantum Algorithm Teams Program under Grant No. DE-AC02-05CH11231, the Google Quantum Research Award (LL), the Research Council of Norway under CoE Grant No. 262695, the Peder Sather Grant Program (FMF), the NASA Space Technology Research Fellowship (BO), and the Ning fellowship (QZ).

Appendix A.: Trotter step by low-rank factorization

As an alternative to the fixed swap networks used in the main text to evolve under the two-electron integral term in the DG basis, one may use the low-rank factorization strategy in [39], but applied to the block diagonal matricized tensor ${v}_{\kappa ,{\kappa }^{\prime },\lambda ,{\lambda }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}$ . For such a matrix, we know the maximum number of Cholesky factors is given by the dimension of the matrix, ${N}_{d}^{2}$ , however, because of the block structure, each of the Cholesky factors need only have non-trivial support only within a κ, κ' block. If n_κ bounds the larger of n_κ, n_κ', then the dimension of one of these blocks is $O\left({n}_{\kappa }^{2}\right)$ . It is easy to see that the total matricized tensor has dimension $O\left({n}_{\kappa }^{2}{N}_{b}^{2}\right)$ , and a number of non-zero entries scaling as $O\left({n}_{\kappa }^{4}{N}_{b}^{2}\right)$ . Hence it matches the primitive limit as n_κ → 1 and N_b → N_d and the active space limit as n_κ → N_d and N_b → 1.

To execute a single Trotter step of the two-electron part of the Hamiltonian, one may start from a factorization that is a product over κ, κ' blocks as

$\begin{align}\hfill \mathrm{exp}\left[-\mathrm{i}{\Delta}t\sum _{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}{v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}{\hat{c}}_{\kappa ,i}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{i}^{\prime }}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{j}^{\prime }}{\hat{c}}_{\kappa ,j}\right]& \approx \prod _{\kappa ,{\kappa }^{\prime }}\mathrm{exp}\left[-\mathrm{i}{\Delta}t\sum _{i,{i}^{\prime },j,{j}^{\prime }}{v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },j,{j}^{\prime }}^{\left(d\right)}{\hat{c}}_{\kappa ,i}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{i}^{\prime }}^{{\dagger}}{\hat{c}}_{{\kappa }^{\prime },{j}^{\prime }}{\hat{c}}_{\kappa ,j}\right]\hfill \\ \hfill & \equiv \prod _{\kappa {\kappa }^{\prime }}{R}_{\kappa {\kappa }^{\prime }}\hfill \end{align} \tag{ A1 }$

Within a non-trivial block $\left({v}_{\kappa ,{\kappa }^{\prime },\lambda ,{\lambda }^{\prime };i,{i}^{\prime },{j}^{\prime },{j}^{\prime }}^{\left(d\right)}={v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },{j}^{\prime },{j}^{\prime }}^{\left(d\right)}\right)$ we expect the following decomposition

$\begin{align}\hfill {v}_{\kappa ,{\kappa }^{\prime };i,{i}^{\prime },{j}^{\prime },{j}^{\prime }}^{\left(d\right)}\approx \sum _{\mu \nu \ell }& \left({U}_{i\mu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}{U}_{{i}^{\prime }\mu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}{U}_{j\nu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}\cdot {U}_{{j}^{\prime }\nu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}{\lambda }_{\mu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}{\lambda }_{\nu }^{\left(\ell ,\kappa ,{\kappa }^{\prime }\right)}\right).\hfill \end{align} \tag{ A2 }$

Here ℓ comes from Cholesky factorization, and μ, ν comes from a second eigenvalue decomposition. As for a single block, κ, κ', we assume n_κ is independent of the system size, so are the index ranges of μ,ν, ℓ. For each ℓ, κ, κ', U^(ℓ,κ,κ') is a matrix with orthogonal columns. This decomposition allows us to apply R_κκ' using the low-rank decomposition technique. The maximum rank of these factors is L, which empirically for Gaussian basis sets we expect to scale as O(n_κ), but due to the lack of empirical data for large DG basis sets we assume the worst case, $O\left({n}_{\kappa }^{2}\right)$ . For each of these factors, the second eigenvalue decomposition will have maximal rank ρ_ℓ = O(n_κ). The depth of such circuits using a fermionic swap network scales as ∑_ℓρ_ℓ < Ln_κ, which in empirical studies on molecular orbital basis sets scaled at least as Ω(n_κ log n_κ) and in the worst case $O\left({n}_{\kappa }^{3}\right)$ . In any case, for a given error tolerance, from locality we know that n_κ converges to a constant as a function of system size, and hence each R_κκ' may be executed in constant depth for large system using the low-rank Trotter step, which simultaneously executes a fermionic swap network.

Following this idea further, if one constructs a system mapped to qubits in 1D, where the mapping to orbitals is grouped by κ, then the κκ' block may be executed for κ' = κ + 1, and at the same time, the two blocks may be swapped with this technique. Moreover, as this adjacent κκ' block is disjoint from the qubits outside of κκ' this operation may be parallelized across pairs of adjacent blocks. This corresponds exactly to lifting the original fermionic swap network for time evolution to the level of blocks, where the individual fermionic simulation operations (as they were referred to in the original work) are performed via the low-rank method. From this, we see that the total required depth scales as O(N_b) as desired. Although we have assumed here that n_κ approaches a constant as system size grows, if we account for this in the worst case, we expect a depth of $O\left({N}_{b}{n}_{\kappa }^{3}\right)$ , which matches the fixed swap networks in the main body of the text. Hence, determining the optimal choice of implementation will be done to constant factors and requires further empirical study.

Appendix B.: Scaling crossover in a DG plane wave dual basis

Besides the total number of two-electron integrals, the cost of certain quantum algorithms such as the LCU method depends also on λ, the sum of the absolute value of the two-electron integrals. Here λ is computed as

$\begin{equation}\lambda =\sum _{\left\{\kappa ,{\kappa }^{\prime };pqrs\right\}/\left\{p=r,q=s\right\}}\vert {v}_{\kappa ,{\kappa }^{\prime };p,q,r,s}\vert .\end{equation} \tag{ B1 }$

Notice that the terms {p = r, q = s} nominally diverge in a plane wave basis set with periodic boundary conditions. These values can be computed by including correction terms to the periodic boundary condition [73, 74]. However, the number of such terms in the two-body interaction scales the same as that in the one-body interaction in the Hamiltonian, so we omit such low-order terms directly for simplicity. In figure 10, we observe that the crossover point occurs around H₈ when the tolerance is set to 0.1, and around H₁₈ when the tolerance is 0.01.

Appendix C.: Correlated calculations in a DG Gausslet basis

Figures C1–C3 show the DG calculations for H₂, H₄ and H₆ using Gausslets as the primitive basis set.

**Figure B1.** The number of non-zero two-electron integrals with bond lengths = 1.0 to 3.6 in atomic units, tol = 10⁻¹ to 10⁻³, plotted on log–log scale. The crossover appears before or around H₂₂ in all cases.
Download figure:
Standard image High-resolution image

**Figure C1.** Potential energy surfaces for H₂ in a Gausslet basis computed by restricted SCF, (DG-)CCSD and benchmarked with DMRG results. The averaged number of non-zero two-electron integrals per atom (DG-element κ) is 4144. The averaged number of non-zero two-electron integrals of the CAS calculations is 205.
Download figure:
Standard image High-resolution image

**Figure C2.** Potential energy surfaces for H₄ in a Gausslet basis computed by restricted SCF, (DG-)CCSD and benchmarked with DMRG results. The averaged number of non-zero two-electron integrals per atom (DG-element κ) is 16 580. The averaged number of non-zero two-electron integrals of the CAS calculations is 3269.
Download figure:
Standard image High-resolution image

**Figure C3.** Potential energy surfaces for H₆ in a Gausslet basis computed by restricted SCF, (DG-)CCSD and benchmarked with DMRG results. The averaged number of non-zero two-electron integrals per atom (DG-element κ) is 287 762. The averaged number of non-zero two-electron integrals of the CAS calculations is 16 551.
Download figure:
Standard image High-resolution image

Table C1 shows information on the real space discretization for the used Gausslet basis set.

Table C1. Averaged numbers Gausslets used to discretize the molecular system.

Atoms	H₂	H₄	H₆	H₈
Tot. no. of Gausslet	529	801	1066	1336
Gausslets in z-axis	11–13	15–21	17–27	21–33
Gausslets in x-resp. y-axis	6.33–7.40	6.92–7.53	7.31–7.78	7.27–8.01

Table C2 compares the non-zero two-electron integral count for the DG-Gausslet basis and the active space basis. Figure C4 shows a naive extrapolation of the non-zero two-electron integral count presented in table C2.

Table C2. Averaged numerical values of non-zero two-electron integrals for H₂, H₄, H₆ and H₈.

Atoms	H₂	H₄	H₆	H₈
No of Gausslets	529	801	1066	1336
nnz-tei Gausslet	351 084	816 232	1 459 901	2 310 318
DG per elem.	6	6	10	10
nnz-tei DG	4144	16 580	287 762	511 449
CAS	4	8	12	16
nnz-tei CAS	205	3269	16 551	52 346

Appendix D.: Swap network sub-circuits

In this section, we give some more detail about the components of the swap network described in section 4.1. Recall the structure of the overall swap network:

(a)
A four-complete swap network within each half-block. This acquaints all sets of 4 orbitals with the same spin and within each block.
(b)
A double bipartite swap network on each block. This acquaints all sets of 4 orbitals with no net spin and within each block. Details of the construction are given in figure D2.
(c)
A permutation within each block. This changes the orbital to qubit mapping in preparation for the next stage.
(d)
Alternating layers of balanced double bipartite swap networks. Each balanced double bipartite swap network acquaints, for some pair of blocks, all sets of 4 orbitals with an even number of each spin and with two orbitals from each block. The N_b alternating layers ensure that every pair of blocks is involved together in some balanced double bipartite swap network. Details of the construction of a double bipartite swap network are given in figure D3.