Paper The following article is Open access

Local, expressive, quantum-number-preserving VQE ansätze for fermionic systems

, , and

Published 12 November 2021 © 2021 The Author(s). Published by IOP Publishing Ltd on behalf of the Institute of Physics and Deutsche Physikalische Gesellschaft
, , Citation Gian-Luca R Anselmetti et al 2021 New J. Phys. 23 113010 DOI 10.1088/1367-2630/ac2cb3

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1367-2630/23/11/113010

Abstract

We propose VQE circuit fabrics with advantageous properties for the simulation of strongly correlated ground and excited states of molecules and materials under the Jordan–Wigner mapping that can be implemented linearly locally and preserve all relevant quantum numbers: the number of spin up (α) and down (β) electrons and the total spin squared. We demonstrate that our entangler circuits are expressive already at low depth and parameter count, appear to become universal, and may be trainable without having to cross regions of vanishing gradient, when the number of parameters becomes sufficiently large and when these parameters are suitably initialized. One particularly appealing construction achieves this with just orbital rotations and pair exchange gates. We derive optimal four-term parameter shift rules for and provide explicit decompositions of our quantum number preserving gates and perform numerical demonstrations on highly correlated molecules on up to 20 qubits.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Hybrid quantum classical variational algorithms, including those of the variational quantum eigensolver (VQE) type [1, 2], are among the leading candidates for quantum algorithms that may yield quantum advantage in areas such as computational chemistry or machine learning already in the era of noisy intermediate scale quantum (NISQ) computing [3]. A foundational issue in VQE [1, 2], and in many of its extensions and alternatives [411], is finding a 'good' definition of the entangler circuit. Here the qualifier 'good' has many facets, possibly including: (1) providing an efficient approximate representation of the target quantum states in the limit of an intermediate (ideally polynomial-scaling) depth (2) consisting of a low number of distinct physically realizable gate elements (3) exhibiting a simple pattern of how these gate elements are applied (4) exhibiting sparse spatial locality that is further compatible with device connectivity (5) exhibiting simple analytical gradient recipes and robust numerical convergence behavior during optimization of the VQE energy, e.g. by mitigating the effects of barren plateaus [12] (6) respecting exactly the natural particle and spin quantum number symmetries of the target quantum states, e.g. as notably explored in [13] (7) providing an exact representation of the target quantum states in the limit of sufficient (usually exponential-scaling) depth.

Especially within the context of VQE for spin-1/2 fermions governed by real, spin-free Hamiltonian operators (e.g. electrons in molecules and materials, the prime application of VQE), a variety of compelling VQE entangler circuit recipes have been discussed in the literature. Prominent examples include UCCSD [1, 14, 15], k-UpCCGSD [16, 17], Jastrow-factor VQE [18], the symmetry-preserving ansätze [13], the hardware efficient ansätze [19, 20], ADAPT-VQE [21], pUCCD [22], and additional methods discussed below [2328]. Each of these generally satisfies a subset of the 'good' facets listed above, though no extant ansatz that we are aware of obtains all of them, with the notable exception of the generalized swap network form of k-UpCCGSD of [17].

In this work we develop a VQE entangler circuit recipe for fermions in the Jordan–Wigner representation and show, or at least provide evidence, that it obtains all facets, with facet (5) partially left to future numerical studies. Perhaps the most notable property of our fabrics is the exact preservation of all relevant quantum numbers the individual gate elements of the fabric, which is why we refer to them as quantum number preserving (QNP). This property may be critical for employment of VQE in larger systems, where contaminations from or even variational collapse onto states with different particle or spin quantum numbers can severely degrade the quality of the VQE wavefunction.

Note that after we posted the first version of our manuscript, we became aware of the generalized swap network reformulation of k-UpCCGSD of [17]. This paper refactors k-UpCCGSD to use nearest-neighbor connectivity, yielding a circuit fabric that could be written in terms of four-qubit gates containing diagonal pair exchange and orbital rotation elements in a very similar manner as our $\hat{Q}$-type QNP gate fabric discussed below. There are some tactical differences in the qubit ordering and the generalized swap network paper does not emphasize the role of quantum number symmetry as much as the present manuscript. Moreover, the origin of the $\hat{Q}$-type QNP gate fabric as a simplification of our more-complete $\hat{F}$-type QNP gate fabric of appendix D provides a markedly different approach to developing this gate fabric. In any case, we encourage any readers interested in the present manuscript to also explore [17].

2. Gate fabrics

Our VQE entangler circuit recipe draws inspiration from the well known fact that the qubit Hilbert space (without any fermionic symmetries) can be spanned by a tessellation of two-qubit gates universal for $\mathcal{SU}(4)$ in alternating layers (see figure 1). This tessellation can formally be repeated to infinite depth. However, one finds that after some finite, N-dependent critical depth of order $\mathcal{O}({2}^{2N})$, additional gate layers do not increase the expressiveness of the circuit, as formal completeness (denoted 'universality') in $\mathcal{SU}({2}^{N})$ is achieved. In practice usually shorter circuit depths are of interest. For instance, one may consider the case where the tessellation is restricted to be polynomial scaling in N, in which case universality cannot be exactly achieved. However, a good approximation of specialized (e.g. physically relevant) parts of some subgroup may still be achievable in a way that is tractable to compute even on an NISQ computer but intractable to compute with a classical device.

Figure 1.

Figure 1. Sketch for N = 6 of a gate fabric universal for $\mathcal{SU}({2}^{N})$ providing inspiration for the fermionic QNP gate fabrics developed here. The gate fabric is a two-local-nearest-neighbor tessellation of alternating 15-parameter, two-qubit SU(4) gates. The SU(2) gate in the SU(4) gate decomposition on the bottom line is the three-parameter universal gate for the one-qubit Bloch sphere. The indicated 24-parameter decomposition of SU(4) is overcomplete for the 15-parameter $\mathcal{SU}(4)$ group.

Standard image High-resolution image

Note that it is difficult to find a single landmark reference explicitly proving that the fabric in figure 1 is universal for $\mathcal{SU}({2}^{N})$, but every research group we have discussed the matter with acknowledges that this gate fabric is widely known in the field. Moreover, it is simple to prove that this gate fabric is universal for $\mathcal{SU}(4)$ if one starts from the well-known fact that circuits composed of arbitrary one-qubit operations and CX or CZ gates between arbitrary pairs of qubits are universal for $\mathcal{SU}({2}^{N})$ [29]. One can picture an arbitrary-depth version of the gate fabric in figure 1 where only one $\hat{S}U(2)$ or CZ gate at desired qubit indices is active in each layer (the latter possible by repeating layers of CZ gates with trivial $\hat{S}U(2)$ gates interleaving). Then the only requirement is to extend the local nearest neighbor connections of CZ gates to arbitrary pairs of qubit indices. This is easily accomplished by adding additional layers of the fabric whose $\hat{S}U(2)$ gates interleaving). Then the only requirement is to extend the local nearest gates are initialized to implement SWAP gates to expand the linear nearest-neighbor CZ gates to arbitrary connectivity. And thus one obtains a circuit with one-qubit $\mathcal{SU}(2)$ gates with free parameters at arbitrary desired qubit positions interleaved with CZ gates between arbitrary pairs of qubit positions, which is well known to be universal. This construction suffices to prove universality, but is obviously extremely wasteful. In practice, we find that unconstraining all of the $\mathcal{SU}(2)$ parameters markedly improves the expressivity of the gate fabric at a given depth, and that action or operator universality is numerically obtained when the number of free $\mathcal{SU}(2)$ parameters is similar (strictly greater than or equal to) the number of free parameters in the many-body unitary action or operator.

Particularly striking in figure 1 is the locality (alternating nearest neighbor connectivity) and simplicity (single gate element) of the circuit, properties of what we call a 'gate fabric.' More precisely, throughout this manuscript, we define a gate fabric for a subgroup of $\mathcal{SU}({2}^{N})$ to be a tessellation of gates over N-qubits with the following properties:

  • (a)  
    Simplicity: composed of a single type of k-qubit, l-parameter gate element (with a known decomposition into elementary gates), where k and l are independent of N.
  • (b)  
    Linear locality: when the qubits are thought of as arranged on a vertical line the gate elements are arranged in layers and connect up to k contiguous qubits.
  • (c)  
    Universality: achieving universality within the target subgroup of $\mathcal{SU}({2}^{N})$ within a finite number of layers depending on N.
  • (d)  
    Symmetry: commuting with all symmetry operators used to define the subgroup of $\mathcal{SU}({2}^{N})$, i.e. $[\hat{U},\hat{N}]=0$, where $\hat{U}$ is the circuit unitary for any set of parameters and $\hat{N}$ is the symmetry operator.

Depending on the subgroup of $\mathcal{SU}({2}^{N})$ of interest it can be more or less difficult to find fabrics akin to the one shown in figure 1. In appendix A we discuss the trivial restriction to $\mathcal{SO}({2}^{N})$ and the less-trivial restriction to subspaces of definite Hamming weight.

3. Gate fabrics for fermions under the Jordan–Wigner mapping

The focus of this work is the construction of gate fabrics for the subgroup $\mathcal{F}({2}^{2M})\in \mathcal{SU}({2}^{N})$ constrained to spin-restricted fermionic symmetry under the Jordan–Wigner representation. To make this more precise, we define M real orthogonal spatial orbitals ${\left\{\vert {\phi }_{p}\rangle \right\}}_{p=0}^{M}$. For each spatial orbital, we define corresponding α (β) spin orbitals |ψp ⟩ := |ϕp ⟩|α⟩ ($\vert {\psi }_{\bar{p}}\rangle {:=}\vert {\phi }_{p}\rangle \vert \beta \rangle $) for a total of N := 2M spin orbitals in a spin-restricted formalism. We associate the occupation numbers of these spin orbitals with the occupation numbers of N qubits. We number the qubits in 'interleaved' ordering ...|1β ⟩|1α ⟩|0β ⟩|0α ⟩. The fermionic creation/annihilation operators are defined in terms of the qubit creation/annihilation operators via the Jordan–Wigner mapping in 'α-then-β' ordering, ${p}^{\pm }{:=}({\hat{X}}_{p}\mp \mathrm{i}{\hat{Y}}_{p})/2{\bigotimes}_{q=0}^{p-1}{\hat{Z}}_{q}$ and ${\bar{p}}^{\pm }{:=}({\hat{X}}_{\bar{p}}\mp \mathrm{i}{\hat{Y}}_{\bar{p}})/2{\bigotimes}_{q=0}^{p-1}{\hat{Z}}_{\bar{q}}{\bigotimes}_{q=0}^{M-1}{\hat{Z}}_{q}$. We note that for the majority of applications in the space of spin-1/2 fermions, the governing Hamiltonians are real (e.g. for non-relativistic electronic structure theory), and so we restrict from complex to real unitary operators, i.e. $\mathcal{SU}({2}^{N})\to \mathcal{SO}({2}^{N})$. The spin-restricted fermionic subgroup is then defined as the subgroup of $\hat{U}\in \mathcal{SO}({2}^{N})$ that respect the commutation relations $[\hat{U},{\hat{N}}_{\alpha }]=0$, $[\hat{U},{\hat{N}}_{\beta }]=0$, and $[\hat{U},{\hat{S}}^{2}]=0$. Here the α (β) number operator is ${\hat{N}}_{\alpha }{:=}{\sum }_{p}{p}^{{\dagger}}p={\sum }_{p}(\hat{I}-{\hat{Z}}_{p})/2$ [${\hat{N}}_{\beta }{:=}{\sum }_{p}{\bar{p}}^{{\dagger}}\bar{p}={\sum }_{p}(\hat{I}-{\hat{Z}}_{\bar{p}})/2$] [30]. The spin-squared operator is ${\hat{S}}^{2}{:=}{\sum }_{pq}p{q}^{{\dagger}}{\bar{p}}^{{\dagger}}\bar{q}+({\hat{N}}_{\alpha }-{\hat{N}}_{\beta })/2+{({\hat{N}}_{\alpha }-{\hat{N}}_{\beta })}^{2}/4$, and does not admit a local description in terms of Pauli operators in the Jordan–Wigner basis (we provide further details in appendices B and C). We denote this real subgroup, preserving ${\hat{N}}_{\alpha }$, ${\hat{N}}_{\beta }$, and ${\hat{S}}^{2}$, as $\mathcal{F}({2}^{2M})$.

Naively one might expect that there should not be any local gate fabric exactly preserving all three fermionic quantum numbers, since the ${\hat{S}}^{2}$ operator is non-local. The crux of this work is thus the simple quantum-number-preserving gate fabric of figure 2. This gate fabric is composed of two-parameter four-qubit gate elements $\hat{Q}$, each composed of a one-parameter four-qubit spin-adapted spatial orbital rotation gate QNPOR(φ) and a one-parameter four-qubit diagonal pair exchange gate QNPPX(θ). We describe further related quantum-number-preserving gate fabrics for $\mathcal{F}({2}^{2M})$ in appendix D—these were the progenitors of the simpler gate fabrics shown in the main text, and may have advantageous properties in specific realizations of VQE entangler circuits.

Figure 2.

Figure 2. Proposed gate fabric for $\mathcal{F}({2}^{2M})$ (sketched for M = 4). The spin orbitals in Jordan–Wigner representation are in 'interleaved' ordering with even (odd) qubit indices denoting α (β) spin orbitals. The Jordan–Wigner strings are taken to be in 'α-then-β' order. The gate fabric is a four-local-nearest-neighbor-tessellation of alternating even and odd spatial-orbital-pair two-parameter, four-qubit $\hat{Q}$ gates. Each $\hat{Q}$ gate has two independent parameters and contains a one-parameter, four-qubit spatial orbital rotation gate QNPOR(φ) and a one-parameter, four-qubit diagonal pair exchange gate QNPPX(θ). The order of QNPOR and QNPPX (note [QNPOR, QNPPX] ≠ 0) does not seem to substantially change expressiveness at intermediate depths. QNPOR(φ) implements the spatial orbital Givens rotation |ϕ0⟩ = c|ϕ0⟩ + s|ϕ1⟩ and |ϕ1⟩ = −s|ϕ0⟩ + c|ϕ1⟩, with the same orbital rotation applied in the α and β spin orbitals. QNPPX(θ) implements the diagonal pair Givens rotation, |0011⟩ = c|0011⟩ + s|1100⟩ and |1100⟩ = −s|0011⟩ + c|1100⟩. The real one-parameter, one-qubit rotation gate is ${\hat{R}}_{y}(\lambda ){:=}{\text{e}}^{-\text{i}\lambda \hat{Y}/2}$. In $\hat{Q}$, we include the optional constant $\hat{{\Pi}}$ gate, for which natural choices include the four-qubit identity gate, i.e. $\hat{{\Pi}}=\hat{I}$, or the fixed spin-adapted orbital rotation gate $\hat{{\Pi}}={\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}(\pi )$. In the latter case, the gate fabric with all parameters {θ = 0} and {φ = 0} promotes exchange of orbitals. We find that the choice of $\hat{{\Pi}}\in \left\{\hat{I},{\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}(\pi )\right\}$ does not appear to affect the expressiveness of the quantum circuit, but the latter choice has turned out to be advantageous during gradient-based parameter optimization. This can additionally be mixed with a non-trivial initialization of the QNPOR gates, to, e.g. angles of φ = π/2 as we illustrate below. Regardless of the choice of $\hat{{\Pi}}$, this gate fabric exactly preserves the real nature of the subgroup, exactly commutes with the ${\hat{N}}_{\alpha }$, ${\hat{N}}_{\beta }$, and ${\hat{S}}^{2}$ symmetry operators, and numerically appears to provide universality at sufficient parameter count.

Standard image High-resolution image

Facets (2)–(4) of gate fabrics are manifestly fulfilled for all these proposals and facet (6) holds by construction as all gates individually preserve all quantum numbers. For facets (1) and (7) we provide numerical evidence below and in appendix G. It is worth noting that these tests numerically indicate that our gate fabrics are universal in the vast bulk of quantum number irreps, i.e. they may be used for cases where S ≠ 0 and/or where Nα Nβ (including both even and odd spin cases). Note that there are a few edge case quantum number irreps where the $\hat{Q}$-type gate fabrics are not universal—see next paragraph for discussion. We believe that the methods from [31, 32] or [23] can be used to rigorously show universality in the (bulk of the) quantum number sectors of $\mathcal{F}({2}^{2M})$ in all cases (as well as that our circuits are polynomial depth epsilon-approximate unitary t-designs and form epsilon-nets). Working out the details of a rigorous proof, which we believe has to be done spin sector by spin sector in some cases, is however beyond the scope of this work.

There are a few pathological edge case irreps for which the $\hat{Q}$-type QNP gates are not universal. These cases correspond to quantum number irreps with NΔ open-shell high-spin orbitals and then all other orbitals wholly occupied or all other orbitals wholly unoccupied. In such cases, the ${\hat{Q}}_{\text{PX}}$ gates have trivial action, as there are no pairs of orbitals with one orbital unoccupied and one orbital doubly occupied. The remaining ${\hat{Q}}_{\text{OR}}$ gates (which form a match gate circuit) then have insufficient support to move the high-spin particles into all $\left(\genfrac{}{}{0pt}{}{M}{\vert {N}_{{\Delta}}\vert }\right)$ possible configurations. Such irreps are rarely of interest in chemical physics. Moreover, universal gate fabrics for these edge case irreps do exist in the form of the extended quantum-number-preserving gate fabrics for $\mathcal{F}({2}^{2M})$ in appendix D. For complete details on these edge case irreps, the reader is referred to a more-thorough analysis in appendix G.3.

For QNPOR, QNPPX (and all other parametrized QNP gates introduced in appendix D) we provide explicit decompositions into elementary gates in appendix E. We further provide generalized parameter shift rules [3336] for theses gates in appendix F.2, enabling a computation of the gradients with respect to their circuit parameters with a maximum of four distinct circuits and without increasing circuit depth, gate count, or qubit number. We compare this gradient recipe to the generalization presented in [37], extend the variance minimization technique from [38] to the QNP gate gradients and note that our new rule can be applied to a large variety of other gates.

4. Numerical demonstrations

To numerically investigate the properties of the gate fabric from figure 2 and to collect evidence that it satisfies all facets of a 'good' entangler circuit, we consider two prototypical examples of highly correlated molecular ground states: the first is p-benzyne, which exhibits a biradical open-shell singlet ground state, with two unpaired electrons indicated by significant deviations from Hartree–Fock (HF) natural orbital occupation numbers, and four other moderate deviations from HF natural orbital occupation numbers. We use the geometry from [39], build the orbitals at RHF/cc-pVDZ, and construct a (6e, 6o) active space Hamiltonian with the orbitals ranging from HOMO − 2 to LUMO + 2. This corresponds to a case of M = 6 spatial orbitals (i.e. N = 12 qubits), and we focus on the ground state irreducible representation (Nα = 3, Nβ = 3, S = 0). For a larger test case, we consider naphthalene, which while not intrinsically biradical, has multiple natural orbitals with significant deviations from HF natural orbital occupation numbers. We build the orbitals at RHF/STO-3G, and then construct a (10e, 10o) active space Hamiltonian consisting of the π and π* orbitals. This corresponds to a case of M = 10 spatial orbitals (i.e. N = 20 qubits), and we focus on the ground state irreducible representation (Nα = 5, Nβ = 5, S = 0). In both cases, we consider VQE gate fabrics of the form of figure 2.

Our final VQE circuit starts with the preparation of an uncorrelated product state by applying local Pauli $\hat{X}$ gates to appropriate qubits of an all-zero state depending on the number of alpha and beta electrons. The qubits are chosen such that for all parameters equal to zero in the following fabric the state is transformed to the state with the energetically lowest orbitals occupied. We then consider two parameter initialization strategies: (A) the fabric is initialized with all θ = 0 and all φ = π/2 and $\hat{{\Pi}}={\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}(\pi )$ (solid lines in 3). (B) The fabric is initialized with all θ = 0 and all φ = π and $\hat{{\Pi}}=\hat{I}$ (dotted lines in 3).

In both cases we optimize the VQE ground state energy with respect to the VQE gate fabric parameters via L-BFGS. As the purpose of this study is to explore the expressive power of this gate fabric, we consider neither shot noise nor decoherence. This restriction permits the use of analytical expressions for the Hamiltonian expectation values and VQE parameter gradients thereof, greatly accelerating the classical statevector simulation of the VQE.

Figure 3 shows the salient results of this study. For the case of p-benzyne, figure 3(a) shows the VQE ground state energy vs FCI with respect to the depth of the gate fabric. The first notable point is that the fabric is able to provide higher accuracy than either HF (a fabric of QNPOR gates—redundant here due to the use of HF orbitals in the active space) or doubly-occupied configuration interaction (DOCI) (a fabric of QNPPX gates—equivalent to the pUCCD ansatz from [22]). Focusing on the early convergence behavior on the left side of the plot, even with only a few layers of the VQE gate fabric, e.g. ∼50–80 parameters, absolute accuracy of 1 kcal mol−1 is achieved, which is commonly referred to as chemical accuracy. As the gate fabric depth is increased, roughly geometric (exponential) convergence of the absolute energy is achieved, modulo some minor aberrations due to difficulties in tightly converging the L-BFGS-based numerical optimizations of the VQE gate fabric parameters. Focusing on the later convergence behavior on the right side of the plot, as the number of parameters in the VQE gate fabric approaches the number of parameters in the FCI problem (note that in this irrep there are 175 configuration state functions (CSFs), see appendix C.3), the error convergence turns sharply downward. At 180 parameters we are able to achieve very tight convergence to errors of ∼10−10 Eh relative to FCI, numerically indicating the onset of universality. Figure 3(b) shows the sorted power spectra of the computational basis state (determinant) amplitudes of the various VQE gate fabrics and the FCI state. The exact zeros in the FCI state amplitudes are an artifact of the D2h spatial point group symmetry of this molecule, which our VQE gate fabric was not optimized to capture. Even for low VQE gate fabric circuit depths, we see that all determinants are populated by nonzero amplitudes, with a compromise apparently being made to allow for some nonzero error in all amplitudes to provide for the best variational energy. As more layers are added to the gate fabrics, the precision of the amplitude spectra increases, as indicated by, e.g. significant attenuation of the symmetry-driven zero block of the amplitude spectrum. The tail of amplitudes exactly zero in FCI is exactly extinguished in the VQE state only when numerical universality is achieved at a 180-parameter VQE gate fabric. This behavior is reminiscent of the nonzero but structured tensor factorized representation of the determinant amplitudes in coupled cluster theories, where here the tensor structure is provided by the local quantum gate fabric.

Figure 3.

Figure 3. Results of the discussed VQE fabric for representative molecular test cases: (a) convergence of the VQE energy relative to the exact ground state energy EFCI of the 12-spin-orbital active space of p-benzyne as a function of the number of parameters in the fabric, (b) occupation ${\left\vert \langle I\vert {\Psi}\rangle \right\vert }^{2}$ of each computational basis state ⟨I| in the optimized VQE state |Ψ⟩ at the color-indicated parameter counts in (a). Blue area indicates the computational basis states of the full configuration interaction (FCI) ground state in the active space. Each set of computational basis states is sorted in descending order, we show a figure with consistent ordering between sets in see appendix G, (c) convergence of the VQE energy to the exact ground state energy EFCI in the 20-spin-orbital active space of naphthalene as a function of the number of parameters an for the two different initialization schemes, (d) convergence under the L-BFGS optimizer for the color coded parameter counts indicated in (c), dotted (solid) convergence lines correspond to a data point from the dotted (solid) curve in (c), inset highlights the plateaus encountered with initialization method B during the first 180 epochs.

Standard image High-resolution image

Moving to the larger test case of naphthalene, figure 3(c) tells a similar story as the corresponding plot for p-benzyne. Here we see similar and roughly geometric convergence of energy error vs VQE gate fabric depth and parameter count, albeit with a smaller prefactor. As with p-benzyne, the VQE gate fabric rather quickly outstrips both the HF and DOCI ansätze, which its primitive gates are constituted from, and achieves chemical accuracy of ∼1 kcal mol−1 in absolute energy at just ∼800–1000 parameters (there are 19 404 CSFs in this irrep, so universality is not reached for any of the depth explored here). Figure 3(d) considers convergence of the energy with respect to the L-BFGS epoch for a number of different VQE gate fabric depths. A first key finding is that making the gate fabric deeper decreases the epoch count needed to converge to chemical accuracy. A second key insight is that, while initialization strategy (B) has shallower circuits and ultimately achieves lower energy error at very high epoch count, plateaus are visible during the optimization with the L-BFGS optimizer (figure 3). Strategy (A), which exchanges orbitals by means of the non trivial choice $\hat{{\Pi}}={\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}(\pi )$ and also initializes all QNPOR gates with angles of π/2, appears to circumvent the plateaus entirely and for deeper circuits speeds up (the power-law like) convergence to below chemical accuracy.

The fabric presented here has favorable properties for implementation on NISQ hardware: the 12 qubit ansatz at 110 parameters is without (with) $\hat{{\Pi}}$ gates decomposable into elementary gates (two-qubit controlled Pauli and one-qubit gates) with resulting depth of 507 (617). The 20 qubit ansatz at 1080 parameters without (with) $\hat{{\Pi}}$ gates has depth 2761 (3361) in such decomposition. To put this into perspective, a single trotter step of a naive UCCSD circuit has gate depth ≈ 6600 (12 qubits), respectively ≈57 600 (20 qubits). Another considerable advantage is that only N − 2 unique four-qubit gates have to be calibrated on hardware as the structure is repetitive after the first two layers.

Simulations were done with the support of PennyLane [36] and OpenFermion [40].

5. Comparison with other entangler circuits

Having numerically demonstrated the features of the VQE gate fabric, it is worth considering the relationship of this gate fabric to other proposed VQE entangler circuits. There has been substantial prior work along these lines in the past few years.

For one instance, the hardware efficient ansatz [19, 20] is manifestly a local gate fabric, using essentially $\mathcal{SU}(4)$ entangler elements or subsets thereof from the native gate set of the underlying quantum circuit architecture. However, this gate fabric does not respect the particle or spin quantum number symmetries, and therefore is likely to encounter substantial difficulties in locating low-lying states within a target quantum number irrep, particularly in larger active spaces.

In another direction, there are myriad proposed entangler circuit constructions which are either already explicitly or in principle could be adapted to real amplitudes and strict commutation with the number and/or spin-squared symmetry operators, but which are either nonlocal circuits or composed of heterogeneous gate layers. For instance, UCCSD [1, 14, 15], (here referring to the Trotterized version thereof) and its sparse and/or low-rank cousins k-UpCCGSD [9, 16], ADAPT-VQE [21], and Jastrow-factor VQE [18] all may have the power to achieve universality at sufficient depth, e.g. as proved in a recent analysis of distangled UCC [23] and have been either partially or completely symmetrized already. However, as written, all of these ansätze require nonlocal gate elements that, e.g. mediate excitations among non-adjacent spin orbitals in UCCSD, and thus are not gate fabrics. Moreover, many of these constructions involve heterogeneous gate layers. For a canonical instance, Jastrow-factor VQE [18] involves alternating circuit layers of orbital rotations and substitutions (with the last of these being nonlocal and complex-valued in the usual formulation). Of all methods discussed in the prior literature, k-UpCCGSD is likely closest to our proposed method, with products of single and diagonal double substitution operators comprising the method. k-UpCCGSD as described in [16] involves nonlocal pair substitutions and therefore does not yield a local gate fabric. Note however that the generalized swap network reformulation of k-UpCCGSD described in and around equation (12) and figure 7 of [17] (noticed after the first version of this manuscript was posted) appears to realized k-UpCCGSD by means of a local circuit composed of four-qubit gates that is almost a fabric.

Yet another interesting direction to consider is previously proposed true gate fabrics that preserve quantum number symmetry, but do not achieve universality. Orbital rotation fabrics [4143], (i.e. HF) are clearly one example here, but so too is doubly occupied configuration interaction (DOCI), for which a gate fabric was developed with the pUCCD ansatz [22]. Both of these ansätze have the interesting property that they can be mapped into gate fabrics requiring only M qubits, but both fail to reach FCI universality as the parameter depth is increased.

Another interesting gate fabric construction is the 'gate-efficient ansatz' presented in [24], which presents as a gate fabric that preserves total particle number ${\hat{N}}_{\alpha }+{N}_{\beta }$, but does not appear to respect high-spin particle number ${\hat{N}}_{\alpha }-{\hat{N}}_{\beta }$ or ${\hat{S}}^{2}$ symmetry. Yet another interesting entangler is the 'qubit coupled cluster' approach presented in [25], which essentially implements a partial spin adaption of UCCSD to preserve ${\hat{N}}_{\alpha }$ and ${\hat{N}}_{\beta }$ symmetry within the single and double excitation operations, but neither preserves ${\hat{S}}^{2}$ symmetry nor attains the structure of a local gate fabric. Another related approach presented in [26] constructs fermion-adapted excitation operators which preserve particle number symmetry but not spin symmetry, and additionally are aimed at optimizing the number of CNOT gates in non-gate-fabric UCCSD methods. An entangler that has the potential to preserve all quantum number symmetries with additional spin-adaption work is the quantum approximate optimization algorithm (QAOA)-inspired Pauli-term approach, presented in [27], but this approach yields highly nonlocal circuits which do not resemble gate fabrics. Another approach which has some intersection with the present work is the correlating antisymmetric geminal power (AGP) approach explored in [28], which first implements classically-tractable APG to provide state preparation in quantum circuits and then augments AGP with an anti-Hermitian pair hopping entangler which resembles our pair exchange gate. However, the correlating AGP is not written in the form of a local gate fabric.

Another interesting direction to explore that we propose here is alternative local gate fabrics that fully preserve quantum number symmetry, but which exhibit different gate constructions than the QNPPX and QNPOR gates used in figure 2. Examples of such gate fabrics using generic four-qubit five-parameter FCI gates and decompositions of these gates into QNPPX, QNPOR, one-hole/particle substitution, and pair-break up/down gates are described in appendix D.

One additional interesting direction is the 'symmetry preserving state preparation circuits' of [13]. This work primarily focuses on total number symmetry, but does introduce a four-qubit gate that preserves particle and spin quantum numbers via a hyperspherical parametrization.

Note that an alternative approach to the exact symmetry preservation explored in this manuscript is symmetry projection [44, 45], which often requires ancilla qubits and extra measurements due to the necessarily non-unitary nature of the projection operation.

6. Discussion

It is important to note that the main focus of this paper is the existence and performance of the QNP gate fabrics in the limit of an ideal quantum computer with infinite shot resolution. One particular topic that we defer for future consideration is the performance of methods using the QNP gate fabrics in the presence of realistic shot and/or decoherence noise channels. Here, one key concern to be studied is if one encounters the potential for variational collapse to a state with incorrect symmetry through decoherence noise channels (note that we believe that variational collapse through shot noise channels is overwhelmingly unlikely due to the non-systematic nature of this noise channel). We hypothesize that this is unlikely to be a practical issue for three reasons: (1) the forces driving a parameter search from the correct number manifold to the incorrect number manifold will be at most proportional to the decoherence noise strength, which must be fairly small for practical computation to be carried out. (2) The decoherence channels are not directly parametrized by the variational parameters of the QNP circuit, i.e. it is unlikely for the variational parameters of the decoherence-including circuit to be able to provide enough support for complete transport of population from one number manifold to another. (3) It is likely that the contaminants from incorrect quantum number blocks can be removed by efficient quantum error mitigation strategies, as was recently demonstrated for the NT (and trivially extendible to Nα /Nβ ) on a hardware implementation of HF [46]. Note that it is not yet clear how (3) could be implemented for the spin-squared operator. A promising result along the lines of this discussion was recently found in simulations and hardware deployment of the Nα /Nβ -preserving 'ASWAP' ansatz (essentially orbital rotations, which are universal for certain quantum number irreps of the H2/STO-3G test system) [47]. Here empirical evidence was found of noise reduction through the use of a symmetry-preserving ansatz, and no variational collapses were encountered despite the targeting of remarkably high-energy states such as the Nα = 2, Nβ = 2 double-anion state of H2/STO-3G.

On the topic of initialization strategies: a different strategy for mitigating barren plateaus was recently presented in the literature [48], and works by creating blocks of identity operators in the ansatz to limit the effective depth and reduce the issue of vanishing gradients. Our strategy instead draws from the idea that our fabric fully exchanges populations at initialised parameters of π. If we initialize halfway at π/2, all of the possible arms or light cones are populated in the beginning and have a contribution to the final gradient, which seems to help in the optimisations we show here. In the future, it might be possible to merge the ideas from these two strategies into an improved approach.

It is also worth pointing out that there is an apparent asymmetry in the implementation complexity of the ${\hat{Q}}_{\text{OR}}$ gate (4× CNOTs) and the ${\hat{Q}}_{\text{PX}}$ gate (13× CNOTs). This suggests that it might be advantageous to apply a ${\hat{Q}}_{\text{OR}}$ gate on both sides of each ${\hat{Q}}_{\text{PX}}$ gate to obtain a circuit with more parameters and fewer total CNOT gates. It is also worth considering that an overhead of 4–13 CNOT gates per effective VQE parameter is decidedly higher than, e.g. the overhead of the hardware efficient ansatz. Therefore, it is highly likely that the hardware efficient ansatz and/or an intermediate ansatz that only preserves ${\hat{S}}_{z}$ symmetry might be more accurate on the extremely small experiments allowable on today's quantum hardware. i.e. in the limit of high decoherence noise, the longer circuits of the QNP ansatz may accrue more error than that from the loss of symmetry with a simple ansatz. A related finding was observed on IBM hardware experiments for LiH/STO-3G using the symmetry-preserving state preparation circuits, in which it was found that decoherence errors overwhelmed the proper ${\hat{S}}^{2}$-preserving circuits, while circuits the only preserved Nα and Nβ fared better [13]. Our opinion is that the QNP gate fabrics developed herein will likely become much more important in intermediate timescales as decoherence error rates diminish and tractable problem size increases.

It is also worth considering how the gate fabric concept could be extended to other types of physical symmetries than the distinguishable, Hamming-weight constrained, or spin-1/2 fermionic cases encountered here, e.g. for higher-spin fermions, bosons, nucleons, or elementary particles. For instance, one could consider the case of systems composed of higher-spin fermions. Here each single particle orbital would require effective representation with more than one qubit, as is often done in qudit-to-qubit mappings. Next, the relevant composition operators would have to be mapped to qubits in analogy with the Jordan–Wigner mapping. Finally, localized gate fabrics that preserve the symmetry operators but allow for universal exploration of the Hilbert space would have to be developed. Such gate fabrics would almost surely require larger gate elements than the four-qubit gates encountered here. Similar extensions are likely possible also for bosons and for systems with additional quantum numbers such as isospin, strangeness, charm, etc. The conceptual pathway to obtaining such gate fabrics would likely be similar to the present manuscript, but it remains to be seen if such extended gate fabrics are possible.

7. Summary and outlook

In this work, we set out to construct doppelgängers of the well known gate fabric (i.e. a potentially infinitely repeatable, simple and geometrically local pattern of gate elements that span the parent group at sufficient depth) for the unrestricted qubit Hilbert space $\mathcal{SU}({2}^{N})$ consisting of simple two-qubit gate elements SU(4). Our major result is the construction of a gate fabric for the important special case of spin-1/2 fermionic systems in the Jordan–Wigner representation $\mathcal{F}({2}^{2M})$ consisting of simple four-qubit gate elements $\hat{Q}$. Each two-parameter $\hat{Q}$ gate comprises a one-parameter spatial orbital rotation gate QNPOR(φ) and a one-parameter diagonal pair exchange gate QNPPX(θ). A fabric made of either of these gate elements alone does not achieve FCI universality with sufficient parameter depth, but our VQE gate fabric, being an amalgamation of the two appears to be able to do so. Moreover, at intermediate depths, the VQE gate fabric appears to be pragmatically expressive as evidenced by tests of the ground state energy convergence in strongly correlated molecular systems. It is worth emphasizing that these properties seem to hold in the vast bulk of quantum number irreps, i.e. that these fabric circuits can be applied for cases where S ≠ 0 (including even or odd spin cases) and/or where Nα Nβ (see appendix G for details on specific high-spin edge cases that are not universal with the $\hat{Q}$-type QNP gate fabrics of the main text, but that can be addressed with elements of the $\hat{F}$-type QNP gate fabrics of appendix D). Many important questions remain regarding our QNP gate fabrics. These include: (1) how does the numerical optimization of parameters for such gate fabrics behave in the presence of shot and/or decoherence noise? (2) How can numerical optimization algorithms be adapted to exploit the knowledge that the VQE entangler circuit is a gate fabric? (3) Is the fixed $\hat{{\Pi}}$ gate construction or an extension thereof an effective way to mitigate barren plateaus during numerical optimization? (4) How does the VQE gate fabric perform for relative properties, for properties at different nuclear geometries, and for properties in different quantum number irreps? (5) Is the construction of the VQE gate fabric in terms of $\hat{Q}$ gates optimal, or do more elaborate constructions, e.g. using the $\hat{F}$ gates of appendix D provide additional benefits? (6) What is the scaling behavior of the error in absolute and/or relative properties as a function of parameter depth for representative interesting molecular systems? (7) Can the gate fabric be adapted to additionally exploit external symmetries such as spatial point group symmetries, e.g. as explored in [49]? Taken together, the results of this work might provide an interesting guide for the required symmetries and limiting simplicities when constructing more elaborate VQE entanglers for fermionic systems.

Acknowledgments

RMP is grateful to Dr Edward Hohenstein for many discussions on the structure of the ${\hat{S}}^{2}$ operator. The authors further thank Fotios Gkritsis for discussions and acknowledge useful comments from Will Simmons, Seyon Sivarajah and David Ramo that have led to a reduction of depth of the decompositions of our QNP gates. QC Ware Corp. acknowledges generous research funding from Covestro Deutschland AG for this project. Covestro acknowledges funding from the German Ministry for Education and Research (BMBF) under the funding program quantum technologies as part of project HFAK (13N15630). DW acknowledges funding by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy Cluster of Excellence Matter and Light for Quantum Computing (ML4Q) EXC2004/1 390534769.

Conflict of interest

The VQE gate fabrics described in this work are elements of two US provisional applications for patents both filed jointly by QC Ware Corp. and Covestro Deutschland AG. RMP owns stock/options in QC Ware Corp.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Appendix A.: Symmetry-constrained subgroups of $\mathcal{SU}({2}^{N})$

This section discusses some of the technical hurdles encountered in developing universal gate fabrics for certain subgroups of $\mathcal{SU}({2}^{N})$, using well-known literature results for restriction to real operators $\mathcal{SO}({2}^{N})$ and further restriction to Hamming-weight-preserving operators $\mathcal{H}({2}^{N})$.

The imposition of specific symmetries which constrain the subgroup of $\mathcal{SU}({2}^{N})$ may or may not present considerable difficulties in constructing gate fabrics of the type defined above. For an example that does not introduce significant difficulty, consider the case where we restrict the Hilbert space operators to have real value, i.e. a restriction to $\mathcal{SO}({2}^{N})$. In this case, one may simply substitute $\mathcal{SU}(4)\to \mathcal{SO}(4)$ in the gate fabric of figure 1 to construct the desired gate fabric sketched in figure 4 for $\mathcal{SO}({2}^{N})$.

Figure 4.

Figure 4. Gate fabric universal for $\mathcal{SO}({2}^{N})$ (sketched for N = 6). The gate fabric is a two-local-nearest-neighbor tessellation of alternating even and odd qubit-pair six-parameter, four-qubit SO(4) gates.

Standard image High-resolution image

For an example that does introduce significant difficulty, consider the case where we restrict $\mathcal{SO}({2}^{N})$ Hilbert space operators to preserve Hamming weight, i.e. to respect the commutation constraint $[\hat{U},\hat{P}]=0$ where $\hat{P}{:=}{\sum }_{p}(\hat{I}-{\hat{Z}}_{p})/2$ is the Hamming weight or 'particle counting' operator. We denote this subgroup as $\mathcal{H}({2}^{N})$. Here, we might be tempted to continue restricting the two-qubit $\mathcal{SO}(4)$ gates to preserve Hamming weight, mandating that we substitute $\mathcal{SO}(4)\to \mathcal{H}(4)$, where $\mathcal{H}(4)$ implements a Givens rotation between configurations |01⟩ and |10⟩ while acting as the identity in |00⟩ and |11⟩. This is sketched in figure 5. However, a tessellation of two-qubit Givens gates is not a gate fabric for $\mathcal{H}({2}^{N})$, as it does not provide universality for this subgroup. In fact, it can be shown that a tessellation of Givens gates amounts to a one-particle rotation of the qubit creation and annihilation operators ${\hat{p}}^{\pm }{:=}{\sum }_{q}{V}_{qp}{\hat{q}}^{\pm }$ for ${V}_{qp}\in \mathcal{SO}(N)$ and ${\hat{q}}^{\pm }{:=}({\hat{X}}_{q}\mp \mathrm{i}{\hat{Y}}_{q})/2$, and thus after exactly N layers and N(N − 1)/2 gates the part of Hilbert space reachable with the fabric does no longer increase anymore, and in fact the fabric is classically simulable in polynomial time via techniques such as the match gate formalism or direct implementation with classical photons and beamsplitters. Note that $\mathcal{H}({2}^{N})$ has irreducible representations of dimension up to $\left(\genfrac{}{}{0.0pt}{}{N}{\lfloor N/2\rfloor }\right)$, so failure to reach universality can be shown by simply parameter counting. Speaking more practically, this proposed gate fabric has very limited expressive power for most irreps of $\mathcal{H}({2}^{N})$, and does not provide a good approximation to most desired actions within this space.

Figure 5.

Figure 5. Gate fabric attempt not universal for the Hamming-weight-preserving subgroup $\mathcal{H}({2}^{N})$ (sketched for N = 6). The gate fabric is a two-local-nearest-neighbor tessellation of alternating even and odd qubit-pair one-parameter, two-qubit Hamming-weight-preserving $\hat{H}(4)$ gates. The gate fabric exactly commutes with the Hamming weight operator $\hat{P}\equiv {\sum }_{p}(\hat{I}-{\hat{Z}}_{p})/2$, but the gate fabric does not span $\mathcal{H}({2}^{N})$ for any depth.

Standard image High-resolution image

In fact, no gate fabric for $\mathcal{H}({2}^{N})$ is possible with two-qubit gate elements. One possible construction of a gate fabric for $\mathcal{H}({2}^{N})$ with three-qubit gate elements is sketched in figure 6. Note that this might not be a minimal representation—we will see examples of fermionic systems shortly where a much simpler gate element than the fully explicitly universal k-minimal qubit gate provides a gate fabric.

Figure 6.

Figure 6. Gate fabric universal for the Hamming-weight-preserving subgroup $\mathcal{H}({2}^{N})$ (sketched for N = 9). The gate fabric is a three-local-nearest-neighbor tessellation of cascading qubit-triple six-parameter, three-qubit Hamming-weight-preserving $\hat{H}(8)$ gates. Each $\hat{H}(8)$ gate is composed of a three-parameter $\mathcal{SO}(3)$ rotation in the d-Hamming-weight subspace, where d ∈ [1, 2] for a total of 6 parameters. The gate fabric exactly commutes with the Hamming weight operator $\hat{P}\equiv {\sum }_{p}(\hat{I}-{\hat{Z}}_{p})/2$ and spans $\mathcal{H}({2}^{N})$ at sufficient depth.

Standard image High-resolution image

Appendix B.: Additional details on the Jordan–Wigner mapping

This section is included to enumerate the expansion of same-spin number operators and the ${\hat{S}}^{2}$ operator into Pauli operators in the Jordan–Wigner mapping defined in the main text.

B.1. Same-spin occupation and substitution operators

B.1.1. Same-spin occupation number operators

The same-spin occupation number operators are

Equation (B1)

whereby p+ p is a 'particle occupation number operator' (counts 1s). p p+ is a 'hole occupation number operator' (counts 0s). Note that the Jordan–Wigner strings cancel for these operators.

B.1.2. Same-spin substitution operators

The same-spin one-particle substitution operator is

Equation (B2)

Here ${Z}_{p+1,q-1}^{{\leftrightarrow}}{:=}{\bigotimes}_{r=p+1}^{r=q-1}{\hat{Z}}_{r}$. For completeness

Equation (B3)

Technically, p+ q is the 'one-particle substitution operator' and p q+ is the 'one-hole substitution operator.'

With some algebra, one can show that,

Equation (B4)

(the formula for p > q is the same except for the indices on ${Z}_{p+1,q-1}^{{\leftrightarrow}}$.)

Here, the Jordan–Wigner strings only cancel partially. However, the α-then-β Jordan–Wigner ordering does provide the advantage that the remaining ${Z}_{p,q}^{{\leftrightarrow}}$ strings are supported only on the intermediate α (β) spin orbital indices for α (β) substitution operators.

B.2. Quantum number operators

B.2.1. Alpha number operator

The α number operator is

Equation (B5)

The eigenvalues of the α number operator are Nα ∈ [0, 1, ...M] with degeneracy $\left(\genfrac{}{}{0.0pt}{}{M}{{N}_{\alpha }}\right){2}^{M}$. The determinants are eigenfunctions of the α number operator, with eigenvalues given by the α population count,

Equation (B6)

and $\mathrm{p}\mathrm{o}\mathrm{p}\mathrm{c}\mathrm{o}\mathrm{u}\mathrm{n}\mathrm{t}({\overrightarrow{I}}_{\alpha })$ counts the number of ones in ${\overrightarrow{I}}_{\alpha }$.

B.2.2. Beta number operator

The β number operator is,

Equation (B7)

The eigenvalues of the β number operator are Nβ ∈ [0, 1, ...M] with degeneracy $\left(\genfrac{}{}{0.0pt}{}{M}{{N}_{\beta }}\right){2}^{M}$. The dets are eigenfunctions of the β number operator, with eigenvalues given by the β population count,

Equation (B8)

B.2.3. Total spin squared operator

The total spin squared operator is,

Equation (B9)

with the spin lowering operator,

Equation (B10)

the spin raising operator,

Equation (B11)

and the z-spin,

Equation (B12)

After some algebra, under the chosen Jordan Wigner mapping, this resolves to

Equation (B13)

The eigenvalues of the ${\hat{S}}^{2}$ operator can be written as S/2(S/2 + 1) with S ∈ [0, 1, 2, ...] (singlet, doublet, triplet, etc).

Appendix C.: The specific case of M = 2 in $\mathcal{F}({2}^{2M})$

This section explicitly enumerates, for the special case of M = 2, the characteristics of the Jordan–Wigner computational basis functions (representing Fock space Slater determinants), S2-pure CSF linear combinations thereof, and arbitrary linear combinations thereof which we will refer to as FCI states. This section is useful to develop the beginnings of a picture for the arbitrary M-spatial-orbital Fock space, and also directly leads to the FCI gate fabric of the next full section.

C.1. Slater determinants/Jordan–Wigner computational basis states

Table 1 enumerates the Slater determinants in the M = 2 case along with their corresponding Jordan–Wigner computational basis states, provides the Nα and Nβ eigenvalues of each determinant (all determinants are proper eigenstates of the particle number operators), identifies which determinants are CSFs (only some determinants are also CSFs), and if a CSF, provides the S eigenvalue of the determinant/CSF.

Table 1. Enumeration of characteristics of M = 2 Fock space in Jordan–Wigner representation. First column: base-2 qubit occupation string (i.e. qubit computational basis state). Second column: base-10 qubit occupation string (i.e. base-10 index for vector and matrix quantities). Third column: Slater determinantal configuration represented by this qubit computational basis state. Fourth column: number of α electrons in this configuration (always a proper eigenstate). Fifth column: number of β electrons in this configuration (always a proper eigenstate). Sixth column: is this configuration a valid CSF, i.e. a proper eigenstate of ${\hat{S}}^{2}$? Seventh column: if yes to the previous question, S eigenvalue for this simultaneous Slater determinant/CSF (S = 0—singlet, S = 1—doublet, S = 2—triplet, ...).

Base 2Base 10Determinant Nα Nβ Is CSF? S
|0000⟩|#0⟩ 00Y0
|0001⟩|#1⟩ 10Y1
|0010⟩|#2⟩ 01Y1
|0011⟩|#3⟩ 11Y0
|0100⟩|#4⟩ 10Y1
|0101⟩|#5⟩ 20Y2
|0110⟩|#6⟩ 11N 
|0111⟩|#7⟩ 21Y1
|1000⟩|#8⟩ 01Y1
|1001⟩|#9⟩ 11N 
|1010⟩|#10⟩ 02Y2
|1011⟩|#11⟩ 12Y1
|1100⟩|#12⟩ 11Y0
|1101⟩|#13⟩ 21Y1
|1110⟩|#14⟩ 12Y1
|1111⟩|#15⟩ 22Y0

C.2. Quantum number operators

The ${\hat{N}}_{\alpha }$ and ${\hat{N}}_{\beta }$ operators are diagonal (as is always true in the Jordan–Wigner representation), and their diagonal values are depicted in the fourth and fifth columns of table 1, respectively.

The ${\hat{S}}^{2}$ operator is not diagonal (as is generally true in the Jordan–Winger representation). Instead, only 14 out of the 16 rows/columns of this operator are diagonal, and their diagonal entries are given in the seventh column of table 1. The non-diagonal contributions arise from the non-CSF determinants

and,

These two determinants form the seniority-2 coupling set (the set of determinants with 2× non-spin-paired electrons). In this restricted basis, the ${\hat{S}}^{2}$ operator is,

C.3. Configuration state functions (CSFs)

CSFs are defined as sparse linear combinations of Slater determinants that provide proper eigenstates of ${\hat{S}}^{2}$. The 14 spin-pure Slater determinants discussed above are also CSFs, with corresponding eigenvalues S.

In the seniority-2 coupling set of the non-spin-pure Slater determinants |#6⟩ and |#9⟩, the eigenvectors of the ${\hat{S}}^{2}$ operator are,

and the corresponding eigenvalues are,

e.g. the + combination yields an S = 0 singlet CSF, while the − combination yields an S = 2 triplet CSF.

Thus the symmetry-adapted CSFs for this seniority coupling set are,

and,

Therefore, we have a complete real, orthonormal set of 16 CSFs for $\mathcal{F}({2}^{2\ast 2})$: 5 singlets, 8 doublets, and 3 triplets. These CSFs are proper eigenfunctions of ${\hat{N}}_{\alpha }$, ${\hat{N}}_{\beta }$, and ${\hat{S}}^{2}$.

C.4. Quantum number irreps

Valid solutions to the time-dependent or time-independent Schrödinger equation for spin-1/2 fermions governed by spin-free Hamiltonian operators must be definite simultaneous eigenstates of the quantum number operators $({\hat{N}}_{\alpha },{\hat{N}}_{\beta },{\hat{S}}^{2})$ with definite target eigenvalues (Nα , Nβ , S). We refer to the set of valid simultaneous eigenstates for a given set of target quantum numbers (Nα , Nβ , S) as a quantum number irrep.

Table 2 enumerates the dimensionality and our particular convention for the CSF basis for each definite (Nα , Nβ , S) irrep of the M = 2 Fock space. An arbitrary special orthogonal rotation within each irrep would also provide a faithful representation of the basis for that irrep.

Table 2. CSF irreps for M = 2 Fock space. D refers to the irrep dimension. The listed elements are our particular convention for the CSF basis functions of each irrep.

Nα Nβ S D Elements
0001|#0⟩
2201|#15⟩
2021|#5⟩
0221|#10⟩
1121 $\vert {{\Phi}}_{Z=2}^{S=2}\rangle $
1012|#1⟩,|#4⟩
0112|#2⟩,|#8⟩
1212|#11⟩,|#14⟩
2112|#7⟩,|#13⟩
1103|#3⟩,|#12⟩,$\vert {{\Phi}}_{Z=2}^{S=0}\rangle $

C.5. Full configuration interaction (FCI) states

The restriction of physically valid solutions of the time-dependent or time-independent Schrödinger equation to a given target quantum number irrep severely constrains, but does not exactly determine the valid solution for most irreps. For instance, in the (Nα = 1, Nβ = 0, S = 1) irrep, the 15-parameter generic solution,

is invalid because it does not respect the quantum number symmetries, but the one-parameter solution,

is valid due to the fact that the dimension of the target irrep is D = 2 > 1.

We generically refer to states which exactly lie within a given target quantum number irrep, but where the remaining flexibility in the state is determined by solving an auxiliary equation such as the time-dependent or time-independent Schrödinger equation, as 'FCI' states. The motivation for this naming is the set of states that emerge from exactly diagonalizing the spin-free electronic Hamiltonian within a given quantum number irrep, i.e. the classical FCI method, though the usage within this work should be understood to be generalized to solving any linear auxiliary equation governed by a spin-free operator which is simultaneously diagonalized by the three quantum number operators.

The question that arises at this point is how to construct special orthogonal operators that respect the quantum number symmetry but have the power to move from an arbitrary quantum-number-pure trial state to an FCI state within the same quantum number irrep. The simple answer is to construct complete special orthogonal operators acting on the CSF basis of each irrep, with the property that these operators commute with all three quantum number operators. This leads to the construction of 4× one-parameter $\mathcal{SO}(2)$ operators (simple Givens rotation matrices) acting within the 4 × S = 1 doublet irreps, and 1× three-parameter $\mathcal{SO}(3)$ operator acting within the (Nα = 1, Nβ = 1, S = 0) irrep. This seems to imply that the parameter dimension of $\mathcal{F}({2}^{2\ast 2})$ is 7. However, further analysis reveals that to preserve ${\hat{S}}^{2}$ symmetry, the same operator must be applied in the (Nα = 1, Nβ = 0, S = 1) and (Nα = 0, Nβ = 1, S = 1) irreps and that the same operator must be applied in the (Nα = 1, Nβ = 2, S = 1) and (Nα = 2, Nβ = 1, S = 1) irreps. This is related to the fact that the spin-free Schrödinger equation is invariant under permutation of the α and β labels in the working equations. This reduces the total number of parameter of $\mathcal{F}({2}^{2\ast 2})$ to 5, and yields the highly structured special orthogonal operators that will be encountered as M = 2 FCI gate operators in the next section.

Appendix D.: Gate fabric for $\mathcal{F}({2}^{2M})$ via M = 2 FCI gates

An early iteration of the gate fabric described in the main text was developed by constructing a gate fabric comprising a five-parameter four-qubit $\hat{F}$ gate universal for M = 2 FCI as detailed in figure 7. A fabric of these $\hat{F}$ gates was found to exactly preserve quantum number symmetry, to provide universality for $\mathcal{F}({2}^{2M})$ for sufficient parameter depth, and to yield an expressive approximate representation at intermediate depths. The representation power and numerical convergence was found to be similar between $\hat{F}$ gate fabrics and the $\hat{Q}$ gate fabrics, and the latter is conceptually simpler, so we have elected to focus on the latter in the main text. In the following we describe this alternative gate fabric and additional variants and refer to and use the concepts and notation introduced in appendices B and C.

Figure 7.

Figure 7. Gate fabric hypothesized to be universal for $\mathcal{F}({2}^{2M})$ (sketched for M = 4). The spin orbitals in Jordan–Wigner representation are physically ordered in 'interleaved' ordering with even (odd) qubit indices denoting α (β) spin orbitals. The Jordan–Wigner strings are defined in 'α-then-β' order as defined in the main text. The gate fabric is a four-local-nearest-neighbor-tessellation of alternating even and odd spatial-orbital-pair five-parameter, four-qubit $\hat{F}$ gates, constructed to be universal for M = 2 FCI. Each $\hat{F}$ gate consists of: (A) a spin-adapted one-parameter $\mathcal{SO}(2)$ rotation in the one-particle (Nα + Nβ = 1) S = 0 block denoted in red (parameter θ1p ). (B) A spin-adapted one-parameter $\mathcal{SO}(2)$ rotation in the one-hole (Nα + Nβ = 3) S = 0 block denoted in blue (parameter θ1h ). (C) A spin-adapted three-parameter $\mathcal{SO}(3)$ rotation in the (Nα = 1, Nβ = 0, S = 0) irrep denoted in green (3 parameters in the unique upper triangle of $\hat{x}$). The decomposition of $\hat{u}$ into a transformation from the determinant basis of the Jordan–Wigner computational basis to the CSF basis, followed by the application of an $\mathcal{SO}(3)$ rotation in the S = 0 CSFs, followed by backtransformation to the determinant basis is depicted below the definition of $\hat{F}$. The factoring of $\hat{F}$ into a product over a set of 5× representative one-parameter four-qubit gates, and the explicit decomposition of these gates into physically realizable forms in the standard two-qubit gate library is discussed later in the appendix.

Standard image High-resolution image

D.1. Decomposition of $\hat{F}$ into simple gate elements

There are many different possible implementations of $\hat{F}$ into products of simpler (e.g. one-parameter) gate elements. However, the block diagonal nature and configuration constituency of $\hat{F}$ suggests the following pragmatic choice, leading to a decomposition with a simple decomposition all the way down to a standard two-qubit gate library. The text refers to the $\hat{F}$ gate matrix in the second line of figure 7.

The red block (matrix entries c1p := cos(θ1p /2) and s1p := sin(θ1p /2)) corresponds to a Givens rotation between the one particle (Nα = 1, Nβ = 0, S = 1) CSFs and and the same Givens rotation between the (Nα = 0, Nβ = 1, S = 1) CSFs |#2⟩ and |#8⟩ (to preserve ${\hat{S}}^{2}$ symmetry). We call this operation the QNP1p gate (quantum-number-preserving one-particle gate).

The blue block (matrix entries c1h := cos(θ1h /2) and s1h := sin(θ1h /2)) implements a Givens rotation between the one hole (Nα = 1, Nβ = 2, S = 1) CSFs and and the same Givens rotation between the (Nα = 2, Nβ = 1, S = 1) CSFs |#7⟩ and |#13⟩ (to preserve ${\hat{S}}^{2}$ symmetry). We call this operation the QNP1h(θ1h ) gate (quantum-number-preserving one-hole gate).

The green block implements an $\mathcal{SO}(3)$ rotation between the three (Nα = 1, Nβ = 1, S = 0) CSFs, , , and . There are three natural rotation gates (i.e. Euler-angle-like rotation gates) in this subspace: first, the QNPPX gate (quantum-number-preserving pair exchange gate) implements a Givens rotation between the two closed shell CSFs and . Second and third the QNPPBU and QNPPBL (QNP pair-break upper/lower gates) rotate between the upper, respectively, lower closed shell CSF and the open-shell singlet CSF $(\vert 0110\rangle +\vert 1001\rangle )/\sqrt{2}$.

Explicit decompositions of the five gates QNP1p, QNP1h, QNPPX, QNPPBU, and QNPPBL to elementary two-qubit gate operations are provided in appendix E.

D.2. Simplifications of the $\hat{F}$ gate fabric

A natural question at this point is whether there exist gate fabrics for $\mathcal{F}({2}^{2M})$ which are simpler than the $\hat{F}$-gate fabric described above. e.g. a simpler gate fabric might have fewer parameters per gate element, and/or fewer QNP product gates per gate element, while still preserving quantum number symmetry and numerical efficiency. For one explicit example, it is clear that the 3× QNP product gates in (Nα = 1, Nβ = 1, S = 0) irrep are redundant, as QNPPX and QNPPBU (or QNPPBL) are sufficient to attain any desired action in the irrep. Further, repeated application of pairs of QNPPX and QNPPBU (or QNPPBL) gates is sufficient to attain any desired operator in the irrep. So we can already reduce from a five-parameter $\hat{F}$ gate fabric to a four-parameter modified ${\hat{F}}^{\prime }$ gate fabric.

Next, we can consider the one-hole and one-particle spaces. Depending on the target irrep, only one of these rotations is generally needed, e.g. for most irreps, a gate fabric of QNP1p, QNPPX, and, QNPPBU is universal. So, for most irreps, a three-parameter modified ${\hat{F}}^{{\prime\prime}}$ gate fabric is sufficient. A technical detail here is that the choice of QNP1p vs QNP1h required for universality is contingent on whether there are more particles or holes in the desired irrep of $\mathcal{F}({2}^{M})$ for extreme edge case irreps.

As we show in the main text, it is possible to reduce ${\hat{F}}^{{\prime\prime}}$ even further to a two-parameter $\hat{Q}$ gate fabric, where the $\hat{Q}$ fabric symmetrizes the rotations between the one-particle and one-hole irreps, and additional mixes rotations between the one-particle/hole irreps and the (Nα = 1, Nβ = 1, S = 0) irrep. To that end we consider an alternative QNP gate which is already well-known in the literature, the spatial orbital rotation gate, which we describe in the following section.

D.3. Orbital rotations

A well-known operation in both classical and quantum electronic structure methods is the spin-adapted spatial orbital rotation gate, which implements,

Equation (D1)

for ${V}_{pq}\in \mathcal{SO}(M)$, and for the particular case of M = 2 adjacent spatial orbitals. If we take Vpq to be a 2 × 2 special orthogonal matrix, i.e. a Givens rotation matrix with parameter φ, then this one-parameter, four-qubit QNPOR gate (quantum-number-preserving orbital rotation gate) is a special case of the five-parameter, four-qubit $\hat{F}$ gate from figure 7 with, c := cp1 = ch1 = cos(θ/2), s := sp1 = sh1 = sin(θ/2), and

Equation (D2)

This gate can be viewed as a simultaneous and symmetrical application of the QNP1p and QNP1h gates which also acts in a direct product manner in the (Nα = 1, Nβ = 1, S = 0) irrep. The explicit action of the QNPOR gate is depicted in figure 8.

Figure 8.

Figure 8. Spin-adapted spatial orbital rotation gate between two adjacent spatial orbitals. The parameter φ is the argument of the Givens rotation between orbitals |ϕ0⟩ and |ϕ1⟩, with the same Givens rotation applied in the α and β spaces.

Standard image High-resolution image

It is well-known that a fabric of $\left(\genfrac{}{}{0.0pt}{}{M}{2}\right){\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}$ gates arranged in a rectangular or triangular gate fabric pattern can exactly implement an arbitrary orbital rotation within M spatial orbitals, with a classically tractable relationship between the Vpq orbital rotation matrix and the parameters {ϕd } of the fabric being possible through the QR decomposition of the orbital rotation matrix.

D.4. Variants of the QNP fabrics

As with the gate fabric in the main text it can be interesting to prepend the parametrized gate elements ${\hat{F}}^{\prime }$ or ${\hat{F}}^{{\prime\prime}}$ with a fixed gate like the $\hat{{\Pi}}$ gate as this may improve trainability of the fabric or even expressiveness at intermediate depths. In the main text we have explored the options $\hat{{\Pi}}\in \left\{\hat{I},{\mathrm{Q}\mathrm{N}\mathrm{P}}_{\text{OR}}(\pi )\right\}$. Another natural option, inspired by the concept of fermionic swap networks, would be to take $\hat{{\Pi}}$ to be an orbital wise fermionic swap gate. This gate is also QNP and we introduce it in the end of the following section.

Appendix E.: Explicit decompositions of the quantum number preserving gates

Here we provide explicit decompositions of all the QNP gates introduced in the main text, namely the gates: QNP1p, QNP1h, QNPPX, QNPPBU, QNPPBL, and QNPOR as well as one additional gate OFSWAP. We call these gates number preserving gates because for any gate G from the above list it holds that

Equation (E1)

The decompositions of QNP1p(θ) and QNP1h(θ) are given in terms of decompositions of two gates, each acting on just the alpha or beta space such that QNP1p(θ) = QNPA0B1(θ)QNPA1B0(θ) and QNP1h(θ) = QNPA2B1(θ)QNPA1B2(θ). None of these four gates is individually QNP.

E.1. Pair exchange gate QNPPX(θ)

For the QNPPX(θ) gate we present the following decomposition in terms of standard gates and controlled Y rotations:

Due to cancellations when expanding out the controlled Y rotations, a decomposition in terms of only standard gates has only slightly higher depth (15 → 18) and requires less two-qubit gates (17 → 13) even if the controlled Y rotation is a native operation:

A similar gate was considered in [26, 50].

E.2. Two orbital givens rotation gate QNPOR

We describe the construction of the Givens rotation gate in more detail. A Givens rotation is generally a rotation in a two dimensional subspace of the form

Equation (E2)

where c := cos(θ/2) and s := sin(θ/2) for a continuous parameter θ. Under the Jordan–Wigner mapping a Givens rotation between the orbital bases can be implemented as pair of parallel Givens gates as follows:

Equation (E3)

In the two qubit space, the Givens rotation gate G(θ) has the action

Equation (E4)

and can be decomposed into elementary gates as follows:

Equation (E5)

In the four-qubit Hilbert space, the two orbital Givens rotation gate QNPOR(θ) has the action

When applied to two neighboring spatial orbitals, this gate also preserves all three quantum numbers and has the following decomposition with gate depth 5 and just 4 CNOT gates:

An alternative decomposition into controlled Y rotation gates is:

Note that if these gates are native, the two-qubit gate count is raised to 6 while reducing the depth to 3. Expanding out the controlled Y rotations yields:

Of course the two Givens rotations of the gate commute and can be performed at the same time, giving a gate depth of 6 and a CNOT count of 8. Depending on the preceding and following gates, it may furthermore be favourable to substitute the doubled CNOT gates with a single CNOT and an SWAP gate.

E.3. Single particle and single hole gates

In the following we abbreviate R := RY(θ/8). The QNPA1B0(θ) gate can be decomposed as follows:

The QNPA0B1(θ) gate can be decomposed as follows:

The QNPA2B1(θ) gate can be decomposed as follows:

Finally, the QNPA1B2(θ) gate can be decomposed as follows:

E.4. Pair breaking gates

For the pair breaking gates we present decompositions into standard gates and controlled Y rotations. The pair break low gate QNPPBL has the decomposition:

While the pair break up gate QNPPBU has the decomposition:

E.5. Fermionic orbital swap gate

Finally, the OFSWAP gate is an orbital wise fermionic swap gate, i.e. two fermionic swap gates (an SWAP gate followed by a controlled Z gate) with action

Equation (E7)

where the −1 in the lower right corner takes into account the sign of the fermionic anti-commutation relations, applied between the alpha and beta wires respectively. Curiously OFSWAP is only up to phases representable by an orbital rotation as we have $\mathrm{O}\mathrm{F}\mathrm{S}\mathrm{W}\mathrm{A}\mathrm{P}\enspace {\hat{Z}}_{0}{\hat{Z}}_{\bar{0}}={\mathrm{Q}\mathrm{N}\mathrm{P}}_{2\text{OGR}}(\pi )$.

Appendix F.: Generalized parameter-shift rules

In order to compute the derivative of expectation values with respect to quantum gate parameters, the so-called parameter-shift rule has been established as a tool to avoid finite difference derivatives, which become unstable under the influence of noise from both, measurements and circuit imperfections [51]. In addition to the original concept, multiple efforts have been made to analyze and generalize the parameter-shift rule [33, 35, 38, 52, 53]. In this appendix we introduce the concept of tuning the shift angle in parameter-shift rules for an algorithmic advantage (appendix F.1), a new four-term parameter-shift rule for gates with three distinct eigenvalues (appendix F.2) and exclude a further straightforward generalization of this type of parameter-shift rules (appendix F.2.2). We also compare our new four-term rule to the one recently presented in [37] (appendix F.2.1) and extend the variance minimization strategy from [38] to both four-term rules. This four term shift rule is applicable to all of the QNP gates introduced above, except for the spin adapted QNPOR gate, which can be analytically differentiated by differentiating the individual G gates and using chain rule.

F.1. Shift tuning

We briefly recap the derivation of the standard parameter-shift rule without fixing the shift angle, leading to a free parameter in the rule. Consider a parametrized gate of the form

Equation (F1)

where P2 = 1, as is the case for example for Pauli rotation gates. In a circuit with an arbitrary number of parameters, let us single out the parameter of the gate U above and write our cost function of interest as

Equation (F2)

where the part of the circuit preparing |ψ⟩(θ) from some initial state applied before the gate U has been absorbed into |ϕ⟩ and the part after U is absorbed in to B. Then the derivative is, by the product rule, given by

Equation (F3)

Now look at the conjugation of B by U at an arbitrary shift angle ±α:

Equation (F4)

Subtracting $\mathcal{U}(-\alpha )(B)$ from $\mathcal{U}(\alpha )(B)$ and excluding multiples of π as values for α, we obtain the generalized two-term parameter-shift rule

Equation (F5)

Equation (F6)

where the original parameter-shift rule corresponds to choosing α = π/2. We note that the concept of shift-tuning was independently discovered in [38] and introduced in the quantum computing software package PennyLane [36].

F.1.1. Reducing the gate count

In particular, the general form of equation (F5) allows us—provided that θ is not a multiple of π—to choose α = −θ, making the first of the cost function evaluations f(0) and therefore reducing the gate count because U(0) = 1 can be skipped in the circuit. This may lead to an additional gate count reduction if the neighboring gates on both sides of U can be merged, which is true for example in circuits for the QAOA.

F.2. Four-term parameter-shift rule

Here we derive a four-term parameter-shift rule for gates that do not fulfill the two-term rule, e.g. controlled rotation gates like CRZ (θ) or many of our QNP gates with one parameter.

To this end, consider a gate

Equation (F7)

with Q3 = Q but not necessarily Q2 = 1, as is true for any gate with spectrum {−1, 0, 1}. Then the exponential series can be rewritten as

Equation (F8)

and a computation similar to the one above leads to

Equation (F9)

We can then obtain the commutator by linearly combining this difference with itself for a second angle ±β, so that

Equation (F10)

Equation (F11)

which holds true if the angles α, β and the prefactors d1,2 satisfy

Equation (F12)

Equation (F13)

Therefore, we get the four-term parameter-shift rule

Equation (F14)

where we again can choose α or β such that one of the function evaluations skips the gate U. A particularly symmetric solution of equations (F12) and (F13) is

Equation (F15)

In general, any gate for which the spectrum of the generator is {−a + c, c, a + c} obeys the four-term parameter-shift rule as the shift c can be absorbed into a global phase that does not contribute to the gradient and a can be absorbed into the variational parameter of the gate.

As an example, the four-term rule is applicable to (multi-)controlled Pauli rotations CRP (φ) for which Q is the zero matrix except for the Pauli operator P on the target qubit. For multiple control qubits and our QNP gates, this will lead to less circuit evaluations using the chain rule and applying the two-term rule to the gate decomposition.

In order to find out whether an n-qubit single-parameter gate U satisfies the four-term rule, one can compute

Equation (F16)

and test if there is an $a\in \mathbb{R}$ such that ${\bar{Q}}^{3}={a}^{2}\enspace \bar{Q}$, which is a sufficient condition, as the only thing we needed for the four term rule to apply was this assumptions about the generator spectrum.

F.2.1. Relation to other four-term rule

Previous work showed the existence of a four-term parameter-shift rule [37] for gates of the form (F7), which is implemented with only one shift angle but requires the two additional gates

There are four relevant aspects when comparing this rule to the one in (F14): first, our four-term rule does not require any additional gates like V±, which add overhead to the gradient evaluation circuits. While the authors bound the additional cost by the cost of the differentiated gate itself, it might more crucially be non-trivial to construct V± for gates that do not have an obvious fermionic representation like the gates considered in [37].

Second, the shift tuning technique for gate count reduction in (F1) can easily be extended to both, our four-term rule and the rule derived in [37], provided one has access to the parametrized versions of V±. As the construction of V± for fermion-based gates is based on rotations, this access can be assumed for these gates whenever V± can be implemented.

Third, it was shown in [37] that their four-term rule reduces to a standard two-term rule up to the insertions of the V± operators whenever both the circuit of interest and the measured observables are purely real-valued. This is the case for virtually all molecular Hamiltonians and most of the circuits proposed for quantum chemistry problems—including the fabrics in this work—such that gradients of highly complex gates may be computed with just two circuit executions including the gates V± using the rule in [37].

Fourth, the variances of the derivative estimators given by the two rules can be minimized to the same value by choosing the shift angles optimally, as shown in appendix F.3. This means that for a given budget of circuit executions, the quality of the estimated derivative is the same, even though the number of distinct circuits differs.

In summary, the specialized two-term parameter-shift rule in [37] is preferable if the following three criteria hold: firstly, the circuit and observable need to be real-valued. Secondly, the auxiliary gates V± have to be available. Thirdly, the computation must happen on a simulation level in which the number of distinct circuits instead of the measurement budget is relevant, so that the reduction from four to two terms provides an advantage which is larger than the overhead of adding V±. In all other scenarios the four-term rule equation (F14) with the optimal parameters in equations (F24) and (F25) requires slightly fewer gates and the same number of circuit executions, making it preferable in particular on quantum computers.

F.2.2. Impossibility of some further shift rules

One may wonder whether a three shift rule is possible for gates whose generators have just three distinct eigenvalues and whether shift rules exist for gates with more distinct eigenvalues. We present some insights on these questions in the following.

During the derivation of the four-term parameter-shift rule we chose to first linearly combine $\mathcal{U}(\pm \alpha )(B)$ and $\mathcal{U}(\pm \beta )(B)$ with the same prefactors, respectively. Alternatively one may try to combine $\mathcal{U}({\alpha }_{i})(B)$ at three shift angles ${\left\{{\alpha }_{i}\right\}}_{i\in \left\{1,2,3\right\}}$ linearly and demand the result to fulfill

Equation (F17)

This leads to the system of equations

with ${c}_{i}=\mathrm{cos}\left(\frac{{\alpha }_{i}}{2}\right)$ and ${s}_{i}=\mathrm{sin}\left(\frac{{\alpha }_{i}}{2}\right)$, which we conjecture to not have a solution.

Considering the generalization of the (standard) two-term shift rule to the four-term rule in (F14) and their requirement on the gate generator, i.e. Q2 = 1 and Q3 = Q, it seems a natural question whether further generalization is possible to gates that, e.g. fulfill Q5 = Q. We show next that this is not the case.

Consider the generalized condition Qm = Qn , mn for the generator of a d-dimensional one-parameter gate. We recall that we may absorb shifts and scaling prefactors of the spectrum of Q into a global phase gate and the variational parameter, respectively, which may be used to obtain gates satisfying the generalized condition Qm = Qn . In the eigenbasis of the Hermitian matrix Q, this condition becomes ${\lambda }_{i}^{m}={\lambda }_{i}^{n}\enspace \forall 1\leqslant i\leqslant d$, which only ever is solved by −1, 0 and 1 over $\mathbb{R}$ (in which the spectrum of Q must be contained) with the additional condition mn mod 2 = 0 for λi = −1. This means that Q already satisfies Q3 = Q, allowing for the four-term rule to be applied.

Consequently, a direct generalization of the four-term rule is not possible. Note that this does not exclude the existence of other schemes to compute the derivative of an expectation value w.r.t. parametrized states that are based on linear combinations of shifted expectation values.

F.3. Minimizing the variance

If we approximate the physical variance of the expectation value, V, to be independent of θ, the variance of measuring f at a given parameter for sufficiently many measurements N is V/N. The resulting variance of the two-term shift rule derivative for a budget of N measurements is

Equation (F18)

where we chose the optimal allocation of N/2 measurements to each of the two terms in the shift rule. We may optimize the shift angle in the two-term rule w.r.t. this variance which yields the standard choice π/2 for the shift because

Equation (F19)

The variance can be reduced further by introducing a multiplicative bias to the estimator, as presented in [38]; the optimal choice of the prefactor depends on the value and the variance of the derivative and is given by

Equation (F20)

Note that λ* has to be estimated because V and ∂θ f are not known exactly. The optimal choice of the shift parameter remains $\frac{\pi }{2}$.

For the four-term rule in equation (F14), the optimal shot allocation is proportional to the prefactors d1,2 and leads to the variance

Equation (F21)

As for the two-term parameter-shift rule, we may minimize this variance w.r.t. α and β via d1 and d2, which are given via equations (F12) and (F13) by

Equation (F22)

Equation (F23)

This results in

Equation (F24)

Equation (F25)

and three equivalent solutions based on the symmetries of equations (F12) and (F13).

The variance then is σ2 = V/N as for the optimal two-term rule and again it may be further reduced by introducing a bias via a multiplicative prefactor λ, with the same optimal λ* as before.

For both, the four-term rule and the specialized two-term rule in [37], the minimal variance is σ2 = V/N as well, as the prefactors are equally large and sum to 1.

In conclusion, under the constant variance assumption, the variance for all discussed two- and four-term parameter shift rules is the same at a given measurement budget, showing that they are equally expensive on a quantum device, for which the number of measurements instead of the number of distinct circuits is relevant.

Appendix G.: Additional numerical results

G.1. Computational basis state amplitudes

In figure 3(b) of the main text the individual ordering of each trace of computational basis states is ordered individually, which allows to view the shape of each tail but restricts comparability between single amplitudes. In figure 9 only the computational basis states of the true ground state and of one optimized VQE state at 110 parameters are plotted in consistent ordering. This allows for direct comparison between the amplitudes of the VQE and of FCI and demonstrates how our fabric finds a good approximation to most amplitudes while having far too few parameters to reproduce all amplitudes exactly.

Figure 9.

Figure 9. A cutout from figure 3(b) for a VQE with 110 parameters (orange curve) and the blue shaded FCI probabilities at a vertical cut off of 10−7 with consistent ordering between the FCI and VQE computational basis states. The numbers above the columns indicate the seniority of the computational basis state at the respective index, e.g. the first column at index 0 is the Hartree Fock determinant with seniority 0.

Standard image High-resolution image

G.2. Numerical universality demonstration for Haar random states

The test cases in real molecular systems in the main text are somewhat complicated by the specifics of the electronic structure Hamiltonian and especially by the spatial point group symmetry of the test molecules. One notable artifact is that some of the left-most gates in our gate fabrics in real molecules are 'dead,' as they perform orbital rotations and diagonal pair exchanges in the occupied or virtual subspaces of the HF starting state. The point group symmetry also seems to adversely affect the numerical convergence behavior of the VQE gate fabric parameter optimization, e.g. suggesting the $\hat{{\Pi}}$-gate pre-mixing initialization adopted in the main text. Noticeably better convergence behavior was observed when the molecules were perturbed from D2h symmetry to C1 by random Gaussian perturbations in XYZ coordinates.

This section is included to demonstrate the numerical universality properties of our proposed gate fabric for the artificial case of Haar random statevectors. Specifically, for a number of test case irreps (M, Nα , Nβ , S), we form the full CSF basis, and then generate Haar random statevectors |A⟩ and |B⟩ within this irrep of $\mathcal{F}({2}^{2M})$ by Gaussian random sampling and normalization of the statevector in the CSF basis, and then backtransformation to the standard Jordan–Wigner computational basis. We then optimize the VQE gate fabric parameters of the VQE entangler circuit $\hat{U}$ to maximize $\vert \langle A\vert \hat{U}\vert B\rangle {\vert }^{2}$ via L-BFGS with noise-free analytical gradients. Note that we do not perform $\hat{{\Pi}}$-based pre-mixing convergence enhancement in this section.

The results are shown for the half-filled cases for M = 4 and M = 6 in figure 10. The top panels show the bulk convergence properties with respect to circuit depth D/number of parameters Nparameter (roughly linearly proportional). The general finding here is roughly geometric convergence at low parameter depths, followed by a sharp drop to near the machine epsilon as the number of parameters crosses over the number of CSFs in the irrep, indicating the onset of universality in the action of the VQE entangler circuit. Quantum number symmetries are preserved to at least the machine epsilon for all intermediate and final parameter values. The lower panels show the numerical convergence behavior of the L-BFGS optimization procedure for each point in the top panel. There are several salient features in these plots: (1) the earliest convergence behavior appears to be roughly geometric, and self-similar between gate fabrics with different numbers of parameters (2) fabrics with smaller numbers of parameters deviate earlier from this geometric convergence and eventually 'flatline' at their non-universal terminal values (3) some minor plateaus are observed in the convergence behavior for small numbers of parameters (4) there is a distinct phase change as universality is crossed, with circuits with larger numbers of parameters than needed for universality exhibiting strongly geometric convergence behavior all the way to the machine epsilon.

Figure 10.

Figure 10. Numerical demonstration of universality of VQE gate fabric of the form of figure 2 for Haar random states |A⟩ and |B⟩ within $\mathcal{F}({2}^{2M})$. The VQE gate fabric parameters were optimized via L-BFGS with noise-free analytical gradients to maximize $\vert \langle A\vert \hat{U}\vert B\rangle {\vert }^{2}$, where $\hat{U}$ is the VQE entangler circuit operator. Top row: convergence of overlap $\vert \langle A\vert \hat{U}\vert B\rangle \vert $ with respect to respect to gate fabric depth D/number of parameters Nparameter (roughly linearly proportional). Bottom row: convergence behavior of L-BFGS optimization procedure for each gate fabric depth D/number of parameters Nparameter (roughly linearly proportional). Left column: results for (M = 4, Nα = 2, Nβ = 2, S = 0) eight-qubit example. Right column: results for (M = 6, Nα = 3, Nβ = 3, S = 0) 12-qubit example. Each colormapped line in the lower panel corresponds to the L-BFGS numerical convergence behavior of a single point in the upper panel. Results are single random instances within each test case, and are wholly representative of generic random instances and test cases in other quantum number irreps.

Standard image High-resolution image

Such tests are assuredly artificial, but are free from the external artifacts present in the molecular test cases, and serve to more-strongly indicate that the gate fabric developed in figure 2 are universal and quantum-number-symmetry-preserving for $\mathcal{F}({2}^{2M})$.

G.3. Non-universal edge cases

It is important to note that while the $\hat{Q}$-type QNP gate fabrics of main text are numerically universal for the vast majority of quantum number irreps in the 'bulk' of the Hilbert space, there are a limited number of edge cases for which these gate fabrics are not universal. These cases constitute systems where, after high-spin constraints are accounted for, there are only holes or particles left in the remaining orbitals. In these cases, the QNPPX gates have trivial action in the wholly hole or particle space, and are unable to explore new configurations within the space. More tangibly, for an irrep with dimensions (M, Nα , Nβ , S), we first compute the 'unconstrained' irrep (MS, Nα ', Nβ ', 0) where Nα ' + Nβ ' + S = Nα + Nβ and the larger of Nα ' := Nα or Nβ ' := Nβ is decremented first until Nα ' = Nβ ', and then both Nα ' and Nβ ' are decremented together (in this line, := is read as 'initialized to'). The resulting unconstrained irrep will always have Nα ' = Nβ '. If the unconstrained irrep is all holes (Nα ' = Nβ ' = 0) or all particles (Nα ' = Nβ ' = MS), then the $\hat{Q}$-type QNP gate fabric is not universal. A trivial exception is if only a single orbital with all holes or all particles remains in the unconstrained irrep, in which case universality is still preserved.

Note that the number of irreps in the Hilbert space is growing roughly as $\mathcal{O}({M}^{3})$, while the required constraints Nα ' = Nβ ' = 0 or MS seem to indicate that the number of non-universal irreps indicate that the number of irreps which are not universal with $\hat{Q}$-type QNP gate fabrics will grow as roughly $\mathcal{O}(M)$. Moreover, the non-universal irreps appear at the 'edge' of the Hilbert space, and consist of cases with severe high-spin constraints which are likely to be either polynomially tractable classically, physically uninteresting, or both. Interesting cases with roughly half-and-half filling of holes and particles and moderately low total spin number will almost surely fall into irreps which are universal with $\hat{Q}$-type QNP gate fabrics. Finally, it is worth noting that any issues with these edge cases can be completely obviated by instead working with the five-parameter $\hat{F}$ gate fabrics discussed in appendix D—these do not appear to exhibit any edge case non-universalities, and are numerically universal for all cases we have tested.

Tables 3 and 4 show explicitly the irreps for M = 4 and M = 6 that were found numerically to be non-universal with $\hat{Q}$-type QNP gate fabrics via numerical studies of the same type as the previous section. The non-universality behavior was immediately apparent as discrepancies of overlap of $1-\vert \langle A\vert \hat{U}\vert B\rangle {\vert }^{2}$ of order of 10−2, while the universal irreps exhibited maximum discrepancies of overlap of order of <10−13.

Table 3. Quantum number irreps for M = 4 for which the $\hat{Q}$-type QNP gates of the main text are not universal. Overall there are 35 unique irreps for M = 4 with total dimension D ≡ 22M = 256. The 6 irreps with total dimension 36 listed below are not universal due to high-spin constraints. All other irreps are numerically found to be universal to the essentially machine precision.

Nα Nβ S Dimension
0226
1126
2026
2426
3326
4226

Table 4. Quantum number irreps for M = 6 for which the $\hat{Q}$-type QNP gates of the main text are not universal. Overall there are 84 unique irreps for M = 6 with total dimension D ≡ 22M = 4096. The 24 irreps with total dimension 400 listed below are not universal due to high-spin constraints. All other irreps are numerically found to be universal to the essentially machine precision.

Nα Nβ S Dimension
02215
03320
04415
11215
12320
13415
20215
21320
22415
26415
30320
31415
35415
36320
40415
44415
45320
46215
53415
54320
55215
62415
63320
64215
Please wait… references are loading.