Complete-Graph Tensor Network States: A New Fermionic Wave Function Ansatz for Molecules

We present a new class of tensor network states that are specifically designed to capture the electron correlation of a molecule of arbitrary structure. In this ansatz, the electronic wave function is represented by a Complete-Graph Tensor Network (CGTN) ansatz which implements an efficient reduction of the number of variational parameters by breaking down the complexity of the high-dimensional coefficient tensor of a full-configuration-interaction (FCI) wave function. We demonstrate that CGTN states approximate ground states of molecules accurately by comparison of the CGTN and FCI expansion coefficients. The CGTN parametrization is not biased towards any reference configuration in contrast to many standard quantum chemical methods. This feature allows one to obtain accurate relative energies between CGTN states which is central to molecular physics and chemistry. We discuss the implications for quantum chemistry and focus on the spin-state problem. Our CGTN approach is applied to the energy splitting of states of different spin for methylene and the strongly correlated ozone molecule at a transition state structure. The parameters of the tensor network ansatz are variationally optimized by means of a parallel-tempering Monte Carlo algorithm.

3 section 3. In section 4, the procedure for the variational optimization of the CGTN via paralleltempering Monte Carlo is presented. The vertical spin splittings of methylene and ozone are reported in section 5.

Novel representations of quantum many-body states
The electronic Hamiltonian in second quantization reads in Hartree atomic units ('h = m e = e = 4π 0 = 1')Ĥ which contains one-electron integrals h i j over spatial orbitals φ i (r) given in non-relativistic theory by [16] with nuclear charge number Z I of atomic nucleus I and electron-nucleus-I distance r I = |r − R I |. The nucleus-nucleus repulsion term is suppressed for the sake of brevity. The two-electron integrals V i jkl are defined as The Hamiltonian and its ingredients may also be written in terms of spin orbitals φ i (x) = φ i (r)σ , where σ is a spin-up or spin-down spin eigenfunction. The coordinate x then denotes both spatial and spin variables. The eigenstate of the electronic Hamiltonian is the total electronic wave function | (N ) A if we restrict ourselves to N electrons. In quantum chemistry, the full-configurationinteraction (FCI) expansion of the total electronic wave function in terms of spin-adapted configuration state functions (CSFs) (SU(2) eigenfunctions) exactly solves the non-relativistic time-independent Schrödinger equation in a given one-particle basis of orbitals. The FCI wave function can be understood as a linear combination of occupation number vectors in the directproduct basis of the one-particle Hilbert spaces. Occupation number vectors are generated by distributing N electrons among the k orbitals. The FCI wave function of total electronic state A then reads A,FCI = q n 1 n 2 ...n k C (A) n 1 n 2 ...n k |n 1 n 2 . . . n k , where C (A) n 1 n 2 ...n k are the (F)CI expansion coefficients of state A and |n 1 n 2 . . . n k denotes an occupation number vector. Note that we restrict the sum in (4) to those vectors that represent N electrons and thus span the N -particle Hilbert space. The sums run over the dimension q of the local Hilbert spaces of the set of orbitals {1, 2, . . . , k}. For spin orbitals holds q = 2, the occupied and unoccupied one-electron state {|1 , |0 }. In the case of spatial orbitals, the basis of the local Hilbert space {|n i } consists of four states, {| , | ↑ , | ↓ , | ↑↓ }. Each occupation number vector |n ≡ |n 1 n 2 . . . n k is constructed as a direct product from the states of the local Hilbert spaces |n ≡ |n 1 n 2 . . . n k = |n 1 ⊗ |n 2 ⊗ . . . ⊗ |n k . 4 The number of variational CI parameters required to describe an electronic state (or a quantum state in general) grows exponentially with system size, which is a direct consequence of the underlying tensor-product structure of the Hilbert space. The exponentially growing number of parameters in the FCI ansatz renders it unfeasible for molecules containing more than a few atoms. Nevertheless, because of the nature of the interactions we may hope that there exists an efficient parametrization of a class of variational wave functions such that the low-energy sector of the electronic Hamiltonian is described with sufficient accuracy [12], [17]- [20]. In addition, the huge body of numerical evidence compiled during the past 40-50 years in quantum chemistry demonstrates that various truncated configuration-interaction expansions are efficient and reliable to approximate an electronic state [21]. This latter observation indicates that provided we find an efficient parametrization of all CI coefficients in the FCI expansion, we do not need to sample all occupation number vectors but only the most important ones for the total electronic energy. Otherwise, the procedure would be as unfeasible for large molecules as the FCI approach itself is.
One way of finding an efficient parametrization of states is to approximate the highdimensional coefficient tensor C (A) n 1 n 2 ...n k by a tensor network. Tensor network states build a new class of variational wave functions. The high-dimensional coefficient tensor is broken down into low-rank tensors, which are arranged on an arbitrary network [12], [22]- [28]. The primary advantage of tensor network states compared to the standard FCI expansion is the reduced number of variational parameters, which approximately scales as O(kχ p ) where k is the number of orbitals, χ the bond dimension and p the rank of the tensor. Tensor network states can be designed in such a way to directly map the entanglement of the underlying system [29,30].
The MPS constructed by the DMRG algorithm are the simplest example of tensor network states for one-dimensional systems [25,27,31]. An MPS with open-boundary conditions is defined as where the rank-3 tensors A i are written as m × m matrices A i [n i ] for a specific local state n i [9], [32]- [34].  [35] and Verstraete et al [12] who discussed the additional flexibility when using both wave function and Hamiltonian in MPS representation). Chan et al [5] and Hachmann et al [36] rephrased the quantum chemical DMRG algorithm consistently in terms of MPSs. Other variational families of states have been proposed for strongly correlated systems which can be efficiently contracted for variational Monte Carlo calculations. These include string-bond states (SBS) [37] and subsequently entangled-plaquette states (EPS) [38] and correlator product states (CPS) [39]. In this work, we will build upon an extension of these families to treat the full electronic Hamiltonian for molecular systems. Our extension is twofold: (i) SBS, EPS and CPS have only been applied to local spin Hamiltonians so far, while we aim at the full electronic Hamiltonian as given in (1), and (ii) we include all pair correlations of the one-electron basis states and do not impose any restriction on these pairs.

Complete-graph tensor network (CGTN) ansatz
In the case of an MPS parametrization of a wave function, sites-or orbitals in a quantum chemical context-have to be mapped on a suitably chosen lattice. Then, correlations are transmitted over the one-dimensional lattice by the size of the matrices occurring in the MPS. Naturally, this ansatz is more suitable for molecular systems with an inherent linear structure rather than for those with long-range correlations. Orbital ordering on this lattice is then crucial for the convergence of the variational optimization technique employed, e.g., for the DMRG algorithm [14,15]. Molecular orbitals are mostly delocalized and therefore contain no spatial information that would dictate a unique orbital ordering-in contrast to a clear spatial structure of spin models that can be easily mapped onto a one-dimensional lattice. Furthermore, a localization procedure cannot be efficiently applied for most compact molecules under the constraint of keeping the orthogonality of the orbitals. Hence, an MPS state might be difficult to optimize for a general molecule of arbitrary structure. By contrast, in the CGTN approach to be introduced now, non-local correlations are directly embedded into the nonlinear tensor network ansatz. The CGTN replaces the high-dimensional coefficient tensor in the FCI ansatz of (4) by a network of tensors that connects all orbitals with each other, where f ≡ { f n α n β αβ } represents a rank-[k(k + 1)/2] tensor that depends on the orbitals α, β ∈ {1, 2, . . . , k}. The local states of the spin orbitals n α , n β can be either occupied or unoccupied {|1 , |0 } (i.e. for spin orbitals q = 2). The sum runs over all possible occupation number vectors |n in the N -electron Hilbert space (in principle, in the full Fock space) with the correct number of electrons, projected spin and point-group symmetry.
The above ansatz is built on the key idea that every orbital is 'connected' with every other orbital. Hence, all CI coefficients are constructed from such pair correlations optimized for all orbitals. The number of variational parameters in our ansatz depends on the number of spin orbitals k and on the bond dimension d and is given as 1 2 k(k + 1)q 2 , where d = 2. Comparing this to the number of parameters in the FCI ansatz, which scales as O(2 k ) for spin orbitals, it is clear that CGTN states are much more efficient in terms of the number of variational parameters. It is important to emphasize that we do not need to specify any reference configuration like in most post-Hartree-Fock methods. Our ansatz comprises naturally all basis states that can be generated in the Hilbert space of interest. Thus, although the CI coefficients are approximated by the reduced set of CGTN parameters, we can expect that CGTN calculations are (at least approximately) size-consistent.
Compared to the tensor networks suggested so far for local (spin) Hamiltonians (see the last section), CGTN states form a subspace of the very general CPS parametrization, which is so general that it basically covers any non-hierarchical tensor network approximation of a wave function. In particular, they correspond to two-site CPS including all long-range correlation effects. A similar parametrization of a simple variational wave function was also chosen by Huse and Elser [40] to describe ground states of frustrated quantum spin systems. However, the question arises how accurate this parametrization is for the non-local electronic Hamiltonian of (1), which shall be investigated in this work. Although undesirable from the point of view of feasibility, inaccuracies may be cured by also including three-orbital, four-orbital, etc correlations as is possible with the general CPS ansatz. Thus, we may easily increase the 6 flexibility of CGTN states by including higher-order correlators (summing over three or more indices instead of two) or by increasing the bond dimension of the tensors from scalar values to matrices. In this work, the number of pair-correlation parameters is determined by the definition of an active orbital space, which is a standard procedure in quantum chemistry [21]. The next step is to variationally optimize the nonlinear tensor network ansatz.

Variational optimization
We apply a variational Monte Carlo scheme to optimize the CGTN state. In the context of tensornetwork states, this was demonstrated by Schuch et al [37] and by Sandvik and Vidal [41] for local Hamiltonians. We augment the optimization with a parallel tempering scheme. The energy of the system is written as an expectation value of the Hamiltonian operator over an N -electron wave function | (N ) CGTN , which delivers an upper bound to the exact FCI energy in a given one-particle basis.
For the sake of brevity, the tensor product in front of the occupation number vector in our CGTN ansatz is abbreviated by a scalar function C I , which corresponds to an (unnormalized) CI-like coefficient for a given occupation number vector |I in configuration-interaction theory, Inserting (10) into the normalization integral in the denominator of (8) yields the well-known CI-like normalization condition where the sum takes the square of the weights over all possible basis states in the Hilbert space with correct symmetry. We now insert (11) into (8) to get an approximation to the energy expectation value of the electronic Hamiltonian for our wave function ansatz and then substitute the ket in the denominator by (10): After rewriting this sum as we can perform Monte Carlo sampling with strictly non-negative probabilities C 2 I . We define an energy estimator E(I ) as a function of the occupation number vector |I that reads For a given |I , the number of basis states J | contributing to this sum is bounded by the number of terms in the Hamiltonian. Since the occupation number vector |I is not an eigenstate of the Hamiltonian,Ĥ el |I produces a linear combination of occupation number vectors with coefficients constructed from the one-electron and two-electron integrals occurring in the Hamiltonian. For a molecule, the sum over J can therefore be performed exactly. The Monte Carlo expectation value of an operator is calculated using ALPS code [42].
The calculation of the energy estimator E(S) scales as O(k 4 ) where k is the number of spin orbitals. The quotient of the two weights C J /C I scales as O(1) since it only depends on the difference between them. The evaluation of the high-dimensional integral of the electronic Hamiltonian between J and I , however, scales as O(k 4 ) due to the electron-electron repulsion operator.
A variational optimization is in general not guaranteed to converge to a global minimuminstead, it may be trapped in local minima and yield incorrect results. In our ansatz, the highly nonlinear structure and the complex energy landscape of molecules make such trapping quite likely. In particular, the approach of Sandvik and Vidal [41] that applies gradient information to propose a new set of variational parameters turns out to be unreliable in our case. We therefore use a stochastic optimization procedure that works entirely without gradient information. To each choice of variational parameters f = { f n α n β αβ }, we can assign an electronic energy E( f ). After introducing an artificial temperature T (actually, a parameter with the dimension of an energy; here measured in Hartree units, E h ), we can sample the continuous variables f following a canonical ensemble with the weight of a configuration given by exp(−E( f )/T ). The limit T → 0E h will yield the desired ground state of the molecule. It should be emphasized that this ensemble does not correspond to a physical ensemble at any finite temperature.
The advantage of this approach is that we can easily control the optimization procedure by tuning the parameter T . While an accurate simulation of the ground state is only possible for T → 0E h that may get stuck in local minima, a simulation at larger T may easily surmount highenergy barriers between local minima. We therefore use a parallel tempering/replica exchange scheme [43] where simulations are run at several temperatures simultaneously. After a certain number of updates, replica i and replica i + 1 at neighboring temperatures are exchanged with a probability The set of temperatures has to be chosen in such a way that the lowest temperatures are close to T = 0E h to yield information about the ground state and the highest temperatures are sufficient to overcome energy barriers. Hence, the choice depends very much on the specific problem at hand. It might be desirable to dynamically optimize the temperatures for some specific applications [44], but for the purposes of this work, we restrict ourselves to a static choice of the temperatures. For a set of M temperatures in the range [T 1 , T M ], we choose for the remaining M − 2 temperatures T l with T 1 < T l < T M : It is, of course, possible to finally take the state obtained from the above procedure as an input state for a direct optimization using gradient information, which may yield better accuracy close to the minimum.

Results
Our primary goals in this work are: (i) to analyze the CGTN ansatz for the description of electronic energies and CI coefficients of molecules and (ii) to show that we can optimize the CGTN ansatz by means of a variational parallel-tempering Monte Carlo algorithm. Our test molecules are methylene and ozone. For these small molecules we do not need to apply sampling of the occupation number vectors since the sum in (13) can be carried out explicitly. Hence, we use the above-described sampling scheme to sample the f coefficients only. As a consequence, we avoid the sampling error of the occupation number vectors and thus obtain a reliable picture of the quality of our CGTN ansatz.

Singlet and triplet polyatomic radicals: the methylene example
The accurate calculation of different spin states is of great importance to chemistry; in particular, for chemical reactions in which a spin-crossing event occurs [45]- [53]. There is, however, no way to tell our optimization algorithm how to converge directly to the desired spin state. The optimization algorithm might easily get trapped in local minima corresponding to a spin-contaminated total state. One possible solution would be to sample in the basis of spin-adapted CSFs, which can easily be constructed as linear combinations of occupation number representations using Clebsch-Gordon coefficients, producing an SU(2) eigenstate of the Hamiltonian with a well-defined total spin. In that case, however, the complete occupation number vector basis must be constructed. Another solution, which we need to employ in our second example below, is the application of a level-shift operator as used, for instance, for the DMRG algorithm in [54]. The concept of the level-shift operator can be easily implemented in the current optimization scheme. The idea is to substitute the original Hamiltonian by an effective shifted Hamiltonian where the unwanted states with a higher multiplicity are shifted up in energy. The lowest energy state of the total system is then a spin-pure state with the correct total spin. The shifted Hamiltonian is written aŝ where we add the product of the spin ladder operator to the original Hamiltonian and is a positive constant. This prevents the occurrence of states that possess the same projected spin but a different total spin. We choose methylene as our test molecule, for which we determine the spin splitting of the singlet and triplet states. Methylene is the smallest polyatomic radical featuring a triplet ground state and several low-lying singlet states where strong correlations effects are present [55]. We are particularly interested in the energies of the triplet ground state and the lowest-lying singlet state with point-group symmetries B 1 and A 1 , respectively.
In preparatory calculations, we calculated the one-electron and two-electron integrals as well as complete-active-space (CAS) self-consistent-field (SCF) reference energies with the Molpro program package [56]. The orbitals have been expanded in Dunning's cc-pVTZ basis set [57,58]. The electronic energies of the singlet and triplet states of CH 2 were studied at a  We investigate three active spaces that are successively enlarged, starting with a CAS(4,4) of four spatial orbitals comprising four electrons that is increased in each step by an occupied and a virtual orbital around the Fermi level yielding in total CAS (4,4), CAS(6,6) and CAS (8,8). The CAS is specified in parentheses as (n,m) where n is the number of electrons in m molecular orbitals. The number of variational parameters in the CGTN states does not depend on the dimension of the N -particle Hilbert space but on the number of orbitals in the corresponding active space. The selected active spaces provide insight into the convergence behavior of the CGTN parametrization. For the CAS(4,4), the number of variational parameters is around three times larger than the size of the Hilbert space of CH 2 (resulting in an over-parametrization), whereas for CAS (6,6) it is of comparable size. For the CAS (8,8), however, there are about nine times more many-electron basis states (i.e. occupation number vectors) than variational parameters in the CGTN ansatz. Hence, while the first two smaller active spaces allow us to demonstrate that the CGTN ansatz is able to reproduce the CASSCF reference, the third CAS probes the efficiency of the reduced-parameter CGTN ansatz. In order to prevent spin contamination in the CGTN state, the energy evaluation is performed in the basis of spinadapted CSFs for the singlet and triplet calculations. The weight for the CSF is calculated as a linear combination of the weights for the occupation number vector.
In figure 1 and table 1, CASSCF energy splittings between singlet and triplet spin states are compared to those obtained in the CGTN calculations. The number of variational parameters is given for each calculation as well. The total absolute energies cannot be quantitatively reproduced by the CGTN states but they provide a qualitatively correct description of the energy difference between different spin states for a set of active spaces. For the CAS(4,4), the CGTN calculations exactly reproduce the CASSCF reference calculations and verify that the ansatz optimized with the parallel-tempering Monte Carlo optimization can indeed find the correct ground-state wave function. The next question to answer is whether the CGTN ansatz can also extract the essential features of the electronic structure for larger active spaces because even if total electronic energies are not well reproduced, it would be sufficient to reliably produce relative energies of chemical accuracy (of about 1 kcal mol −1 ). We have already found [54] that MPS as optimized by the DMRG algorithm can reproduce the energetical spin splitting in transition metal complexes and clusters although the one-dimensional MPS parametrization is not very suited for this problem. The energy difference between two states can converge much faster than the total electronic energies of the individual states. Considering that during a chemical process (reaction, spin flip) only a small number of orbitals are needed to qualitatively describe the changes in electronic structure, it can be understood that the parametrization of the total electronic states requires a balanced representation of all occupation number vectors that involve these orbitals. We may expect that this balanced description is possible with a CGTN ansatz. This is exactly what we observe in the CAS(6,6)-and CAS (8,8)-CGTN calculations. The relative energies appear to be better reproduced than the absolute energies for the different spin states. Even though the parametrization in terms of the CGTN states consists of only a small fraction of parameters compared to the dimension of the Hilbert space in the CAS (8,8) case, the vertical spin splitting is accurately reproduced.
The accuracy of the CGTN to represent the electronic structure can also be assessed by performing an analysis of the CI coefficients of the CASSCF and CGTN calculations. In figure 2 and table 2, the CI coefficients of the ten most important occupation number vectors for the singlet and triplet CGTN and CASSCF calculations are compared. According to the spectrum of the CI coefficients, the relative energy between the two spin states is accurately described even though the CGTN energies are slightly higher than the CASSCF reference energies. The CGTN parametrization of the total electronic states approximates the dominant occupation number vectors with high accuracy. However, it lacks a proper description of most of the very small  (8,8). Even though the CGTN state has about 90% less parameters than the CASSCF wave function, it finds the most important occupation number vectors and provides highly accurate CI coefficients. The CI coefficients of the CASSCF and CGTN wave functions of singlet and triplet spin symmetry are shown for the ten most important occupation number vectors. The CGTN coefficients qualitatively and even quantitatively agree with the CASSCF reference, which is the exact solution for the CAS (8,8) in the given one-particle basis. coefficients of a large set of occupation number vectors that significantly affects the total energy but has basically no influence on the energy difference. The reason for this deficiency lies in the few parameters used for the construction of the CGTN ansatz and might be systematically improved by using higher-order correlators, a strategy already described by Changlani et al [39]. Qualitatively speaking, the CGTN states 'carved' the important occupation number vectors out of the entire N -electron Hilbert space that characterizes the electronic structure of the underlying molecular system. Therefore, a qualitatively correct description of the electronic structure is provided by the CGTN wave function.

Strongly correlated molecular system: ozone
We continue our numerical study with a very difficult case selected for probing the capabilities of the CGTN ansatz: the electronic structure of ozone at the transition state structure of the O 2 + O chemical reaction is a complex multireference problem [59]. We performed CAS (8,9)-CI reference calculations for the singlet and triplet states of ozone at the transition structure reported in [59]. For this calculation, we select Dunning's cc-pVTZ basis set [57,58] and an active space consisting of the 9 − 14a 1 − 3a molecular orbitals. The Hilbert space of the singlet state is then spanned by 7 956 and the one for the triplet state by 5 268 occupation number vectors. The CGTN state contains only 684 variational parameters, which is an efficient reduction by 91% compared to the singlet CASSCF wave function. The singlet energies of the ozone molecule at a transition state structure are given in table 3 and show the highly multireference nature of the electronic structure when compared to the Hartree-Fock energy.
The vertical spin splitting between the first excited singlet and ground-state triplet state is reported in table 4. For the first excited state of singlet symmetry, a CSF basis has been constructed to obtain a spin pure state without spin contamination. For the triplet ground state, we have tested the levelshift approach as well as no spin constraints at all and found that levelshift calculations are prone to get stuck in a local minimum. This can be circumvented if no spin constraints are applied. Then, however, spin contamination might become a problem. For the levelshift calculation, = 1 was used. As in the previous study of methylene, the relative energies between the singlet and triplet states converge much faster than the total electronic energies-even for this highly multireference system. (Note that we used the same temperature range and number of replicas as in the case of methylene.) Table 4. Vertical spin-splitting energy differences between the singlet and triplet states of ozone at the transition geometry. All calculations were in an active space of nine molecular orbitals comprising eight electrons. For the singlet CGTN energy, a CSF basis was employed, whereas no spin constraints were imposed on the triplet state calculation.  It is evident that the calculation employing the levelshift operator got stuck in a local minimum. The other CGTN calculation has no spin restriction. However, since the ground state is a triplet spin state, no spin contamination is expected. This is also seen in the expectation value of the converged triplet CGTN state of Ŝ 2 = 1.99.
In figure 3, the convergence behavior for the triplet ground-state calculation is shown. It can be seen that convergence difficulties arise when the levelshift operator is applied. In the algorithm, theŜ −Ŝ+ operator translates into a high-order polynomial that features many roots and therefore many local minima.

Conclusions
In this paper, we introduced a general class of tensor network states, which we named CGTN states, to approximate the electronic wave function of a molecular system. This ansatz assumes pair correlations of one-electron states (orbitals) to construct all CI expansion coefficients of a total electronic state. Hence, instead of 2 k -or 4 k in the case of spatial orbitalsvariational parameters, we employ only k(k + 1)/2 × q 2 . CGTN states are a subclass of the most general CPS form of tensor network approximations to the FCI state. The accuracy of the CGTN approximation of total electronic states of molecules has been numerically studied for methylene and ozone. We should note that this is the first application of a tensor network parametrization for molecular wave functions employing the full non-local electronic many-particle Hamiltonian. For this purpose, the k(k + 1)/2 × q 2 CGTN parameters have been optimized with a variational Monte Carlo protocol that we have augmented by parallel tempering in order to prevent convergence to local minima of the electronic energy hypersurface in this parameter space.
In molecular physics and chemistry, we are primarily interested in obtaining accurate relative energies between two spin states or between two configurations on the same potential energy surface to describe the thermodynamics and kinetics of chemical reactions. The accurate calculation of relative energies is therefore mandatory. CGTN states provide the flexibility to describe electronic structures without relying on an a priori chosen reference configuration such as the Hartree-Fock state. The CGTN ansatz is therefore not biased to any particular Slater determinant and is capable of finding the most important occupation number vectors in the Hilbert space of the molecular system.
In particular, the calculation of the electronic structure of transition metal clusters faces diverse challenges due to many near-degenerate electronic states, resulting in the failure of most of the standard quantum chemical methods. We expect the efficient parametrization of the CGTN states to be a very promising approach for dealing with the multireference nature and the large active spaces mandatory for an accurate description of transition metal clusters.