Twisted hybrid algorithms for combinatorial optimization

Libor Caha; Alexander Kliesch; Robert Koenig

doi:10.1088/2058-9565/ac7f4f

1. Introduction

Due to their real-world interest, problems and algorithms for combinatorial optimization figure prominently in present-day theoretical computer science. For theoretical physics, the profound and immediate connections to the physics, e.g., of Ising or Potts models are particularly appealing. Combinatorial optimization also provides an intriguing potential area of application of near-term quantum devices with clear figures of merit such as approximation ratios. Yet the study of quantum algorithms for these problems is still in its infancy, especially when compared to the intensely studied area of classical algorithms. For example, for classical algorithms, an established bound [13, 14] on efficiently achievable approximation ratios for MaxCut under the unique games conjecture matches that achieved by the celebrated Goemans–Williamson algorithm [10] (see also [4]). It appears rather unlikely that under the unique games conjecture an efficient quantum algorithm can outperform the Goemans–Williamson algorithm for generic graphs. Even the more modest goal of identifying special families of instances for which a quantum algorithm outperforms comparable efficient classical algorithms appears to be out of reach. Independently of whether or not one can find a provable real-world quantum advantage in the setting of combinatorial optimization, or ends up using quantum devices as a heuristic to efficiently find approximate solutions, or finds novel classical algorithms inspired by quantum ones (as has happened before), it is natural to study to what extent existing proposals can be improved in a systematic manner with associated performance guarantees. This is what we pursue here in the context of hybrid classical-quantum algorithms.

For the problem of finding (or approximating) the maximum of a combinatorial cost function $C:{\left\{0,1\right\}}^{n}\to \mathbb{R}$ (given by polynomially many terms), typical hybrid algorithms proceed by defining the cost function Hamiltonian

$\begin{equation*}{H}_{C}=\sum\limits _{z\in {\left\{0,1\right\}}^{n}}C(z)\left\vert z\right\rangle \left\langle z\right\vert \end{equation*}$

in terms of local terms, and a parametrized family ${\left\{{U}_{G}(\theta )\right\}}_{\theta \in {\Theta}}$ of n-qubit unitary circuits. The later might be parametrized by the underlying graph of the cost function or in case of hardware-efficient algorithms tailored to the physical device [12]. The parametrized family gives rise to variational ansatz states

$\begin{equation*}\left\vert {\Psi}(\theta )\right\rangle ={U}_{G}(\theta ){\left\vert 0\right\rangle }^{\otimes n},\end{equation*}$

that can be prepared with U_G(θ) from a product state ${\left\vert 0\right\rangle }^{\otimes n}$ . Measuring Ψ(θ) in the computational basis then provides a sample z ∈ {0, 1}ⁿ from the distribution p(z) = |⟨z|Ψ(θ)⟩|² such that the expectation value of the associated cost function is equal to the energy $\mathbb{E}\left[C(z)\right]=\left\langle {\Psi}(\theta )\right\vert {H}_{C}\left\vert {\Psi}(\theta )\right\rangle$ of the state Ψ(θ) with respect to H_C. Thus the problem of maximizing C is translated to that of finding a value of the (vector of) parameters θ maximizing the energy of Ψ(θ). The latter step is envisioned to be performed e.g., by numerical gradient descent or a similar classical procedure prescribing (iteratively) what parameters θ to try. The computation of this prescription (according to obtained measurement results) is the classical processing part of the quantum algorithm leading to the term hybrid. We will refer to this form of algorithm as a 'bare' hybrid algorithm in the following.

The potential utility of this approach hinges on a number of factors. Of primary importance—beyond questions of convergence or efficiency—is whether the family {Ψ(θ)}_θ∈Θ of states is sufficiently rich to variationally capture the (classical) correlations of high-energy states of H_C. There is an inherent tension here between the requirement of applicability using near-term devices, and the descriptive power, i.e., required complexity of these states: on the one hand, each unitary U_G(θ) is supposed to be realized by a low-depth circuit with local gates (making it amenable to experimental realization on a near-term device), and the dimensionality of the parameter or 'search' space Θ should be low to guarantee fast convergence e.g., of gradient descent. On the other hand, states having high energy with respect to H_C and belonging to the considered family of variational states may have intrinsically high circuit complexity, and, correspondingly, may also require a large number of variational parameters to approximate. The unavoidability of this issue has been demonstrated using the MaxCut-problem on expander graphs with n vertices and the quantum approximate optimization algorithm (QAOA) at level p: here the parameter space is ${\Theta}={\left[0,2\pi \right)}^{2p}$ and the corresponding circuits U_G(θ) have depth O(pd). Locality and symmetry of the ansatz imply that achievable expected approximation ratios are upper bounded by a constant (below that achieved by Goemans–Williamson) unless p = Ω(log n) [3]. In fact, the locality of the ansatz alone implies that for smaller values of p, the achieved expected approximation ratio is not better than of a random guessing for random bipartite graphs, as shown in [6].

These fundamental limitations of 'standard' hybrid algorithms are tied to the assumption that an increased complexity of the required quantum operations is unacceptable and/or infeasible in the near term. Under these circumstances, the only way forward appears to be to use alternative, possibly more powerful (e.g., non-local) efficient classical processing which could exploit the limited available quantum resources more effectively. One example where a classical post-processing is used is [7], where QAOA is combined with a greedy 'pruning' method to produce an independent set of large size. Here post-processing is needed, in particular, to ensure that the output is indeed an independent set. Another proposal in this direction is the idea of 'warm-starting' QAOA with a solution provided by the Goemans–Williamson algorithm [5] (see also [16]). The warm-starting approach has the appeal that—by construction—the Goemans–Williamson approximation ratio can be guaranteed in this approach (assuming convergence of the energy optimization). An alternative is the recursive QAOA (RQAOA) method [2, 3] which uses QAOA states to iteratively identify variables to eliminate. This effectively reduces the problem size but increases the connectivity and thus the circuit complexity of the iteratively obtained subproblems. Furthermore, analytical bounds on the expected approximation ratios are unknown except for very special examples [3]. For both warm-starting QAOA as well as RQAOA, one deviates from the original QAOA ansatz, leading to different variational states and corresponding quantum circuits.

1.1. Our contribution

Basic idea. Here we consider arguably more minimal adaptions of hybrid variational algorithms for the MaxCut-problem on three-regular graphs. For a given bare hybrid algorithm $\mathcal{A}$ involving a family {Ψ(θ)}_θ∈Θ of variational ansatz states as described above, we show how to construct a modified algorithm ${\mathcal{A}}^{+}$ which uses the same family of states {Ψ(θ)}_θ∈Θ. The algorithm ${\mathcal{A}}^{+}$ will be called twisted- $\mathcal{A}$ . Our modified algorithms are directly motivated by the work of Feige, Karpinski, and Langberg [9] (referred to as FKL in the following). These authors propose an algorithm for the MaxCut problem on three-regular graphs which proceeds by solving a semidefinite program relaxation (similar to Goemans and Williamson), and subsequently improving the rounded solution by a simple greedy post-processing technique. We also consider the improved version by Halperin, Livnat, and Zwick [11] (referred to as HLZ below) which involves a more non-local greedy procedure. For some motivation, see the following example.

Example. Consider a simple motivational example of a greedy post-processing procedure that can improve a given cut. The input will be a three-regular graph G = (V, E) and a cut C. We say that a vertex is unsatisfied when all three of its neighbours lie in the same partition of the cut as it does. The algorithm will repeatedly run through the vertices and check whether some of them are unsatisfied. If it finds an unsatisfied vertex it moves it to the opposite side of the cut and repeats the process with the updated cut until none of the vertices is unsatisfied. Since moving one vertex increases the cut size by 3 and potentially lowers the number of unsatisfied vertices by 4, one can show that this procedure improves the cut size by at least $\frac{3}{4}$ times the number of unsatisfied vertices in the initial cut. Let us apply this greedy procedure to a random cut, which has an expected approximation ratio of 1/2. A vertex will be unsatisfied with probability 2⁻³. From the linearity of expectation we have that the greedy procedure will improve the cut by at least $\frac{3}{4\cdot 8}\vert V\vert$ . Since $\vert V\vert =\frac{2}{3}\vert E\vert$ , we achieve approximation ratio at least $\frac{1}{2}+\frac{1}{16}=0.5625$ in expectation.

The algorithm ${\mathcal{A}}^{+}$ proceeds by using the variational family of states defined by the algorithm $\mathcal{A}$ to obtain an approximate cut, but this step is modified or 'twisted', as discussed below. The algorithm ${\mathcal{A}}^{+}$ then attempts to enlarge the cut size of the obtained cut by applying a classical post-processing procedure: we perform either the FKL post-processing procedure (obtaining an algorithm FKL- ${\mathcal{A}}^{+}$ ) or the HLZ post-processing procedure (giving an algorithm HLZ- ${\mathcal{A}}^{+}$ ).

Let us now describe the sense in which ${\mathcal{A}}^{+}$ is a 'twisted' form of $\mathcal{A}$ and not merely a hybrid algorithm augmented by a subsequent classical post-processing step. This terminology stems from the fact that in the quantum subroutine of the algorithm, the variational parameters (angles) are not optimized with respect to the original problem Hamiltonian H_G. Instead, one can express the expected cut size produced by measuring a state Ψ(θ) and using classical post-processing by the expectation value of a modified Hamiltonian ${H}_{G}^{+}$ (for both FKL and HLZ) in the variational state Ψ(θ). The twisted algorithm ${\mathcal{A}}^{+}$ thus optimizes the angle θ with respect to the modified Hamiltonian ${H}_{G}^{+}$ . Importantly, this does not change the ansatz/variational family of states used. This allows us to make a fair comparison (in terms of quantum resources and, especially, the number of variational parameters) to the original algorithm $\mathcal{A}$ .

Improved hybrid algorithms. The modified algorithm ${\mathcal{A}}^{+}$ requires a set of quantum operations that are comparable (in number and complexity) to that of $\mathcal{A}$ . In particular, it involves preparing the states {Ψ(θ)}_θ∈Θ. In addition, ${\mathcal{A}}^{+}$ uses extra local measurements because the hybrid optimization step is modified: the energy to be optimized is given by a modified problem Hamiltonian ${H}_{G}^{+}$ rather than the MaxCut-problem Hamiltonian H_G associated with the considered graph G. The modified Hamiltonian ${H}_{G}^{+}$ is either a three- or four-local Hamiltonian and (as H_G) diagonal in the computational basis. In particular, this means that measurements of up to 4 qubits at a time in the computational basis are sufficient to determine the (expected) cost function. We note that while this can also be achieved by measuring each qubit in the computational basis and taking appropriate marginals, locality properties can be exploited at the optimization stage, see e.g. [15].

By construction, the algorithms $\mathcal{A}$ and ${\mathcal{A}}^{+}$ achieve (expected) cut sizes (for any fixed instance G) related by the inequalities

$\begin{equation}\mathbb{E}\,\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\,\left(\mathcal{A}(G)\right)\right]\leqslant \mathbb{E}\,\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\,\left({\mathcal{A}}^{+}(G)\right)\right]\ \end{equation} \tag{ 1 }$

for any (bare) hybrid algorithm $\mathcal{A}$ , assuming that the optimal parameters are found in the optimization step. Indeed, (1) follows because, denoting with

$\begin{equation*}{\theta }_{\ast }=\mathrm{arg}\underset{\theta }{\mathrm{max}}\left\langle {\Psi}(\theta )\right\vert {H}_{G}\left\vert {\Psi}(\theta )\right\rangle \ \end{equation*}$

the optimal parameters for the Hamiltonian H_G, we have by definition of the algorithms that

$\begin{equation}\begin{aligned}\hfill \mathbb{E}\,\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left(\mathcal{A}(G)\right)\right]& =\left\langle {\Psi}({\theta }_{\ast })\right\vert {H}_{G}\left\vert {\Psi}({\theta }_{\ast })\right\rangle \hfill \\ \hfill \mathbb{E}\,\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left({\mathcal{A}}^{+}(G)\right)\right]& =\underset{\theta }{\mathrm{max}}\left\langle {\Psi}(\theta )\right\vert {H}_{G}^{+}\left\vert {\Psi}(\theta )\right\rangle ,\hfill \end{aligned}\end{equation} \tag{ 2 }$

and

$\begin{equation*}{H}_{G}^{+}={H}_{G}+{{\Delta}}_{G},\end{equation*}$

where Δ_G is a sum of non-negative local operators. These considerations apply to any bare hybrid algorithm $\mathcal{A}$ .

Lower bounds on approximation ratios. We specialize our considerations to QAOA _p and establish lower bounds on the approximation ratio for bare and twisted QAOA, i.e., we consider the algorithms QAOA _p and ${\text{QAOA}}_{p}^{+}$ . Specifically, we consider low values of p for three-regular graphs, triangle-free three-regular graphs and high girth three-regular graphs. We denote the expected approximation ratio achieved by an algorithm $\mathcal{A}$ on a graph G with maximum cut size MC(G) by

$\begin{equation*}{\alpha }_{G}\left(\mathcal{A}\right){:=}\text{MC}{(G)}^{-1}\cdot \mathbb{E}\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left(\mathcal{A}\left(G\right)\right)\right].\end{equation*}$

In the following, we will refer to the expected approximation ratio achieved by an algorithm $\mathcal{A}$ simply as the approximation of $\mathcal{A}$ (omitting the term 'expected') unless specified otherwise. In the case of $\mathcal{A}={\text{QAOA}}_{p}$ , $\mathbb{E}\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left(\mathcal{A}\left(G\right)\right)\right]$ is defined as in (2), but with the level-p QAOA trial function Ψ_G(β, γ), $\beta ,\gamma \in {\left[0,2\pi \right)}^{p}$ instead of Ψ(θ).

Our results are summarized in figure 1, which gives our lower bounds on the approximation ratio for each of these methods. For comparison, we also state the following known bounds on bare QAOA for any three-regular graph G,

$\begin{equation*}\left.\begin{matrix}\hfill {\alpha }_{G}\left({\text{QAOA}}_{1}\right)\geqslant 0.6924\hfill \\ \hfill {\alpha }_{G}\left({\text{QAOA}}_{2}\right)\geqslant 0.7559\hfill \\ \hfill {\alpha }_{G}\left({\text{QAOA}}_{3}\right)\geqslant 0.792\,39\hfill \end{matrix}\right.\quad \begin{matrix}\hfill \text{established}\;\text{in}\;[\text{8}]\hfill \\ \hfill \text{conjectured}\;\text{in}\;[\text{8}]\text{,}\;\text{established}\;\text{in}\;[\text{18}]\hfill \\ \hfill \text{conjectured}\;\text{in}\;[\text{18}]\text{.}\hfill \end{matrix}\end{equation*}$

Also shown in figure 1 are the guaranteed approximation ratios of the best-known classical algorithms: this includes the Goemans–Williamson algorithm (GW) for general graphs (which is optimal when assuming the unique games conjecture [13]) which achieves

$\begin{equation*}{\alpha }_{G}(\text{GW})\geqslant 0.8785\quad \text{for}\,\text{any}\;\text{graph}\;G\enspace (\mathrm{s}\mathrm{e}\mathrm{e}\ [\mathrm{10}]).\end{equation*}$

For three-regular graphs, the best efficient classical algorithms are the algorithm by Feige et al [9] which relies on a semidefinite program whose solution is then improved by a simple greedy post-processing technique, and a refinement of this technique by Halperin et al [11]. They achieve

$\begin{equation*}\begin{matrix}\hfill {\alpha }_{G}\left(\text{FKL}\right)\geqslant 0.924\hfill \\ \hfill {\alpha }_{G}\left(\text{HLZ}\right)\geqslant 0.9326\hfill \end{matrix}\quad \text{for}\,\text{any}\;\text{three}-\text{regular}\;\text{graph}\enspace G\enspace \begin{matrix}\hfill \mathrm{s}\mathrm{e}\mathrm{e}\ [\mathrm{9}]\hfill \\ \hfill \mathrm{s}\mathrm{e}\mathrm{e}\ [\mathrm{11}].\hfill \end{matrix}\end{equation*}$

**Figure 1.** The main results of this work. We compare the provably guaranteed approximation ratios of bare QAOA _p, $\text{FKL}-{\text{QAOA}}_{p}^{+}$ and $\text{HLZ}-{\text{QAOA}}_{p}^{+}$ for three-regular graphs with girth greater than 2p + 2. Numbers written in boldface also apply to general three-regular graphs. All quantities are rounded down to four decimals. Guaranteed approximation ratios which have been established in other work are indicated with citations.
Download figure:
Standard image High-resolution image

According to the table given in figure 1, the established lower bounds on the expected approximation ratios for twisted versions of QAOA at level p = 1, ..., 5 are comparable to the lower bounds on QAOA_p+1 at the higher level p + 1. This suggests that by using these twisted versions, the level p can be reduced by one while roughly maintaining the approximation ratio. We emphasize, however, that this conclusion can only be drawn when it is known that the corresponding bound on QAOA_p+1 is tight.

Let us conclude by mentioning a few open problems. One potential avenue to obtaining improved approximation ratios with hybrid algorithms is to use a different variational family of ansatz states. Here our work gives clear guidance when this is combined with classical post-processing: for a graph G, the energy of a modified cost function Hamiltonian ${H}_{G}^{+}={H}_{G}+{{\Delta}}_{G}$ should be optimized instead of that of H_G. In particular, since Δ_G is a sum of three-local terms in the case of FKL and a sum of four-local terms in the case of HLZ, this motivates introducing new terms (e.g., proportional to these terms) in the ansatz. Such a modification of the algorithm is superficially related to the fact that the classical (randomized rounding-based) algorithms of [9, 11] also use additional (three-variable) constraints in the semidefinite program (SDP) compared to the Goemans–Williamson algorithm. We note, however, that using different variational ansatz states will require a different accounting of resources (e.g., circuit depth). In contrast, our twisted algorithms use the same circuits to prepare ansatz states as their bare version.

Another promising approach may be to combine warm-starting-type ideas with classical post-processing. Here one could consider algorithms that first solve the SDP underlying the classical algorithms [9, 11], and subsequently prepare a corresponding quantum state. One may hope that—similar to [5]—suitably designed approaches give a guaranteed approximation ratio matching that of these classical algorithms.

Moving beyond combinatorial optimization problems, it is natural to ask if variational quantum algorithms for many-body quantum Hamiltonian problems (e.g., quantum analogues of MaxCut as considered in [1]) can be improved by similar greedy (quantum) post-processing procedures.

1.1.1. Outline

In section 2, we review the relevant classical post-processing methods that—in combination with randomized rounding of the solution of certain SDP relaxations—yield the best known efficient classical algorithms for MaxCut on three-regular graphs. In section 3, we review the QAOA and state a few properties relevant to our subsequent analysis. In section 4, we motivate and define the algorithm ${\mathcal{A}}^{+}$ obtained from a hybrid algorithm $\mathcal{A}$ . Finally, in section 5, we establish our lower bounds on the achieved approximation ratio achieved by the twisted algorithm QAOA ⁺.

2. Classical post-processing methods for MAXCUT

In this section, we describe the two classical post-processing procedures which we build on to define twisted versions of a given hybrid algorithm for the MaxCut problem on three-regular graphs. These post-processing procedures are subroutines of the classical algorithms for MaxCut on bounded degree graphs and graphs with maximum degree 3 by Feige et al [9], and Halperin et al [11], respectively.

Recall the definition of the MaxCut problem: we are given an (undirected, simple) graph G = (V, E) and are asked assign two colors to vertices C : V → {0, 1}, which we refer to as a cut of G, that maximizes the number $\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)$ of satisfied edges. Here we say that an edge e = {u, v} is satisfied by C if and only if C(u) ≠ C(v). The maximal size $\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)$ of a cut C of G is denoted MC(G).

The Goemans–Williamson algorithm [10] for MaxCut proceeds by solving an SDP relaxation [4] of the MaxCut problem, and subsequently uses a randomized hyper-plane rounding to obtain a cut. The algorithms of [9, 11] also proceed by first solving certain SDPs and applying randomized rounding. The obtained candidate cut is then further processed in a greedy manner in order to improve the cut size.

Here we review these post-processing procedures and corresponding performance guarantees. One of their key features is that they can be applied to any candidate cut C irrespective of whether it is produced e.g., by rounding the solution of an SDP, random guessing, or starting with a fixed cut. This means that they can also be applied to the output of a hybrid algorithm. We emphasize, however, that our modified hybrid algorithms require a modification going beyond simple post-processing of the classical measurement result, see section 4 for details.

Although the guaranteed approximation ratio achieved by HLZ is better than the one achieved by FKL, we investigate both algorithms. The reason for this lies in the locality of the procedures: while FKL considers only the direct neighborhood of a vertex in a single step and is therefore local, HLZ also considers paths and cycles of lengths in the given graph whose lengths might potentially be unbounded and is therefore not necessarily local. We emphasize, however, that the performance of both procedures in the quantum case can be quantified by considering local operators.

Both post-processing procedures take as input a cut C. They iteratively work towards (ideally) improving the cutsize by modifying the cut. A single iteration proceeds by identifying a suitable subset W ⊂ V of vertices whose assigned color is flipped, i.e., replacing C by the modified cut

$\begin{equation*}{C}^{W}(v){:=}\begin{cases}C(v)\quad \hfill & \quad \text{for}\enspace v\notin W\hfill \\ 1-C(v)\quad \hfill & \quad \text{otherwise}\;\hfill \end{cases}.\end{equation*}$

2.1. The Feige–Karpinski–Langberg (FKL) post-processing method

The main idea of this post-processing step is the following observation: if there are three vertices c, j, k such that one of them (say, c) is connected to both the other ones and all three vertices are assigned the same color by the cut C, then flipping the value at c, i.e., considering C^{c}, will increase the size of the cut, see figure 2.

**Figure 2.** The main motivation behind FKL. On the left, the closed neighborhood of a vertex c is shown. Now assume that we assign a cut C to G and that (c, j, k) is a good triplet for C. We distinguish two cases, depending on whether the edge {c, ℓ} is satisfied (dashed line) or unsatisfied (straight line). Top row: if {c, ℓ} is unsatisfied, flipping the value of c increases the size of the cut by three (no satisfied edges are destroyed, three satisfied edges are created). Bottom row: if {c, ℓ} is satisfied, flipping the value of c increases the size of the cut by one (one satisfied edge is destroyed, two satisfied edges are created).
Download figure:
Standard image High-resolution image

To formalize this, we assume that the set V of vertices of the graph G = (V, E) is ordered. Without loss of generality, set V = [n] = {1, ..., n}. The following definitions will be central:

Definition 2.1 (triplets).

(a)
A three-tuple (c, j, k) ∈ V³ of pairwise distinct vertices with j < k is called a triplet if {c, j} ∈ E and {c, k} ∈ E. We call the vertex c the central vertex of the triplet. The set of all triplets in G will be denoted T_G.
(b)
Let C be a cut of G and (c, j, k) ∈ T_G. Then (c, j, k) is called a good triplet for C if
$\begin{equation*}C(c)=C(j)=C(k).\end{equation*}$
The set of all good triplets for C will be denoted ${\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}(C)$ .
(c)
Let C be a cut of G, $(c,j,k)\in {\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}(C)$ and v ∈ V. We say that (c, j, k) is destroyed by flipping v if (c, j, k) is not a good triplet for the cut C^{v}.

We now formulate the post-processing procedure by FKL. While the observations above show that flipping the center of a good triplet (c, j, k) will increase the cutsize, we might get even better results by flipping j or k. Furthermore, it is in our interest that the flipping does not destroy too many good triplets. Taking all this into account motivates the procedure given in algorithm 1.

Algorithm 1. The FKL improvement procedure for three-regular graphs [9].

1: function FKL(three-regular graph G = (V, E), cut C)

2: $S{\leftarrow}{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}\left(C\right)$ $S{\leftarrow}{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}\left(C\right)$

3: while S ≠ ∅ do

4: ${\qquad V}_{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}{\leftarrow}$ ${\qquad V}_{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}{\leftarrow}$ set of all vertices in S

5: $\qquad v{\leftarrow}\mathrm{arg}\underset{\sigma \in {V}_{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}}{\mathrm{max}}\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left({C}^{\left\{\sigma \right\}}\right)-\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left(C\right)}{\left\vert S{\backslash}{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}\left({C}^{\left\{\sigma \right\}}\right)\right\vert }$ $\qquad v{\leftarrow}\mathrm{arg}\underset{\sigma \in {V}_{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}}{\mathrm{max}}\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left({C}^{\left\{\sigma \right\}}\right)-\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\left(C\right)}{\left\vert S{\backslash}{\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}\left({C}^{\left\{\sigma \right\}}\right)\right\vert }$

6: C ← C^{v}

7: S ← triplets in S that are good for C^{v}

8: return C

The following result is proven in [9].

Lemma 2.2 (lemma 3.2 in [9]). Let G be a three-regular graph and let C be a cut of G. Then the cut C' = FKL(G, C) satisfies

$\begin{equation*}\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })\geqslant \mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)+\frac{1}{3}\vert {\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}(C)\vert .\end{equation*}$

Let us exemplify this improvement by using two simple examples with a three-regular graph G = (V, E). Consider first the trivial constant cut C_const which assigns the same color to all vertices. The cutsize of C_const is 0, hence the approximation ratio vanishes as well, i.e.,

$\begin{equation*}\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}_{\text{const}})}{\text{MC}(G)}=0.\end{equation*}$

Now consider the cut C' := FKL(G, C_const) obtained by applying the FKL-post-processing procedure to the trivial cut. This cut achieves approximation ratio at least

$\begin{equation*}\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })}{\text{MC}(G)}\geqslant 2/3.\end{equation*}$

This can be seen as follows: for a constant cut, every triplet is a good triplet and it is easy to see that $\left\vert {T}_{G}\right\vert =2\left\vert E\right\vert$ for a three-regular graph. Lemma 2.2 then implies that the resulting cut C satisfies $\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })\geqslant \frac{2}{3}\left\vert E\right\vert$ and we obtain the claim with $\text{MC}(G)\leqslant \left\vert E\right\vert$ .

As another example, consider a uniformly random cut C_random of G. For such a cut, the expected approximation ratio is

$\begin{equation*}\mathbb{E}\left[\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}_{\text{random}})}{\text{MC}(G)}\right]=1/2.\end{equation*}$

Let C'' := FKL(G, C_random) be the result of applying the FKL-procedure to C_random. Then

$\begin{equation*}\mathbb{E}\left[\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{{\prime\prime}})}{\text{MC}(G)}\right]\geqslant 2/3.\end{equation*}$

To see this, note that the probability of a fixed triplet being good is equal to $\frac{1}{4}$ . By linearity of expectation, we have $\mathbb{E}[\vert {\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}({C}^{{\prime\prime}})\vert ]=\frac{1}{4}\left\vert {T}_{G}\right\vert =\frac{1}{2}\left\vert E\right\vert$ . Lemma 2.2 then implies that the resulting cut C'' satisfies $\mathbb{E}[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{{\prime\prime}})]\geqslant \left(\frac{1}{2}+\frac{1}{3}\cdot \frac{1}{2}\right)\left\vert E\right\vert =\frac{2}{3}\left\vert E\right\vert \geqslant \frac{2}{3}\text{MC}(G)$ .

2.2. The Halperin–Livnat–Zwick (HLZ) post-processing method

In 2004, Halperin et al [11] improved upon the algorithm of [9], giving an algorithm for MaxCut achieving an expected (provable) approximation ratio of at least 0.9326 on graphs with vertex degree at most 3. To the best of our knowledge³ , this is the best currently known efficient classical algorithm. Although their algorithm works for graphs of maximum degree 3, we will discuss a restricted and thus simpler version for triangle-free three-regular graphs. Unlike the FKL-post-processing this method employs more non-local improvement procedure. The main point here is to illustrate the use of another post-processing method in the construction of twisted hybrid algorithms. We will refer to this procedure simply as HLZ-post-processing.

Given a cut C of a triangle-free graph G, this post-processing method proceeds as specified in algorithm 2. Specializing the results of [11] to the triangle-free case considered here gives the following statement:

Algorithm 2. The HLZ improvement procedure simplified to three-regular triangle free graphs.

1: function HLZ(triangle-free three-regular graph G = (V, E), cut C)

2: V₃ ← vertices in V with three unsatisfied edges by cut C

3: V₂ ← vertices in V with two unsatisfied edges by cut C

4: while V₃ ∪ V₂ ≠ ∅ do

5: if V₃ ≠ ∅ then

6: v ← vertex in V₃ with the smallest number of neighbours in V₃

7: C ← C^{v}

8: else if V₂ ≠ ∅ then

9: v ← vertex in V₂

10: {v₁, ..., v_k} ← the longest path or cycle in G[V₂] containing v

11: M ← {v_i ∈ {v₁, ..., v_k}|i is odd}

12: C ← C^M

13: V₃ ← vertices in V with three unsatisfied edges by cut C

14: V₂ ← vertices in V with two unsatisfied edges by cut C

15: return C

Lemma 2.3 (lemma 3.1 in [11]). Let G be a three-regular triangle-free graph, C be a cut of G and V₂ and V₃ be the sets of vertices with two and three unsatisfied edges adjacent to them in the cut C. Then the cut C' = HLZ(G, C) satisfies⁴

$\begin{equation*}\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })\geqslant \mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)+\frac{2}{5}\vert {V}_{2}\vert +\frac{17}{15}\vert {V}_{3}\vert .\end{equation*}$

Again, let us get a feel for the impact of the procedure like we did for FKL in certain simple scenarios, this time for a triangle-free three-regular graph G = (V, E). Once again, consider first the trivial constant cut C_const which assigns the same color to all vertices and therefore has cutsize 0, so the approximation ratio is 0 as well. Considering ${C}^{\prime }{:=}\text{HLZ}\left(G,{C}_{\text{const}}\right)$ , i.e., the cut obtained by applying the HLZ-post-processing procedure, this cut achieves an approximation ratio of at least

$\begin{equation}\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })}{\text{MC}(G)}\geqslant 0.7555.\end{equation} \tag{ 3 }$

To see this, note that for a constant cut, all vertices belong to V₃ = V and none to V₂ = ∅. Lemma 2.3 implies that $\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })\geqslant \frac{17}{15}\left\vert V\right\vert$ and using that $\left\vert E\right\vert =3/2\left\vert V\right\vert \geqslant \text{MC}(G)$ , we obtain $\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })}{\text{MC}(G)}\geqslant \frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })}{\vert E\vert }\geqslant \frac{17\cdot 2}{15\cdot 3}\approx 0.7555$ .

As another example, consider a uniformly random cut C_random of G. For such a cut, the expected approximation ratio is $\frac{1}{2}$ , i.e., $\mathbb{E}\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)\right]=\frac{1}{2}\left\vert E\right\vert$ . Considering the cut C'' := HLZ(G, C), the approximation ratio of this cut is

$\begin{equation*}\mathbb{E}\left[\frac{\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{{\prime\prime}})}{\text{MC}(G)}\right]\geqslant 0.6611\end{equation*}$

which can be seen as follows: the probability of a vertex being in V₃ and V₂ are 2⁻³ and 2⁻², respectively. By linearity of expectation, we have $\mathbb{E}\left[\left\vert {V}_{3}\right\vert \right]={2}^{-3}\left\vert V\right\vert$ and $\mathbb{E}\left[\left\vert {V}_{2}\right\vert \right]={2}^{-2}\left\vert V\right\vert$ . Lemma 2.3 implies that $\mathbb{E}[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{{\prime\prime}})]\geqslant \frac{\vert E\vert }{2}+\frac{2}{5\cdot 4}\vert V\vert +\frac{17}{15\cdot 8}\vert V\vert$ . Using that $\left\vert V\right\vert =\frac{2}{3}\left\vert E\right\vert$ , we see that the approximation ratio is lower-bounded by $\frac{1}{2}+\frac{29}{180}\approx 0.6611$ in expectation value.

3. Quantum approximate optimization and MaxCut

Here we briefly state the relevant definition for QAOA applied to the MaxCut problem. In section 3.2, we then discuss basic features of QAOA that we exploit to find lower bounds on approximation ratios.

3.1. Definition of the MaxCut Hamiltonian and QAOA _p

Recall that the MaxCut problem Hamiltonian for a graph G = (V, E) is given by

$\begin{equation}{H}_{G}=\frac{1}{2}\sum\limits _{\left\{u,v\right\}\in E}\left(I-{Z}_{u}{Z}_{v}\right)\ \end{equation} \tag{ 4 }$

where a single qubit is associated with each vertex u ∈ V. Measurement of a state ${\Psi}\in {({\mathbb{C}}^{2})}^{\otimes \left\vert V\right\vert }$ in the computational basis yields a string $C\in {\left\{0,1\right\}}^{\left\vert V\right\vert }$ specifying a cut C of expected size

$\begin{equation}\left\langle {\Psi}\right\vert {H}_{G}\left\vert {\Psi}\right\rangle =\mathbb{E}\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}(C)\right].\end{equation} \tag{ 5 }$

The variational family used in QAOA is specified by a natural number p called the level of QAOA. For a given graph G = (V, E), the level-p variational state with parameters $(\beta ,\gamma )\in {\left[0,2\pi \right)}^{p}\times {\left[0,2\pi \right)}^{p}$ is

$\begin{equation}\left\vert {\psi }_{G}(\beta ,\gamma )\right\rangle ={U}_{G}(\beta ,\gamma )\left\vert {+}^{\vert V\vert }\right\rangle \end{equation} \tag{ 6 }$

where $\left\vert +\right\rangle =\frac{1}{\sqrt{2}}(\left\vert 0\right\rangle +\left\vert 1\right\rangle )$ , $\left\vert {+}^{\vert V\vert }\right\rangle {:=}{\left\vert +\right\rangle }^{\otimes \vert V\vert }$ and where

$\begin{equation*}{U}_{G}(\beta ,\gamma ){:=}\prod\limits _{m=1}^{p}\left[\mathrm{exp}\left(-\mathrm{i}{\beta }_{m}\sum\limits _{u\in V}{X}_{u}\right)\mathrm{exp}\left(-\mathrm{i}{\gamma }_{m}{H}_{G}\right)\right]\ \end{equation*}$

is the QAOA unitary. In the following, we analyze the performance of twisted algorithms derived from QAOA _p.

3.2. Locality and uniformity of QAOA

The analysis of QAOA typically exploits its locality and uniformity, see e.g., [8, 17, 18]. Similar arguments apply to our modified versions of QAOA. Here we state these properties in a form that will be used below to establish lower bounds on the achieved approximation ratios.

Locality of QAOA . One of the defining features of this ansatz is its locality: the reduced density operator of ψ_G(β, γ) on some subset S ⊂ [n] of qubits is uniquely determined by (β, γ) and the 'p-environment' of S, a certain subgraph of G. For the following analysis, it will be convenient to express this dependence in a more detailed form.

Let A be a local operator supported on a subset supp(A) ⊂ [n] of qubits. Conjugation of A by an operator of the form exp(−iβ_m X_u) does not change the support of A and leaves the operator invariant unless u ∈ supp(A). Similarly, conjugation of A by an operator of the form exp(iγ_m Z_u Z_v) leaves A invariant unless {u, v} ∩ supp(A) ≠ ∅, in which case the support generically becomes {u, v} ∪ supp(A). Applying this reasoning iteratively shows the following: conjugating A by the QAOA unitary U_G(β, γ) is equivalent to conjugation by a cost function unitary ${U}_{{G}^{(p)}[\mathrm{s}\mathrm{u}\mathrm{p}\mathrm{p}(A)]}(\beta ,\gamma )$ associated with a subgraph G^(p)[supp(A)] of G. The latter is defined as follows, for any fixed subset S ⊂ V vertices corresponding to the support of A. A length-ℓ path starting in S is a sequence (u₀, ..., u_ℓ) of vertices such that u₀ ∈ S and {u_j−1, u_j} ∈ E for all j = 1, ..., ℓ. The subgraph G^(p)[S] of G is the result of taking the union of all paths of length at most p starting in S. We call G^(p)[S] the p-environment of S. Succinctly, this shows that

$\begin{equation*}\left\langle {\psi }_{G}(\beta ,\gamma )\right\vert A\left\vert {\psi }_{G}(\beta ,\gamma )\right\rangle =\left\langle {\psi }_{{G}^{(p)}[\mathrm{s}\mathrm{u}\mathrm{p}\mathrm{p}(A)]}(\beta ,\gamma )\right\vert A\left\vert {\psi }_{{G}^{(p)}[\mathrm{s}\mathrm{u}\mathrm{p}\mathrm{p}(A)]}(\beta ,\gamma )\right\rangle .\end{equation*}$

In other words, to evaluate the expectation of A, it suffices to consider the QAOA-state associated with the p-environment of the support of A.

Uniformity of QAOA . For a generic local operator A with support S = supp(A), the quantity $\left\langle {\psi }_{{G}^{(p)}[S]}(\beta ,\gamma )\right\vert A\left\vert {\psi }_{{G}^{(p)}[S]}(\beta ,\gamma )\right\rangle$ depends on the underlying graph G only through the p-environment G^(p)[S] of S and the subgraph G[S] of G induced by S. In fact, for a fixed induced subgraph K := G[S], only the equivalence class of the p-environment G^(p)[S] matters. Here two graphs G₁ and G₂ (that both contain K as a subgraph) are called equivalent if and only if they are isomorphic with an isomorphism fixing K. This property of QAOA is an immediate consequence of its definition.

This motivates considering equivalence classes of p-environments associated with a graph $\tilde{G}$ . We denote this set by ${\mathcal{E}}^{(p)}(\tilde{G})$ and call this the set of p-environments of $\tilde{G}$ . Modulo isomorphisms fixing $\tilde{G}$ , every element of ${\mathcal{E}}^{(p)}(\tilde{G})$ is a graph that appears as a p-environment G^(p)[S] for a graph G, where S is a subset of vertices of G with the property that the induced subgraph is $\tilde{G}=G[S]$ . We will use individual representatives of each equivalence class to denote elements of ${\mathcal{E}}^{(p)}(\tilde{G})$ . For example, the set is depicted in figure 4 found in appendix A. These observations allow to reorganize expectation values that are uniform. For example,

where ${n}_{G}(\tilde{G})$ is the number of times the p-environment $\tilde{G}$ appears in G.

Of special interest to us will be so-called p-trees. Given a graph $\tilde{G}$ and $p\in \mathbb{N}$ , ${T}^{(p)}\left(\tilde{G}\right)$ is defined as the sole tree in ${\mathcal{E}}^{(p)}\left(\tilde{G}\right)$ , see figures 6 and 7 in appendix A for examples.

4. Twisted variational hybrid algorithms for MaxCut

In this section, we define our twisted algorithm ${\mathcal{A}}^{+}$ given a hybrid algorithm $\mathcal{A}$ . We first show in section 4.1 that the effect of classical post-processing can be quantified in terms of the expectation value of a modified problem Hamiltonian. We then give the definition of the twisted algorithm ${\mathcal{A}}^{+}$ in section 4.2.

4.1. Lifting performance guarantees to hybrid algorithms

Lemmas 2.2 and 2.3 provide performance guarantees for the improvement obtained by applying the (classical) FKL- and the HLZ-algorithm to any cut C. Here we show that these results easily translate to the context of hybrid algorithms.

Concretely, consider a graph G = (V, E) with V = [n] and a variational ansatz state ${\Psi}\in {({\mathbb{C}}^{2})}^{\otimes n}$ . Measuring Ψ in the computational basis provides a cut C ∈ {0, 1}ⁿ to which we can apply either the FKL or the HLZ procedure.

Let us first consider the simpler case of FKL, i.e., suppose that C' = FKL(G, C) is the cut obtained by applying the FKL-post-processing to the cut C. To make lemma 2.2 applicable to this setting, we need an operator that accounts for good triplets. Such an operator is

$\begin{equation*}{N}_{G}{:=}\sum\limits _{(c,j,k)\in {T}_{G}}{{\Pi}}_{c,j,k},\quad \text{where}\enspace {{\Pi}}_{c,j,k}{:=}{\left(\left\vert 000\right\rangle \!\left\langle 000\right\vert +\left\vert 111\right\rangle \!\left\langle 111\right\vert \right)}_{c,j,k}\end{equation*}$

with T_G denoting the set of triplets in G. Observe that Π_c,j,k is a projector onto the subspace spanned by computational basis states $\left\vert C\right\rangle$ describing a cut C ∈ {0, 1}ⁿ such that (c, j, k) is a good triplet in C. This implies that the expectation $\left\langle {\Psi}\right\vert {N}_{G}\left\vert {\Psi}\right\rangle$ of N_G in a state Ψ is equal to the expected number of triplets in a cut C obtained by measuring Ψ in the computational basis, i.e.,

$\begin{equation}\left\langle {\Psi}\right\vert {N}_{G}\left\vert {\Psi}\right\rangle =\sum\limits _{C\in {\left\{0,1\right\}}^{n}}\vert \langle C\vert {\Psi}\rangle {\vert }^{2}\cdot \vert {\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}(C)\vert =\mathbb{E}\left[\vert {\mathsf{G}\mathsf{o}\mathsf{o}\mathsf{d}}_{G}(C)\vert \right].\end{equation} \tag{ 8 }$

Correspondingly, we call N_G the good triplet number operator.

Combining (8) with (5), we obtain the following 'quantum version' of lemma 2.2:

Lemma 4.1. Let G = (V, E) be a three-regular graph with V = [n] and ${\Psi}\in {({\mathbb{C}}^{2})}^{\otimes n}$ . Let C ∈ {0, 1}ⁿ be the result of measuring Ψ in the computational basis and C' := FKL(G, C). Then

$\begin{equation*}\mathbb{E}\,\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}\,({C}^{\prime })\right]=\left\langle {\Psi}\right\vert \left({H}_{G}+\frac{1}{3}{N}_{G}\right)\left\vert {\Psi}\right\rangle .\end{equation*}$

This lemma shows that the 'target Hamiltonian' H_G should be modified by introducing the improvement operator

$\begin{equation}{{\Delta}}_{G}^{\text{FKL}}{:=}\frac{1}{3}{N}_{G}.\end{equation} \tag{ 9 }$

A similar treatment applies to the HLZ-procedure. Suppose that C' = HLZ(G, C) is the cut obtained by applying the HLZ-post-processing to the cut C. We now want to 'quantify' lemma 2.3 and therefore need two operators that account for the number of vertices with two and three unsatisfied edges adjacent to them, respectively. To define these operators, let A(c) be the ordered three-tuple of neighbors of c ∈ V and $\bar{A}(c)$ denote the closed neighbourhood $\bar{A}(c){:=}(c,A{(c)}_{1},A{(c)}_{2},A{(c)}_{3})$ . Then we set

$\begin{equation*}{M}_{G}^{(2)}=\sum\limits _{c\in V}{{\Pi}}_{c,A(c)}^{(2)}\quad \text{and}\quad {M}_{G}^{(3)}=\sum\limits _{c\in V}{{\Pi}}_{c,A(c)}^{(3)},\end{equation*}$

where

$\begin{align*}\hfill {{\Pi}}_{c,A(c)}^{(2)}{:=}& \sum\limits _{b\in \left\{0,1\right\}}\left\vert b\right\rangle \!{\left\langle b\right\vert }_{c}\otimes {P}_{A(c)}^{(b)}\quad \text{with}\enspace {P}_{A(c)}^{(b)}{:=}\!\!\!\sum\limits _{\begin{subarray}{c}\left\{x,y,z\right\}\in {\left\{0,1\right\}}^{3},\\ b\oplus x+b\oplus y+b\oplus z=1\end{subarray}}\!\!\!\left\vert xyz\right\rangle \!{\left\langle xyz\right\vert }_{A(c)}\enspace \text{and}\enspace \hfill \\ \hfill {{\Pi}}_{c,A(c)}^{(3)}{:=}& {\left(\left\vert 0000\right\rangle \!\left\langle 0000\right\vert +\left\vert 1111\right\rangle \!\left\langle 1111\right\vert \right)}_{\bar{A}(c)}.\hfill \end{align*}$

Observe that ${P}_{A(c)}^{(b)}$ is a projector onto the sum of computational basis states that contain exactly two bits equal to b. Furthermore, ${{\Pi}}_{c,A(c)}^{(2)}$ is a projector onto the subspace spanned by computational basis states which are associated with exactly two unsatisfied edges adjacent to c. Similarly, ${{\Pi}}_{c,A(c)}^{(3)}$ is a projector onto the subspace spanned by computational basis states which are associated with exactly three unsatisfied edges adjacent to c. By abuse of notation, we use ${{\Pi}}_{c}^{(2)}$ and ${{\Pi}}_{c}^{(3)}$ whenever the graph is known from the context.

Using the same reasoning as for lemma 4.1, we obtain the following:

Lemma 4.2. Let G = (V, E) be a three-regular triangle-free graph with V = [n] and ${\Psi}\in {({\mathbb{C}}^{2})}^{\otimes n}$ . Let C ∈ {0, 1}ⁿ be the result of measuring Ψ in the computational basis and C' := HLZ(G, C). Then

$\begin{equation*}\mathbb{E}\left[\mathsf{c}\mathsf{u}\mathsf{t}\mathsf{s}\mathsf{i}\mathsf{z}\mathsf{e}({C}^{\prime })\right]=\left\langle {\Psi}\right\vert \left({H}_{G}+\frac{2}{5}{M}_{G}^{(2)}+\frac{17}{15}{M}_{G}^{(3)}\right)\left\vert {\Psi}\right\rangle .\end{equation*}$

Therefore, H_G should be modified by introducing the improvement operator

$\begin{equation}{{\Delta}}_{G}^{\text{HLZ}}{:=}\frac{2}{5}{M}_{G}^{(2)}+\frac{17}{15}{M}_{G}^{(3)}.\end{equation} \tag{ 10 }$

4.2. Definition of the twisted algorithm ${\mathcal{A}}^{+}$

Here we present our modified variational algorithm ${\mathcal{A}}^{+}$ which we call twisted- $\mathcal{A}$ . We formalize a variational quantum algorithm $\mathcal{A}$ as follows: it is given by a family of states

$\begin{equation*}\mathcal{A}={\left\{{{\Psi}}_{x}\left(\theta \right)\right\}}_{\theta \in {\Theta}},\end{equation*}$

where x is an input to the algorithm, i.e., a problem instance and ${\Theta}\subset {\mathbb{R}}^{k}$ for some $k\in \mathbb{N}$ . Once one has chosen θ, the state ${{\Psi}}_{x}\left(\theta \right)$ is measured to obtain the output of the algorithm.

In the case of the MaxCut problem, a problem instance is given by a graph G. A good hybrid algorithm for this problem specifies a variational family ${\left\{{{\Psi}}_{G}(\theta )\right\}}_{\theta \in {\Theta}}$ whose elements can be efficiently prepared (e.g., by a low-depth circuit) and which—ideally—contains elements with large energy (corresponding to the expected cut size) with respect to the MaxCut problem Hamiltonian H_G (see equation (4)). Given such an algorithm $\mathcal{A}$ , we obtain a twisted algorithm $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ - ${\mathcal{A}}^{+}$ by the following modifications, where $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}\in \left\{\text{FKL},\text{HLZ}\right\}$ denotes the chosen classical post-processing involved (see section 2):

(a)
In the angle optimization step, the modified cost function Hamiltonian ${H}_{G}^{+}={H}_{G}+{{\Delta}}_{G}^{\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}}$ is used. Here ${{\Delta}}_{G}^{\text{FKL}}$ and ${{\Delta}}_{G}^{\text{HLZ}}$ are the corresponding operators defined in equations (9) and (10), respectively.
(b)
The classical post-processing procedure $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ is applied to the measurement result obtained by measuring the optimal state.

Algorithm 3 shows the general procedure.

Algorithm 3. The twisted algorithm $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ - ${\mathcal{A}}^{+}$ where $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}\in \left\{\text{FKL},\text{HLZ}\right\}$ and where $\mathcal{A}={\left\{\left\vert {{\Psi}}_{G}\left(\theta \right)\right\rangle \right\}}_{\theta \in {\Theta}}$ is a variational algorithm. The measurement result C ∈ {0, 1}ⁿ obtained in step 3 defines a cut of G.

1: function $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ - ${\mathcal{A}}^{+}$ ${\mathcal{A}}^{+}$ (three-regular graph G = (V, E) with V = [n])

2: Compute ${\theta }_{\ast }=\mathrm{arg}{\mathrm{max}}_{\theta \in {\Theta}}\left\langle {{\Psi}}_{G}\left(\theta \right)\right\vert ({H}_{G}+{{\Delta}}_{G}^{\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}})\left\vert {{\Psi}}_{G}\left(\theta \right)\right\rangle$ ${\theta }_{\ast }=\mathrm{arg}{\mathrm{max}}_{\theta \in {\Theta}}\left\langle {{\Psi}}_{G}\left(\theta \right)\right\vert ({H}_{G}+{{\Delta}}_{G}^{\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}})\left\vert {{\Psi}}_{G}\left(\theta \right)\right\rangle$

3: Measure ${{\Psi}}_{G}\left({\theta }_{\ast }\right)$ ${{\Psi}}_{G}\left({\theta }_{\ast }\right)$ in the computational basis getting outcome C ∈ {0, 1}ⁿ

4: Compute ${C}^{\prime }=\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}(G,C)$ ${C}^{\prime }=\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}(G,C)$

5: return C'

5. Lower bounds on approximation ratios of QAOA ⁺

Here we analyze the twisted versions of QAOA in detail. For a graph G and $p\in \mathbb{N}$ , let H_G be the Hamiltonian (4) and ψ_G(β, γ) the level-p trial wavefunction defined by (6). The twisted algorithms $\text{FKL}-{\text{QAOA}}_{p}^{+}$ and $\text{HLZ}-{\text{QAOA}}_{p}^{+}$ proceed as described in algorithm 4. We prove lower bounds on the approximation ratios ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{p}^{+}\right)$ and ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{p}^{+}\right)$ for certain families of three-regular graphs G.

Algorithm 4. The twisted algorithm $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ -QAOA_p for $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}\in \left\{\text{FKL},\text{HLZ}\right\}$ .

1: function $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ $\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}$ -QAOA_p(three-regular graph G = (V, E) with V = [n])

2: Compute $({\beta }_{\ast },{\gamma }_{\ast })=\mathrm{arg}{\mathrm{max}}_{(\beta ,\gamma )\in {\left[0,2\pi \right)}^{p}\times {\left[0,2\pi \right)}^{p}}\left\langle {\psi }_{G}(\beta ,\gamma )\right\vert ({H}_{G}+{{\Delta}}_{G}^{\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}})\left\vert {\psi }_{G}(\beta ,\gamma )\right\rangle$ $({\beta }_{\ast },{\gamma }_{\ast })=\mathrm{arg}{\mathrm{max}}_{(\beta ,\gamma )\in {\left[0,2\pi \right)}^{p}\times {\left[0,2\pi \right)}^{p}}\left\langle {\psi }_{G}(\beta ,\gamma )\right\vert ({H}_{G}+{{\Delta}}_{G}^{\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}})\left\vert {\psi }_{G}(\beta ,\gamma )\right\rangle$

3: Measure ψ_G(β_*, γ_*) in the computational basis getting outcome C ∈ {0, 1}ⁿ

4: Compute ${C}^{\prime }=\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}(G,C)$ ${C}^{\prime }=\mathsf{P}\mathsf{o}\mathsf{s}\mathsf{t}(G,C)$

5: return C'

A remark on the proof technique is in order here: while we rely on numerical gradient descent to determine good candidate parameters, these are used to optimize our lower bounds only. In particular, the validity of the established bounds is independent of the correctness of these numerical methods. This is especially important because we consider high-dimensional optimization problems and gradient descent may or may not converge.

5.1. Approximation ratios of FKL-QAOA ⁺ for three-regular graphs

We denote the girth of a graph G, i.e., the size of the smallest cycle in G, by g(G). We present two kinds of results: for $\text{FKL}-{\text{QAOA}}_{1}^{+}$ , we give a bound applicable to all three-regular graphs. For higher levels p, we give bounds applicable to three-regular graphs with high girth.

Proposition 5.1. Let G be a three-regular graph. Then

(a)
${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{1}^{+}\right)\geqslant 0.7443$ .
(b)
If g(G) ⩾ 7, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{2}^{+}\right)\geqslant 0.7887$ .
(c)
If g(G) ⩾ 9, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{3}^{+}\right)\geqslant 0.8146$ .
(d)
If g(G) ⩾ 11, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{4}^{+}\right)\geqslant 0.8323$ .
(e)
If g(G) ⩾ 13, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{5}^{+}\right)\geqslant 0.8457$ .
(f)
If g(G) ⩾ 15, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{6}^{+}\right)\geqslant 0.8564$ .

Proof.

(a)
For brevity, let us write ψ_G(θ) for the QAOA ₁ state with parameters $\theta =(\beta ,\gamma )\in {\left[0,2\pi \right)}^{2}$ . Recall from lemma 4.1 that the expected approximation ratio obtained from such a state using the FKL-post-processing procedure is given by
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{FKL}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}.\end{equation} \tag{ 11 }$
We follow and simplify the approach of [8, 18] and bound the ratio (11) in terms of its local contributions.We first rearrange and express the numerator of (11) as a sum over triplets. Notice that since the graph is three-regular, any edge lies in exactly four triplets. Hence
$\begin{equation}{H}_{G}+{{\Delta}}_{G}^{\text{FKL}}=\sum\limits _{(c,j,k)\in {T}_{G}}{T}_{(c,j,k)}\ \end{equation} \tag{ 12 }$
where T_(c,j,k) is the triplet operator defined as
$\begin{equation*}{T}_{(c,j,k)}{:=}\frac{{H}^{c,j}+{H}^{c,k}}{4}+\frac{1}{3}{{\Pi}}_{c,j,k}\quad \text{for}\enspace (c,j,k)\in {T}_{G}\end{equation*}$
and where ${H}^{a,b}{:=}\frac{1}{2}\left(I-{Z}_{a}{Z}_{b}\right)$ is term in the MaxCut-problem Hamiltonian H_G associated with the edge {a, b}.Next consider the denominator in the expression (11), i.e., the maximum size MC(G) of a cut. We can bound this term by the expression
where is the set of isolated triangles (triangles that do not share an edge with another triangle) in G and is the set of crossed squares (consisting of two triangles sharing an edge). Inequality (13) follows immediately from the expression that in any cut of G, there is at least one unsatisfied (i.e., 'uncut') edge in each isolated triangle because of frustration. Similarly, there is at least one unsatisfied edge in each crossed square. We note that the bound (13) applies to any three-regular graph G with more than four vertices because in these graphs, any triangle is either isolated or part of a crossed squared. (Observe that for the remaining graph, the complete graph G = K₄ on four vertices, we have MC(K₄) = 4, and equation (13) does not hold for this graph. In our argument, we will replace equation (13) by the relaxed equation (14) below which applies also to K₄.)We can bound MC(G) further starting from (13) by expressing the right-hand side as a sum over edges. Since every isolated triangle has three edges, we can express the number of isolated triangles as
where is 1 if the edge e is part of an isolated triangle in the graph G and 0 otherwise. Similarly, we have
for crossed squares, where is 1 if the edge e is part of a crossed square in the graph G and 0 otherwise.To establish our bound, we only consider the one-environment of each edge e ∈ E, i.e., G⁽¹⁾[e]. For an edge e ∈ E which belongs to a triangle, the one-environment G⁽¹⁾[e] is not necessarily sufficient to distinguish whether the triangle is isolated or belongs to a crossed square: for example, this is the case for an edge e that belongs to a crossed square but is not shared by both triangles. The fraction of uncut edges (in any cut) is 1/3 for an isolated triangle, and 1/5 for a crossed square. Using the smaller of these two contributions per edge, i.e., pretending that each triangle is in a crossed square, yields the bound
Here indicates whether the edge e is part of a triangle, i.e., equals 1 whenever the edge e is part of a triangle in graph G and 0 otherwise. Notice that , therefore it is enough to examine the one-environments of edges to obtain the bound (the possible environments are showcased in figure 3 in the appendix A). We note that while we have excluded G = K₄ in the proof of inequality (14), it is easy to check directly that this graph also satisfies (14).Expression (14) motivates defining the local averaged MaxCut fraction of an edge e in G as
Using that every edge appears in four triplets, we can reexpress the upper bound (14) as
$\begin{align}\hfill \text{MC}(G)& \leqslant \frac{1}{4}\sum\limits _{(c,j,k)\in {T}_{G}}\left({L}_{\left\{c,j\right\}}^{G}+{L}_{\left\{c,k\right\}}^{G}\right)\hfill \\ \hfill & =\sum\limits _{(c,j,k)\in {T}_{G}}{L}_{(c,j,k)}^{G},\hfill \end{align} \tag{ 15 }$
where
$\begin{equation*}{L}_{(c,j,k)}^{G}{:=}\frac{1}{4}\left({L}_{\left\{c,j\right\}}^{G}+{L}_{\left\{c,k\right\}}^{G}\right)\end{equation*}$
denotes the local averaged MaxCut fraction of a triplet (c, j, k) ∈ T_G.Inserting the upper bound (15) on MC(G) and expression (12) into (11) gives
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{FKL}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant \frac{{\sum }_{(c,j,k)\in {T}_{G}}\left\langle {\psi }_{G}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{G}(\theta )\right\rangle }{{\sum }_{(c,j,k)\in {T}_{G}}{L}_{(c,j,k)}^{G}}.\end{equation} \tag{ 16 }$
Recall that for any triplet (c, j, k) ∈ T_G, the expectation value $\left\langle {\psi }_{G}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{G}(\theta )\right\rangle$ is equal to the local expectation $\left\langle {\psi }_{\tilde{G}}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{\tilde{G}}(\theta )\right\rangle$ , where $\tilde{G}$ is the (appropriate) graph environment of the triplet. By its definition as a local quantity, the combinatorial quantity ${L}_{(c,j,k)}^{G}={L}_{(c,j,k)}^{\tilde{G}}$ also depends only on the corresponding graph environment. The set of equivalence classes of possible graph environments consists of 11 (equivalence classes of) graphs, see figure 4 in appendix A. Denoting—as in (7)—by n_G(G_r) the number of times the environment G_r appears in G, we can restate (16) as
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{FKL}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant \frac{{\sum }_{r=1}^{11}{n}_{G}({G}_{r})\left\langle {\psi }_{{G}_{r}}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{r}}(\theta )\right\rangle }{{\sum }_{r=1}^{11}{n}_{G}({G}_{r}){L}_{(c,j,k)}^{{G}_{r}}}.\end{equation} \tag{ 17 }$
Equation (17) is valid for any choice of $\theta \in {\left[0,2\pi \right)}^{2}$ . Suppose now that we have found some angles $\bar{\theta }\in {\left[0,2\pi \right)}^{2}$ such that
$\begin{equation}\frac{\left\langle {\psi }_{{G}_{s}}(\bar{\theta })\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{s}}(\bar{\theta })\right\rangle }{{L}_{(c,j,k)}^{{G}_{s}}}\geqslant \frac{\left\langle {\psi }_{{G}_{1}}(\bar{\theta })\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{1}}(\bar{\theta })\right\rangle }{{L}_{(c,j,k)}^{{G}_{1}}}\quad \text{for}\,\text{all}\,s=2,\dots ,11.\end{equation} \tag{ 18 }$
An example of such a pair is
$\begin{equation}\bar{\theta }=(\bar{\beta },\bar{\gamma })=(1.130\,565,5.667\,705)\end{equation} \tag{ 19 }$
as can be verified by straightforward computation. The mediant inequality $\frac{a+b}{c+d}\geqslant \mathrm{min}\left\{\frac{a}{c},\frac{b}{d}\right\}$ implies (inductively) that
$\begin{equation*}\frac{{\sum }_{r=1}^{11}{n}_{r}{t}_{r}}{{\sum }_{r=1}^{11}{n}_{r}{\ell }_{r}}\geqslant \underset{r=1,\dots ,11}{\mathrm{min}}\frac{{t}_{r}}{{\ell }_{r}}\end{equation*}$
for any integers ${\left\{{n}_{j}\right\}}_{j=1}^{11}\subset {\mathbb{N}}_{0}$ and non-negative scalars ${\left\{{t}_{r}\right\}}_{r=1}^{11}$ , ${\left\{{\ell }_{r}\right\}}_{r=1}^{11}$ . Combining this with (18), we conclude that
$\begin{equation}\frac{{\sum }_{r=1}^{11}{n}_{G}({G}_{r})\left\langle {\psi }_{{G}_{r}}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{r}}(\theta )\right\rangle }{{\sum }_{r=1}^{11}{n}_{G}({G}_{r}){L}_{(c,j,k)}^{{G}_{r}}}\geqslant \frac{\left\langle {\psi }_{{G}_{1}}(\bar{\theta })\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{1}}(\bar{\theta })\right\rangle }{{L}_{(c,j,k)}^{{G}_{1}}}\geqslant 0.7443.\end{equation} \tag{ 20 }$
From (17) and (20) we obtain
$\begin{equation*}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{FKL}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant 0.7443\end{equation*}$
and the claim follows by taking the maximum over $\theta \in {\left[0,2\pi \right)}^{2}$ .Let us briefly elaborate on the choice (19) of parameters $\bar{\theta }$ in this proof. By direct computation, we numerically observe that the quantity ${\mathrm{max}}_{\theta \in {\left[0,2\pi \right)}^{2}}\frac{\left\langle {\psi }_{{G}_{r}}(\theta )\right\vert {T}_{(c,j,k)}\left\vert {\psi }_{{G}_{r}}(\theta )\right\rangle }{{L}_{(c,j,k)}^{{G}_{r}}}$ is minimal for r = 1. The parameters $\bar{\theta }\in {\left[0,2\pi \right)}^{2}$ in equation (19) are the numerically obtained angles achieving the maximum for r = 1. We note that their only required feature in our argument is property (18). This can be verified immediately. A proof that these values $\bar{\theta }$ indeed correspond to some maximum is not required.
(b)–(f)
Let ψ_G(θ) for $\theta \in {\left[0,2\pi \right)}^{2p}$ be the QAOA _p-wave function. We again consider the expected approximation ratio given by the expression ratio (11). We can use the trivial lower bound MC(G) ⩽ |E| on the size of the maximum cut, giving
$\begin{equation}{\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{p}^{+}\right)\geqslant \vert E{\vert }^{-1}\cdot \left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{FKL}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle \end{equation} \tag{ 21 }$
for any choice of $\theta \in {\left[0,2\pi \right)}^{2p}$ . The assumptions on the girth can be expressed as g(G) > 2p + 2 for p = 2, 3, 4, 5, 6, i.e., the level of QAOA. For such high-girth graphs, all relevant graph environments of an arbitrary triplet in G are isomorphic to the tree , see figure 6 in appendix A. Therefore, using (12), the bound (21) becomes
for any choice of $\theta \in {\left[0,2\pi \right)}^{2p}$ . We can evaluate the right-hand side of this inequality using a tensor network algorithm and gradient descent to maximize the angles. In particular, in each of the cases (b)–(f) we found a set of angles θ such that the right-hand side of (22) is equal to the value stated in the proposition. These angles are listed in figure 8. This completes the proof.

For sake of comparison, we also obtained the guaranteed approximation ratios of bare QAOA for p = 4, 5, and 6 for high girth graphs. These were computed in a similar fashion as explained at the end of the proof of proposition 5.1:

The witness angles proving the lower bounds are listed in figure 8.

5.2. Approximation ratios of HLZ-QAOA ⁺ for three-regular graphs

Proposition 5.2. Let G = (V, E) be a three-regular graph. Then

(a)
If G is triangle-free (i.e. g(G) ⩾ 4), then ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{1}^{+}\right)\geqslant 0.7548$ .
(b)
If g(G) ⩾ 7, then ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{2}^{+}\right)\geqslant 0.7954$ .
(c)
If g(G) ⩾ 9, then ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{3}^{+}\right)\geqslant 0.8191$ .
(d)
If g(G) ⩾ 11, then ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{4}^{+}\right)\geqslant 0.8358$ .
(e)
If g(G) ⩾ 13, then ${\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{5}^{+}\right)\geqslant 0.8482$ .
(f)
If g(G) ⩾ 15, then ${\alpha }_{G}\left(\text{FKL}-{\text{QAOA}}_{6}^{+}\right)\geqslant 0.8582$ .

Proof.

(a)
Recall from lemma 4.2 that the expected approximation ratio obtained using the HLZ-post-processing procedure is given by
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{HLZ}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)},\end{equation} \tag{ 24 }$
where we again use ψ_G(θ) for the QAOA ₁ state.We rearrange and express the numerator (24) as a sum over three-star subgraphs, as they are underlying graphs of local terms of the improvement operator ${{\Delta}}_{G}^{\text{HLZ}}$ . The three-star graph with the central vertex c has vertices {c, j, k, ℓ} and edges {{c, j}, {c, k}, {c, ℓ}} and we depict it by . Since the graph G is three-regular, any edge {a, b} ∈ E lies in exactly two stars with central vertices a and b. Hence
$\begin{equation}{H}_{G}+{{\Delta}}_{G}^{\text{HLZ}}=\sum\limits _{c\in V}{S}_{c},\end{equation} \tag{ 25 }$
where S_c is the three-star operator
$\begin{equation*}{S}_{c}{:=}\frac{{H}^{c,j}+{H}^{c,k}+{H}^{c,\ell }}{2}+\frac{2}{5}{{\Pi}}_{c}^{(2)}+\frac{17}{15}{{\Pi}}_{c}^{(3)}\quad \text{for}\enspace c\in V,\end{equation*}$
(j, k, ℓ) is the ordered neighbourhood of c in G and H^a,b is again the MaxCut term on edge {a, b}.Inserting the trivial upper bound on MC(G) ⩽ |E| and (25) into (24) gives:
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{HLZ}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant \frac{{\sum }_{c\in V}\left\langle {\psi }_{G}(\theta )\right\vert {S}_{c}\left\vert {\psi }_{G}(\theta )\right\rangle }{\vert E\vert }.\end{equation} \tag{ 26 }$
We can restate (26) as a sum over the local expectation values over the graph environments from the set (listed in figure 5 in appendix A):
$\begin{equation}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{HLZ}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant \frac{{\sum }_{r=1}^{8}{n}_{G}({G}_{r})\left\langle {\psi }_{{G}_{r}}(\theta )\right\vert {S}_{c}\left\vert {\psi }_{{G}_{r}}(\theta )\right\rangle }{\vert E\vert },\end{equation} \tag{ 27 }$
where n_G(G_r) is number of times the environment G_r appears in graph G.Suppose now that we have found some angles $\bar{\theta }\in {\left[0,2\pi \right)}^{2}$ such that
$\begin{equation}\left\langle {\psi }_{{G}_{s}}(\bar{\theta })\right\vert {S}_{c}\left\vert {\psi }_{{G}_{s}}(\bar{\theta })\right\rangle \geqslant \left\langle {\psi }_{{G}_{1}}(\bar{\theta })\right\vert {S}_{c}\left\vert {\psi }_{{G}_{1}}(\bar{\theta })\right\rangle \quad \text{for}\,\text{all}\;s=2,\dots ,8.\end{equation} \tag{ 28 }$
An example of such a pair is
$\begin{equation*}\bar{\theta }=(\bar{\beta },\bar{\gamma })=(0.102\,870,5.669\,319)\end{equation*}$
as can be verified by straightforward computation.We combine (27) with (28) and use the fact that ${\sum }_{r=1}^{8}{n}_{G}({G}_{r})=\vert V\vert =2/3\vert E\vert$ for three-regular graphs:
$\begin{equation*}\frac{\left\langle {\psi }_{G}(\theta )\right\vert \left({H}_{G}+{{\Delta}}_{G}^{\text{HLZ}}\right)\left\vert {\psi }_{G}(\theta )\right\rangle }{\text{MC}(G)}\geqslant \frac{2}{3}\left\langle {\psi }_{{G}_{1}}(\bar{\theta })\right\vert {S}_{c}\left\vert {\psi }_{{G}_{1}}(\bar{\theta })\right\rangle \geqslant 0.7548\end{equation*}$
and the claim follows.
(b)–(f)
We will follow a similar line of reasoning as in (a) and proposition 5.1(b)–(f). The assumptions again guarantee that the considered graphs are of girth greater than 2p + 2 with p being the level of QAOA. For such high-girth graphs, all graph environments of an arbitrary star in G are isomorphic to . Therefore,
$\begin{equation*}{\alpha }_{G}\left(\text{HLZ}-{\text{QAOA}}_{p}^{+}\right)\geqslant \frac{2}{3}\left\langle {\psi }_{\tilde{G}}(\theta )\right\vert {S}_{c}\left\vert {\psi }_{\tilde{G}}(\theta )\right\rangle ,\end{equation*}$
where ${\psi }_{\tilde{G}}(\theta )$ for $\theta \in {\left[0,2\pi \right)}^{2p}$ be the QAOA _p-wave function. We obtain witness angles by numerical optimization (listed in figure 8) and the claim follows.

We note that the proven lower bound proposition 5.2(a) on the approximation ratio ${\alpha }_{G}(\text{HLZ}-{\text{QAOA}}_{1}^{+})$ of the twisted algorithm QAOA ₁ is below the value 0.7555 resulting from the application of HLZ to a constant partition (see (3)). An improvement over this trivial (classical) algorithm can only be observed starting from level p ⩾ 2 (cf proposition (5.2)(b)–(f)). This is not surprising given the fact that the QAOA-ansatz is very restricted, especially for small values of p. In particular, for any angles (β, γ), the QAOA-state ψ_G(β, γ) (cf (6)) with the usual cost function Hamiltonian H_G for MaxCut is different from both the all-zero state ${\left\vert 0\right\rangle }^{\otimes n}$ and the all-one state ${\left\vert 1\right\rangle }^{\otimes n}$ . This is the case for any level p since because of the ${\mathbb{Z}}_{2}$ -symmetry of the ansatz: every state ψ_G(β, γ) is an eigenstate of the operator X^⊗n.

Acknowledgments

We thank Zahra Baghali Khanian for suggesting the name 'twisted QAOA'. AK and RK acknowledge support by the Army Research Office under Grant No. W911NF-20-1-0014. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. RK and LC gratefully acknowledge support by the European Research Council under Grant Agreement No. 101001976 (project EQUIPTNT). LC thanks IBM Zurich for their hospitality.