An expressive ansatz for low-depth quantum approximate optimisation

The quantum approximate optimisation algorithm (QAOA) is a hybrid quantum–classical algorithm used to approximately solve combinatorial optimisation problems. It involves multiple iterations of a parameterised ansatz that consists of a problem and mixer Hamiltonian, with the parameters being classically optimised. While QAOA can be implemented on near-term quantum hardware, physical limitations such as gate noise, restricted qubit connectivity, and state-preparation-and-measurement (SPAM) errors can limit circuit depth and decrease performance. To address these limitations, this work introduces the eXpressive QAOA (XQAOA), an overparameterised variant of QAOA that assigns more classical parameters to the ansatz to improve its performance at low depths. XQAOA also introduces an additional Pauli-Y component in the mixer Hamiltonian, allowing the mixer to implement arbitrary unitary transformations on each qubit. To benchmark the performance of XQAOA at unit depth, we derive its closed-form expression for the MaxCut problem and compare it to QAOA, Multi-Angle QAOA (MA-QAOA) (Herrman et al 2022 Sci. Rep. 12 6781), a classical-relaxed algorithm, and the state-of-the-art Goemans–Williamson algorithm on a set of unweighted regular graphs with 128 and 256 nodes for degrees ranging from 3 to 10. Our results indicate that at unit depth, XQAOA has benign loss landscapes with local minima concentrated near the global optimum, allowing it to consistently outperform QAOA, MA-QAOA, and the classical-relaxed algorithm on all graph instances and the Goemans–Williamson algorithm on graph instances with degrees greater than 4. Small-scale simulations also reveal that unit-depth XQAOA invariably surpasses both QAOA and MA-QAOA on all tested depths up to five. Additionally, we find an infinite family of graphs for which XQAOA solves MaxCut exactly and analytically show that for some graphs in this family, special cases of XQAOA are capable of achieving a much larger approximation ratio than QAOA. Overall, XQAOA is a more viable choice for variational quantum optimisation on near-term quantum devices, offering competitive performance at low depths.


I. INTRODUCTION
Full-fledged fault-tolerant quantum computers capable of executing quantum algorithms that can solve problems of interest are expected to involve at least millions of physical qubits, high-fidelity gate operations, and quantum error correction techniques [1].While the physical realisation of such devices is still a long way off, noisy intermediate-scale quantum (NISQ) devices capable of running quantum algorithms with limited circuit depth are becoming more widely available [2,3].Particularly promising are the variational quantum algorithms (VQAs) [4][5][6][7][8] capable of potentially realising a quantum advantage on NISQ devices.Unlike traditional quantum algorithms like Shor's algorithm [9] that use specially designed quantum circuits to solve specific problems, VQAs use parameterised quantum circuits whose objective is to drive a quantum state close to the desired state that minimises a cost function by varying the gate parameters.
The Quantum Approximate Optimisation Algorithm (QAOA) [8] is one such algorithm that can solve optimisation problems by encoding their solutions into the ground state of a quantum Hamiltonian and preparing a quantum state that approximates this ground state.QAOA involves a p-level quantum circuit described by a collection of 2p classical parameters to generate a quantum state.The classical parameters are fine-tuned to optimise the expectation of the cost for the generated quantum state.This quantum state can then be measured to obtain an approximate solution to the optimisation problem.Besides its ability to solve combinatorial optimisation problems, QAOA can be used to perform universal quantum computation [10,11].Moreover, even at its lowest level p = 1, QAOA can efficiently generate probability distributions that likely cannot be generated efficiently by classical computers [12,13].
There are several approaches that have been proposed to improve the performance of low-depth QAOA by adding new parameters to the ansatz [23,[57][58][59][60][61].These approaches include Multi-Angle QAOA (MA-QAOA) [57], which increases the number of classical parameters added in each layer for more precise control of the optimisation process; Free-Axis Mixer QAOA (FAM-QAOA) [58], which includes additional variational parameters in the mixer Hamiltonian that allow for rotation about an axis in the XY plane; QAOA with Adaptive Bias Fields (AB-QAOA) [59], which adds a Pauli-Z component to the mixer Hamiltonian; Adaptive Derivative Assembled Problem Tailored QAOA (ADAPT-QAOA) [60], which grows the ansatz iteratively using a gradient criterion; and QAOA+ [23], which augments the traditional QAOA ansatz with an additional multi-parameter layer that is independent of the specific problem being solved.Despite these improvements, there remains an imperative for problem-inspired quantum ansatzes with minimal computational overhead, which are not only expressive but also readily trainable allowing for greater flexibility in the optimisation process.
This paper presents a modified version of the QAOA called eXpressive QAOA (XQAOA).It shares the same inspiration behind the recently proposed Multi-Angle QAOA (MA-QAOA) approach [57] but goes beyond it by including an additional Pauli-Y component in the mixing Hamiltonian.This modification strategically overparameterises the quantum ansatz, facilitating the exploration of all relevant directions of the Hilbert space by allowing the mixer to effectively implement arbitrary unitary operations on each qubit with just a single iteration.As a result, XQAOA does not suffer from reachability deficits [44,62]; with appropriately chosen angles, XQAOA can output any computational-basis state.To quantify the performance of the quantum algorithm, we apply it to the problem of maximum cut (MaxCut) on arbitrary graphs.We derive closed-form expressions for XQAOA, MA-QAOA, and QAOA at p = 1 for the Max-Cut problem and benchmark their performance against a naive Classical-Relaxed (CR) algorithm and the stateof-the-art Goemans-Williamson (GW) [63] algorithm on unweighted D-regular graphs-graphs where every node is connected to D other nodes-with 128 and 256 nodes for 3 ≤ D ≤ 10.The benchmark reveals that at p = 1, XQAOA outperforms MA-QAOA, QAOA, and the CR algorithm on all graph instances and the GW algorithm on graphs with D > 4; interestingly, the CR algorithm also outperforms QAOA and MA-QAOA on all graphs with QAOA matching MA-QAOA's performance for graphs with D > 5. We find that the exceptional performance of the XQAOA ansatz is attributed to the favourable characteristics of its benign loss landscape, which is notably free of barren plateaus and spurious local minima, with any remaining local minima being concentrated around the global optimum.Lastly, we show that for unweighted triangle-free graphs with edges of odd degrees, XQAOA can solve MaxCut exactly.Here, the edge degree d(e) of an edge e = {u, v} is defined as the number of neighbours of e, i.e., d(e) = |N (u) ∪ N (v)| − 2, where N (w) is the set of all nodes connected to the node w.
The structure of the remainder of this paper is as follows: in section II, we review the necessary background material, where we explain the MaxCut problem and the challenges in finding its optimal solution (section II A), the traditional QAOA ansatz and its application to the MaxCut problem (section II B), and the MA-QAOA ansatz and its extension to MaxCut on arbitrary graphs in (section II C).In section III, we introduce XQAOA and discuss its variants and other notable properties.In section IV, we present the results of our numerical simulations.In section V, we interpret and discuss our results, and in section VI, we provide some concluding remarks.

A. Maximum Cut (MaxCut)
Many real-world problems can be phrased as combinatorial optimisation problems [64].Here, we lay emphasis on XQAOA's application to an archetypal problem known as MaxCut, which has numerous applications in computer science and operations research, including statistical physics and circuit layout design [65], analysis of social networks [66], data clustering [67], semi-supervised learning [68], and more [69,70].The (weighted) MaxCut problem is an optimisation problem in which we are given an undirected weighted graph and asked to partition its vertices into two disjoint sets S and S such that the sum of the weights of the edges between the two sets is as large as possible.
Formally, given an undirected graph G = (V, E) and non-negative weights w uv = w vu on the edges {u, v} ∈ E, the MaxCut problem is that of finding a set S of vertices that maximises the weight of the edges in the cut (S, S); that is, the weight of the edges with one endpoint in S and the other in S. The MaxCut problem can be The optimisation problem given by eq. ( 1) is NP-hard 1 , which suggests that it is highly plausible that no efficient algorithm exists that can solve it.However, there are approximation algorithms that can find good solutions in polynomial time for many instances of the problem.The GW algorithm holds the current record for an approximation ratio guarantee on generic graphs, achieving an approximation ratio of r * ≈ 0.87856 using semidefinite programming [63].When confined to unweighted 3-regular graphs, this lower bound can be increased to r * ≈ 0.9326 [72] 2 .Assuming the unique games conjecture [74] 3 and that P ̸ = NP, this is the best possible approximation ratio for MaxCut [75][76][77] that polynomialtime classical algorithms can achieve.Additionally, it has 1 Historically, the NP-hardness of MaxCut was one of the earliest results known in computational complexity theory: the decision version of the MaxCut problem was one of Karp's first NP-complete problems [71].Here, a decision problem is a problem in which a yes-or-no answer is sought.A decision version of the MaxCut problem may be phrased as follows: given a graph G and an integer j, determine if G has a cut whose size is at least j.[73]. 3The unique games conjecture asserts that the problem of estimating the approximate value of a certain type of game, known as a unique game, has an NP-Hard computational complexity.
been proven that it is NP-hard to approximate the Max-Cut value with an approximation ratio that is better than r * ≥ 16/17 ≈ 0.94117 [78,79].
Combinatorial optimisation problems can be formulated using n bits and m clauses, where each clause represents a constraint on a subset of the bits that is satisfied for certain combinations of values for those bits but not for others.We consider the case when each clause µ is associated with a cost c µ ∈ R. The objective function defined on n-bit strings is then given by the sum of the costs of the satisfied clauses: where n is an n-bit string and C µ (z) = c µ if z satisfies the clause µ and 0 otherwise.An approximate optimisation algorithm aims to find a string z that achieves a desired approximation ratio r ⋆ , i.e., it seeks a string z that satisfies where C max = max z C(z).The QAOA algorithm consists of two operators (see fig. 1): the problem unitary and the mixing unitary, which are generated by the problem Hamiltonian and mixing Hamiltonian, respectively.The problem unitary is defined as the following unitary operator U (C, γ) which depends on a real-valued angle γ ∈ R: The operators C = z C(z) |z⟩⟨z| and C µ = z C µ (z) |z⟩⟨z| are the diagonal operators whose entries are the objective function values.Next, the mixing unitary is defined as the β-dependent product of commuting one-qubit unitaries where β ∈ [0, π) and B is the sum of all single-qubit Pauli-X operators For any positive integer p ≥ 1, the QAOA algorithm generates an angle-dependent quantum state using 2p angles, where the subscripts of γ and β indicate the iterate number of the quantum ansatz.The quantum state has the form where |s⟩ denotes the uniform superposition over all n-bit strings We then compute the expectation value of C for the variational state described in eq. ( 7) which is accomplished by repeated measurements of fresh copies of the quantum system in the computational basis.
The optimal parameters (γ * , β * ) that maximise the expectation value ⟨C⟩ are found using a classical computer: Typically, this is performed by estimating the parameters and then optimising them using simplex or gradient techniques.The approximation ratio r * is a relevant metric for assessing the performance of QAOA, where We will focus on applying QAOA to the MaxCut problem for the rest of this paper.To this end, note that the optimisation problem in eq. ( 1) is equivalent to finding the maximum eigenvalue of the problem Hamiltonian C for MaxCut: where Z i denotes the Pauli-Z matrix acting on the i-th qubit.
Before proceeding with the rest of the section, let us make a few definitions that will be used throughout the paper.For w ∈ V , let N (w) = {x ∈ V : {x, w} ∈ E} be the set of neighbours of w, i.e. vertices which are adjacent to w.Then, for an edge {u, v} ∈ E, we have that • e = N (v)\{u} is the set of vertices other than u that are connected to v.
• d = N (u)\{v} is the set of vertices other than v that are connected to u.
is the set of vertices that form a triangle with the edge {u, v}.In other words, F is the set of vertices that are neighbours of both u and v.
The following theorem can be used to compute the expectation value of the cost function for QAOA at p = 1 (QAOA 1 ) for MaxCut on arbitrary weighted graphs, thereby allowing us to assess the performance of QAOA and γ ′ ij = γw ij .
In appendix D 3, we give a proof of theorem 1, which we show follows as a straightforward corollary of our main theorem (theorem 3).By taking w ij = 1 if {i, j} ∈ E and 0 otherwise, eq. ( 13) simplifies for unweighted graphs to: which has previously appeared as eq.( 14) of [80] 4 .
From theorem 1, we see that at p = 1, the expectation value ⟨C uv ⟩ of any edge in a graph depends on only the nodes and edges adjacent to it.The overall expectation value for QAOA 1 can then be calculated by summing the expectation values over all edges in the graph.For an n-node graph, the right-hand side of eq. ( 13) can be computed in linear time O(n).Since the total number of edges in any graph is at most n 2 = O(n 2 ), computing the expectation value of QAOA would take at most O(n 3 ) time.However, to find an actual bit string that represents an approximate solution for an arbitrary graph, here we use the QAOA quantum circuit to generate a quantum state on which measurement is performed.
The Multi-Angle QAOA (MA-QAOA) [57] varies from the original QAOA in that it allows each summand of the problem and mixing Hamiltonians to have its own angle, as opposed to these Hamiltonians having a single angle each 5 .In this modification for p = 1 (called MA-QAOA 1 ), the problem and mixing unitaries are defined as respectively, where C = (C µ ) µ=1,...,m and B = (B ν ) ν=1,...,n denote collections of operators.Thus, MA-QAOA 1 generates an angle-dependent quantum state of the form where γ = [γ 1 , γ 2 , . . ., γ m ] and β = [β 1 , β 2 , . . ., β n ].The subscript in γ µ refers to the µ-th clause, and the subscript in β ν refers to the ν-th qubit.In the context of MaxCut, µ and ν index the edges and vertices, respectively, of the graph involved.The approximation ratio obtained using QAOA lower bounds that of MA-QAOA, and MA-QAOA's guarantee of convergence to the exact solution as p → ∞ follows immediately from [8, eq. ( 10)] and from noting that MA-QAOA is a generalisation of QAOA.Herrman et al. [57] provide an analytical formula for computing the performance of MA-QAOA 1 on MaxCut for unweighted triangle-free graphs.We generalise their result with the following theorem, where we present an analytical formula for the expectation value of the cost function for MA-QAOA 1 for MaxCut on arbitrary weighted graphs, allowing for the assessment of MA-QAOA 1 's performance on general graphs.
and γ ′ jk = γ jk w jk .We present, in appendix D 3, a proof of theorem 2, which again, is a straightforward corollary of our main theorem (theorem 3).Like eq. ( 13), the expectation value ⟨C uv ⟩ MA for any edge in a graph depends on only its neighbouring nodes and edges, and the overall expectation value is the sum of the expectation values over all edges in the graph; hence, computing eq. ( 18) for an arbitrary graph has a time complexity of O(n 3 ).However, this time complexity has a larger constant prefactor compared to that of computing eq. ( 13).While QAOA 1 involves only two hyperparameters regardless of the size of the problem, MA-QAOA The eXpressive QAOA builds on MA-QAOA by introducing an additional α-dependent unitary operator to the mixing Hamiltonian.Let us define the α-dependent operator to be the following product of commuting onequbit operators: where A = (A i ) i=1,...,n , and α ∈ [0, π) 6 .The mixing unitary is then given by the product of the U (B, β) and U (A, α) unitary operators: Thus at p = 1, XQAOA generates an angle-dependent quantum state of the form Similarly to the eq.( 17), the subscript in γ i denotes the i-th clause, and the subscripts in α i and 6 Due to the α → α + π and β → β + π translational symmetries of the QAOA output state, one could without loss of generality assume that α and β lie in the interval [0, π).For the purposes of our simulations though, we do not place such an explicit restriction, since α and β repeat in intervals of π (in addition, for unweighted graphs, γ repeats in intervals of 2π) anyways.The data in fig. 5 were adjusted to fit the ranges mentioned in this paper.
β i refer to the i-th qubit, which in the context of MaxCut correspond to the edges and vertices, respectively, of the graph.
One motivation for introducing the XQAOA is that, unlike QAOA and MA-QAOA, the XY mixer7 in eq. ( 20) is the most general product (with respect to the n registers in the circuit) unitary operator one could write for p = 1 XQAOA, up to an unphysical global phase incurred when the system is measured immediately after the mixer unitary is applied.This makes XQAOA a natural generalisation of QAOA to consider, as one aims to maximise the expressiveness of the ansatz given the limitations on its depth, and also gives XQAOA the ability to output any computational-basis state given appropriate angles γ, β, and α.To see this, note that if we set the angles γ = β = 0 in (21), we are left with single-qubit Y-rotations on the |+⟩ states.Choosing appropriate angles α j on each qubit will bring |+⟩ to |0⟩ or |1⟩.The same is true for when γ = 0 and α = β.Consequently, as we mentioned in section I, XQAOA is able to eschew any reachability deficits [44,62].

Ansatz
No. of Parameters From eq. ( 21), it is clear that several variations of the XQAOA ansatz can be generated by placing restrictions on the allowed angles of the mixing unitaries.The MA-QAOA is a special case of the XQAOA ansatz obtained by setting all α i = 0. Other configurations of the XQAOA ansatz worth noting are those with the XY Mixer, Y Mixer, and the X=Y Mixer, respectively.The XY Mixer is the most general mixer and uses individual angles α i , β i for each unitary in the mixing Hamiltonian.The Y Mixer consists of only Pauli-Y gates and is obtained by setting all β i to zero.The X=Y Mixer includes both Pauli-X and Pauli-Y gates but uses a single angle for both, with α i equal to β i .
As we summarise in table I, the XQAOA ansatzes with the XY mixer, Y mixer, and X=Y mixer for n qubits and m clauses require the classical optimisation of 2n + m, n + m, and n + m angles, respectively.While the performances of these mixers are not known a priori, the XY mixer is expected to have a higher computational overhead than the other two mixers due to the presence of an additional n classical parameters.The X=Y mixer is expected to perform better than the Y mixer because it is able to trace a larger portion of the Bloch sphere due to its non-trivial trajectory, whereas the Y mixer is limited to the XZ plane.
In the remainder of the paper, we will use the superscript notation to indicate the specific variant of the XQAOA ansatz being used, i.e.XQAOA XY 1 , , and XQAOA Y 1 refer to p = 1 XQAOA with the XY, X=Y, and Y mixers, respectively.The next theorem-the main theorem of this paper-allows us to calculate the expectation value of the cost function for XQAOA XY 1 for MaxCut on arbitrary weighted graphs, which in turn allows us to evaluate the performance of XQAOA.
and γ ′ jk = γ jk w jk .We present a proof of theorem 3 in appendix D 2. Like eq. ( 13) and eq.( 18), the expectation value of any edge in a graph ⟨C uv ⟩ XY is determined by its neighbouring nodes and edges, and the overall expectation value is the sum of the expectation values of all edges in the graph.Calculating eq. ( 22) for an arbitrary graph also has a time complexity of O(n 3 ), but has a larger prefactor compared to both eq.( 13) and eq.( 18).While QAOA 1 requires only two parameters regardless of the problem size and MA-QAOA 1 requires n + n 2 parameters, XQAOA XY 1 requires 2n + n 2 parameters.In contrast, both XQAOA Y 1 and XQAOA X=Y 1 require n + n 2 parameters.In the following corollary, we show that for unweighted graphs with edges of odd edge degrees, the XQAOA Y 1 ansatz can solve MaxCut exactly.Here, the edge degree d(e) of an edge e = {u, v} ∈ E is defined as the number of neighbours of e, i.e., d(e Corollary 4. Consider an unweighted graph G where the edge degree of every edge is odd.Then, when γ = π and α = π 4 , the XQAOA Y 1 state |γ, α⟩ provides the exact MaxCut solution for G, where |γ, α⟩ denotes the state in eq. ( 21) where all γ i = γ, β i = 0, and α i = α.
We give a proof of corollary 4 in appendix D 4. One consequence of corollary 4 is that it allows us to identify a graph instance for which we can analytically prove a separation between XQAOA Y 1 and QAOA 1 .Our next corollary elucidates this result.
Corollary 5.For the unweighted 5-vertex star graph G, XQAOA Y 1 with optimal angles (say, from corollary 4) computes the MaxCut of G with an expected (and worstcase) approximation ratio of 1, whereas the expected approximation ratio of QAOA 1 with optimal angles is only 0.75.
We give a proof of corollary 5 in appendix D 5. While our result above pertains to the 5-vertex star graph (see fig. 2), one can readily generalise this proof to any t-vertex star graph, where t ≥ 5 is odd (here, the oddness criterion arises because it is only for odd-vertex star graphs that the edge degrees of the graph are all odd).The statement that QAOA 1 achieves an optimal expected approximation ratio of 3/4 for these graphs could be considered a finite-dimensional analogue of [57, Section IV]'s result that in the limit as the number of vertices tends to infinity, the performance of QAOA 1 approaches 0.75 for star graphs.In terms of the expected approximation ratio that can be achieved, this infinite class of graphs instantiates a clear advantage that XQAOA has over QAOA.FIG.2: Diagrammatic representation of the 5-vertex star graph S 4 , which we use to show an advantage that XQAOA has over QAOA.More specifically, we show that while XQAOA Y 1 can find the MaxCut of S 4 with an approximation ratio of 1, QAOA 1 can achieve an approximation ratio of at most 3/4.

IV. COMPUTATIONAL RESULTS
We evaluate the performance of the XQAOA algorithm by benchmarking it on the MaxCut problem on unweighted D-regular graphs that were generated using an algorithm developed by Steger and Wormald [85].Since the three different configurations XQAOA XY 1 , XQAOA X=Y 1 and XQAOA Y 1 of XQAOA that we consider have performances that are not known a priori, we first benchmark them on 10 randomly generated instances of 3-regular graphs with 128 vertices.The best-performing configuration of XQAOA 1 is then benchmarked against MA-QAOA 1 , QAOA 1 , and the CR and GW algorithms on 25 instances of D-regular graphs with 128 and 256 vertices for 3 ≤ D ≤ 10.The XQAOA 1 , MA-QAOA 1 , QAOA 1 , and CR algorithms are benchmarked by performing 100 runs of the classical optimiser with random initial points, whereas the GW algorithm is benchmarked by first solving the relaxed problem and then generating 100 random vectors for hyperplane rounding.For an explanation of the GW algorithm, we refer the reader to appendix A. The CR algorithm computes the MaxCut by simply running the optimiser on the relaxed version of eq. ( 1), i.e.Maximise {u,v}∈E where the maximisation is performed over angles θ u and θ v .
To compute the approximation ratio, we need to obtain the exact MaxCut values, which we did using the GUROBI solver [86], a widely used industry tool.Although proving optimality with GUROBI takes exponential time, it can find solutions quickly.GUROBI was able to identify optimal solutions for 128-node graphs and near-optimal solutions for 256-node graphs with at most 6% MIPGap 8 .To compute the expectation values of QAOA 1 , MA-QAOA 1 , and XQAOA 1 for large n, we used the analytical results from theorems 1, 2, and 3.The Parallel-LBFGS algorithm was used to optimise the variational parameters of QAOA 1 , MA-QAOA 1 , XQAOA 1 , and the CR algorithm.
Finally, to assess the performance of quantum algorithms at increased depths, we expanded our benchmarking to include depths ranging from 1 to 5 for the most effective XQAOA variant, as well as QAOA and MA-QAOA.Due to the lack of analytical formulas for larger p values and the significant computational complexity associated with simulating deep quantum circuits, we conducted our benchmarks using the Qiskit [87] simulator on small graphs.

A. Benchmark Results for p = 1
Our comparative analysis of the three XQAOA variants on 3-regular graphs revealed that the XQAOA X=Y 1 variant performed the best (see fig. 3).When benchmarked against MA-QAOA 1 , QAOA 1 , CR, and the GW algorithm on D-regular graphs with 128 and 256 vertices for 3 ≤ D ≤ 10, XQAOA X=Y 1 consistently outperformed QAOA 1 , MA-QAOA 1 , and the CR algorithm on all graph instances.Notably, XQAOA X=Y 1 demonstrated competitive performance against the GW algorithm for 3 and 4-regular graphs and exceeded it for D-regular graphs with D > 4 (see fig. 4 9 ).The boxplots reveal that the 8 The MIPGap is the gap between the lower and upper objective bound divided by the absolute value of the incumbent objective value. 9Here, we note that in fig.4, the boxplots for the XQAOA X=Y to initial parameter choices can be attributed to its overparameterised ansatz, which is further explained in section V C.
Our analysis also highlighted a linear increase in the approximation ratio of QAOA 1 with the degree of the computer in order to obtain samples from measuring the output states of the circuits involved.Hence, the actual solutions obtained from QAOA 1 , QAOA * 1 , and MA-QAOA 1 may differ from the expected approximation ratios shown in the boxplots, and could be either higher or lower.For an extended discussion of the distinction between using expected approximation ratios and approximation ratios of individual samples, we refer the reader to Larkin et al. [88].
graph.Its performance nears that of MA-QAOA 1 for D > 5, hinting at a possible reachability deficit [44,62], limiting MA-QAOA 1 's ability to find an approximate solution close to the optimal.It is also important to note that the QAOA 1 ansatz experiences the barren plateau phenomenon [89], with the size of the plateau increasing with the degree of the graph.To mitigate this, careful selection of initial points for the classical optimiser is crucial, as outlined in appendix C. The results of this strategy are also presented in fig. 4 as QAOA * .

B. Quantum-Classical Transition
In our numerical simulations with the XQAOA X=Y 1 ansatz, we observed that as the classical optimiser converged to an optimum, the optimal angles (γ * , β * ) stabilised at specific values.Specifically, γ * uv converged to When and when allowing us to read off the classical bit-string.It is important to acknowledge that due to degenerate local optimums, a relatively small subset of angles might not stabilise at the specified values even after convergence.
Given the benign characteristics of the XQAOA X=Y 1 ansatz's loss landscape, such deviations are highly un- 10 In weighted graphs, the optimal angles γ * uv are scaled by the edge weight wuv.The effective optimal angle, discounting the weight prefactor, can be calculated as γ * uv = wuvγ * uv mod 2π.
likely.However, in cases where deviations occur, we force γ * uv to take the closest value in {0, π, 2π}11 .Reviewing fig. 5 from a different perspective, we observe that the XQAOA X=Y 1 ansatz, initially set with random angles, creates a highly entangled quantum state with a multitude of superposed states.As optimisation progresses, this entanglement gradually decreases, leading to a marked reduction in the number of superposed states.When the optimum is reached, the entangling layer disappears, leaving the system in a singular definitive state.In essence, what begins as a distinctly quantum state, through the course of optimisation, evolves into a classical state.This transition, marked by the disappearance of the entangling layer, facilitates the extraction of the solution through classical means, negating the need for quantum computers.This quantum-to-classical transition raises a natural question of whether the entangling layer is fundamentally necessary.To answer this, we conducted further numerical simulations by setting γ = 0 and optimising only the β angles.Our numerical results showed that without any entangling layer, the performance of XQAOA X=Y 1 was similar to that of CR (see fig. 6).In fact, when we set γ = 0, eq. ( 22) reduces to which is the same as eq.( 23) with the angles having an additional factor of 2. This demonstrates that the overparameterised entangling layer augments the landscape, making the gradient-based classical optimiser less susceptible to getting trapped in local optima.
C. Benchmark Results for 1 ≤ p ≤ 5 In previous sections, we showed that XQAOA X=Y 1 achieves near-optimal solutions and surpasses QAOA 1 and MA-QAOA 1 .This led us to compare XQAOA X=Y p 's performance with QAOA p and MA-QAOA p for p > 1.
Lacking analytical formulas for expectation values for p > 1 and constrained by the computational overhead of large-scale quantum simulation, we performed smallscale simulations using Qiskit [87], benchmarking 20 random 3-regular graph instances with 16 vertices each.Given the variability in expectation values due to the limited number of shots (1024), we utilised the Powell optimiser [90] with ten random restarts to optimise the ansatzes' parameters.The simulation results, presented in fig.7, reveal that the median approximation ratio of XQAOA X=Y p consistently outperforms QAOA p and MA-QAOA p up to p = 4, with XQAOA X=Y 1 's median nearly reaching 1.0.Interestingly, while QAOA p shows modest improvement with increasing depth, MA-QAOA p peaks at p = 3 before declining at p ≥ 4 12 .In contrast, XQAOA X=Y p exhibits a gradual decline in performance beyond p = 1, particularly noticeable at p ≥ 4.This decline is attributed to barren plateaus due to the circuit's overexpressiveness, rather than a lack of effectiveness at higher depths [92,93].
V. DISCUSSION

A. Randomness in the Approximation Algorithms
We optimise the classical parameters of XQAOA 1 , MA-QAOA 1 , QAOA 1 , and CR algorithms using a gradient-based classical optimiser.We start the optimisation process with a randomly chosen initial point which affects the quality of the solution found, especially if the optimisation landscape is non-convex and has nontrivial features.If the initial point is near a local optimum or barren plateau, the gradient-based optimiser will converge to the local optimum and return a suboptimal solution.As a result, XQAOA 1 , MA-QAOA 1 , QAOA 1 , and CR algorithms have a wide range of approximate solutions for the same problem.The maximum approximation ratio returned by MA-QAOA 1 is 0.82, whereas XQAOA 1 can often achieve an approximation ratio of 1.0 with the right choice of initial parameters.The range of approximation ratios for the CR algorithm is similar to XQAOA 1 , but numerical simulations suggest that the cost landscape of the CR algorithm may be difficult to navigate and plagued with local optima, which may require an exponential number of initial points for the CR to match XQAOA 1 's performance, thus negating the benefit of having a polynomial-time approximation algorithm.
The GW algorithm also has randomness in its process.After it solves the relaxed version of the MaxCut problem, it generates an n-dimensional random vector r to perform its hyperplane rounding to find the optimal cut.While GW has an expected approximation ratio of 0.87856 in the worst case at the asymptotic limit, the distribution of the approximation ratio returned by the GW algorithm for a finite number of randomly generated r vectors and problem instances can vary significantly, which is why we see a distribution for the output of the GW algorithm.While XQAOA 1 , MA-QAOA 1 , and the CR algorithm each require solving the problem anew for each random initial point, the GW algorithm solves the relaxed MaxCut problem just once and then efficiently generates individual solutions through hyperplane rounding with randomly generated vectors, avoiding repetitive computations.

B. Classical Simulability of the XQAOA Ansatz
Computing expectation values of QAOA 1 , MA-QAOA 1 , and XQAOA 1 for MaxCut on arbitrary graphs all have a time complexity of O(n 3 ), albeit with varying prefactors.XQAOA 1 is unique in that its entangling layer vanishes at the optimal solution, in contrast to QAOA 1 and MA-QAOA 1 , which maintain their entangling layer post-convergence.This entangling layer in QAOA 1 and MA-QAOA 1 requires generating an entangled quantum state through quantum computation followed by measurements to assign values to variables, precluding efficient classical simulation.In contrast, XQAOA 1 's optimal state, characterised by zero gammas, is non-entangled and permits classical bit assignment without quantum computation.However, in cases where XQAOA 1 falls short, a p > 1 XQAOA p might be required, necessitating quantum computation.
It remains an open question whether a simple analytical formula exists for efficiently computing the mean values of Pauli operators of the XQAOA 1 , MA-QAOA 1 , and QAOA 1 states when dealing with k-local Ising Hamiltonians for k > 2. Additionally, if the problem assignments take integer-valued arguments, the quantum circuit would require more qubits per variable, further increasing the entangling and non-locality of the circuit, potentially allowing for quantum advantage.The p = 1 Recursive QAOA (RQAOA 1 ) [39] is another quantum algorithm that can solve the MaxCut problem classically in time O(n 4 ) without requiring any quantum computation or multi-qubit measurements.

C. Role of Overparameterisation in XQAOA
The efficacy of XQAOA largely stems from its overparameterised ansatz.Overparameterisation, which involves introducing additional parameters to increase model dimensionality, reshapes the loss landscape to facilitate easier optimisation.To understand the impact of overparameterisation, consider the role of local minima in this context: a local minimum is a point in the loss landscape with a loss value lower or equal to those of its neighbours within an ϵ radius.While such minima are prevalent in lower dimensions, they become less likely with increasing dimensionality as it becomes harder for their loss values to remain the lowest across all new dimensions, converting them from minima into saddle points.Saddle points differ from local minima as they are not the lowest points in all directions; thus, optimisers can navigate past them more readily.As overparameterisation turns more potential local minima into saddle points, the path to the global minimum becomes less obstructed.Although the overparameterised models may not be completely devoid of local minima, the remaining local minima tend to be close to the global minimum.This proximity reduces the likelihood of settling for suboptimal solutions and facilitates more efficient convergence to the global minimum [94][95][96][97][98].This advantageous effect of overparameterisation is evident in fig.4, where the lower quartiles for XQAOA X=Y 1 consistently surpass 0.92 approximation ratio on all the benchmarked graph instances, strongly hinting at the benign loss landscape devoid of barren plateaus and suboptimal local minima.
While the previous discussion intuitively explains how overparameterisation enhances the efficacy of XQAOA, it's crucial to place this within the wider context of ongo-ing research on the trainability 13 of parameterised quantum circuits (PQCs).These studies, which form the basis of quantum landscape theory (QLT) [99], offer a deeper understanding of quantum loss landscapes [37,48,53,89,92,93,[100][101][102][103][104][105][106].QLT defines a PQC to be overparameterised when it has sufficiently many parameters to explore all relevant directions of its state space [99].An essential aspect of this definition is the inherent expressiveness of the PQC, which is characterised by its ability to generate a wide range of unitaries under varied parameter settings [107].This distinction is particularly evident when comparing XQAOA X=Y 1 and MA-QAOA 1 for the MaxCut problem; despite having an equal number of parameters, XQAOA X=Y 1 consistently outperforms MA-QAOA 1 on all problem instances.However, high expressivity also has drawbacks, such as the barren plateau phenomenon, where circuits show vanishingly small gradients due to their expressiveness [92].XQAOA addresses this by using a problem-specific ansatz, tailoring its circuit design to the task at hand.This approach allows XQAOA to be overparameterised with a quadratic number of parameters, in contrast to generic ansatzes that may require exponentially more parameters and deeper circuits [99,108].XQAOA thus achieves an optimal balance, avoiding both the limitations of underparameterisation, such as spurious local minima and reachability deficits [53], and the challenges of high expressivity like barren plateaus.This positions XQAOA in an optimal 'Goldilocks zone' of trainability and expressivity.

D. Shallow XQAOA vs Deep QAOA
The decision to use shallow XQAOA versus deep QAOA circuits hinges on their trainability for a given problem.Trainability depends on the loss landscapes of their ansatzes, influenced by factors like parameter count, initialisation strategy, circuit depth, and hardware noise.These factors can adversely affect trainability, hindering optimisation.Thus, the choice between shallow XQAOA and deep QAOA is determined by their relative trainability under these conditions.
It is evident that to surpass the performance of XQAOA 1 , QAOA circuits need to be sufficiently deep.For example, it is conjectured that a minimum depth of p = 12 is necessary for QAOA to outperform the GW algorithm on the MaxCut problem on unweighted 3-regular graphs [43].This conjecture not only posits a depth requirement but also suggests a parameter ini- 13 In this context, 'trainability' refers to the ability of a PQC to efficiently adjust its variational parameters for optimising a cost function.The terms 'train,' 'trainable,' and 'trainability' are often used interchangeably with 'optimise' and 'optimisable' in quantum algorithms literature due to the parallel drawn between PQCs and Quantum Neural Networks, where the process of optimising network variables is commonly referred to as 'training.'tialisation strategy of using a set of predetermined angles for warm-starting the optimisation.However, this strategy becomes ineffective in the presence of hardware noise [48].The noise distorts the loss landscape, leading to barren plateaus, and necessitates deeper, more noiseprone circuits.In contrast, our findings suggest that XQAOA X=Y 1 is adequate for the MaxCut problem, questioning the need for deeper XQAOA circuits on NISQ devices.While XQAOA may require p > 1 for certain problems, necessitating quantum computation, its relatively shallower depth compared to QAOA makes it less susceptible to noise.With sufficiently low noise levels, XQAOA can still achieve near-optimal solutions, albeit with additional random restarts or parameter initialisation strategies [105].
For problems unlike MaxCut on unweighted 3-regular graphs, where patterns are ambiguous [37,106] and a priori information is limited, training deep QAOA circuits is more challenging [109].This is due to the difficulty in determining the minimal effective depth [110] and suitable parameter initialisation strategies in scenarios where random initialisation is suboptimal [89].For such problems, where extracting useful a priori information is challenging, XQAOA may be a preferable choice, regardless of whether the quantum computers are NISQ or fault-tolerant.An example of such a problem could be the MaxCut on randomly weighted regular graphs or randomly generated graphs.

VI. CONCLUSION
In this work, we presented the XQAOA ansatz and its variants and explained how they generalise the MA-QAOA and QAOA ansatzes.Our numerical simulations reveal that a single iteration of the XQAOA ansatz, especially with the X=Y mixer, outperforms a single iteration of both MA-QAOA and QAOA.This enhanced performance of XQAOA is attributed to its overparameterised ansatz, which enables exploration in all relevant directions of its state space.The incorporation of the Pauli-Y rotation gate also significantly contributes to this improved efficacy.Our benchmarks also reveal that XQAOA performs just as well as the stateof-the-art Goemans-Williamson algorithm and even outperforms it for unweighted regular graphs with degrees greater than 4. Additionally, we find that the naive Classical-Relaxed algorithm with fewer classical parameters than MA-QAOA performs better by a large margin and that the performance of QAOA grows arbitrarily close to MA-QAOA for regular graphs with increasing degrees.Finally, we find an infinite family of graphs for which XQAOA solves MaxCut exactly and show analytically that for some graphs in this family, special cases of XQAOA are capable of achieving a much larger approximation ratio than QAOA.
Interestingly, we found that as the XQAOA X=Y 1 ansatz converges to an optimum, its entangling layer disap-pears, leaving behind only single-qubit unitaries, making it possible to efficiently solve and extract the solution classically.Although the entangling layer disappears as the ansatz reaches an optimal solution, it is necessary for the optimisation process, without which the performance of the XQAOA X=Y 1 ansatz deteriorates to that of the Classical-Relaxed algorithm.Although for the problem of MaxCut, the efficient classical simulation of the p = 1 XQAOA ansatz eliminates quantum advantage, it remains open whether this is still the case for larger p and problems whose Ising formulations are 2-local with external fields or k-local with k > 2.
We have also shown that despite the XQAOA ansatz being overparameterised-with a quadratic increase in free parameters in the worst-case scenario-it is significantly easier to train compared to the underparameterised QAOA ansatz and the adequately parameterised Classical-Relaxed algorithm.The QAOA ansatz struggles with issues like spurious local minima, barren plateaus, and reachability deficits, while the Classical-Relaxed algorithm often encounters sub-optimal local minima far from the global optimum.In contrast, the XQAOA ansatz, like other overparameterised models, features a more benign loss landscape, free from barren plateaus, spurious local minima, and reachability deficits.This characteristic enables the classical optimiser to consistently converge to optimal or near-optimal solutions, independent of the parameter initialisation strategy.While the increased number of free parameters in XQAOA might suggest higher computational costs, its faster convergence rate, eliminating the need for specific initialisation strategies or random restarts, compensate for the extra parameters' computational overhead.
Our work opens up new avenues for further research into improving quantum optimisation algorithms as well as their impact on various applications.For example, QAOA and its variants have already found numerous potential uses in solving various optimisation problems beyond MaxCut, including problems in graph theory [111][112][113], finance [114], chemistry [115,116], and others [15,117].Future work could extend these results by adopting and exploiting the advantages of XQAOA in various applications.In addition, due to its advantages at low depth, XQAOA could be tested and implemented on near-term quantum hardware and compared against existing experimental benchmarks [112,[117][118][119].

VII. ACKNOWLEDGEMENTS
VV is thankful to Ye Jun from the A*STAR Institute of High Performance Computing and the A*STAR Computational Resource Centre for supporting this work through the use of their high-performance computing facilities.VV is thankful to Aaron Tranter for the stimulating discussions and insightful suggestions.We thank Truman Ng for helpful comments on an earlier version of this manuscript.This research is supported by A*STAR C230917003 and the Australian Research Council Centre of Excellence CE170100012.DEK acknowledges funding support from the A*STAR Central Research Fund (CRF) Award for Use-Inspired Basic Research; and the National Research Foundation, Singapore and A*STAR under the Quantum Engineering Programme (NRF2021-QEP2-02-P03).

VIII. DATA AVAILABILITY
We provide a total of 420 D-regular graphs that were used in this paper, along with their optimal cut and their solutions in a machine-readable CSV format.We also provide the simulation data and scripts used to generate the plots presented in this paper.The benchmark and simulation dataset and the scripts for generating the plots can be found at https://github.com/vijeycreative/XQAOA-Dataset[120].
We can relax this program to a vector program by allowing the binary variables y i to be n-dimensional vector variables v i that lie on the n-dimensional unit sphere S n .Replacing the product of scalar terms in eq.(A1) with the corresponding inner product, we obtain the following vector program for MaxCut.

Maximise
(A2) This relaxed vector program can be efficiently solved by semidefinite programming, which allows us to obtain a set of optimal vectors v * i for each node in the original graph.The Goeman-Williamson algorithm then uses a random n-dimensional vector r from S n to partition the vertices into two sets by assigning sign(r • v * i ) to each node.The sign function returns 1 for non-negative inputs and -1 elsewhere, meaning that each node's rounding depends on its position relative to the hyperplane defined by r that passes through the origin.The probability of the hyperplane rounding cutting an edge {i, j} is proportional to the angle between the vectors and can be expressed as The expected weight of the cut found by the algorithm is calculated by adding up the expected contributions of each edge, where the contribution of an individual edge is its probability of being cut.We can write the sum as follows To find the approximation ratio, we need to compare the expected weight of the cut produced by the algorithm to the optimal cut.This is done by comparing the ratio α of individual edge contributions for each edge {i, j} in eq.(A4) and eq.(A2) and finding the minimum value: where θ = arccos (v i • v j ) is the angle between the vectors v i and v j .Minimising the above expression, we get Having determined that each edge's contribution to the cut is expected to be no less than 0.87856 of the optimal value for θ = 2.331122, we can use the linearity of expectation to conclude that the total expected value is also no less than 0.87856 of the optimal value.If the Unique Games Conjecture [74][75][76][77] proves true, this method offers the strongest possible guarantee that any classical algorithm can achieve in polynomial time.
Appendix B: Parallel-LBFGS Algorithm FIG.8: The Parallel-LBFGS algorithm and CGA method are used to optimise an objective function.The objective function f (x), gradient function gr(x i ), and initial points x = (x 1 , x 2 , . . ., x n ) are provided to the LBFGS algorithm, which is initialised with p asynchronous processes.A gradient pool distributes the gradient calculation across p available processors.When the LBFGS algorithm calls either the objective or gradient function, the wrapper interface (blue box) begins evaluating the objective function and all the gradients in parallel.The results are then combined and returned to the LBFGS algorithm.This process is repeated until the optimisation converges to a solution (i.e., when all the gradients are less than or equal to the specified gradient tolerance value).
Applying eq.(D3) to each term of the sum cos x i cos y i + sin x i sin y i gives The identity in eq.(D2) follows immediately from replacing each y i in eq.(D1) with −y i .
By taking the sum (difference, respectively) of eq.(D1) and eq.(D2), only the even (odd, respectively) terms remain.Hence, it follows that Lemma 7. Let f ∈ Z + and x 1 , x 2 , . . ., x f , y 1 , y 2 , . . ., y f ∈ R.Then, we have that We now make a remark about the above identities: while the right-hand sides of eq.(D1), eq.(D2), eq.(D4), and eq.(D5) each involves a sum over exponentially (in f ) many terms (the cardinality of F f 2 is 2 f ), their left-hand sides involve just products of polynomially (in f ) many terms.Hence, going from the right-hand sides of these identities to their left-hand sides results in exponential savings in computational cost.This will be useful for the expressions that we derive in appendix D 2.

Proof of theorem 3
Proof of theorem 3. Consider the XQAOA ansatz applied to MaxCut with |s⟩ = |+⟩ ⊗n and Hence, compute the expectation value ⟨C⟩ XY , it suffices to compute each term in the sum separately.
For the rest of this proof, we fix an edge {u, v} ∈ E. We shall evaluate the product (D8) The factors in the above Kronecker product can be expanded as follows.For a ∈ {u, v}, e iβaXa e 2iαaYa e iβaXa Z a = (cos β a + i sin β a X a ) (cos 2α a + i sin 2α a Y u ) (cos β a + i sin β a X a ) Z a = cos 2α a cos 2β a Z a + cos 2α a sin 2β a Y a − sin 2α a X a . (D9) Substituting this into equation eq.(D8) gives Hence, by substituting this expression into eq.(D7), the expected cost function corresponding to the edge {u, v} ∈ E can be written as where for single-qubit Pauli matrices P, Q ∈ {X, Y, Z}, we have defined Moving forward, the approach we take is as follows.Firstly, we shall evaluate the expression for η(P, Q) in eq.(D13) for all P, Q ∈ {X, Y, Z}.Secondly, we shall substitute our expressions for η(P, Q) into eq.(D12) to derive analytical expressions for ξ(P, Q).Thirdly and also finally, we shall substitute these analytical expressions into eq.(D11) to obtain our desired expression eq. ( 22).
Before we execute these three steps, we first introduce some notation to help us keep track of the neighbours of the edge {u, v}: let be the set of neighbors of u that are of distance 2 or greater from v. Similarly, let be the set of neighbors of v that are of distance 2 or greater from u. Next, let where a i = ω b+i = q c+i for i = 1, . . ., f , be the set of vertices that are neighbours of both u and v, i.e., N uv comprises those nodes in V that form a triangle with both u and v. Finally, let be the set of neighbours of u that are not v and let N v\u = N v\ \u ∪ N uv = {q 1 , . . ., q c , a 1 , . . ., a f } = {q 1 , . . .q c , . . ., q e } (D18) be the set of neighours of v that are not u.We are now ready to execute our aforementioned three steps.
Step 1: Evaluation of η(P, Q) First, we rewrite eq.(D13) as where since the term 1 2 {u,v}∈E γ uv w uv I in C uv is cancelled by its inverse and does not contribute to eq. (D13).Next, we expand γ where contains terms involving u but not v, and contains terms involving v but not u, and E (uv) = {{a, b} ∈ E : a, b / ∈ {u, v}} denotes the set of edges that do not contain either u or v as an endpoint.By substituting eq.(D21) into eq.(D19), we obtain where Γ P Q = η(P, Q)P u Q v .By using the fact that Pauli operators either commute or anti-commute, evaluating eq.(D24) gives To evaluate eq.(D25), we first compute where we applied the trick in eq.(D3) to each term in the above product.In the above expression, |x| = d i=1 x i denotes the Hamming weight of the string x ∈ {0, 1} d .Similarly, By taking the product of eq.(D29) and eq.(D30), and using eq.(D16) to relabel the vertices in N uv by a i 's, we obtain the following expression for eq.(D25): Next, by substituting eq.(D29) into eq.(D26), we obtain Similarly, by substituting eq.(D30) into eq.(D27), we obtain This completes Step 1, where η(P, Q) is specified by eq.(D24), eq.(D31), eq.(D32), eq.(D33), and eq.(D28).
Step 2: Evaluation of ξ(P, Q) There are 3 2 = 9 different choices of P Q ∈ {X, Y, Z} 2 .We will split these choices into four cases, as follows: • which differs from the expression given by eq.(D42) for ξ(X, X) by only a single minus sign.As we mentioned in appendix D 1, the identities eq.(D4) and eq.(D5) allowed us to simplify the exponential sums in eq.(D40) to get exponential savings in the computational cost in the worst case.From the above calculations, we see that for five of the nine choices of P and Q, ξ(P, Q) vanishes: ξ(Z, Z) = ξ(Z, X) = ξ(Y, X) = ξ(X, Z) = ξ(X, Y ) = 0.The remaining four non-vanishing ones are given by eq.(D42), eq.(D43), eq.(D47), and eq.(D49).
Step 3: Derivation of eq. ( 22) We are now essentially done.By substituting the expressions for ξ(P, Q) that we obtained in Step 2 into eq.(D11), and noting that N v\u = e, N u\v = d, N u\ \v = d\F , N v\ \u = e\F , and N uv = F , we obtain eq. ( 22).

Proofs of Theorems 2 and 1
The analytical expressions for the expected cost function of both MA-QAOA and QAOA can be derived from XQAOA's analytical formula.Setting all α i to 0 in eq. ( 22) gives eq. ( 18) in theorem 2.Moreover, setting all β i to β and γ i to γ in eq. ( 18) and simplifying gives eq. ( 13) in theorem 1.

Proof of corollary 4
Proof of corollary 4. By setting β k = 0 for all k in eq. ( 22), we obtain the following expected cost function (corresponding to the edge {u, v}) for XQAOA To prove corollary 4, we shall specialise eq.(D51) to unweighted graphs with odd edge degrees.First, we show that if every edge degree of a graph G = (V, E) is odd, then the graph is necessarily two-colourable, i.e., there exists a map τ : V → {0, 1} such that for all edges {u, v} ∈ E, τ (u) ̸ = τ (v).Indeed, an instantiation of such a map τ is given by: τ (v) = 0, deg(v) even, 1, deg(v) odd.(D52) It now remains for us to prove that τ is indeed a two-colouring: let {u, v} ∈ E. Note that the edge degree of an edge {i, j} ∈ E is deg({i, j}) = |N (i) ∪ N (j)| − 2 = deg(i) + deg(j) − 2. By our assumption, the edge degree deg({u, v}) is odd, and hence deg(u) + deg(v) is also odd.This implies that exactly one of deg(u) and deg(v) is odd, from which it follows that exactly one of τ (u) and τ (v) is equal to 1. Hence, τ (u) and τ (v) cannot be equal to each other, i.e., τ (u) ̸ = τ (v).This completes our proof that G is two-colourable.Now, a graph is 2-colourable if and only if it does not contain an odd cycle (see [123,Theorem 5.21], for example).Hence, the graph G considered in corollary 4 with all edge degrees being odd must be triangle-free, i.e.G does not contain a 3-cycle.This implies that the set F in eq.(D51) must be empty.
Therefore, setting F = ∅, w uv = 1, and γ ′ ij = γ ij for all edges {i, j} ∈ E gives the expected XQAOA Y 1 cost function (for the edge {u, v}) for unweighted graphs with all edge degrees being odd:

FIG. 1 :
FIG.1: a) A specific instance of a graph for which we want to identify a set of vertices that maximises the number of edges that are cut.b) A quantum circuit with a single iteration of a quantum ansatz applied to it.The quantum ansatz consists of a unitary operation specific to the problem being solved and a problem-independent mixing unitary.c) Decomposing the problem and mixing unitaries for QAOA, MA-QAOA, and XQAOA into CNOT and single-qubit rotation gates.

1 FIG. 3 :
FIG. 3: Improvement of XQAOA X=Y 1 ansatz over the XQAOA XY 1 and XQAOA Y 1 ansatz for ten different instances of 3-regular graphs with 128 vertices.For each of the three ansatz variants, the Parallel-LBFGS optimiser was run 100 times with random initial values for each of the ten graph instances.

1 , 1 , 1 ,
FIG.4:This boxplot compares the sizes of cuts obtained by quantum and classical algorithms against the maximum cut size found by GUROBI.The data used to create the boxplots comes from 100 random cuts generated using the GW algorithm, 100 random initialisations for several other algorithms (XQAOA X=Y 1 , MA-QAOA 1 , QAOA 1 , and CR), and 100 informed initialisations for QAOA 1 (labelled as QAOA * ) applied to 25 different instances of regular graphs with 128 or 256 nodes and degree values ranging from 3 to 10.The whiskers extend up to data points within 1.5 times the interquartile range from the upper and lower quartiles, and crosses represent outliers.For clarity, outliers of QAOA 1 , with approximation ratios less than 0.7, that resulted from the barren plateau have been omitted.The inset scatterplot compares the best-found solutions for 100 runs for the XQAOA X=Y 1 and the GW algorithm on all the graph instances.The points in the upper-left corner show that XQAOA X=Y 1 performs better than GW, while the opposite is true for points in the lower-right corner.The colour of the points indicates the degree of the corresponding graph.

FIG. 5 :
FIG. 5: This graph illustrates the changes in the γ and β angles of the XQAOA X=Y 1 ansatz as the classical optimiser reaches the optimal solution for a 3-regular graph with 32 vertices and 48 edges.Each line in the graph represents a set of evaluated parameters, with the colour indicating the corresponding approximation ratio.The solid red circles in the graph represent the final angles determined by the classical optimiser for the XQAOA X=Y 1 FIG. 6: Improvement of XQAOA X=Y 1 ansatz over XQAOA X=Y 1 with γ = 0, and the Classical-Relaxed algorithms for ten different instances of 3-regular graphs with 128 vertices.For each of the three ansatz variants, the Parallel-LBFGS optimiser was run 100 times with random initial values for each of the ten graph instances.

FIG. 7 :
FIG. 7: Comparison of approximation ratios achieved by QAOA p , MA-QAOA p , and XQAOA X=Y p for 1 ≤ p ≤ 5, evaluated across 20 instances of 3-regular graphs, each with 16 vertices.The Powell optimiser was run 10 times with random initialisation for each graph instance across all three algorithms.
) by conjugating the Pauli operator Z u Z v by the mixing unitary and then by the problem unitary.By the commutation properties of the Pauli matrices Λ and the fact that they satisfy e −iθΛ = cos θI − i sin θΛ, it follows that most of the terms in the mixing unitary e −iα•Y e −iβ•X = n j=1 e −iαj Yj e −iβj Xj commute through Z u Z v and annihilate their inverses.Hence, e iβ•X e iα•Y Z u Z v e −iα•Y e −iβ•X = e iβuXu e 2iαuYu e iβuXu Z u ⊗ e iβvXv e 2iαvYv e iβvXv Z v .

TABLE I
: Summary of the XQAOA Ansatz family and the associated number of free parameters for p iterations of the ansatz for MaxCut on graphs with n vertices and m edges.