Brought to you by:
Paper The following article is Open access

Filtering variational quantum algorithms for combinatorial optimization

, , , , and

Published 2 February 2022 © 2022 IOP Publishing Ltd
, , Citation David Amaro et al 2022 Quantum Sci. Technol. 7 015021 DOI 10.1088/2058-9565/ac3e54

2058-9565/7/1/015021

Abstract

Current gate-based quantum computers have the potential to provide a computational advantage if algorithms use quantum hardware efficiently. To make combinatorial optimization more efficient, we introduce the filtering variational quantum eigensolver which utilizes filtering operators to achieve faster and more reliable convergence to the optimal solution. Additionally we explore the use of causal cones to reduce the number of qubits required on a quantum computer. Using random weighted MaxCut problems, we numerically analyze our methods and show that they perform better than the original VQE algorithm and the quantum approximate optimization algorithm. We also demonstrate the experimental feasibility of our algorithms on a Quantinuum trapped-ion quantum processor powered by Honeywell.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Combinatorial optimization tackles problems of practical relevance [1]. Applications include finding the shortest route via several locations for a delivery service, making optimal use of available storage space in logistics, and optimizing a manufacturing supply chain to increase the productivity of a factory. If quantum algorithms can solve such problems even just slightly faster than classical algorithms, this can have a large impact on various sectors in industry and research.

Variational quantum algorithms are a promising tool to get the most out of the current generation of gate-based quantum processors [27]. These algorithms employ parameterized quantum circuits that can be tailored to hardware constraints such as qubit connectivities and gate fidelities. In this context, a common approach for combinatorial optimization encodes the optimal solution in the ground state of a classical multi-qubit Hamiltonian [810]. Popular variational quantum algorithms such as the variational quantum eigensolver (VQE) [11] and the quantum approximate optimization algorithm (QAOA) [12] attempt to prepare this ground state by searching for the circuit parameters that minimize the energy expectation value of the corresponding quantum state. VQE imposes no restrictions on the ansatz circuit and has become a powerful method for quantum chemistry [13], condensed matter [14], and combinatorial optimization [15]. For combinatorial optimization problems, however, it tends to produce sub-optimal solutions [16]. QAOA uses a specific ansatz circuit inspired by adiabatic quantum computation [17] and the Trotterization of the time evolution corresponding to quantum annealing [18]. Despite its promising properties [1921] and considerable progress with regards to its experimental realization [22], in general the QAOA ansatz requires circuit depths that are challenging for current quantum hardware.

In this article, we introduce the quantum variational filtering (QVF) algorithm that optimizes a parameterized quantum circuit to approximate the action of a filtering operator on this circuit. We also present filtering VQE (F-VQE)—a special case of QVF—which is particularly efficient and similar to VQE. The main focus of this article is F-VQE which, due to its low quantum hardware requirements, is particularly relevant for current quantum computers. We consider filtering operators $F\equiv f(\mathcal{H};\tau )$ defined via real-valued functions f of the problem Hamiltonian $\mathcal{H}$ and a parameter τ in such a way that f2(E; τ) strictly decreases with the energy E for any τ > 0. The parameter τ plays a role similar to the time step in imaginary time evolution (ITE) and the ITE operator $\mathrm{exp}(-\tau \mathcal{H})$ is one example of a filtering operator considered in this work. The repeated action of a filtering operator on a quantum state projects out high-energy eigenstates (corresponding to sub-optimal solutions of the combinatorial optimization problem) and increases the overlap with the ground state. Importantly, QVF and F-VQE have no restrictions on the ansatz circuit and so they can employ the ansatz most suitable for the quantum hardware at hand. Furthermore we address the question which filtering operators benefit from using causal cones in the optimization, as that can drastically reduce the required number of qubits of the quantum hardware. Among the filtering operators considered in this article we find that the ITE operator is the best performing one that can be combined with causal cones. We therefore focus on the combination of ITE with causal cones for which the F-VQE method is equivalent to the hardware-efficient ITE procedure in [23] (HE-ITE).

We investigate the performance of F-VQE for various filtering operators and of HE-ITE using MaxCut problems on random three-regular weighted graphs of different sizes. Finding the optimal solution for this class of MaxCut problems is NP-hard [24, 25] and therefore no classical polynomial-time algorithm is expected to exist that achieves this goal. Given a weighted graph, the MaxCut problem consists in finding the optimal cut: a separation of the vertices into two disjoint subsets so that the cut cost, i.e. the sum of the weights of the edges between the two subsets, is maximum. The approximation ratio of a cut is defined as the cut cost divided by the cost of the optimal cut. As shown in figure 1, F-VQE consistently achieves larger approximation ratios after fewer optimization steps than VQE and QAOA. Moreover, F-VQE readily runs on actual quantum processors. The HE-ITE algorithm achieves similar results with a reduced number of qubits and gates.

Figure 1.

Figure 1. Performance of some of our algorithms. F-VQE employs the inverse filtering operator. One optimization step corresponds to one time step in HE-ITE and one update of all parameters in F-VQE, VQE, and QAOA. (a) Average approximation ratio (lines) and standard deviation (error bars for HE-ITE and shaded regions for F-VQE, VQE and QAOA) across 25 random weighted MaxCut problems. (b) Average approximation ratio (line) and standard deviation (shaded region) across 25 random weighted MaxCut problems. Inset: causal cone widths—i.e. the actual numbers of qubits required on quantum hardware—and their average frequency with standard deviation.

Standard image High-resolution image

This article is structured as follows. Section 2 introduces filtering operators, QVF and F-VQE. Section 3 presents the numerical and experimental studies. We conclude this article and discuss potential next steps in section 4. Technical details are provided in appendices at the end of this article.

In another publication [26] we analyze the performance of F-VQE in the context of the job shop scheduling problem.

2. Methods

In this section, we first define filtering operators. Then we explain the QVF and F-VQE algorithms. For F-VQE we present a procedure that dynamically updates τ during the optimization. Furthermore we address the question which filtering operators can use causal cones in F-VQE.

2.1. Filtering operators

Given an n-qubit Hamiltonian $\mathcal{H}$, we define a filtering operator $F\equiv f(\mathcal{H};\tau )$ via a real-valued function f(E; τ) of the energy E and a parameter τ > 0. We require that the function f2(E; τ) is strictly decreasing on the interval given by the complete spectrum of the Hamiltonian E ∈ [Emin, Emax]. Filtering operators are Hermitian and commute with the Hamiltonian by definition.

When a quantum state |ψ⟩ is sampled in the eigenbasis {|λx ⟩: x = 0, 1, ..., 2n − 1} of a Hamiltonian, then the probability distribution of eigenvectors is given by Pψ (λx ) = |⟨ψ|λx ⟩|2. The application of the filtering operator on the quantum state produces a new quantum state $\vert F\psi \rangle =F\vert \psi \rangle /\sqrt{{\langle {F}^{2}\rangle }_{\psi }}$ which generates a probability distribution that depends on the energy Ex of each eigenstate |λx ⟩:

Equation (1)

For non-eigenstates |ψ⟩ the action of the filtering operator increases the probability of all overlapping eigenstates |λx ⟩ ($\mathrm{w}\mathrm{h}\mathrm{e}\mathrm{r}\mathrm{e}\enspace {P}_{\psi }({\lambda }_{x}) > 0$) for which ${f}^{2}({E}_{x};\tau ) > {\langle {F}^{2}\rangle }_{\psi }$ and decreases the probability of all overlapping eigenstates |λy ⟩ for which ${f}^{2}({E}_{y};\tau )< {\langle {F}^{2}\rangle }_{\psi }$. Since f2(E; τ) is strictly decreasing as a function of E, such eigenstates exist for any non-eigenstate |ψ⟩. Hence, under the application of the filtering operator, the probability of sampling eigenstates with low energy increases and the probability of sampling eigenstates with high energy decreases. This naturally leads to a reduction of the average energy. In particular, if the ground state |λ0⟩ has a finite overlap (Pψ (λ0) > 0) with the initial state, the probability of sampling it increases with every application of the filtering operator. After sufficiently many applications the ground state is produced.

Some filtering function definitions are given in figure 2. The Chebyshev filtering operator approximates the Dirac delta operator $\delta (\mathcal{H})$ using the following expansion in terms of Chebyshev polynomials up to order τ [27]:

Equation (2)

Equation (3)

Here δr,s is the Kronecker delta and the Chebyshev polynomials are defined via the recursive formula Ts+1(x) = 2xTs (x) − Ts−1(x) with T0(x) = 1, T1(x) = x.

Figure 2.

Figure 2. Filtering operators used in this work. The value of f(E; τ)/f(E0; τ) is plotted against the energy range E ∈ [E0, 1] for ground state energy E0 = 0.001. The value of τ used in the figure is the first value τ1, selected at the first optimization step, averaged across the 25 MaxCut instances for 13 qubits. Values after ± indicate the standard deviation. For the Chebyshev filter, the value used in the figure was rounded to 5.

Standard image High-resolution image

The parameter τ in the filtering operator definition is inspired by the time step parameter of ITE. In fact, for the exponential filtering operator in figure 2 τ is precisely the imaginary time step. This parameter interpolates the action of the filtering operator between two limits. For vanishing values of τ → 0 the filtering operator becomes the identity operator and for sufficiently large values of τ the filtering operator becomes a projector onto the ground state.

For our selection of filters we took inspiration from several works. The inverse filter is inspired by the inverse iteration procedure which is a common ingredient in numerical routines that calculate eigenvalues and eigenvectors of matrices [2830]. The cosine filter was previously used in non-variational algorithms to achieve faster ground state preparation [31] and to analyze finite energy intervals [32]. The Chebyshev filter considered here was also employed in powerful tensor network algorithms for the study of thermalization [3335].

2.2. Quantum variational filtering

The QVF algorithm approximates the repeated action of a filtering operator on some initial quantum state by successively optimizing the variational parameters of a parameterized quantum circuit. The algorithm starts at optimization step t = 0 by preparing an initial state |ψ0⟩ that has finite overlap with the ground state: ${P}_{{\psi }_{0}}({\lambda }_{0}) > 0$. Then the algorithm proceeds iteratively and in each optimization step t ⩾ 1 approximates the state |Ft ψt−1⟩—that results from exactly applying the filtering operator Ft to the state |ψt−1⟩—by a state |ψt ⟩. The subscript t in Ft indicates that the filtering operator can change at each optimization step. The algorithm stops after an initially chosen number of optimization steps.

In order to approximate the application of the filtering operator, we prepare a parameterized quantum circuit ansatz |ψ( θ )⟩ that depends on a vector of m parameters θ = (θ1, ..., θm ). At optimization step t we search for the parameters that minimize the Euclidean distance between the parameterized quantum state and |Ft ψt−1⟩:

Equation (4)

The final vector of parameters obtained at the end of the minimization of equation (4) defines the quantum state |ψt ⟩ ≡ |ψ( θ t )⟩ at optimization step t. The cost function in equation (4) can be minimized with the help of the Hadamard test, which needs one additional ancilla qubit and several additional controlled operations, as we explain in appendix A.

2.3. Filtering VQE

To avoid the additional quantum resources required by the Hadamard test in QVF, in the following we develop F-VQE. The F-VQE algorithm uses a specific gradient-based procedure that requires essentially the same circuits as VQE.

The partial derivative of the cost function in equation (4) with respect to one parameter θj is derived in appendix B.1:

Equation (5)

Here the state |ψ( θ + π e j )⟩ is produced by the same ansatz circuit except that the vector of angles is shifted by an amount π along the direction e j of parameter θj . If the gradient is evaluated at the current vector of parameters θ t−1, then the parameter-shift rule [36, 37] yields:

Equation (6)

Here the three circuits |ψt−1⟩ and $\vert {\psi }_{t-1}^{j\pm }\rangle \equiv \vert \psi ({\boldsymbol{\theta }}_{t-1}\pm \frac{\pi }{2}{\boldsymbol{e}}_{j})\rangle $ are generated by the ansatz with different parameter vectors. Note that the expectation value in the denominator is the same for all partial derivatives at fixed t.

The F-VQE algorithm takes advantage of this favorable case as follows. At optimization step t, F-VQE performs a single gradient-descent update:

Equation (7)

where η > 0 is the learning rate. Then F-VQE moves on to the next cost function ${\mathcal{C}}_{t+1}(\boldsymbol{\theta })$ and proceeds identically. For each optimization step this algorithm requires the evaluation of 2m + 1 circuits.

The expectation value ${\langle {F}_{t}\rangle }_{\psi }$ of filtering operators can be efficiently evaluated by sampling the quantum state in the Hamiltonian eigenbasis. If each eigenstate |λx ⟩ is sampled Mx times from a total of M samples, the filtering operator expectation value can be approximated via the Monte Carlo estimator f(E; τ) as:

Equation (8)

In this article, we represent combinatorial optimization problems by diagonal quadratic unconstrained binary optimization (QUBO) Hamiltonians [810] for which the eigenbasis is the computational basis and energies can be efficiently computed. Therefore the expectation value of a filtering operator can be approximated by sampling the quantum state in the computational basis.

At each optimization step t the samples used to compute ${\langle {F}_{t}^{2}\rangle }_{{\psi }_{t-1}}$ in equation (6) are also used to compute the average energy ${\langle \mathcal{H}\rangle }_{{\psi }_{t-1}}$. As t increases, the average energy is expected to decrease and the probability of sampling the ground state is expected to increase. Thus F-VQE provides the average energy and a growing chance of sampling a low energy eigenstate or even the ground state at no extra cost during the optimization.

The gradient in F-VQE is equivalent to the one in VQE under certain assumptions. We derive the VQE gradient in appendix B.2. If the Hamiltonian in the VQE gradient of equation (B7) is replaced by −Ft , the new VQE gradient evaluated at the point |ψ( θ )⟩ = |ψt−1⟩ coincides with the F-VQE gradient in equation (6) up to a positive multiplicative factor. However, we emphasize that the corresponding VQE cost function −⟨ψ( θ )|Ft |ψ( θ )⟩ is different from the F-VQE cost function in equation (4) where the dependence on the parameters is of the form −Re⟨ψt−1|Ft |ψ( θ )⟩. We note that the F-VQE gradient in equation (5), in general, coincides only at the point |ψ( θ )⟩ = |ψt−1⟩ with the gradient of the modified VQE cost function. We can also see in equations (B6) and (B8) that the second derivatives do not coincide, not even at that point |ψ( θ )⟩ = |ψt−1⟩. Therefore both algorithms explore parameter landscapes with different curvatures.

2.4. Adapting τ

Both the cost function in equation (4) and its gradient in equation (6) depend on the parameter τ via the expectation value of the filtering operator in equation (8). We dynamically adapt τ to keep the gradient norm as close as possible to some desired large and fixed value at every optimization step. This can prevent the gradient from vanishing and enable us to determine its value more accurately with a fixed number of measurements.

In F-VQE we employ the following heuristic to dynamically adapt τ. At each optimization step the dependence $g(\tau )\equiv {\Vert}\nabla {\left.{\mathcal{C}}_{t}(\tau )\right\vert }_{{\boldsymbol{\theta }}_{t-1}}{{\Vert}}^{2}$ between the gradient norm and τ is used to select the value τt that returns a gradient norm as close as possible below a certain threshold gc > 0. For each optimization step such value τt is obtained by solving the implicit equation g(τt ) = gc . Note that g(0) = 0 since for τ = 0 the filtering operator becomes the identity operator and the gradient norm vanishes. For large values of τ the gradient norm saturates at a finite value that is determined by the overlap of the gradient circuits with the ground state. Taking this into account, we select τt in the following way. We evaluate the gradient norm for increasing values of τ until either (i) an upper bound τu > τt is found such that g(τu ) > gc or (ii) g(τ) converges to a constant. In the first case (i) we search for τt in the range [0, τu ] up to a certain precision. In the second case (ii) we select from the tried values the one that provided a gradient norm closest below the threshold. Note that for the Chebyshev filtering operator only positive integer values of τ are allowed.

We emphasize that our heuristic is different from a simple re-scaling of the gradient by a constant. As shown in figure 3 each partial derivative changes non-trivially as a function of τ. Moreover, in the simulations we observe that the gradient norm has a consistent dependence on τ across different optimization steps and problems.

Figure 3.

Figure 3. Gradients as a function of the parameter τ. For one 13-qubit MaxCut problem we plot the gradient norm (solid blue) and the absolute value of partial derivatives in equation (6) with respect to four parameters in the ansatz circuit shown in figure 4(a). Partial derivatives correspond to the first parameter on qubit 1 (green dashed), the last parameter on qubit 1 (magenta dotted), the first parameter on qubit 13 (solid yellow), and the last parameter on qubit 13 (red dashed-dotted). The value of τt selected at these optimization steps is marked with a gray vertical line.

Standard image High-resolution image

2.5. Causal cones

The F-VQE algorithm is based on expectation value computations of filtering operators and it is natural to ask whether this method can benefit from using causal cones. The causal cone of an observable is the quantum circuit composed of only those qubits and gates in the ansatz circuit that have an actual effect on the expectation value. In general, causal cones allow us to simplify the computation of expectation values for local observables. The simplification follows from the fact that outside the causal cone of a local operator unitary gates cancel with their adjoints. Causal cones are a crucial ingredient in various tensor network methods, e.g. based on the multiscale entanglement renormalization ansatz [38]. We have previously used them to make variational quantum algorithms more hardware-efficient for the simulation of the time evolution of quantum many-body systems [23].

Figures 4(b) and (c) show the causal cones for observables with support on two neighboring qubits and two non-neighboring qubits, respectively. Only the qubits and gates inside the causal cone need to be prepared experimentally. Note that for observables with support on distant qubits, the causal cone splits into two separable causal cones that can be independently realized in hardware. Therefore causal cones can reduce the required number of gates and qubits when the observables have small support.

Figure 4.

Figure 4. Parameterized quantum circuit defined by a vector θ = (θ1, ..., θm ) of m parameters, here for m = 19. (a) For a number p of circuit layers the block of gates inside the dashed rectangle is repeated p times. Here we show the circuit for p = 1. The F-VQE algorithm samples the entire circuit to evaluate a global observable indicated by the blue rectangle. (b) and (c) Highlighted qubits and gates constitute the causal cone that HE-ITE uses to evaluate two-local observables on two neighboring qubits and two non-neighboring qubits, respectively.

Standard image High-resolution image

Inspecting the filtering operators in figure 2, we see that the exponential, power, and Chebyshev filters can make use of causal cones. The exponential filter $\mathrm{exp}(-\tau \mathcal{H})$ is equivalent to a product of two-local terms that can be processed independently if additional approximations are made [23]. The power filter ${(1-\mathcal{H})}^{\tau }$ for integer values of τ is equivalent to a sum of at most 2τ-local terms so that the entire expectation value can be determined from the sum of simpler expectation values. Similarly the expectation value of the Chebyshev filter can be calculated using the sum of expectation values of at most 2τ-local observables, as the Chebyshev filtering operator is a polynomial in $\mathcal{H}$ of degree τ.

3. Results

In this Section, we describe the MaxCut Hamiltonians, parameterized quantum circuits, as well as the simulation and experimental settings that are being considered in this article. The settings are summarized in table 1. Then we analyze the performance of the algorithms. Simulations and the experiment on the Quantinuum H1 trapped-ion quantum processor [39] are compiled by TKET [40].

Table 1. Simulation and experimental settings. (a) The cost function for VQE and QAOA is the average energy for their respective ansatz circuits |ψ( θ )⟩ and |ψ( γ , β )⟩. The learning rate for F-VQE is the inverse of the cost function's Hessian diagonal. (b) For the 23-qubit problems HE-ITE uses ${2}^{{n}_{\text{cone}}+2}$ measurement shots for each causal cone, where ncone is the number of qubits in the causal cone. (c) Different settings for the nine-qubit experiment. In the ansatz circuit all rotation gates except the first and the last ones on each qubit are removed. Additional details corresponding to the experiment are provided in appendix C.

(a) AlgorithmF-VQEHE-ITEVQEQAOA
Cost functionEquation (4)[23] ${\langle \mathcal{H}\rangle }_{\psi (\boldsymbol{\theta })}$ ${\langle \mathcal{H}\rangle }_{\psi (\boldsymbol{\gamma },\boldsymbol{\beta })}$
Ansatz circuitFigure 4(a)Figure 4(a)Figure 4(a)Equation (12)
Initial param.|+⟩n |+⟩n |+⟩n Random
Adaptive τ yes, gc = 0.1no, fix 1.0
Learning rate η inv. Hess. d.1.01.0
opt. steps70707070
(b) Size n (qubits)579111323
AlgorithmsAllAllAllAllAllHE-ITE
Layers p in HE-ITE111111
Layers p in all except HE-ITE23456
Measurement shots M 1050100150200 ${2}^{{n}_{\text{cone}}+2}$
(c) Nine-qubit experiment on Quantinuum H1
Layers p 1
Ansatz circuitFigure 9(b)
Adaptive τ Yes, gc = 0.2
Measurement shots M 500
Optimization steps9

3.1. Weighted MaxCut Hamiltonians

We use 25 random MaxCut instances for each problem size of n ∈ {5, 7, 9, 11, 13, 23} qubits. These problem sizes n correspond to graphs with N = n + 1 vertices. Each instance is defined on a random three-regular weighted simple undirected and connected graph $\mathcal{G}(\mathcal{V},\mathcal{E},\mathcal{W})$ where $\mathcal{V}=\left\{1,\enspace 2,\enspace \dots ,\enspace N\right\}$ is the set of vertices, $\mathcal{E}\subset \mathcal{V}\times \mathcal{V}$ is the set of edges between different vertices, and $\mathcal{W}=\left\{{w}_{e}\in [0,1]:\enspace e\in \mathcal{E}\right\}$ is the set of random weights uniformly distributed in the range [0, 1] for all edges. A cut is represented by variables zv = {+1, −1}, with $v\in \mathcal{V}$, that are +1 for the vertices in one subset of the cut and −1 for the vertices in the other subset of the cut. This formulation has the obvious symmetry of swapping labels +1, −1 for the two subsets. We break this symmetry by assigning +1 to the last vertex v = N and thereby reduce the number of variables to n = N − 1. Then the MaxCut problem consists in solving the optimization problem $({z}_{1}^{\ast },{z}_{2}^{\ast },\dots ,{z}_{n}^{\ast })={\mathrm{a}\mathrm{r}\mathrm{g}\mathrm{m}\mathrm{a}\mathrm{x}}_{({z}_{1},{z}_{2},\dots ,{z}_{n})}C({z}_{1},{z}_{2},\dots ,{z}_{n})$, with cost function

Equation (9)

The Hamiltonian formulation of the MaxCut problem is obtained via any real coefficients a and b > 0 as

Equation (10)

where each variable zu in the MaxCut cost function is replaced by the Pauli operator Zu acting on qubit u ∈ {1, ..., n}. A ground state of $\mathcal{H}$ in equation (10) is the computational state $\vert {\lambda }_{0}\rangle ={\bigotimes}_{v=1}^{n}\vert (1-{z}_{v}^{\ast })/2\rangle $. The approximation ratio of an n-qubit quantum state |ψ⟩ is defined as

Equation (11)

Before we apply filtering operators to MaxCut Hamiltonians, we re-scale the energy range to [0, 1] using the coefficients a and b in equation (10). To achieve this, we compute lower and upper bounds of the MaxCut cost function in equation (9). We choose the upper bound to be the optimum cost of the semidefinite programming relaxation of the MaxCut problem [41]. We fix the lower bound to the minimum cost 0, which corresponds to the trivial cut zu = +1 for all u ∈ {1, ..., n}.

3.2. Setup

F-VQE. This algorithm uses the parameterized quantum circuit shown in figure 4(a). An initial state |ψ0⟩ = |+⟩n is prepared by setting to π/2 the parameters in the last rotation on each qubit and setting the remaining parameters to 0. Parameters are iteratively updated using analytical gradient descent as described in section 2.3. At each optimization step, the value of the parameter τ is adapted according to the procedure explained in Section IID. We choose a threshold of gc = 0.1 for the gradient norm and solve the implicit equation with a precision 0 < gc g(τt ) < 0.01. For the learning rate η we choose the inverse of the Hessian's diagonal. As shown in appendix B.1, at each optimization step t all diagonal elements of the Hessian have the same value which can be computed by means of the circuit |ψt−1⟩. This is a quasi-Newton method [42, 43] that uses only the diagonal of the Hessian matrix and can be realized without additional cost.

HE-ITE. In relation to using causal cones with F-VQE, we have chosen to concentrate our analysis on the exponential filter. In this case the F-VQE method is equivalent to the HE-ITE algorithm called angle update in [23]. We adapt this algorithm to the general QUBO Hamiltonians with long-range interactions considered here. We use the ansatz circuit depicted in figure 4(a) with p = 1 layer. Figures 4(b) and (c) show examples of causal cones in HE-ITE. A total of 70 time steps are performed with a fixed imaginary time step τ = 1.0. For the 23-qubit problems, we choose the number of measurement shots dependent on the number ncone of qubits in the cone as ${2}^{{n}_{\text{cone}}+2}$.

VQE. For this algorithm we employ the same ansatz as in F-VQE and HE-ITE, shown in figure 4(a). The computation of the analytical gradient for this ansatz requires two quantum circuits per parameter as described in appendix B.2. For the VQE cost function the diagonal of the Hessian can be obtained using the same circuits that are needed for the analytical gradient, similar to F-VQE. Thus it is possible to apply the same quasi-Newton method to the VQE optimization. However, we found that this heuristic performs worse than simply fixing a learning rate for all optimization steps and partial derivatives. More specifically, in the simulations we found that the Hessian diagonal elements frequently vanish so that the parameter update often diverges with this heuristic. The analysis of this phenomenon goes beyond the scope of this work and therefore we choose to fix a learning rate η for all optimization steps and partial derivatives. Comparing the performance of the values η = 1, 0.1 and 0.01 using our five-qubit MaxCut instances, we conclude that η = 1 performs best and hence use this fixed value for η in our simulations and experiments.

QAOA. The parameterized quantum circuit for QAOA is:

Equation (12)

Equation (13)

Here $\vert +\rangle =(1/\sqrt{2})(\vert 0\rangle +\vert 1\rangle )$, Xq denotes the Pauli operator X acting on qubit q, and the ansatz is defined by the m = 2p parameters in the vectors γ = (γ1, ..., γp ) and β = (β1, ..., βp ). We initialize the parameters randomly in the range [0, π]. We optimize the parameters using analytical gradient descent. As shown in appendix B.3, the computation of the analytical gradient for the QAOA ansatz requires just the ansatz circuit with various parameter sets. In the QAOA simulations we do not use the quasi-Newton method to determine the learning rate, as this would need additional circuits. Instead we fix a learning rate η for all optimization steps and partial derivatives. We choose η = 1 as it is the best performing learning rate for the five-qubit MaxCut instances where we compared the performance of η = 1, 0.1, and 0.01.

3.3. Performance

In the following, we present and analyze the numerical and experimental results of the F-VQE algorithms and compare them with those of HE-ITE, VQE and QAOA. For each MaxCut instance we pay special attention to two benchmark quantities: the approximation ratio and the probability of measuring the ground state. These quantities have some dependence. A large probability of sampling the ground state Pψ (λ0) ≈ 1 implies a large approximation ratio ⟨αψ ≈ 1. However, a quantum state |ψ⟩ given by a superposition of low-energy excited states can exhibit a large approximation ratio ⟨αψ ≈ 1 but low ground state probability Pψ (λ0) ≈ 0.

We compare the performance of various filtering operators in F-VQE in figure 5. Here filtering operators are sorted from left to right by performance and the inverse filter is the best performing one. Figure 5(a) shows for each algorithm considered in this article the final approximation ratio averaged over all 25 MaxCut instances for each problem size. We observe that the best performing filters achieve the largest approximation ratios and are more reliable in obtaining such approximation ratios, which can be gathered from the small values of the corresponding standard deviations. Figure 5(b) shows the distribution of the minimum number of optimization steps required to achieve an approximation ratio above 0.75. Here all filters show a similar performance: they require five or fewer optimization steps with little deviation from the median. Figure 5(c) shows the fraction of MaxCut instances where the algorithms obtain a probability of sampling the ground state above 0.25. The probability of sampling it at least once with M measurement shots is then 1 − (1 − 0.25)M . The best filters achieve this probability for a larger fraction of MaxCut instances.

Figure 5.

Figure 5. Simulation results for 25 random weighted MaxCut problems of various sizes. (a) Average approximation ratio (circles) and standard deviation (error bars). Note the different range for QAOA. (b) Optimization steps: median (lines) and all instances (dots). The number of instances that achieve an approximation ratio of 0.75 is 25 for all algorithms and problem sizes except for QAOA. For QAOA these numbers are 17, 22, 21, 9, and 3 for n = 5, 7, 9, 11, and 13, respectively. (c) Fraction of instances that achieve a probability of measuring the ground state above 0.25.

Standard image High-resolution image

Let us now compare the best performing filtering operator, the inverse filter, with VQE and QAOA. F-VQE requires less optimization steps than VQE and QAOA to achieve larger and more consistent approximation ratios. This can be seen in figures 5(a) and (b) as well as in figure 1(a). Additionally, as shown in figure 5(c), F-VQE converges to the ground state more often for all problem sizes.

The HE-ITE algorithm also achieves approximation ratios close to the optimum and frequently converges to the ground state. Moreover, the evolution of the approximation ratio shown in figure 1(a) almost overlaps with that for F-VQE. Similarly figure 1(b) shows that HE-ITE obtains an average approximation ratio close to the optimum for 25 MaxCut instances of 23 qubits. Importantly, the quantum circuits never use more than six qubits. The inset in figure 1(b) shows the average fraction of circuits used for each qubit count. We observe that the majority of circuits require just four qubits.

The performance of HE-ITE depends on the imaginary time step τ and to analyze it we have run HE-ITE for a long total imaginary time of 100. Figure 6 compares for three values of τ the final approximation ratios and the fraction of MaxCut instances where a probability of sampling the ground state above 0.25 is achieved. We conclude that HE-ITE improves systematically by choosing smaller values of τ.

Figure 6.

Figure 6. Simulation results for the HE-ITE algorithm applied to 25 random weighted MaxCut problems of various sizes using different imaginary time steps τ and total imaginary time 100. (a) Approximation ratio: average (circles) and standard deviation (error bars). (b) Fraction of instances that achieve a probability of measuring the ground state above 0.25.

Standard image High-resolution image

To demonstrate the experimental feasibility, we run F-VQE with the inverse filter on the Quantinuum H1 trapped-ion quantum processor and solve a random nine-qubit MaxCut instance. Experimental settings (see table 1(c)) are similar to those used for the numerical simulations, with some differences described in detail in appendix C. The main difference is that we use only p = 1 layer and remove all rotation gates except the first and the last ones for each qubit. Figure 7 shows the approximation ratio and the probability of measuring the ground state at each optimization step. The final approximation ratio is 0.9844 ± 0.0062 and the probability of sampling the ground state after the final optimization step is 0.928 ± 0.024. Here the value after ± indicates a 95% confidence interval.

Figure 7.

Figure 7. Approximation ratio and probability of sampling the ground state for a single random weighted MaxCut instance solved on the Quantinuum H1 trapped-ion quantum processor [39]. Error bars indicate one standard deviation. Details are provided in appendix C.

Standard image High-resolution image

4. Conclusions and outlook

We have introduced variational quantum algorithms that make use of filtering operators to solve combinatorial optimization problems. The QVF algorithm uses a parameterized quantum circuit to approximate the repeated action of a filtering operator. F-VQE is a particularly efficient version of QVF and similar to VQE. These algorithms impose no restrictions on the ansatz circuit so that we can choose the one that performs optimally on hardware. We have also tested a HE-ITE algorithm introduced in [23]. This algorithm approximates the action of the ITE operator. Using causal cones, HE-ITE can require circuits that are significantly smaller than the problem size.

We have compared F-VQE and HE-ITE with VQE and QAOA using a set of random weighted MaxCut problems of various sizes. The F-VQE algorithm achieved larger and more consistent approximation ratios as well as more reliable convergence to the optimal solutions than VQE and QAOA via fewer optimization steps. The HE-ITE algorithm obtained similar approximation ratios and ground state convergence with drastically reduced qubit count. Moreover, F-VQE successfully solved a nine-qubit MaxCut problem on the Quantinuum H1 trapped-ion quantum processor [39]. We conclude that F-VQE and HE-ITE are powerful algorithms to solve combinatorial optimization problems on noisy quantum computers.

Owing to the high flexibility of F-VQE, various promising strategies can be considered to further improve the performance. F-VQE can readily be combined with the conditional value-at-risk cost function [16, 44, 45] to provide new filtering operators with additional capabilities. Local cost functions and shallow ansatz circuits can be used to avoid barren plateaus [46]. F-VQE might also benefit from the original QAOA ansatz [12] or generalizations thereof such as the Hamiltonian variational ansatz [47, 48], the quantum alternating operator ansatz [49], the hardware-efficient mixer-phaser ansatz [50] or the depth optimized QAOA ansatz [51]. The ansatz can be specifically selected to minimize experimental noise on quantum hardware [52]. To this end, the quantum autoencoder is a powerful concept [53, 54]. It is enticing to combine F-VQE with the holographic ansatz [55] that has led to impressive results on the Quantinuum quantum computer [56, 57]. Any ansatz can be enhanced by extending it with classical neural networks [58]. The classical optimizer for the parameters can simultaneously adjust the ansatz circuit structure to alleviate the effects of experimental noise [59, 60]. It might also be beneficial to grow the ansatz during the optimization as in ADAPT-VQE [61], Adapt-QAOA [62], layerwise learning [63] or Layer VQE [64]. Advanced gradient-descent techniques like stochastic gradient descent may reduce the number of optimization steps and avoid local minima [65]. Finally, different heuristic adaptations of the parameter τ and the learning rate can be explored to gain more information from circuit samples.

An interesting future application for QVF and F-VQE is the computation of states different from the ground state. Our filtering algorithms can target any state of energy Etarget, e.g. a specific excited state, simply by using the slightly modified filtering operator $f(\vert \mathcal{H}-{E}_{\text{target}}\vert ;\tau )$.

Another interesting application for QVF and F-VQE is the optimization based on black-box cost functions. Our filtering algorithms (as well as VQE) can optimize a variational ansatz given any black-box cost function ${\mathcal{H}}_{\text{black}-\text{box}}(\boldsymbol{x})$ that takes as input a bitstring of binary variables x and returns the associated cost function value. Therefore it is not necessary to represent the problem in terms of a QUBO Hamiltonian. A different problem representation might, e.g., reduce the required qubit count. We note that one of the first successful approaches for the optimization corresponding to such black-box cost functions is based on simulated annealing [66] which made it possible to tackle real-world problems that could not be tackled before [67]. An interesting future project is the comparison of our filtering algorithms with simulated annealing.

It is exciting to think about using filtering operators for the computation of ground states of quantum Hamiltonians. This can be achieved with some filtering operators. For example, the ITE filter is often used in combination with quantum Hamiltonians, for details see [23] and references therein. Using the circuits described in this article, it is also straightforward to realize the power and Chebyshev filters for integer values of τ in QVF as well as F-VQE. With respect to the inverse filter—the best performing filtering operator of this work—there exists a proposal to realize it with quantum Hamiltonians by means of a Fourier approximation [68]. With the help of the Fourier transform, also other filtering operators can be applied to quantum Hamiltonians [69].

The insights gained here are also useful for the design of new quantum-inspired classical algorithms that can obtain significant speedups compared to traditional classical methods [7072]. Here one interesting question is whether tensor network methods for ground state minimization can benefit from the use of filtering operators. Faster and more accurate ground state algorithms are crucial, e.g., for the construction of systematically improved functionals for density functional theory [73], or to further improve fast solvers for the boundary value problem of general nonlinear partial differential equations [74]. Such algorithms can also give us new answers to traditional tensor network questions regarding quantum phases and phase transitions in strongly correlated quantum systems [75, 76].

Acknowledgments

We thank Brian Neyenhuis and all the Quantinuum quantum hardware team for their availability and support with the H1 device. We acknowledge the cloud computing resources received from the 'Microsoft for Startups' program. We also thank Seyon Sivarajah, Alec Edginton, Richie Yeung, Ross Duncan, and all the TKET development team for their technical support.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Appendix A.: Hadamard test for QVF

In this Appendix we describe how to minimize the QVF cost function in equation (4) with the help of a Hadamard test.

Evaluating the cost function in equation (4) requires computing a quantity of the form ⟨ψ( ϕ )|Ft |ψ( θ )⟩, where |ψ( θ )⟩ and |ψ( ϕ )⟩ are n-qubit quantum ansatz circuits shown in figure 4(a) with parameter vectors θ and ϕ . The computation of such a quantity is also needed for the gradient calculation in equation (5). This quantity can be computed with a Hadamard test by evaluating the expectation value of the diagonal and Hermitian observable ZancFt with the specific circuit W( θ , ϕ ) in figure 8—composed of one ancilla qubit labeled by anc and an n-qubit register—that implements the following operation:

Equation (A1)

Figure 8.

Figure 8. Hadamard test circuit to compute Re⟨ψ( ϕ )|Ft |ψ( θ )⟩ for the parameterized quantum circuit in figure 4(a). The H gates acting on the ancilla qubit at the top are Hadamard gates. For an ansatz of p layers the block of gates inside the dashed rectangle is applied p times. This figure shows p = 1 layer. The parameter vectors satisfy θ ' = ϕ θ .

Standard image High-resolution image

Appendix B.: Analytical derivatives

In this Appendix we derive the analytical gradient in equation (5) and the diagonal elements of the Hessian matrix for the cost function in equation (4) corresponding to the QVF algorithm. We use the parameter-shift rule [36, 37] to derive the analytical gradient in equation (6) for the F-VQE algorithm. This procedure is also used to derive analytical gradients for VQE and QAOA.

B.1. QVF and F-VQE

In the quantum ansatz circuits |ψ( θ )⟩ of figure 4(a) parameters θ = (θ1, θ2, ..., θm ) are present only in rotation gates of the form RG (θj ) = exp(−j G/2), where G is a single-qubit Pauli operator or a tensor product of Pauli operators. As a function of a single parameter, the circuit can be expressed in terms of two fixed unitaries VA , VB and the rotation gate corresponding to the parameter:

Equation (B1)

where |0⟩ = |0⟩n is the initial register state. Hence the first and second derivatives with respect to a parameter θj are:

Equation (B2)

Equation (B3)

where we have used that −iG = RG (π).

Then the first and second derivatives of the cost function in equation (4) are:

Equation (B4)

Equation (B5)

The first equation corresponds to equation (5). Since the filtering operator Ft is Hermitian, when the first equation is evaluated for the vector of parameters θ t−1 that produces the state |ψt−1⟩ = |ψ( θ t−1)⟩ we can use the parameter-shift rule to express the numerator as a sum of two circuits, which leads to equation (6). When the second equation is evaluated for θ t−1, it results in:

Equation (B6)

which requires only the quantum circuit for |ψt−1⟩.

B.2. VQE

The cost function is the average energy ${\langle \mathcal{H}\rangle }_{\psi (\boldsymbol{\theta })}$ and the ansatz circuit |ψ( θ )⟩ is the same as the one employed by F-VQE shown in figure 4(a). Given that each parameter is present in only one rotation gate, the parameter-shift rule can be applied to express the analytical gradient as:

Equation (B7)

As with F-VQE, the circuits $\vert {\psi }_{t-1}^{j\pm }\rangle \equiv \vert \psi \left({\boldsymbol{\theta }}_{t-1}\pm \frac{\pi }{2}{\boldsymbol{e}}_{j}\right)\rangle $ are implemented by shifting the parameter θj by an amount ±π/2.

Using the same methods, the second derivative with respect to each parameter can be evaluated as

Equation (B8)

where |ψt−1⟩ = |ψ( θ t−1)⟩ is the state with no shifts, and $\vert {\psi }_{t-1}^{\enspace j++}\rangle =\vert \psi \left({\boldsymbol{\theta }}_{t-1}\pm \pi {\boldsymbol{e}}_{j}\right)\rangle $ is the state with a + π shift.

B.3. QAOA

The QAOA ansatz in equation (12) is not of the previous form: here parameters multiply sums of tensor products of Pauli operators so that the partial derivatives become sums of circuit evaluations. Given a Hamiltonian $\mathcal{H}={\sum }_{k=1}^{K}{h}_{k}{Z}_{{Q}_{k}}$ with real coefficients hk and ${Z}_{{Q}_{k}}={\bigotimes}_{q\in {Q}_{k}}{Z}_{q}$, the ansatz derivatives are:

Equation (B9)

Equation (B10)

Here ${\tilde{U}}_{a,b}=U({\boldsymbol{\gamma }}_{b},{\boldsymbol{\beta }}_{b})\dots U({\boldsymbol{\gamma }}_{a+1},{\boldsymbol{\beta }}_{a+1})U({\boldsymbol{\gamma }}_{a},{\boldsymbol{\beta }}_{a})$, where U( γ j , β j ) is defined in equation (13), contains all QAOA circuit layers from a to b. Therefore the partial derivatives of the QAOA cost function are:

Equation (B11)

Equation (B12)

If the parameter-shift rule is applied to every term individually, we obtain the analytical gradient for the QAOA ansatz in terms of ansatz circuits for various parameter sets:

Equation (B13)

Equation (B14)

The evaluation of all gradient components requires the following 2p(n + K) circuits that are defined by inserting a rotation gate in the ansatz circuit:

Equation (B15)

Equation (B16)

Appendix C.: Experimental details

This Appendix provides further details on the nine-qubit experimental results, shown in figure 7, obtained with the Quantinuum H1 trapped-ion quantum processor solving a MaxCut problem.

The considered MaxCut problem is defined on the ten-node three-regular weighted graph depicted in figure 9(a). The corresponding MaxCut weights are given in table 2. The solution divides the nodes into the two sets of black and white nodes shown in figure 9(a), equivalent to the ten-variable solution vector (1, −1, −1, 1, 1, −1, 1, −1, −1, −1) which corresponds to the nine-qubit solution computational state |011001011⟩ in our experiment, as explained in section 3.1.

Figure 9.

Figure 9. Experimental details relating to figure 7. (a) Ten-node three-regular weighted graph for the MaxCut problem with weights given in table 2. The corresponding solution is the division into black and white nodes. (b) Parameterized quantum circuit employed by F-VQE and (c) its compiled counterpart expressed in terms of Quantinuum H1 native gates.

Standard image High-resolution image

Table 2. MaxCut weights for the MaxCut problem considered in our experiment based on the graph in figure 9(a).

EdgeWeight
(1, 2)0.0609
(1, 8)0.1574
(1, 9)0.1392
(2, 5)0.6131
(2, 10)0.2670
(3, 5)0.4156
(3, 7)0.2020
(3, 8)0.0120
(4, 8)0.6738
(4, 9)0.5296
(4, 10)0.9927
(5, 6)0.2617
(6, 7)0.7781
(6, 10)0.0202
(7, 9)0.3973

Figure 9(b) shows the parameterized quantum circuit that is used in the experiment. This circuit is simpler than the ones that are used for the simulations—cf figure 4(a)—as it contains just p = 1 layer with two single-qubit rotation gates per qubit, one at the beginning and one at end of the circuit.

Before the circuit of figure 9(b) is run on the Quantinuum H1 processor, it is compiled into the circuit shown in figure 9(c) which is composed of the native gates of the processor. There are three native gates [39]: the standard single-qubit rotation gate around the Z axis, a single-qubit rotation gate around an axis in the XY plane PX(θ, ϕ) = Rz (ϕ)Rx (θ)Rz (−ϕ), and the two-qubit interaction gate Uzz = exp(−iπZZ/4). The compilation of the original two-qubit gates in figure 9(b) changes the rotation angles of the single-qubit gates to

Please wait… references are loading.
10.1088/2058-9565/ac3e54