Variational learning algorithms for quantum query complexity

Quantum query complexity plays an important role in studying quantum algorithms, which captures the most known quantum algorithms, such as search and period finding. A query algorithm applies $U_tO_x\cdots U_1O_xU_0$ to some input state, where $O_x$ is the oracle dependent on some input variable $x$, and $U_i$s are unitary operations that are independent of $x$, followed by some measurements for readout. In this work, we develop variational learning algorithms to study quantum query complexity, by formulating $U_i$s as parameterized quantum circuits and introducing a loss function that is directly given by the error probability of the query algorithm. We apply our method to analyze various cases of quantum query complexity, including a new algorithm solving the Hamming modulo problem with $4$ queries for the case of $5$-bit modulo $5$, answering an open question raised in arXiv:2112.14682, and the result is further confirmed by a Semidefinite Programming (SDP) algorithm. Compared with the SDP algorithm, our method can be readily implemented on the near-term Noisy Intermediate-Scale Quantum (NISQ) devices and is more flexible to be adapted to other cases such as the fractional query models.


I. INTRODUCTION
Query models (also known as the decision tree model [1]) play important roles in analyzing quantum algorithms.It captures most of the known quantum algorithms, such as search [2], period-finding [3], Simon's algorithm [4], and the Deutsch-Jozsa algorithm [5].It also can be used as a tool to analyze lower bounds for quantum algorithms [6][7][8].There has been a large body of references, e.g., [9][10][11][12][13][14][15][16][17][18], exploring the advantage of quantum computing relative to classical computing in the query model.It has been shown that exponential separations between quantum and classical query complexity can be obtained for computing partial Boolean functions, whereas quantum query algorithms can only achieve polynomial speed-up over classical counterparts for computing total Boolean functions [6].
A quantum query algorithm (QQA) is defined by an initial state, which, without loss of generality, can be chosen as the all-zero state |0⟩ n , and transformations U 0 O x U 1 . . .O x U t [8].Here O x is the oracle that depends on some input variable x, and U i s are unitary operations independent of x.The algorithm is to apply x U 0 on the input state |0⟩ n and measure the result, as shown in FIG. 1.The quantum query complexities in bounded-error and exact settings are denoted by Q ε (f ) and Q E (f ) for Boolean function f , respectively.In the bounded-error setting, Q ε (f ) represents the minimum number of queries required to solve a problem with a bounded probability of error, while in the exact setting, * zengb@ust.hkQ E (f ) signifies the minimum number of queries needed to ascertain a solution with zero-error.
The intricacies of determining the quantum query complexity of a particular function f have beckoned the development of distinct methods.Two notable methodologies are the polynomial method [6] and the adversary [19] method.The former hinges on representing or approximating an algorithm computing a Boolean function f with a real-valued polynomial, while the latter underscores the limited information gleaned from oracle calls representing the function f .These methods have proven potent in characterizing bounds for quantum query complexities in both bounded-error and exact settings.
The work of Barnum et al. [20] notably offers a precise determination of Q E (f ) and Q ε (f ) through a Semidefinite Programming (SDP) characterization of the quantum query complexity.This method, being a variant of the adversary method, has paved the way for a numerical approach to analyze and construct quantum query algorithms.This numerical avenue was further explored by Montanaro et al. [21], who, inspired by numerical results, designed an extended form of the Deutsch-Jozsa algorithm and provided numerical results on the optimal success probabilities of quantum algorithms computing all boolean functions on up to 4 bits, and all symmetric boolean functions on up to 6 bits.
Despite these advancements, the realm of exact quantum query complexity continues to harbor numerous open questions.For instance, the exact quantum query complexity Q E (f ) for many Boolean functions, whether symmetric or asymmetric, remains elusive.Furthermore, the quantum speed-up factor had been stagnant at a mere factor of 2 for many years [22], until a breakthrough by Ambainis [10] introduced a total Boolean function with a superlinear advantage of exact quantum algorithms over their classical counterparts.The pursuit of total Boolean functions with acceleration factors surpassing 2 remains a significant challenge in the field of exact quantum query complexity.
The SDP method, while powerful, grapples with limitations when addressing these open questions, especially for total Boolean functions where the exponentially growing number of optimization parameters renders the SDP method infeasible for larger-scale Boolean functions.To surmount this challenge, in this paper, we propose a variational heuristic called VarQQA that transmutes the SDP problem into an unconstrained optimization problem, thereby formulating a loss function.This heuristic approach enables the loss function to attain an optimal value of zero when the number of queries t is at least Q E (f ), facilitating the delineation of exact quantum query algorithms as heuristic solutions.VarQQA also addresses the redundancy of the workspace register dimension in the SDP method by introducing it as a tunable hyperparameter in the optimization problem.This adjustment allows for the reduction of the workspace register dimension post determination of Q E (f ), thereby optimizing the quantum query algorithm further through heuristic refinements.
To demonstrate the advantages of our method, in this paper, we employ our VarQQA method to compute Q E (f ) for two Boolean functions.The first one is the Hamming weight modulo function MOD n m .In the paper by Cornelissen et al. [23], an open problem regarding this function was proposed, further conjecturing that Q E (MOD p p ) = p − 1 for all prime numbers p.In the case of p = 5, our method affirmed that its exact query complexity is 4. Additionally, our algorithm discovered that only one qubit is required as a workspace register.We postulate that this is the optimal quantum query algorithm for computing MOD 5  5 since it necessitates the fewest number of qubits and queries.Moreover, we numerically verified the cases for p = 7 and p = 11, thus lending credence to Cornelissen's conjecture.We also observed that the dimension of the workspace register exponentially grows with p, providing numerical insight for the analytical construction of such quantum query algorithms.
The other problem we addressed is the EXACT n k,l problem.Ambainis et al. [24] provided bounds for ) and determined values for some specific cases.However, for many instances, the exact value of Q E (EXACT n k,l ) within the known upper and lower bounds remains undetermined.We employed our method to compute Q E (EXACT n k,l ) values for some of the instances not previously determined.Notably, in some of the cases, we calculated the scenario for n = 16, which far exceeds the computational reach of the SDP method.
Our method presents a promising alternative numerical tool for studying quantum query complexity .The remainder of this paper is organized as follows: In Sec.II, we review the quantum query algorithm and the SDP method.In Sec.III, we introduce our variational learning algorithm.In Sec.IV, we apply our method to study several cases of quantum query complexity.We present our conclusions in Sec.V.

A. The quantum query model
In a query complexity model (classical or quantum), we wish to compute a function f : S → T where:: • S ⊆ Σ n with Σ is finite and n is a positive integer.
• T is a finite set.
The input domain consists of points x = (x 1 , ..., x n ), where x i ∈ Σ, and the output domain is Consider a Quantum Query Algorithm (QQA) to compute a Boolean function f : S ⊆ {0, 1} n → T .The QQA requires: • the input register to hold the input x ∈ {0, 1} n .
• the query register to hold an integer between 0 and n.
• the workspace register that has no restriction on dimension.
The total Hilbert space can be expressed as H = H input ⊗H Q ⊗H W .The tensor product space of query register and workspace register H A = H Q ⊗H W is called the accessible space.The dimensions of H input , H Q , and H W are d input , d q , and d w respectively, where d input = 2 n , d q = n + 1, and d w ≥ 1.A quantum state in the Hilbert space H can be represented as: To compute f (x), the QQA works as follows: a. Input the quantum state |x⟩|0⟩|0⟩.
b. Sequentially apply the unitaries U i that only act on the accessible space and the oracle O.
c. Measure the qubits in the accessible space.
The complexity of a quantum query algorithm is determined by the number of queries it calls.
Here, the oracle O satisfies Since the oracle O leaves the state in the input Hilbert space H input unchanged, an equivalent QQA works only on the accessible space, where the query operator O is now dependent on the input x and is given by: where x 0 is not part of the input and is defined to be the constant 0. Note that, as mentioned in [20], the index i = 0 is needed in the model for an important technical reason without which the model is not even capable of computing some simple functions.However, in many cases, we do not need the index i = 0, like in Grover's algorithm.Then, the initial state can be set with α x,i,w = 0 for i = 0 in Eq.( 1), and the unitary operations are constructed to act trivially on the corresponding dimension.The QQA algorithm is then given by: a. Input the initial state |0⟩|0⟩ in accessible space H A .
b. Sequentially apply the unitaries U i , which are independent of x, and the oracle O x for a specific x.
c. Measure the qubits in the accessible space.
A t-step quantum query algorithm operating in the accessible space is depicted in Figure 1.Consider |Ψ (j) x ⟩ as the quantum state post-application of U j for a given input x, which can be expressed as This state can be uniquely decomposed as: where x,i ⟩ ∈ H W .
It's important to note that while |Ψ (j) x,i ⟩ is a vector in H W , it may not be normalized.
In the quantum query algorithm, results are extracted by performing a projective measurement on the output state.This measurement is defined using a Complete Set of Orthogonal Projectors (CSOP).Specifically, the CSOP for the algorithm is represented by the indexed set {Π f (x) : f (x) ∈ T } of projectors.These projectors are pairwise orthogonal and satisfy the relation f (x)∈T Π f (x) = I A , where I A denotes the identity operator on the accessible space.Given a direct sum decomposition of the accessible space as ⊕ f (x)∈T H f (x) into orthogonal subspaces, there exists a unique CSOP Π f (x) : f (x) ∈ T .In this context, Π f (x) serves as the identity on H f (x) and zeroes out H f (x ′ ) for all f (x ′ ) ̸ = f (x).
In the QQA, the output for a given input x is determined by the probability ⟨Ψ x ⟩.This probability reflects the likelihood of obtaining the correct output f (x) upon measurement of the output quantum state.
A QQA is termed exact when it always produces the correct output for every input, mathematically expressed as: However, in practical scenarios, a QQA might not always be exact.It is said to compute the function f with an error margin ε if the probability of obtaining the correct output is at least 1 − ε for every input, as given by: This criterion allows for a certain degree of tolerance in the algorithm's accuracy while still being considered effective.
The exact quantum query complexity, Q E (f ), is the fewest queries any quantum algorithm needs to compute f (x) accurately for all x.Similarly, the bounded-error quantum query complexity, Q ε (f ), represents the minimal queries needed to compute f with an error chance of ε.

B. Semidefinite Programming Formulation for
Quantum Query Algorithms In this section, we revisit the method of Barnum et al. [20], specifically addressing how quantum query algorithms can be represented as semidefinite programs.
We begin by introducing the concept of the Gram matrix.Consider an indexed family of vectors (|Ψ x ⟩ : x ∈ S) within the Hilbert space H.The associated Gram matrix, denoted as Gram (|Ψ x ⟩ : x ∈ S ), is an |S| × |S| matrix M defined by: We now define the Gram matrix during the QQA.The states defined in Eqn.( 4) and ( 5) generate a sequence that is interlinked by the application of two successive unitaries: the oracle O x corresponding to the input x, and the input-independent unitary operator U j .The final step of the algorithm (measurement) is specified by the orthonormal projections Π z , each corresponding to a different output z ∈ T .
We define the following symmetric matrices M (j) , M (j) i , and Γ z with the matrix elements With the symmetric matrices defined, we are now equipped to present the SDP formulation for the quantum query algorithm.
Definition 1 (SDP Formulation for QQA) Given a Boolean function f , and a number of queries t, solving the following SDP problem yielding the optimal value ε ⋆ is equivalent to finding a t-step QQA that computes f within error ε ⋆ .
for 1 ≤ j ≤ t − 1 where ⊙ denotes the element-wise product, E 0 is the constant 1 matrix, Theorem 2 (Barnum, Saks, and Szegedy [20]) There exists a t-query QQA(f ) that computes function f within error ε if and only if SDP(f, t, ε) with conditions ( 12),( 13),( 14),( 15) and ( 16) is feasible.Furthermore, let r = max(rank(M , the necessary dimension of the workspace register is r. Solving the SDP(f, t, ε) to obtain the optimal value ε ⋆ demonstrates that a quantum algorithm with t queries can compute the function f with an error rate of ε ⋆ .Therefore, determining the exact quantum query complexity of the function f translates to identifying the minimum number of queries, t min , for which the optimal value ε ⋆ of SDP(f, t min , ε) is zero.Montanaro et al. [21] considered an ε ⋆ value less than 0.001 as indicative of the existence of an exact quantum query algorithm.Thus, the goal of computing Q E (f ) with SDP is to determine the smallest t where ε falls below this threshold.
SDP benefits from the robust framework of convex optimization but encounters significant challenges in practical applications.A critical issue is the exponential growth in the number of decision variables, which correspond to the optimization parameters.This growth is directly linked to the input set cardinality for Boolean functions, presenting difficulties in solving even moderately sized problems.Specifically, the memory requirements for SDP(f, t, ε), primarily due to M (j) i as indicated in Eq.13, scale as O(nt|S| 2 ).With |S| = 2 n for total Boolean functions, the memory demand escalates to O(tn4 n ).Our numerical tests show that this scaling renders SDP(f, t, ε) inefficient for Boolean functions with an input length of n = 10, even on a server equipped with 256GB of memory.
Furthermore, the SDP method does not optimize the dimension of the workspace register, d w , in the computation of Q E (f ).This often results in a situation where max(rank(M (j) i )) is approximately equal to |S|, suggesting that the derived quantum query algorithms (QQAs) typically require the maximum possible d w .However, our investigations using the VarQQA approach indicate that the actual d w necessary for these computations is frequently lower than the maximum predicted by SDP.

III. THE VarQQA METHOD
In addressing quantum query complexity, we pivot from the traditional Semidefinite Programming (SDP) approach to a variational learning strategy, commonly employed in machine learning (ML) for tackling complex optimization challenges.As illustrated in FIG. 1, we parameterize the unitaries using free parameters, transforming the constrained convex optimization of SDP into an unconstrained landscape.Transitioning to an unconstrained optimization framework not only aligns with established ML optimization strategies but also opens the door to a plethora of advanced techniques tailored for such problems.Specifically, we can capitalize on the power of auto-differentiation for efficient gradient computations, a hallmark of modern ML methodologies.Additionally, for the optimization itself, we employ the Limited-memory BFGS method [25], renowned for its efficiency in handling large-scale problems.It's important to clarify that the VarQQA framework operates as a heuristic, leveraging these advanced ML techniques to explore quantum query challenges in a more adept and streamlined manner, rather than functioning strictly as an algorithm with guaranteed optimal solutions.

A. Formulation of VarQQA
Given a Boolean function and a specified number of queries t, our objective is to discern the optimal quan-tum query algorithm.Here, by "algorithm," we specifically refer to the configuration of each unitary in the circuit.The term "optimal" is used in the context of achieving the minimal value of the loss function.Var-QQA is designed to address this challenge.It parameterizes the unitary operations with trainable parameters and crafts a loss function intrinsically linked to the error rate.Through iterative refinement of these parameters, VarQQA seeks to minimize the loss function, thereby pinpointing the quantum query algorithm configuration that yields the lowest error for the Boolean function f with t queries.
Central to our VarQQA method is the design and implementation of two foundational components: the parameterization of unitaries and the formulation of the loss function.For the unitary parameterization, we employ a direct exponential of Hermitian matrices, ensuring a compact and efficient representation of our quantum operations.A detailed discussion of this parameterization technique is provided in Appendix A. Transitioning to the loss function, it stands as a pivotal element in our variational framework, offering a quantifiable metric to gauge the accuracy and efficacy of our quantum queries.Let's delve deeper into the specifics of this loss function.
Definition 3 (Loss Function for VarQQA) Consider a t-step quantum query algorithm where each unitary operation is parameterized as U 0 (θ 0 ), . . ., U t (θ t ).The output state of the algorithm for an input x is represented by: Given the complete set of orthogonal projectors {Π f (x) | f (x) ∈ T }, we introduce the loss function E as: where Π ⊥f (x) is defined as: and it denotes the projector onto the orthogonal complement of the space associated with f (x).
The loss function E, defined as the average error rate over all inputs x, is inherently non-negative, reflecting the nature of error rates.As highlighted by Eq.6, an "exact" quantum query algorithm is one where the error rate for each input x is zero.Therefore, E reaches its minimum value of zero if and only if the algorithm yields no error across all inputs.This zero minimum is indicative of the algorithm's exactness, a state achieved only when every individual error rate, contributing to the average, is zero.
We are now prepared to present our VarQQA, as detailed in the learning process outlined in Algorithm 1.

B. Searching for QE(f ) with VarQQA
Given a Boolean function f , we can apply VarQQA to search for its Q E (f ).The methodology is similar to SDP methods; we gradually increase the number of queries t until the minimal error rate E ⋆ reaches below a threshold ϵ.In this work, we choose ϵ = 10 −5 as the threshold.This means that when we find a quantum query algorithm with an error rate E < 10 −5 , we consider it as an exact quantum query algorithm.
VarQQA does not inherently ensure convergence to the global minimum due to its non-convex nature.However, we can employ several strategies to enhance the likelihood of reaching the global optimum.Utilizing multiple random initializations allows us to explore various regions of the solution space, increasing the probability of finding the global minimum.Advanced optimizers, particularly those with adaptive learning rates or momentum-based techniques, can significantly refine the search process.Additionally, the incorporation of domain-specific knowledge can guide the optimization process more effectively.For example, when the lower bound of Q E (f ) is known, VarQQA can leverage this information to streamline the search, particularly for large-scale problems where SDP is not viable.
By implementing these strategies, we aim to mitigate the limitations inherent to VarQQA and bolster the performance of our algorithm.As demonstrated in the results section (see Sec.IV), VarQQA exhibits a strong capability to find the global minimum, with most algo-rithms converging to it within a few random initializations.

IV. RESULTS
In this section, we investigate the feasibility of employing our VarQQA method to compute the exact quantum query complexity Q E (f ) for specific Boolean functions f .We select two total functions, MOD n m and EXACT n k,l , whose Q E (f ) values are currently open problems.By applying VarQQA to these cases and contrasting the results with those obtained via the SDP method, we aim to demonstrate the algorithm's capability and potential advantages in solving such complex problems.
The implementation of VarQQA and numerical results in this section can be accessed at [28].

A. Hamming Weight Modulo Function
A n-bit mod m Hamming weight modulo function is defined as: where |x| denotes the Hamming weight of x.
Recently, Cornelissen et al. [23] demonstrated that for MOD n m where m's prime factors are only 2 or 3, the exact quantum query complexity is n 1 − 1 m .Furthermore, they established that the exact quantum query complexity for any 1 < m ≤ n is at least this amount.Following these findings, they proposed Conjecture 1, suggesting that this lower bound is indeed tight.

Conjecture 1 The exact quantum query complexity of
The recursive proof method used in their work suggests that Conjecture 1 could be resolved if we can prove Q E (MOD p p ) = p − 1 for prime numbers p.Given that the cases for p = 2 and 3 have been resolved, a pressing question emerges: is it possible to construct a 4-query quantum algorithm to exactly compute MOD 5  5 ?We employed the VarQQA method to identify a 4-step QQA capable of computing MOD 5  5 exactly.VarQQA successfully found a QQA that can compute MOD 5  5 with an error rate below 10 −5 , utilizing a workspace dimension d w = 2. Initially, we started with d w = 1, but an exhaustive search did not yield a viable solution.Upon increasing the workspace dimension to d w = 2, we achieved an accessible dimension of d A = 2×(n+1) = 12.The accessible space was then partitioned into orthogonal subspaces with dimensions (2,4,1,1,4).We posit that the QQA discovered through VarQQA for MOD 5  5 is optimal, as it utilizes the minimal number of queries and the least workspace dimension necessary.
To corroborate our results, the SDP method was also applied to seek a 4-step QQA for MOD 5  5 .The SDP approach confirmed the feasibility of solving the Hamming weight modulo problem with an error rate ε below 10 −5 .However, in terms of workspace dimension, the QQA given by SDP required a workspace register with dimension 22 since max(rank(M (j) i )) = 22.Further analysis of the Gram Matrix of the output state, denoted as M (4) [x, y] = ⟨ψ   1) , M (2) , M (3) , and M (4) , respectively.Given the input domain for MOD 5 5 is {0, 1} 5 , each Gram matrix is a 32 × 32 representation.The heatmaps capture the magnitude of the inner products |⟨ψ  In a further exploration,we applied VarQQA to search for a 4-step QQA that can "exactly" compute MOD 5  5 and successfully identified such an algorithm with an error rate below 10 −7 .
The workspace dimension d w = 2 thus the accessi-  [2,6,18,60,120,161,161,120,60,18,6] Table I.Evaluation of QE(MOD p p ) for primes p = 5, 7, 11, with the conjectured exact complexity being p − 1.The columns QSDP and QVar indicate the minimum number of queries required to maintain an error rate ε below 10 −5 , as determined by SDP and VarQQA approaches, respectively.The value r = max(rank(M (j) i )) denotes the maximum workspace dimension deduced from SDP outcomes, whereas dw specifies the workspace dimension ascertained through VarQQA.Additionally, DA details the dimensions of each orthogonal subspace within the accessible space, in accordance with the complete set of orthogonal projectors.
ble dimension d A = 2 × (n + 1) = 12.The accessible space is partitioned into orthogonal subspaces with dimensions (2,4,1,1,4).When we initially explored the case with d w = 1, our exhaustive search failed to find a solution, suggesting that a workspace dimension of at least d w = 2 is necessary.We believe the QQA found by VarQQA for MOD 5  5 is optimal, given that it uses the minimal number of queries and minimal workspace dimension.To validate our findings, we also applied the SDP method to search for a 4-step QQA that computes MOD 5  5 .The SDP method confirms that a 4-step QQA can solve the Hamming weight modulo problem with an error rate ε below 10 −5 .However, the SDP results only imply that the workspace dimension is less than 22, as indicated by max(rank(M The Gram matrix analysis during each stage after calling the oracle provides insights into the output state of the QQA.We found that inputs with the same Hamming weight modulo value have the same output state, differing only by a global phase.The progression of the Gram matrix after each oracle call is depicted in FIG. 2, which may provide guidance for constructing analytical algorithms. The question of whether a 6-step QQA can exactly compute the MOD 7  7 Hamming weight modulo function was also addressed.Indeed, VarQQA found a solution, with an error rate of approximately 10 −7 .The accessible space was partitioned into subspaces with dimensions of (2, 4, 7, 12, 12, 7, 4) and a workspace dimension of d w = 6.The SDP method also identified a 6-step QQA with an error rate ε below 10 −5 , corroborating our findings.
The dimension of the Gram matrix escalates exponentially in relation to n, which prevents the SDP method from delving deeper.Applying VarQQA to search for the MOD 11  11 case yielded a 10-step QQA that calculates MOD 11  11 with an error rate around 10 −6 .The workspace dimension d w equals 61, and the accessible space is partitioned into dimensions (2,6,18,60,120,161,161,120,60,18,6) respectively.These results are summarized in Table I.
We believe that Conjecture 1 generally holds true, and we further conjecture that the workspace dimension of a QQA capable of exactly computing MOD n n scales as EXACT n k,l function Consider the following n-bit function with 0 ≤ k < l ≤ n: i.e., the function returns 1 only when exactly k or l of the bits x i are 1.Ambainis et al. [24] proved the following results: While previous studies have established a lower bound for Q E (EXACT n k,l ), the question of whether these bounds are tight remains open for further investigation.We apply VarQQA to study the unfilled gap of Q E (EXACT n k,l ).We claim that an exact quantum query algorithm is very likely to exist when the error rate of the algorithm found by VarQQA is below 10 −5 .Montanaro et al. [21] applied SDP methods to solve exact quantum query complexity for all symmetric Boolean functions up to 6 bits.Therefore, we focus on the case for n-bit EXACT n k,l functions where n ≥ 7.
For the cases where l − k = 1 and l + k ̸ = n, we used VarQQA to search for Q E (EXACT n k,l ) up to 13 qubits.The QQAs that we found by VarQQA are listed in Table II.In most cases, Q Var meets the theoretical lower bound Q L = max{n−k, l}−1.However, exceptions occur when k equals (n + 1)/2, including EXACT 7  4,5 , EXACT 9 5,6 , and EXACT 11  6,7 .VarQQA could not find an algorithm to exactly compute EXACT 7  4,5 with 4 queries, despite exploring several decompositions of accessible space and increasing the workspace dimension to 10. Similar behavior was observed when searching for a 5-query algorithm for EXACT 9  5,6 and a 6-query algorithm for EXACT 11  6,7 .We also examined the result with SDP and found that the minimum error rate for EXACT 7  4,5 with 4 queries converged to approximately 0.001, and for EXACT 9  5,6 with 5 queries, the error rate converged to around 0.003.Therefore, according to the results from both SDP and Var-QQA, Q E (EXACT 7  4,5 ) and Q E (EXACT 9  5,6 ) should be 5 and 6, respectively, indicating that the lower bound is not tight for these cases.
For l − k ≥ 2, we studied the case where n is even, and k, l are symmetrically distributed around n/2.As Both cases highlight the efficiency of VarQQA, revealing that the workspace dimension required by QQA is significantly smaller than the estimates provided by SDP.Furthermore, VarQQA offers insights into the scaling of the workspace dimension d w relative to n, providing a new angle to comprehend the complexity of QQA beyond just the number of queries t.

C. Numerical analysis of algorithm
The computational resources utilized in this study are detailed as follows: The semidefinite programming (SDP) calculations were performed on a CPU platform, specifically an Intel Xeon Gold 6248R CPU with 256 GB of RAM.We employed the Splitting Conic Solver (SCS) for these computations and cross-verified the results with the Mosek solver.Conversely, the VarQQA computations were executed on an Nvidia V100 GPU, utilizing the "L-BFGS-B" optimization algorithm.
In the realm of SDP, the number of optimization parameters scales as O(t • n • |S| 2 ), where t represents the number of queries, n is the bit length of the input domain, and |S| denotes the cardinality of the input domain.For a total Boolean function, |S| is equal to 2 n , leading to a scaling of the number of parameters as O(t • n • 4 n ).In contrast, VarQQA parameterizes the unitary operations within the accessible space.A unitary in a d-dimensional Hilbert space requires O(d 2 ) parameters for full parameterization.For an n-bit Boolean function, the dimension of the accessible space is d q × d w , where d q = n + 1 and d w is specific to the Boolean function f .Hence, VarQQA exhibits an optimization parameter scaling of O(t•n 2 •d 2 w ).If the dimension of the workspace register scales as O(poly(n)), it can be posited that VarQQA would necessitate significantly fewer parameters compared to the SDP method.Our investigation indicates that VarQQA already outperforms the SDP method for certain Boolean functions analyzed in this study when simulated on a classical computer.The SDP method demands an extensive amount of time to solve a 10-bit Boolean function and is incapable of addressing an 11-bit Boolean function due to memory constraints.In stark contrast, VarQQA has successfully determined the exact query complexity for EXACT n k,l functions, even for cases with up to 16 bits.
While our method does not inherently assure convergence to a global optimum-a distinguishing feature of convex optimization-this limitation can be mitigated through various strategies.Employing multiple random initializations can enhance the likelihood of locating the global minimum by exploring different regions of the solution space.Furthermore, the application of advanced optimizers, such as those with adaptive learning rates or momentum-based methods, can significantly refine the search process.Additionally, heuristic methods or the incorporation of domain-specific insights can steer the optimization more efficaciously.By implementing these strategies, we aim to offset the inherent limitations and amplify the overall efficacy of our algorithm.

V. DISCUSSION
In this study, we introduce a Variational Quantum Query Algorithm (VarQQA) designed to investigate the exact quantum query complexities of Boolean functions.This innovative approach optimizes both the number of oracle calls and the workspace dimension requirements, facilitating the efficient construction of quantum query algorithms.We have applied VarQQA to explore the quantum query complexities of notable Boolean functions, including the Hamming weight modulo function and the EXACT n k,l function.Our investigation reveals the limitations of the Semidefinite Programming (SDP) approach, primarily its exponential scaling in optimization parameters and significant memory demands, which confine its applicability to smaller problem instances.In contrast, VarQQA demonstrates scalability and resource efficiency by optimizing a reduced number of parameters, thus enabling the analysis of more complex Boolean functions.Our results attest to VarQQA's superior performance over SDP in managing higher-dimensional problems, underscoring its potential to enhance quantum query complexity analysis and advance quantum computing.
The application of VarQQA to the Hamming weight modulo function MOD n m not only supports Cornelissen et al.'s [23] conjecture for prime numbers but also highlights the algorithm's minimal resource requirement in terms of queries and qubits.This suggests an optimal quantum query strategy for MOD 5  5 , and our extension to larger primes furthers our understanding of quantum query complexities for this function class.Moreover, our exploration of the EXACT n k,l problem broadens the known limits of quantum query complexities.VarQQA's ability to address instances previously beyond the computational reach of SDP methods illuminates its potential to bridge significant gaps in our understanding.
These findings not only corroborate existing conjectures but also pave new pathways for the analytical construction of quantum algorithms, emphasizing VarQQA's critical role in propelling forward the domain of quantum computing.A vital direction for future research is the identification of Boolean functions with acceleration factors greater than 2, a key question in the realm of exact quantum query complexity.
Note added: Recently, Ye [29] confirmed Conjecture 1 by construction.His algorithm computes the n-bit mod n Hamming weight modulo function with n − 1 queries and requires a 2 n dimensional workspace, which aligns with our numerical results.Despite exhibiting the same asymptotic behavior in terms of the workspace dimension, the algorithm discovered by VarQQA for MOD 5  5 , MOD 7  7 , and MOD 11  11 necessitates a smaller workspace.This reduction in workspace could potentially save on qubit overhead when utilized as a sub-algorithm for larger scale problems.
In the quest to design Quantum Query Algorithms (QQAs) without prior blueprints, the choice of parameterization for the unitary group U(d) is pivotal.The goal is to capture the full expressiveness of U(d), allowing for any element to be accurately represented.Among the proposed methods are the Cayley transform [30], the Weyl group generators [31], recursive constructions [32], and the direct matrix exponential of a Hermitian matrix.
The Cayley transform, while efficient and simple to implement, tends to gravitate towards local minima during optimization.The recursive method, designed to sample unitaries from the Haar measure, becomes increasingly complex with scale, potentially leading to the Barren Plateau phenomenon [33] and hindering trainning process.The Weyl group approach, although intriguing, also falls prey to local minima and lacks the computational swiftness of the direct exponential method.The direct matrix exponential method not only delivers in terms of speed but also presents a more conducive optimization landscape.In our experiments with QQAs, we often achieve convergence to the global optimum with fewer than five random initializations.
Our methodology employs the direct matrix exponential technique, utilizing Hermitian parameterization to represent unitary matrices with real parameters.This technique is pivotal in optimization tasks that require finding a optimal set of t independent unitary matrices {U 1 , U 2 , . . ., U t } within the unitary group U(d).Each U j is parameterized as U j = e iHj , where H j is a d × d Hermitian matrix given by: Here, A j and B j are real d × d matrices, ensuring that H j maintains the Hermitian property, characterized by real diagonal elements and complex conjugate pairs offdiagonally.The construction of H j via A j and B j allows for d 2 independent real parameters for each H j , aligning with the dimensionality of d×d Hermitian matrices.This parameterization is achieved as follows: • A j + A T j forms a symmetric matrix from a real upper triangular matrix A j , offering d(d+1) 2 independent parameters.
• i(B j − B T j ) creates a skew-symmetric matrix from a real upper triangular matrix B j with zeros along its diagonal, contributing d(d−1) 2 independent parameters.
The optimization process involves simultaneously adjusting the parameters of all H j matrices to minimize the objective function.This is typically achieved using gradient-based optimization methods, where the continuous and differentiable nature of the mapping from R d 2 to U(d) plays a crucial role in the effectiveness of these techniques.To eliminate redundancy associated with the global phase, which does not affect our optimization goals, we set tr(A j ) = 0 for each A j .This parameterization strategy not only ensures efficient gradient computation but also facilitates the optimization process within computational frameworks such as PyTorch, streamlining the search for an optimal set of unitary matrices.

Figure 1 .
Figure 1.The circuit structure of t-step quantum query algorithm.Oracle acts on the query register, and unitary acts on the whole accessible space.

5 5
sheds light on the characteristics of the QQA's output states.It was observed that inputs sharing the same Hamming weight modulo value yielded identical output states, differing only by a global phase, i.e., ⟨ψ (y) .This insight could offer valuable direction for the development of analytical algorithms.The evolution of the Gram matrix after each oracle call is illustrated in FIG.2.

y
⟩| for i = 1, 2, 3, 4, indicating the degree of quantum state overlap after 1, 2, 3, and 4 oracle calls.The inputs x and y are organized in the heatmaps according to their Hamming weight modulo 5, forming five distinct clusters.Within each cluster, the inputs are sorted by increasing binary value.Notably, as illustrated in subfigure (d), |⟨ψ

y
⟩| equals 1 if x and y share the same Hamming weight modulo value, signifying identical output states modulo a global phase, and 0 otherwise, indicating perfect orthogonality for inputs with different Hamming weight modulo values.

Table II .
QE(EXACT n k,l ) for l − k = 1.Here, QL represents the theoretical lower bound of the exact quantum query complexity.The columns QSDP and QVar indicate the minimum number of queries required to maintain an error rate ε below 10 −5 , as determined by SDP and VarQQA approaches, respectively.The value r = max(rank(M (j) i )) denotes the maximum workspace dimension deduced from SDP outcomes, whereas dw specifies the workspace dimension ascertained through VarQQA.Additionally, DA details the dimensions of each orthogonal subspace within the accessible space, in accordance with the complete set of orthogonal projectors.

Table III .
QE(EXACT n k,l ) for l − k > 2, where n is even and k, l are centered around n/2.The notation used in this table is consistent with that in TableII.depicted in TableIII, all of the cases that we have investigated satisfy Q Var