Privacy-preserving quantum federated learning via gradient hiding

Distributed quantum computing, particularly distributed quantum machine learning, has gained substantial prominence for its capacity to harness the collective power of distributed quantum resources, transcending the limitations of individual quantum nodes. Meanwhile, the critical concern of privacy within distributed computing protocols remains a significant challenge, particularly in standard classical federated learning (FL) scenarios where data of participating clients is susceptible to leakage via gradient inversion attacks by the server. This paper presents innovative quantum protocols with quantum communication designed to address the FL problem, strengthen privacy measures, and optimize communication efficiency. In contrast to previous works that leverage expressive variational quantum circuits or differential privacy techniques, we consider gradient information concealment using quantum states and propose two distinct FL protocols, one based on private inner-product estimation and the other on incremental learning. These protocols offer substantial advancements in privacy preservation with low communication resources, forging a path toward efficient quantum communication-assisted FL protocols and contributing to the development of secure distributed quantum machine learning, thus addressing critical privacy concerns in the quantum computing era.


I. INTRODUCTION
Quantum computing has experienced rapid advancements in recent years, and within this dynamic landscape, distributed quantum computing including quantum machine learning (QML) [1][2][3][4][5][6][7][8][9], has garnered considerable attention due to its remarkable capability to harness the collective power of distributed quantum resources, surpassing the limitations of individual quantum nodes.Distributed quantum computation usually involves generating and transmitting quantum states across multiple nodes leveraging the advancements in quantum communication technologies [10].Remarkably, distributed quantum computing protocols offer a ray of hope in addressing privacy concerns in the presence of adversaries [10][11][12][13][14], while traditional classical methods have struggled to ensure the confidentiality of sensitive information during distributed processes.These adversaries not only involve third-party attacks that can be tackled with well-celebrated quantum communication technologies such as quantum key distribution [10,11], but also include privacy concerns with untrusted computing nodes [12,13].
A critical example of this vulnerability indeed lies in classical federated learning (FL) [15,16], where multiple clients collaboratively train a machine learning model to optimize a given task while keeping their training data distributed without being moved to a single server or data center.A central server is assigned the responsibility of aggregating the client model updates, typically the model cost function gradients generated by the clients using their local data.However, this opens up a possibility of leaking client's sensitive data to the server using gradient inversion attacks [17][18][19][20][21].While techniques * changhao.li@jpmchase.comemploying homomorphic encryption or differential privacy [22,23] have been introduced to tackle the problem, they usually demand additional computational and communication overhead or come at the expense of reduced model accuracy.To this end, quantum technologies could provide a natural embedding of privacy.To counteract the gradient inversion attack, one recent proposal [9] replaced the classical neural network in the FL model with variational quantum circuits built using expressive quantum feature maps such that the problem of a successful attack is reduced to solving high-degree multivariate Chebyshev equations.Other quantum-based proposals include adding a certain level of noise to the gradient values to reduce the probability of a successful gradient inversion attack [24], leveraging blind quantum computing [25], and others [26][27][28][29].An alternative to the aforementioned methods is to encode the client's classical gradient values into quantum states and leverage quantum communication between the clients and server to transmit the states.This provides opportunities to hide the gradient values of individual clients from the server while allowing the server to perform the model aggregation using appropriate quantum operations on their end.In this case, the transmitted quantum states offer an inherent advantage in terms of privacy even without additional privacy mechanisms, as the classical information can be encoded in logarithmic number of qubits and using Holevo's bound, the server could extract at most logarithmic number bits of classical information during each round of communication [6].Moreover, we remark that the approach can be naturally integrated with quantum cryptographic techniques [10,30] to become robust against third-party attacks.
In this work, we introduce protocols for the above approach, aiming to advance the capability of distributed quantum computing with quantum communication in the context of FL.Specifically, we propose two types of protocols: one based on private inner-product estimation to arXiv:2312.04447v1 [quant-ph] 7 Dec 2023 perform model aggregation, and the other based on the concept of incremental learning to encode the model aggregated sum in the phase of the quantum state.For the former, we transform the secure model aggregation task into a correlation estimation problem and generalize the recently-developed blind quantum bipartite correlator (BQBC) algorithm [7] into multi-party scenarios.For m clients with d model parameters to be updated, the protocol involves a quantum communication cost of Õ(md/ϵ) where ϵ is the standard model update error, and it is quadratically better in m compared to the analogous method based on classical secret sharing [31].For the second type of protocol, similar to incremental learning, clients perform multi-party computation sequentially or simultaneously without having the server involved until the end of the protocol at which the server extracts the aggregated gradient information.For one of our proposed protocols within the framework of incremental learning, the secure multi-party summation algorithm achieves a similar quantum communication cost as the BQBC with the complexity being Õ(md/ϵ).
These protocols are designed not only to bolster privacy but also to have an evaluation on the quantum communication costs.Through the application of quantum algorithms, this work aspires to unlock novel strategies that are capable of safeguarding sensitive information within the realm of distributed quantum computing while optimizing communication efficiency.Furthermore, it is noteworthy that the suggested protocols can seamlessly integrate with quantum key distribution protocols, thereby ensuring information-theoretic security against external eavesdropper attacks.Our work sheds light on designing efficient quantum communication-assisted federated learning algorithms and paves the way for secure distributed quantum machine learning protocols.

A. Federated Learning setup
We present the settings of the quantum communication-based federated learning scheme involving m clients and a central server.Consider the setup with each client i ∈ [m] having N i samples of the form, such that the total number of samples across all the clients is N The aim is to learn a single, global statistical model such that the client data is processed and stored locally, with only the intermediate model updates being communicated periodically with a central server.In particular, the goal is typically to minimize a central objective cost function, where θ = {θ 1 , • • • , θ d } ∈ R d are the set of d trainable parameters of the FL model.The user-defined term w i ≥ 0 determines the relative impact of each client in the global minimization procedure with the most natural setting being w i = Ni N .That is, here the weight w i depends on the local data size of individual clients and is known to both the server and clients.
In the standard federated learning setup, at the t-th iteration, the clients each receive the parameter values θ t ∈ R d from the server and their task is to compute the gradients with respect to θ t and send it back to the server.Here the superscript denotes the iteration step.Upon performing a single batch training, they compute the d gradient updates ∇L i and share it with the server.The server's task is then to perform the gradient aggregation within a standard error bound ϵ to update the next set of parameters θ t+1 using the rule, where for the rest of the work, we assume the relative impact w i = Ni N , and α is the learning rate hyperparameter chosen by the server.The parameters θ t+1 are then communicated back to the clients and the protocol repeats until a desired stopping criteria is reached.
We denote that in many cases, one is interested in learning m i=1 ∇L i (θ t ) mod 2π as the model parameters can have a 2π period, particularly in quantum circuits.We point out that here the local circuit model of both the server and clients could be either classical or quantum, but they both have the capability of encoding their local data into quantum states.Further, we consider a quantum communication channel between the server and m clients in order to facilitate the transmission of quantum states.

B. Data leakage in classical FL
The existing classical FL setup was built on the premise that sharing gradients to the server would not leak the local data information to the server.However, this notion of privacy has been challenged by the wider community [19].Specifically led by the work of [20] and followed up by [17,[32][33][34], it's shown that it is possible for the honest-but-curious server (who strictly follows the protocol but is interested in learning clients' private data) to extract input data from model gradients.In fact, using the results of [18], we showcase in Appendix A how to easily invert the gradients generated from a fully connected neural network model to learn the data.While classical techniques including homomorphic encryption [22] and secret sharing [31] have been employed to tackle the challenge, they usually impose a significant overhead in communication and computation cost, limiting their applications for federated learning tasks.On the other hand, randomization approach employing differential privacy [23], while being simple to implement, usually leads to a reduced model accuracy and utility (see Appendix B for details).
In this work, we address the concern of data leakage originating from gradients that are generated by either a classical neural network based model or a variational quantum circuit based model [35].The primary objective is to facilitate a secure global parameter update without divulging individual clients' gradient information ∇L i (θ t ) to the server, thereby mitigating the risk of gradient inversion attacks.In order to hide the individual gradient information while still performing the model parameter update in Eq. 2, one can implement privacy in either multiplication between weights w i and local gradient ∇L i (θ t ) , or summation among weighted gradients.In what followings, we will show protocols along these two ways: secure inner product estimation or secure weighted gradient summation (in analogy with incremental learning).Before diving into the details, we summarize the proposed protocols by listing the main privacy mechanism as well as quantum communication complexity and their requirements in Table I.

III. PROTOCOL I: SECURE INNER PRODUCT ESTIMATION
In this section, we consider converting the model aggregation problem into task of distributed inner product estimation between server and clients where algorithms such as quantum bipartite correlator (QBC) [7,8] could be employed.From the federated learning parameter update rule Eq. 2, we note that for each parameter index, j ∈ [d], the task for the server would be to perform multiplication between the weight w i and local gradient ∇L i,j (θ), before summation of all weighted gradients to obtain θ t+1 j .
In the following, we start from a baseline approach where the secure inner product is performed with the assistance of classical secret sharing (CSS).Following it, we utilize the blind quantum bipartite correlator algorithm and propose a scheme for secure inner product estimation with quadratically fewer communication cost in m.

A. Baseline: Classical secret sharing assisted inner-product estimation
In this section, we start with a purely classical strategy to hide the gradients of the clients prior to sending the masked gradients to the server.We use this as a baseline to compare against the quantum gradient hiding strategies we develop over the next sections.The baseline strategy is built using the masking technique with one-time pads as introduced in Protocol 0 in Ref. [31].For this protocol to succeed, we assume that each client is switched "on" during the entirety of the protocol and further, has pairwise secure classical communication channels with each of the m − 1 other clients.
The protocol starts with each client i sampling m − 1 random values s i,k ∈ [0, R) for every other client indexed by k.Here R is the chosen upper limit of the interval as agreed by all the clients.Similarly, all other clients generate the random values in [0, R) for every other client.Next, clients i and k exchange s i,k and s k,i over their secure channel and compute the perturbations The clients repeat the above procedure a total of d times (to mask each of the d gradient values ∇L i,j (θ)).
Next, for every parameter to be updated, each client sends masked gradient value to the server, Note that we drop the parameter index j hereafter for simplicity.The task of the server is then to perform a weighted aggregation of the gradients in order to obtain the next set of parameter values.It can be trivially checked that an honest server always succeeds in per-forming the correct aggregation, i.e., ȳ = m i=1 Further, privacy is guaranteed due to the use of one-time pad masking of gradients which guarantees informationtheoretic security against malicious server.The above scheme requires a total of m(m−1) ) classical bits of communication between the clients and a further O(md) bits of communication between the clients and server to achieve secure aggregation.Thus the total classical communication complexity required is, ( We remark that the above classical secret sharing based scheme could be augmented using quantum resources to provide a minor improvement in the total communication cost (Fig. 1a).Specifically, after obtaining the masked gradients as in Eq. 3, the clients can collaboratively encode their masked gradients in an amplitude encoded quantum state, where N c is the normalization factor.This state is then sent to the server which can recover the weighted aggregate sum by performing the SWAP test-based discrimination [36] with their local state |ϕ s ⟩ = 1 Ns m i=1 w i |i⟩.Since the state in Eq. 6 requires only O(log(m)) qubits, the amount of communication between the server and clients can be reduced to O(log(m)/ϵ 2 ), where ϵ is the error incurred in estimating the aggregated sum using the SWAP test.The total communication complexity with this scheme is,

B. Model aggregation with blind quantum bipartite correlator algorithm
To reduce the communication cost, in this section, we propose a method for model updating based on quantum bipartite correlator algorithm [7,8] that is designed to estimate inner product between remote vectors.The essential idea is a generalization of recently-proposed blind quantum bipartite correlator algorithm [7]: firstly, each client converts the gradient information into binary floating point numbers.Then, at each round of communication, the server passes the index qubit state that encodes weight information into each honest or honest-butcurious client and let them privately encode the gradient information into the phase of corresponding index qubits.Finally, the server receives back the index qubits and perform quantum counting algorithm to extract the desired aggregated gradient.
We now proceed to the implementations details of the algorithm.As mentioned, the goal is to have the sever performs the inner product estimation using the known weight information and gradient information that is only locally held by each client.For the k-th client (k ∈ [m]), both the weight w k and the gradient ∇L k can be expanded as binary bitstrings both with size l k : a k and b k , such that w k • ∇L k equals to the inner product One example of such expansions is to use the IEEE standard for floating-point arithmetic [37], where we can have Here u and v are the highest digits of w k and ∇L k , respectively, and are constants known to both server and clients.We then get In the following, we assume l k = l 0 , ∀1 ≤ k ≤ m for simplicity.Then the goal is to design a private inner product protocol to have the server evaluate m k=1 l0 j a kj b kj .We thus consider the following protocol: initially, the server prepares a quantum state with ⌈log(ml 0 )⌉ index qubits |i⟩, and then applied controlled-gate to encode all w k information on a single qubit o a .The final state is where the index k denotes the k-th client and index i is the index for bitstring with size l 0 .We omit the normalization factor for above and following states in the protocol for simplicity.Then, as shown in the diagram in Fig. 1b, the server delivers the above ⌈log(ml 0 )⌉ + 1 qubits to the first client.Note that a malicious server could prepare a state m k=1 l0 i=1 c ki |k, i⟩ |a ki ⟩ oa with non-uniform amplitude distribution of c ki to extract clients' information of interest.To detect such attacks, the first client would firstly decode the ancillary qubit o a (as the encoded weight information is known globally) and then measure the index qubits in X basis.In the honest server case where c ki has BQBC-based QFL a uniform distribution, measurement outcome should be all +1.That is, if a malicious server tries to extract certain gradient information by increasing the amplitude of corresponding bitstrings, the index qubit state without the ancilla would not be |+⟩ ⊗⌈log(ml0)⌉ .After successful verification and re-encoding of the weight information in ancillary qubit o a , the first client encodes its local gradient information ∇L 1 into the phase of the first part of index qubits, which leads to This could be done using CZ gate between qubit o a and a local qubit held by the first client that encodes ∇L 1 .
The first client then passes the above state to the second client, who then encodes its local gradient information ∇L 2 into the phase of the second part of index qubits.The resulting state is The above process is repeated until all the clients have encoded their local gradient information in the phase, resulting to a state Finally, the state is returned to the server by the last client.Then the server runs quantum counting algorithm [7,8] to evaluate 1 ) rounds of communication is needed where ϵ is the standard estimation error.We remark that quantum counting algorithm is based on Grover's search algorithm and is advantageous compared with SWAP-test based algorithms [38] in terms of the error complexity.
We present the takeaway of this method here.Firstly, the privacy is encode in the index qubit states.When the server measures the index qubits, the probability of getting a specific index is simple 1 ml0 , which is small when client number m is large.Moreover, the server could not amplify the amplitude of a specific index by preparing a uniformly distributed superposition state, as the first client is capable of verifying it.Furthermore, even there are multi-round of communication and the server could perform collective attack, by increasing l 0 or adding a random pad on the phase, it's still hard for the server to get individual client's information [7].Note that here the privacy comes from the phase encoding, rather than summation of gradients as in incremental learning protocols.
The quantum communication complexity would be the total number of qubits transmitted in order to estimate (14) Again, here m is the number of clients, and l 0 is related with the precision of gradient ml 0 = O(m log(1/ϵ 0 )) = O(m log(m/ϵ)) where ϵ 0 is the inner product estimation error bound for single clients.This is better than classical secret sharing which has a total complexity in O(m 2 ).We note that in the absence of random phase padding, the technique here doesn't require classical communication at each round.The incorporation of random, onetime phase pads for privacy enhancement necessitates an additional classical communication cost of Õ(m), as each client would need to send the padding information to the server at last.

C. Redundant encoding
The privacy of the protocol above could be further enhanced with redundant encoding of gradient data into binary bitstrings [7].In particular, we remark on the following theorem: Theorem 1 (Efficient redundant encoding).In the BQBC-based QFL protocol, given a fixed estimation error ϵ, there exists a redundant encoding method with a redundant parameter r, such that the probability that server learns client's information decrease polynomially in r, which the communication complexity increases only polylogarithmically in r.
Proof.Following Ref. [7], we consider the following redundant encoding approach aimed at reducing the probability that a malicious server acquiring a specific b i information with i being the pertinent index of interest.we describe the following protocol where both the client k and server encode their single bit local information b ki and a ki into bitstrings b ki,r with size r, where r is an integer and r > 1.The total amount of bits then increases from ml 0 to rml 0 .For the weight information, we consider the following encoding rule a ′ ki,j = a ki ; k = 1, 2, ..., m; i = 1, 2, ..., l 0 ; j = 1, 2, ..., r; which is a simply copy the bit a ki for r times.Here k is the index for client and i is the index of bitstring held by each client.On the other hand, for b ′ k , the k-th client can hide the information b ki randomly in one of the r digits and let the other r − 1 digits to be all zero or one.That is, for bit index i, the k-th client chooses either where R ki is an random number R ki ∈ [r] and δ j,R ki is the Kronecker delta function.
Then, according to the above rules, by running the QBC algorithm, for each k the server would get (18) depending on whether the k-th client chooses encoding method Eq. 16 or Eq. 17.The difference between the two extracted values is r−1 rl0 l0 i a ki that the server would know.Note that this choice could vary for different client k.At the end of the protocol, each client can send an one-bit message via classical channel to the server and let server knows which one was used, after which the server could extract m k=1 l0 j=1 a kj b kj .This process yields a classical communication O(m).
We remark that at each communication round, the probability that the server samples a specific bit reduces from 1 ml0 to 1 rml0 with r > 1.Even though that r-times more communication round will be needed to achieve the same error bound ϵ as in the original QBC case, the server would not know which digit encodes the correct b ki information as here R ki s are random numbers.Therefore, the probability that the server successfully gets a specific bit b ki would be where the second term r ϵ is the total number of communication rounds and the third term is 1 r is due to the randomness in R ki .
It's clear to see that a larger value of r corresponds to a decreased probability for the server to successfully extract valuable information from the client through the attack strategy.The flexibility that the client can independently choose encoding method also protects the majority information of b, i.e., the client may choose Eq. 16 to encode data if the majority of b is 1 to decrease the probability that 1s are being detected.Nevertheless, the trade-off for employing this redundant encoding approach compared with the original one in Sec.III B manifests as an augmented quantum communication complexity, as the transmitted qubit number goes from log(ml 0 ) to log(rml 0 ) now.To this end, with the above redundant encoding method, the communication complexity increases logarithmically in r, while the probability that server successfully gets a specific bit b ki decrease polynomially in r.
With Theorem 1, we show that one can design a protocol such that the privacy goes polynomially better with a redundant encoding parameter while the communication cost only goes logarithmically or linearly with this parameter.

IV. PROTOCOL 2: INCREMENTAL LEARNING
The aforementioned protocols entail secure inner product estimation between the server and clients.An al- ternative approach involves the formulation of protocols with secure weighted gradient summation, ensuring that the server exclusively receives aggregated gradients rather than weighted gradients from individual clients.This concept aligns with the principles of incremental learning, wherein a model or parameter is iteratively trained or updated with new data, assimilating fresh information while preserving knowledge acquired from prior data.In this section, we introduce two such protocols: the first involves secure aggregation through a globally entangled state, while the second is grounded in secure multiparty gradient summation.

A. Secure aggregation based on global entanglement
We start by discussing a secure aggregation protocol using globally entangled state distributed among clients and server (Fig. 2a).Similar as Ref. [39], at each time step and for each parameter to be updated, we consider a global (m+1)-qubit GHZ state and each qubit is held by one party.This can be achieved by letting the server prepare an GHZ state locally and then distribute the m-qubits to the m clients via quantum channels.Alternatively, we can consider each party holds on local qubit and remote entanglement is generated via quantum photonic channels.
Nevertheless, after the preparation of the GHZ state, for a given parameter, the k-th client encodes its local weighted gradient information ∇L k (θ t ) into the phase of its local qubit by applying a phase gate.This process can be either sequentially or simultaneously.The distributed qubits are then sent back to the server via the quantum channel.The resulting state now reads The server could firstly disentangle the m received qubits by performing sequential CNOT gates between the local qubit s with the rest m qubits, leading to Then, similar as it's done in a typical Ramsey interferometry experiment [40], the server would need to estimate the phase term m i ∇L i (θ t ) by applying a Hardmard gate on its local qubit followed by projective measurement in computational basis.The probability of getting zero would simply be The above process is repeated O( 1 ϵ 2 ) times until the desired error bound ϵ is met.The procedure is iteratively applied to update all d parameters.
We now perform security analysis of the gradient information.As all the local gradient is aggregated in the phase of the GHZ state, the server could not extract the gradient of single clients.Under malicious server setting, in contrast of the semi-honest assumption in Ref. [39], if the GHZ state is distributed by the server, the server could simply prepare the state where only the j-th client's qubit is entangled with the server qubit while others are not entangled, for example, Then, the malicious server would extract the gradient information of client j by measuring the phase its local qubit.To tackle this adversary, the GHZ state could be distributed by a trusted client.Alternatively, the GHZ state can be generated by allowing communications among clients.That is, the honest (or honest-butcurious) clients could jointly prepares a m-qubit entangled state and then communicate with the server to reach the state Eq.20.
The total communication complexity for the aforementioned distributed entangled state scenario would be decided by the qubit distribution at each round and number of communication rounds to estimate the phase.Specifically, the total quantum communication cost would read with ϵ being the error bound for the phase estimation.

B. Secure multiparty gradient summation
We next introduce a gradient-hidden quantum federated learning protocol using phase accumulation and estimation.Inspired by the secure multiparty quantum summation protocol proposed in Ref. [41], we consider the following data encoding method.At each time step t, the gradient information for parameter k and client l is ∇ k L l (θ t ) ∈ {0, δ, 2δ, ..., 2π}.Note that here we set the upper bound of each individual gradient to be 2π for simplicity and the condition can be relaxed.
As shown in the diagram in Fig. 2b, the protocol starts from the first client, who encodes its local gradient information for a given parameter into a h-qubit state |∇L 1 ⟩ with h = ⌈log(2π/δ)⌉ (we consider log(2π/δ) being an integer for simplity in the following).A subsequent quantum Fourier transform (QFT) would yield the following state Then, the first client prepares a h-qubit ancillary state that encodes the same information as |l⟩ 1 above.This can be achieved by simply applying CNOT gates between the first h qubits in Eq. 25 and the ancillary h qubits.The resulting state reads where the subscript a denotes ancilla.The first client then sends the h-qubit ancillary state to the second client via quantum communication.Similarly as the first client, the second client would first encode its local gradient information in another h-qubit state |∇L 2 ⟩.In order to perform the summation of ∇L 1 and ∇L 2 , we consider the following operation for the second client: conditioned on the state of the received ancilla qubits, phase gates are applied on the local h qubit, such that the resulting ensemble state reads Note that the local state |∇ k L 2 ⟩ is not entangled with the rest of the system after the above operation.The second client would then pass the received h qubits to the next client and the above process repeats until all the m clients encode their local gradient information in the phase: The m-th cient would return the ancillary qubits back to the first client, who will subsequently perform verification on the states to detect potential dishonesty of the involved parities.Specifically, the first client would first uncompute the ancillary qubits with CNOT gates, leading to Then the ancillary qubits are measured.In the absence of malicious client that tries to extract the phase information of previous clients and perform projective measurements on the ancillary qubits in computational basis, the measurement should yield 0 for all the h qubits.For example, if a malicious client applies inverse QFT on the ancillary qubits to extract the aggregated phase, the first client can detect this anomaly, given that the ancillary qubits cannot be reset to |0⟩ 1,a in such a scenario.Upon the verification, the first client would send the |l⟩ 1 state to the server, who can then apply inverse QFT to extract the accumulated gradient information.
With the state | m i ∇ k L i mod 2π⟩ 1 outlined above, the server could perform model aggregation and update the model accordingly.The same protocol applies for other parameters to be updated and different time windows.
We remark that as the state received by the server is l=1 e ilδ m i ∇ k Li |l⟩ 1 and the gradient aggregation has already been performed incrementally, the server could not extract the local gradient information held by individual clients.Moreover, the verification procedure could ensure that a malicious client could not simply perform measurement on the phase of the received qubits to extract the previously aggregated gradient information.To this end, the protocol relies on one trusted client node that can prepare the entangled state in Eq. 26 and send the final h-qubit state to the server.The efficiency of this incremental learning protocol might be improved by pre-assigning clients into multiple batches in which there are at least one trusted node.
We now discuss the quantum communication cost of the designed protocol.As discussed above, the h-qubit states are transmitted among all the m clients for each parameter to be updated.For p iterations of the process that are needed to yield a standard error ϵ on the phase, the communication complexity of this secure multiparty summation protocol is given by where the log m ϵ term is associated with the error that comes from assigning ∇ k L(θ t ) ∈ {0, δ, 2δ, ..., 2π}.
Alternatively, one may consider encoding all the d gradient information in a superposition state with ⌈log d⌉ qubits to reduce communication cost in terms of number of parameters d.However, as the server would need to update the d parameters separately, at least d samplings are required to query the encoded gradient information hence the total communication complexity in d would still be Õ(d).

V. DISCUSSIONS
We remark that the above protocols leveraging quantum communication can be integrated with common quantum cryptography techniques [10,30,42,43] to be secure against external attacks.As an example, we consider using decoy state [42] to detect eavesdropping attacks: when a quantum state with n data qubits is sent to another party via a quantum channel during the QFL protocols, decoy states are randomly inserted and sent along with data qubits.
More specifically, when a n-qubit state is transmitted, we consider n d = O(n) decoy qubits that are randomly drawn from {|0⟩ , |1⟩ , |+⟩ , |−⟩} by the sender.The receiver receives the data and decoy qubits from the quantum channel, as well as positions and encoding basis of decoy qubits from a separated classical channel.After measuring the decoy qubits in the instructed basis, the receiver transmits the measurement results to the sender, who will then calculate the error rate and detect the potential existence of external eavesdropper.In this simple case, for a given decoy state, the probability that the eavesdropper performs a measurement on it without being detected is simply 3  4 and the probability drops exponentially when there are n d uncorrelated decoy qubits.Advanced decoy-state quantum key distribution techniques [10] can be implemented to enhance the protocol's resilience against third-party attacks.
In this scenario, while the protocol demonstrates the capability to detect eavesdropper attacks, it incurs an additional cost in communication complexity.Specifically, there is an extra classical and quantum communication cost of O(n d ) between each sender and receiver pair.
It is important to emphasize that the proposed QFL protocols do not depend on a variational quantum circuit for gradient generation.Instead, gradient information can be produced using a classical neural network, thereby reducing the quantum capability demands on both the server and clients.Furthermore, in contrast to numerous classical federated learning algorithms that may face a trade-off between privacy loss and utility loss [44], the quantum protocols presented in this study do not compromise privacy for diminished utility, such as reduced accuracy.

VI. CONCLUSION
In conclusion, we design gradient-hidden protocols for secure federated learning to protect against gradient inversion attacks and safeguard clients' local information.The proposed algorithms involve quantum communication among a server and clients, and we analyze both privacy and communication costs.The secure inner product estimation protocol based on BQBC relies on transmitting a logarithmic number of qubits to reduce the information server could query.We devise an efficient redundant encoding method to improve privacy further.For the incremental learning protocols, we consider both phase encoding based on globally entangled state and secure multi-party summation of gradient information to prevent the server from learning individual gradients from clients.We further discuss the quantum and classical communication costs involved in each protocol.
Our present study suggests numerous potential avenues for future research.Firstly, while the proposed protocols primarily address adversaries in the form of a malicious or honest-but-curious server, there is a need to develop secure protocols tailored to scenarios involving a dishonest majority, encompassing malicious clients [45,46].Secondly, the protocols proposed herein can be extended and applied to other secure distributed quantum computing tasks, such as quantum e-voting protocols [47,48].Furthermore, our work would motivate subsequent efforts aimed at achieving quantum communication advantages while preserving privacy advantages over classical counterparts in practical distributed ma-chine learning tasks [6,49].To this end, our work sheds light on designing efficient quantum communicationassisted distributed machine learning algorithms, studying quantum inherent privacy mechanisms, and paves the way for secure distributed quantum computing protocols.

1 FIG. 1 .
FIG. 1. Diagram of QFL protocols based on secure inner product estimation.a. CSS-assisted QFL protocol.The clients jointly prepare a state in which the amplitudes encode the masked gradients and then send it to the server.The gradient masking is achieved via classical secret sharing.b.BQBC-based QFL protocol.We consider a central server with m clients and there are quantum channels among them.During each round of communication, each client encodes their local gradient information in specific phases of the received state and then send back to the server.
b k .In order to perform the estimation algorithm, O( 1 ϵ b k in the protocol, which reads as

FIG. 2 .
FIG. 2. Diagram of QFL protocols that are similar as incremental learning.a. Secure gradient aggregation based on global entanglement among clients.We consider GHZ states that are distributed by the server or trusted client.After each client encodes its local gradient information, the server performs measurement on the phase of the state.b.Quantum federated learning with secure multiparty gradient summation.The ancillary h qubits (in purple) are sent to the rest (m − 1) clients by the first client for gradient summation, after which the first client sends the other h-qubit state (in orange) to the server.

TABLE I .
Privacy and communication complexity of proposed gradient-hidden quantum federated learning protocols.