Sifting attacks in finite-size quantum key distribution

A central assumption in quantum key distribution (QKD) is that Eve has no knowledge about which rounds will be used for parameter estimation or key distillation. Here we show that this assumption is violated for iterative sifting, a sifting procedure that has been employed in some (but not all) of the recently suggested QKD protocols in order to increase their efficiency. We show that iterative sifting leads to two security issues: (1) some rounds are more likely to be key rounds than others, (2) the public communication of past measurement choices changes this bias round by round. We analyze these two previously unnoticed problems, present eavesdropping strategies that exploit them, and find that the two problems are independent. We discuss some sifting protocols in the literature that are immune to these problems. While some of these would be inefficient replacements for iterative sifting, we find that the sifting subroutine of an asymptotically secure protocol suggested by Lo et al (2005 J. Cryptol.18 133–65), which we call LCA sifting, has an efficiency on par with that of iterative sifting. One of our main results is to show that LCA sifting can be adapted to achieve secure sifting in the finite-key regime. More precisely, we combine LCA sifting with a certain parameter estimation protocol, and we prove the finite-key security of this combination. Hence we propose that LCA sifting should replace iterative sifting in future QKD implementations. More generally, we present two formal criteria for a sifting protocol that guarantee its finite-key security. Our criteria may guide the design of future protocols and inspire a more rigorous QKD analysis, which has neglected sifting-related attacks so far.


INTRODUCTION
Quantum key distribution (QKD) allows for unconditionally secure communication between two parties (Alice and Bob).A recent breakthrough in the theory of QKD is the treatment of finite-key scenarios, pioneered by Renner and collaborators (see [1], for example).This has made QKD theory practically relevant, since the asymptotic regime associated with infinitely many exchanged quantum signals is an insufficient description of actual experiments.In practice, Alice and Bob have limited time, which in turn limits the number of photons they can exchange.For example, in satellite-based QKD [2] where, say, Bob is on the satellite and Alice is on the ground, the time allotted for exchanging quantum signals corresponds to the time for the satellite to pass overhead Alice's laboratory on the ground.Even if such considerations would not play a role, the necessity of error correction forces the consideration of finite-size QKD because error correcting codes operate on blocks of fixed finite length.
Finite-key analysis attempts to rigorously establish the security of finite-size keys extracted from finite raw data.A systematic framework for such analysis was developed by Tomamichel et al. [3] involving the smooth entropy formalism.This framework was later extended to a decoy-state protocol by Lim et al. [4].An alternative framework was developed by Hayashi and collaborators [5,6].Other extensions of the finite-key framework include the treatment of device-independency by Tomamichel et al. [7], Curty et al. [8] and Lim et al. [9], and continuous-variable protocols by Furrer et al. [10] and Leverrier [11].The framework used in the aforementioned works, relying on some fairly technical results, 1 represents the current state-of-the-art in the level of mathematical rigor for QKD security proofs.These theoretical advances have led to experimental implementations [12][13][14] with finite-key analysis.
For practical reasons, it is important to consider not only a protocol's security but also its efficiency.Ideally a protocol should use as little quantum communication as possible, for a given length of the final secret key.For example, it was noted by Lo, Chau and Ardehali [15] that-in the asymptotic regime-protocols with biased basis-choice probabilities can dramatically decrease the necessary amount of quantum communication per bit of the raw key.This is because a bias increases the probability that Alice and Bob measure in the same basis.As a consequence, when Alice and Bob perform the sifting step of the protocol, where they discard the outcomes of all measurements that have been made in different bases, they lose less data (see Figure 2 and the discussion in Section IV).
Some authors have adapted this bias in the basis choice in finite-key protocols and combined it with another measure to further decrease the amount of data that is lost through sifting.In the resulting sifting scheme, which we call iterative sifting, Alice and Bob announce previous basis choices while the quantum communication is still in process, and they terminate the quantum communication as soon as they have collected sufficiently many measurement outcomes in identical bases.This way, less quantum communication takes place, while at the same time they always make sure that they collect enough data.The implicit assumption here is that the knowledge of previous basis choices, but not of upcoming ones, does not help a potential eavesdropper.
As we show in this article, this assumption is wrong.Iterative sifting breaks the security proofs that have been presented for these protocols.This sifting scheme was part of theoretical protocols [3,4,8,9] and has found experimental implementations [12].Therefore, many (but not all) of the recently suggested protocols in QKD have serious security flaws.

Summary of the results
The issue with iterative sifting that we point out is as follows.Typical QKD protocols involve randomly choosing some rounds to be used for parameter estimation (PE) (i.e.testing for the presence of an eavesdropper Eve) and other rounds for key generation (KG).Naturally, if Eve knows ahead of time whether a round will be used for PE, i.e., if Eve knows which rounds will form the sample for testing for an eavesdropper's presence, then she can adjust her attack appropriately and the protocol is insecure.Hence a central assumption in the QKD security analysis is that Eve has no knowledge about the sample.We show that this assumption is violated for iterative sifting.
To be more precise, the iterative sifting scheme has two problems which, to our knowledge, have been neither addressed nor noted in the literature: • Non-uniform sampling: The sampling probability, due to which the key bits and the encoding basis are chosen, is not uniform. 2In other words, there is an a priori bias: Eve knows ahead of time that some rounds are more likely to end up in the sample than others.
• Basis information leak : Alice and Bob's public communication about their previous basis choices (which, in iterative sifting, happens before the quantum communication is over) allows Eve to update her knowledge about which of the upcoming (qu)bits end up in the sample.As a consequence, the quantum information that passes the channel thereafter can be correlated to this knowledge of Eve.
It is conceivable that these two problems become smaller as the size of the exchanged data increases.This would remain to be shown.More importantly, however, the protocols in question are designed to be secure for finite key lengths.In the light of these two problems, the analysis in the literature does currently not account for these finite-size effects.This is not a purely theoretical objection but a practically very relevant issue, as we present some eavesdropping attacks that exploit the problems.As we discuss in Section IV, the basis information leak can trivially be avoided by fixing the number of rounds in advance, and only announcing the basis choices after all quantum communication has taken place.We examine some sifting protocols from the literature with this property.In contrast to protocols that use iterative sifting, they often use fresh uniform randomness for the choice of the sample, and therefore are trivially sampling uniformly.This means that they are secure with respect to our concerns.However, we find that there is room for improvement over these protocols regarding efficiency aspects.
Concretely, we note that one aspect that makes iterative sifting very efficient is the parameter estimation protocol that is used with it: after sifting, it simply uses the Z-bits as the sample for parameter estimation and the Xbits for raw key, which is why we call it the single-basis parameter estimation SBPE.This is efficient because the sample choice requires no aditional randomness and no authenticated communication.While SBPE is insecure when used in conjunction with iterative sifting, it turns out to be secure when used with a sifting subroutine of a protocol suggested by Lo, Chau and Ardehali, which we call LCA sifting.The combination of LCA sifting and SBPE is essentially as efficient as iterative sifting.It has trivially no basis information leak and, as we prove, samples uniformly (see 2).We therefore suggest this combination in future QKD protocols.
More generally, we find clear and explicit mathematical criteria that are sufficient for a sifting protocol to be secure in combination with SBPE.In contrast, current literature on QKD does not state such assumptions explicitly, but rather uses them implicitly.
In our formulation, they take the form of two equations, (1) Here, Equation (1) expresses the absence of non-uniform sampling, i.e., that the probability P Θ (ϑ) for a partitioning ϑ of the total rounds into sample rounds and key-generation rounds is independent of ϑ.Equation ( 2) expresses the absence of basis information leak, which is formally expressed by stating that the classical communication Θ l associated with the sifting process is uncorrelated (i.e., in a tensor product state) with Alice's and Bob's quantum systems A l B l .(The precise details of these two equations will be explained in Section V.) We find that the two problems are in fact independent.
Hence, security from one of the two problems does not imply security from the other.The two formal criteria can be used to check whether a candidate protocol is subject to the two problems or not.

Outline of the paper
We introduce the iterative sifting protocol in Section I, where we also explain our conventions and notation.We give a detailed description of the two problems with iterative sifting in Section II.We show how these problems can be exploited in Section III by presenting some intercept-resend attack strategies.
In Section IV, we discuss some sifting protocols that are immune to these problems.We study how ideas of existing protocols can be combined to get new secure protocols that are more efficient.As a result, we suggest the aforementioned combination of LCA sifting and SBPE, and prove its security.
In Section V, we give a more general answer to the question of how the two problems can be avoided by presenting formal mathematical criteria that a sifting protocol needs to satisfy in order to avoid the problems.We conclude with a summary in Section VI.

I. ITERATIVE SIFTING AND PARAMETER ESTIMATION
A typical QKD protocol consists of the following subroutines [3]: (i) Preparation, distribution, measurement and sifting, which we collectively refer to as "sifting", (ii) Parameter estimation, (iii) Error correction, (iv) Privacy amplification.
What we discuss in this paper refers to the subroutines (i) and (ii), whereas subroutines (iii) and (iv) are not of our concern.We refer to subroutine (i) collectively as "sifting".Even though the word sifting usually only refers to the process of discarding part of the data acquired in the measurements, we refer to the preparation, distribution, measurement and sifting together as "sifting", because they are intertwined in iterative sifting.Our focus in this article is on a particular sifting scheme that we call iterative sifting.It has been formulated in slightly different ways in the literature, where the differences lie mostly in the choice of the wording and in whether it is realized as a prepare-and-measure protocol [3,4,8,12] or as an entanglement-based protocol [9].These details are irrelevant for the problems that we describe.Another difference is that some of the abovementioned references take into consideration that sometimes, a measurement may not take place (no-detection event) or may have an inconclusive outcome.This is done by adding a third symbol ∅ to the set of possible outcomes, turning the otherwise dichotomic measurements into trichotomic ones with symbols {0, 1, ∅}.We choose not to do so, because the problems that we describe arise independently of whether no-detection events or inconclusive measurements take place.Incorporating them would not solve the problems that we address but rather complicate things and distract from the main issues that we want to point out.
The essence of the iterative sifting protocol is shown in Protocol I. There, and in the rest of the paper, we use the notation [r] := {1, 2, . . ., r} for all r ∈ N + . ( Our formulation of this protocol is close to the one described in [3], with the main difference that we choose an entanglement-based protocol instead of a prepare-andmeasure protocol.This will have the advantage that the formal criteria in Section V are easier to formulate, but a prepare-and-measure based protocol would otherwise be equally valid to demonstrate our points. In the protocol, Alice iteratively prepares qubit pairs in a maximally entangled state (Step 1) and sends one half of the pair to Bob (Step 2). 3 Then, Alice and Bob each measure their qubit with respect to a basis a i , b i ∈ {0, 1}, respectively, where 0 stands for the X-basis and 1 stands for the Z-basis (Steps 3 and 4).Thereby, Alice and Bob make their basis choice independently, where for each of them, 0 (X) is chosen with probability p x , and 1 (Z) with probability p z .These probabilities p x and p z are parameters of the protocol.The important and problematic parts of the protocol are Step 5 and the subsequent check of the termination condition (TC): after each measurement, Alice and Bob communicate their basis choice over an authenticated classical channel.With this information at hand, they then check whether the termination condition is satisfied: if for at least n of the qubit pairs they had so far, they both measured in the X-basis, and for at least k of them, they both measured in the Z-basis, the termination condition is satisfied and they enter the final phase of the protocol by continuing with Step 6.These quota n and k are parameters of the protocol.If the condition is not met, they repeat the Steps 1 to 5 (which we call the loop phase of the protocol) until they meet this condition.Because of this iteration, whose termination 3 Choosing a maximally entangled state as the state that Alice prepares maximizes the probability that the correlation test in the parameter estimation (after sifting) is passed, i.e. the maximally entangled state maximizes the robustness of the protocol.However, for the security of the protocol, which is the concern of the present article, the choice of the state that Alice prepares is irrelevant.
Number of rounds: Random variable M , determined by reaching the termination condition (TC) after Step 5.
Starting with round r = 1, Alice and Bob do: Step 1: (Preparation): Alice prepares a qubit pair in a maximally entangled state.
Step 2: (Channel use): Alice uses the quantum channel to send half of the qubit pair to Bob.
Step 3: (Random bit generation): Alice and Bob each (independently) generate a random classical bit ar and br, respectively, where 0 is generated with probability px and 1 with probability pz.
Step 4: (Measurement): Alice measures her share in the X-basis (if ar = 0) or in the Z-basis (if ar = 1), and stores the outcome in a classical bit yr.
Likewise, Bob measures his share in the X-basis (if br = 0) or in the Z -basis (if br = 1), and stores the outcome in a classical bit y r .
Step Final phase: The following steps are performed only once: Step 6: (Random discarding): Alice and Bob choose a subset u ⊆ u(m) of size n at random, i.e. each subset of size k is equally likely to be chosen.Analogously, they choose a subset v ⊆ v(m) of size k at random.Then they discard the bits ar, br, yr and y r for which r / ∈ u ∪ v.
Protocol I.The iterative sifting protocol.
condition depends on the history4 of the protocol run up to that point, we call it the iterative sifting protocol.Its number of rounds is a random variable that we denote by M .We denote possible values of M by m (see the TC and Step 6).
After the loop phase of the protocol, in which the whole data is generated, Alice and Bob enter the final phase of the protocol, in which this data is processed.This processing consists of discarding data of rounds in which Alice and Bob measured in different bases, as well as randomly discarding a surplus of data for rounds where both measured in the same basis, where a "surplus" refers to having more than n (k) rounds in which both measured in the X (Z) basis, respectively.This discarding of surplus is done to simplify the analysis of the protocol, which is easier if the number of bits where both measured in the X (Z) basis is fixed to a number n (k).Since after the loop phase, Alice and Bob can end up with more bits measured in this same basis, they throw away surplus at random.Finally, after throwing away the surplus, Alice and Bob locally output the remaining bit strings (s i ) l i=1 and (t i ) l i=1 of measurement outcomes and publicly output the remaining bit string (ϑ i ) l i=1 of basis choices.Iterative sifting is problematic, but to fully understand why, one needs to see how the output of the iterative sifting protocol is processed in the subsequent subroutine (ii), the parameter estimation, where Alice and Bob check for the presence of an eavesdropper.Protocols that use iterative sifting use a particular protocol for parameter estimation.To make clear what we are talking about, we have written it out in Protocol II.
Alice and Bob start the protocol with the strings (s i ) l i=1 , (t i ) l i=1 and (ϑ i ) l i=1 that they got from sifting.Then, in a first step, they communicate the test bits.The test bits are those bits s i , t i that resulted from measurements in the Z-basis, i.e. the bits s i , t i with i such that ϑ i = 1.Then, they determine the fraction of the test bits that are different for Alice and Bob, i.e. they determine the test bit error rate.If it is higher than a certain protocol parameter q tol ∈ [0, 1], they abort.Otherwise, they locally output the raw keys, which are the bits s i , t i that result from measurements in the X-basis, i.e. those s i , t i with i for which ϑ i = 0.
It is important to emphasize that if the output of iterative sifting serves as the input of the parameter estimation protocol as in Protocol II, then the bits that result from measurements in the X-basis are used for the raw key, and the bits that result from measurements in the Z-basis are used for parameter estimation (i.e. they form the sample for the parameter estimation).Hence, the sample is determined by the basis choice; no additional randomness is injected to choose the sample.We Input: For l = n + k, the inputs are: Alice: l-bit string (si) l i=1 ∈ {0, 1} l (measurement outcomes, sifted), Bob: l-bit string (ti) l i=1 ∈ {0, 1} l (measurement outcomes, sifted), public: l-bit string (ϑi) l i=1 ∈ {0, 1} l with i ϑi = k (basis choices, sifted), where 0 means X-basis and 1 means Z-basis.

The protocol
Step 1: (Test bit communication): Alice and Bob communicate their test bits, i.e. the bits si and ti with i for which ϑi = 1, over a public authenticated channel.
Step 2: (Correlation test): Alice and Bob determine the test bit error rate where ⊕ denotes addition modulo 2, and do the correlation test: if λtest ≤ q tol , they continue the protocol and move on to Step 3. If λtest > q tol , they abort.
Step 3: (Raw key output): Let ij be the j-th element of {i ∈ [l] | ϑi = 0}.Then Alice outputs the n-bit string (xj) n j=1 and Bob outputs the n-bit string (x j ) n j=1 , where xj = si j , x j = ti j .
call this the single-basis parameter estimation (SBPE), because the parameter estimation is done in only one basis.This is not necessarily a problem by itself.However, as we will show in Section II A, in iterative sifting, some rounds are more likely to end up in the sample than other rounds.This leads to non-uniform sampling, which is a problem since uniform sampling is one of the assumptions that enter the analysis of the parameter estimation.This seems to be unnoticed so far, as we found that protocols in the literature that use iterative sifting as a subroutine use SBPE as a subroutine for parameter estimation (or something equivalent) [3,4,8,9,12].In contrast, the LCA sifting protocol that we discuss in Section IV does sample uniformly, even if bits from X-measurements are used for the raw key and Z-measurements are used for paremeter estimation, without injecting additional randomness.
We will discuss randomness injection for the sample choice in more detail in Section IV.
The idea behind the parameter estimation is the following: if the correlation test passes, then the likelihood that Eve knows much about the raw key is sufficiently low.The exact statement of this is subtle, and involves more details than are necessary for our purposes.We refer to [3] for more details.Here, what is important is that this estimate of Eve's knowledge is done via estimating another probability that we call the tail probability p tail (µ) which, for µ ∈ [0, 1], is given by Here, Λ test is the random variable of the test bit error rate λ test determined in the parameter estimation protocol, The random variable Λ key is the random variable of a quantity that is not actually measured: it is the random variable of the error rate on the raw key bits if they had been measured in the Z-basis.Since in the actual protocol, the raw key bits have been measured in the X-basis, the random variable Λ key is the result of a Gedankenexperiment rather than an actually measured quantity.We will define Λ key formally in Section V.The usual analysis, as in Reference [3], aims at proving that where Inequality ( 6) is turned into an inequality about the eavesdropper's knowledge about the raw key using an uncertainty relation for smooth entropies [3,16].

Notation and terminology
In the following sections, we will have a closer look at the probabilities of certain outputs of the iterative sifting protocol in Protocol I.For example, in Section II A we will consider the probability that iterative sifting with parameters n = 1, k = 2 outputs the string ϑ = (ϑ i ) 3 i=1 = (1, 1, 0).Since the output of the protocol is probabilistic, the output string becomes a random variable.We denote random variables by capital letters and their values by lower case letters.For example, the random variable for the output string ϑ is denoted by Θ, and the probability of the output string to have a certain value ϑ is . ., ϑ l ), i.e. we omit the brackets and commas.For example, we write 110 ∈ {0, 1} 3 instead of (1, 1, 0) ∈ {0, 1} 3 , so the probability that we calculate in Section II A is P [Θ = 110].Other random variables that we consider include the random variable A 1 (B 1 ) of Alice's (Bob's) first basis choice a 1 (b 1 ) or the random variable M of the number m of total rounds performed in the loop phase of the iterative sifting protocol.
To simplify the calculations, it is convenient to introduce the following terminology.For a round r in the loop phase of the iterative sifting protocol, r is an For calculations with random variables like Θ, A 1 , B 1 or M , the sample space of the relevant underlying probability space is the set of all possible histories of the iterative sifting protocol.This set is hard to model, as it contains not only all possible strings (a r ) r , (b r ) r , (y r ) r and (y r ) r of the loop phase (which can be arbitrarily long) but also a record of the choice of the subsets u and v in the random discarding during the final phase.It is, however, not necessary for our calculations to have the underlying sample space explicitly written out.In order to avoid unnecessarily complicating things, we therefore only deal with the relevant events, random variables and their probability mass functions directly, assuming that the reader understands what probability space they are meant to be defined on.In contrast, the LCA sifting protocol which we discuss in Section IV, has a simpler set of histories, and we will derive a probability space model for it in Appendix C.
We often write expressions in terms of probability mass functions instead of in terms of probability weights of events, e.g.we write

II. THE PROBLEMS A. Non-uniform sampling
To show that iterative sifting leads to non-uniform sampling, we calculate the sampling probabilities for some example parameters k, n ∈ N + as functions of the probabilities p x and p z .By a sampling probability, we mean the probability that some subset of k of the l = n+k bits is used as a sample for the parameter estimation, i.e. the sampling probabilities are P Θ (ϑ) for ϑ ∈ {0, 1} l k , where is the set of all l-bit strings with Hamming weight k.
We say that sampling is uniform if P Θ (ϑ) is the same for all ϑ ∈ {0, 1} l k , and non-uniform otherwise.While non-uniform sampling already arises in the case of the smallest possible parameters k = n = 1, the results are even more interesting in cases where k = n.Let us consider iterative sifting (Protocol I) with n = 1, k = 2 and arbitrary p x , p z ∈ [0, 1].Let Θ denote the random variable of the string ϑ = (ϑ i ) 3 i=1 = ϑ 1 ϑ 2 ϑ 3 of sifted basis choices which is generated by the protocol.The possible values of Θ are 110, 101 and 011.The probabilities of these strings are given as follows (see Appendix A for a proof).Proposition 1: For the iterative sifting protocol as in Protocol I with n = 1 and k = 2, it holds that For the other two possible values of Θ, it holds that Hence, different samples have different probabilities, in general.In order for the sampling probability P Θ to be uniform, in the case where n = 1 and k = 2, we need to have P Θ (ϑ) = 1/3 for ϑ = 011, 101, 110.This holds if and only if g z = g * z , where where This is bad news for iterative sifting: it means that iterative sifting leads to non-uniform sampling for all values of p z except p z = p * z .Interestingly, the value of p * z does not seem to be a probability that has been considered in the QKD literature.In particular, p * z corresponds to neither the symmetric case p z = 1/2 nor to a certain asymmetric probability which has been suggested to be chosen in order to maximize the key rate [3].
The value g z can be interpreted as the probability that in a certain round of the loop phase, Alice and Bob have a Z-agreement, given that they have an agreement in that round (this conditional is why the p 2 z is renormalized with the factor 1/(p 2 z + p 2 x )).Hence, g 2 z is the probability that Alice and Bob's first two basis agreements are Z-agreements.Therefore, P Θ (110) = g 2 z is what one would intuitively expect: to end up with Θ = 110, the first two basis agreements need to be Z-agreements, and conversely, whenever the first two basis agreements are Z-agreements, Alice and Bob end up with Θ = 110.
More generally, it turns out that for n = 1 and for k ∈ N + arbitrary, the iterative sifting protocol leads to This is a uniform probability distribution if and only if g z = g * z , where which is true iff p z = p * z , where Hence, we conclude that iterative sifting does not lead to uniformly random sampling, unless p x and p z are chosen in a very particular way.This particular choice does not seem to correspond to anything that has been considered in the literature so far.

B. Basis information leak
In iterative sifting, information about Alice's and Bob's basis choices reaches Eve in every round of the loop phase.In Step 5 of round r, Alice and Bob communicate their basis choice a r , b r of that round.They do so because they want to condition their upcoming action on the strings a 1 . . .a r and b 1 . . .b r : if they have enough basis agreements, they quit the loop phase; otherwise they keep looping.
What seems to have remained unnoticed in the literature is that Eve can also condition her actions on a 1 . . .a r and b 1 . . .b r .This means that if there is a round r+1, Eve can correlate the state of the qubit that Alice sends to Bob in round r + 1 with a 1 . . .a r and b 1 . . .b r .Hence, the state of the qubit that Bob measures is correlated with the classical register that keeps the information about the basis choice.Note that the basis information leak tells Eve how close Alice and Bob are to meeting their quotas for each basis.Eve can tailor her attack on future rounds based on this information.For example, if Alice and Bob have already met their Z-quota, but not their X-quota, then Eve can measure in the X-basis, knowing that, if Alice and Bob happen to both measure Z, the round may be discarded anyway.
We want to emphasize that the basis information leak is not resolved by injecting additional randomness for the choice of the sample.As we will discuss in Section IV, such additional randomness can ensure that the sampling is uniform, but it does not help against the basis information leak.Randomness injection for the sample is effectively equivalent to performing a random permutation on the qubits [17].This does not remove the correlation between the classical basis information register and the qubits.
We will see more concretely how the basis information leak is a problem when we present an eavesdropping attack in Section III A and when we treat the problem more formally in Section V.

III. EAVESDROPPING ATTACKS
A detailed analysis of the effect of non-uniform sampling and basis information leak on the key rate is beyond the scope of the present paper.It would involve developing a new security analysis for a whole protocol involving iterative sifting.Instead of attempting to find a modified analysis for iterative sifting, we will discuss alternative protocols in Section IV.
However, to give an intuitive idea of the effect, we will calculate another figure of merit: the error rate for an intercept-resend attack.We devise a strategy for Eve to attack the iterative sifting protocol during its loop phase and calculate the expected value of the error rate that results from this attack.Here, ⊕ denotes addition modulo 2 and S i and T i are the random variables of the bits s i and t i , respectively, which are generated by the protocol.One would typically expect an error rate no lower than 25% for an intercept-resend attack [18], which is why our results below are alarming.

A. Attack on non-uniform sampling
Let us first consider an attack on non-uniform sampling, i.e., on the fact that not every possible value of Θ is equally likely.It will be a particular kind of interceptresend attack, i.e.Eve intercepts all the qubits that Alice sends to Bob during the loop phase, measures them in some basis and afterwards, prepares another qubit in the eigenstate associated with her outcome and sends it to Bob.Then we will show that the attack strategy leads to an error rate below 25%.
For the error rate calculation, we assume that the Xand Z-basis is the same for Alice, Bob and Eve, and that they are mutually unbiased.This way, if Alice and Bob measure in the same basis, but Eve measures in the other basis, then Eve introduces an error probability of 1/2 on this qubit.Moreover, for simplicity, we make this calculation for the easiest possible choice of parameters.Consider the iterative sifting iterative sifting protocol (Protocol I) with the parameters k = n = 1.From Equations ( 15) and ( 16), we get that the sampling probabilities in this case are These sampling probabilities are uniform for the symmetric case p x = p z , but are non-uniform for all other values.In the following, we assume p x > 1/2, which makes the sample Θ = 01 more likely than the sample Θ = 10.We choose the following attack: in the first round of the loop phase, she attacks in the X-basis, and in all the other rounds, she attacks in the Z-basis.We choose the attack this way because we know that the first non-discarded basis agreement is more likely to be an Xagreement, whereas the second one is more likely to be a Z-agreement. 5 We calculate the expected error rate for this attack in Appendix B 1. The black curve in Figure 1 shows E as a function of p x for this attack.Notice that E falls below 25% for 1/2 < p x < 1, and reaches a minimum of E ≈ 22.8% for p x ≈ 0.73.
The concerned reader might worry that the 25% error rate associated with the intercept-resend attack was derived under the assumption of equal weighting for the two bases X and Z, whereas it seems here that we choose unequal weightings.However, for the protocol under consideration, the a priori probability distribution {p x , p z } is not the relevant quantity.Rather, the fact that n = k in our example ensures that the X and Z bases enter in with equal weighting.

B. Attack on basis information leak
We now give an eavesdropping strategy that exploits the basis information leak.It is an adaptive strategy, in which Eve's action in round r+1 depend on the past communication of the strings a 1 . . .a r and b 1 . . .b r .Again, we 5 The attentive reader may point out that this attack could be improved by making Eve's basis choice dependent on the communication between Alice and Bob.This is correct, but we intentionally design the attack such that Eve ignores Alice and Bob's communication.That allows one to see the effect of non-uniform sampling alone and to compare it to attacks on basis information leak alone, see Sections III B and III C.
consider the simple case of n = k = 1.To make sure our attack is really exploiting the basis information leak and not the non-uniform sampling, we set p x = p z = 1/2.In this case, from Eq. ( 18), the sampling is uniform: Before we define Eve's strategy, we want to give some intuition.Suppose that during the protocol, Eve learns that Alice and Bob just had their first basis agreement.If this first agreement is a Z-agreement, say, what does this mean for Eve?She knows that the protocol will now remain in the loop phase until they end up with an Xagreement.Suppose that she now decides that she will measure all the remaining qubits in the X-basis.Then, if the next basis agreement of Alice and Bob is an Xagreement, Eve knows the raw key bit perfectly, and her measurement on that bit did not introduce an error.If the next basis agreement is a Z-agreement, she may introduce an error on that test bit.However, there will be a chance that Alice and Bob discard this test bit, because they have a total of two (or more, in the end) Z-agreements, and the protocol forces them to discard all Z-agreements except k = 1 of them.Hence, learning that the first basis agreement was a Z-agreement brings Eve into an favorable position: she knows that attacking in the X-basis for the rest of the loop phase will necessarily tell her the raw key bit, while she has quite some chance to remain undetected.
This intuition inspires the following intercept-resend attack.Before the first round of the loop phase, Eve flips a fair coin.Let F be the random variable of the coin flip outcome and let 0 and 1 be its possible values.If F = 0, then in the first round, Eve attacks in the X basis, and if F = 1, she attacks in the Z-basis.In the subsequent rounds, she keeps attacking in that basis until Alice and Bob first reached a basis agreement.If it is an X-agreement (equivalent to Θ = 01), Eve attacks in the Z-basis in all remaining rounds, and if it is a Z-agreement (equivalent to Θ = 10), she attacks in the X-basis in all remaining rounds. 6e calculate the expected error rate for this attack in the Appendix B 2. We find that Hence, the basis information leak allows Eve to go far below the typical expected error rate of 25% for interceptresend attacks [19].The blue curve in Figure 1 shows, more generally, E as a function of p x , for this attack.

C. Independence of the two problems
Are non-uniform sampling and basis information leak really two different problems, or is one a consequence of the other?We will argue now that the two problems are in fact independent.To this end, we describe a protocol that suffers from non-uniform sampling but not from basis information leak, and another protocol that suffers from basis information leak but not from non-uniform sampling.
We have already seen an instance of a protocol that suffers from basis information leak but not from nonuniform sampling: in Section III B, we looked at the iterative sifting protocol with n = k = 1 and p x = p z = 1, in which case the sampling is uniform.Hence, there was no exploitation of non-uniform sampling, but the attack strategy exploited basis information leak.
What about the other way round?Can non-uniform sampling occur without basis information leak?A closer look at the attack on non-uniform sampling presented in Section III A hints that this is possible: the attack strategy works, even though it completely ignores the communication between Alice and Bob, so it did not make any use of the basis information leak due to this communication.
A more dramatic example shows clearly that nonuniform sampling can occur without basis information leak.To this end, we forget about iterative sifting for a moment and look at a different protocol.Consider a sifting-protocol in which Alice and Bob agree in advance that they will measure the first n = 100 qubits in the X-basis, and that they will measure the second k = 100 qubits in the Z-basis, without any communication during the protocol.Of course, there is no hope for this protocol to be useful for QKD, but it serves well to demonstrate our point.It leads to a very dramatic form of non-uniform sampling, because P Θ (0 . . .01 . . . 1) = 1 and P Θ (ϑ) = 0 for all other ϑ ∈ {0, 1} l k .If Eve attacks the first 100 rounds in X and the second 100 rounds in Z, then she knows the raw key perfectly, without introducing any error.At the same time, there is no communication between Alice and Bob during the protocol, so no information about the basis choice is leaked during the protocol.Instead, Eve (who is always assumed to know the protocol) already had this information before the first round.
Hence, we conclude that the problems of non-uniform sampling and basis information leak are independent.They just happen to occur simultaneously for iterative sifting, but they can occur separately in general.We will see the independence of the two problems more formally in Section V.

D. Attack on both problems
Since the two problems are independent, it is interesting to devise an attack that exploits both of them.Let us again consider k = n = 1 and suppose p x > 1/2 to ensure that we have non-uniform sampling.Suppose Eve begins in the same way as in the attack on non-uniform sampling, measuring in the X-basis.However, as in the attack on the basis-information leak, she makes her attack adaptive by following the rule that she switches to the Z-basis when Alice and Bob announce that they had an X-agreement.If Alice and Bob announce a Z-agreement, Eve keeps attacking in the X-basis.
We give an expression for the error rate induced by this attack in Appendix B 3. The red curve in Figure 1 shows a plot of this error rate as a function of p x .As one can see, the error rate attains its minimum of E ≈ 15.8% for p x ≈ 0.57.Hence, this combined attack on both problems performs much better than the one on non-uniform sampling alone (with a minimal error rate of ∼ 22.8%) and even better than the attack on the basis information leak alone (with a minimal error rate of ∼ 16.3%).

IV. SOLUTIONS TO THE PROBLEMS
How can these problems be avoided?Roughly speaking, we can say that protocols with iterative sifting are characterized by three properties that make it efficient: (1) asymmetric basis choice probabilities and quota, p x > p z and n > k, (2) single-basis parameter estimation (Protocol II), (3) communication in Step 5 of the loop phase.As we have seen, it is the communication which causes the basis information leak.
An obvious fix to this problem is to take this communication out of the loop phase and to postpone it to the final phase, when all the quantum communication is over.Then there is no classical communication during the loop phase, and hence, there cannot be a termination condition that depends on classical communication.Instead, the number of rounds in the loop phase is set to a fixed number m ∈ N + .This number m then becomes a parameter of the protocol.
Fixing the number of rounds introduces a new issue: there is no guarantee that the quotas for X-and Z-agreements will be met after m rounds.In order to perform the parameter estimation, however, the quotas n and k must be met.Otherwise, Inequality (6) is not applicable, because the number of X-and Z-agreements in the loop phase are random numbers that can be below n and k, respectively.Thus, unless one wants to introduce a new tail probability analysis as well, there is a strictly positive probability that Alice and Bob have to abort the sifting protocol because they have too many basis disagreements.If the sifting scheme is modified in this way, it no longer involves any communication about the basis choices during its loop phase.Thus, it is trivially true that there is no basis information leak.
Many protocols in the QKD literature have such a fixed number m of rounds (which is often denoted by N instead) and an according abort event.It seems that before iterative sifting was introduced, the sifting procedure was either not clearly written out in the protocols, or it had such a fixed round number.For example, in the original BB84 paper [20], the sifting scheme is not written out in enough detail to say whether this is the case, but the protocol for which Shor and Preskill showed asymptotic security uses a fixed number of rounds [21].In addition, they use symmetric basis choice probabilities and quota, i.e. p x = p z = 1/2 and k = n.Alice sends 4n+δ qubits to Bob (where δ is a positive but small overhead) without any intermediate classical communication.Afterwards, they compare their bases and check whether they have at least n X-agreements and at least n Z-agreements.If not, they abort, otherwise they choose n X-agreements and n Z-agreements and discard the rest.
With the remainin 2n bits, they continue with parameter estimation.However, instead of performing SBPE, they choose n bits at random (i.e. with fresh randomness) for parameter estimation and use the rest for the raw key.Hence, this protocol shares none of the three properties with iterative sifting that we listed above.
This scheme trivially has no basis information leak.In addition, it trivially samples uniformly, as the whole sample is chosen with fresh randomness that is injected for that purpose.Thus, it is secure with respect to the concerns raised in this article.However, it is unnecessarily inefficient: speaking in expectation values, half of the bits are discarded because they were determined in different bases, and another quarter of the bits is used for parameter estimation, leaving only a quarter of the original bits for the raw key, see Figure 2 a).
A similar protocol has recently been suggested by Tomamichel and Leverrier with a complete proof of its security, modelling all its subroutines [22].They also use symmetric basis choice probabilities p x = p z and randomness injection for the sample choice.However, they do not use half of the sifted bits for parameter estimation but less.Their protocol also samples uniformly, because additional randomness is injected for the choice of the sample.
To increase the efficiency, Lo, Chau and Ardehali (LCA) suggested to use asymmetric basis choice probabilities and quota, i.e. p x > 0 and k = n.As shown in Figure 2 b), this decreases the number of expected disagreements from a value of m/2 to a value of 2p x p z m.This is great for efficiency: for larger block lengths, relatively smaller samples are required to gain the same confidence that Alice's and Bob's bits are correlated. 7 In the limit where m → ∞, the probability p x can be chosen to be arbitrarily close to one, and the fraction of data lost due to basis disagreements converges to zero.We call this protocol LCA sifting.It shares property (1) with iterative sifting.
As for the protocol of Shor-Preskill, Lo Chau and Ardehali did not consider SBPE.Their parameter estimation 7 This can be seen from inequality (6) also requires some randomness injection for the choice of the sample: the Z-agreements form one half of the sample, and the other half is chosen at random from the X-agreements.Then, not just one but two error rates are determined, namely on the X-part and the Z-part of the sample separately.Only if both error rates are below a fixed error tolerance, they continue the protocol using the rest as the raw key (for details, see their article [15]).The LCA protocol trivially has no basis information leak.
In addition, it turns out that it also samples uniformly.This is in fact non-trivial, and to our knowledge, it was not proved in the literature.We fill this gap: the uniform sampling property of the LCA protocol turns out to be a corollary of 2 below.Thus, the LCA protocol could be used as a secure replacement for iterative sifting.Output: For l = n + k, the outputs are: Alice: l-bit string (si) l i=1 ∈ {0, 1} l (measurement outcomes, sifted) or s =⊥ (if the protocol aborts), Bob: l-bit string (ti) l i=1 ∈ {0, 1} l (measurement outcomes, sifted) or t =⊥ (if the protocol aborts), public: l-bit string (ϑi) l i=1 ∈ {0, 1} l with i ϑi = k (basis choices, sifted), where 0 means X-basis and 1 means Z-basis, or ϑ =⊥ (if the protocol aborts).

Number of rounds: Fixed number m (protocol parameter)
The protocol Loop phase: Steps 1 to 4 are repeated m times (round index r = 1, . . ., m).Starting with round r = 1, Alice and Bob do the following: Step 1: (Preparation): Alice prepares a qubit pair in a maximally entangled state.
Step 2: (Channel use): Alice uses the quantum channel to send one share of the qubit pair to Bob.
Step 3: (Random bit generation): Alice and Bob each (independently) generate a random classical bit ar and br, respectively, where 0 is generated with probability px and 1 is generated with probability pz.
Step 4: (Measurement): Alice measures her share in the X-basis (if ar = 0) or in the Z-basis (if ar = 1), and stores the outcome in a classical bit yr.
Likewise, Bob measures his share in the X-basis (if br = 0) or in the Z -basis (if br = 1), and stores the outcome in a classical bit y r .
Final phase: The following steps are performed in a single run: Step 5': (Quota Check): Alice and Bob determine the sets They check whether the quota condition (u(m) ≥ n and v(m) ≥ k) holds.If it holds, they proceed with Step 6.Otherwise, they abort.
Step 6: (Random Discarding): Alice and Bob choose a subset u ⊆ u(m) of size k at random, i.e. each subset of size k is equally likely to be chosen.Analogously, they choose a subset v ⊆ v(m) of size k at random.Then they discard the bits ar, br, yr and y r for which r / ∈ u ∪ v.
On the one hand, we suggest using the sifting part of LCA protocol.To be clear about the details of the sifting scheme, we have written it out in our notation in Protocol III.On the other hand, we find that the parameter estimation part of the LCA protocol is unnecessarily complicated and inefficient: it needs randomness injection for part of the sample choice, and it requires the estimation of two instead of one error rate.What if, instead, LCA sifting is followed by SBPE, i.e., only the error rate on the Z-agreements is determined?The critical question is whether this would still lead to uniform sampling.As the following propositin shows, this is indeed the case.
Proposition 2: The combination of LCA sifting (Protocol III) and SBPE (Protocol II) samples uniformly.In other words, the LCA sifting protocol satisfies In constrast to protocols that use randomness injection for the sample choice, the uniform sampling property is non-trivial to prove for LCA sifting with SBPE.We prove 2 in Appendix C (see the corollary of 8).This shows that the combination of LCA sifting and SBPE is secure and can therefore be used to replace iterative sifting. 8For protocols that use these subroutines, the abort probability p abort of the sifting step is important because it affects the key rate of the QKD protocol.We calculate p abort in Appendix C as well ( 8).This is good news for efficiency, as no randomness injection is required for the choice of the sample.Since this random sample choice would need to be communicated between Alice and Bob in an authenticated way, this also uses up less secret key from the initial key pool (see [23] for a discussion of the key cost of classical postprocessing).One can see in Figure 2 that in the finite-key regime, this also leads to a larger raw key.Together with 3, which we will discuss in Section V, this also establishes security of the protocol in the finite-key regime.In contrast, the original work of LCA [15] only establishes asymptotic security.

Suggestion: Use LCA sifting (Protocol III) and SBPE (Protocol II).
Let us briefly remark about the efficiency LCA sifting in comparison to that of iterative sifting.They differ in that LCA sifting has no communication during the loop phase, see property (3) above.The question is whether this necessarily means that the efficiency is strongly reduced in comparison with iterative sifting.We define the efficiency η of a sifting protocol as where R is the random variable of the number of rounds that are kept after sifting and M is the random variable of the total number of rounds performed in the loop phase of the protocol.We explain this in more detail in Appendix D. A plot of the expected efficiency for iterative sifting and for LCA sifting is shown in Figure 3 for the special case of symmetric probabilities p x = p z and identical quota n = k (this special case is computationally much easier to calculate; for other choices, the computation becomes very hard).We find that iterative sifting is more efficient, as expected, but the difference between the two efficiencies becomes insignificant for practically relevant quota sizes n and k.

V. FORMAL CRITERIA FOR GOOD SIFTING
In Section II, we have seen that iterative sifting leads to problems.In Section IV, we showed that these problems can be avoided by using LCA sifting (Protocol III) and SBPE (Protocol II).In this section, we give a more complete answer to the question of how these problems can be avoided by presenting two simple formal criteria that are sufficient for a sifting protocol to lead to a correct parameter estimation.More precisely, we describe two formal properties of the state produced by a sifting protocol which guarantee that if the protocol is followed by SBPE (Protocol II), then Inequality (6) holds.As indicated in the introduction, the two properties take the form of equalities, see Equations ( 1) and (2).We prove the sufficiency of these two criteria by deriving (6) from them in 3 below.
In order to state the two criteria and the random variable Λ key in (6) formally, we need to define a certain kind of quantum state ρ A l B l Θ l associated with a sifting protocol.To explain what this state is, we explain what the state ρ A l B l Θ l is like for LCA sifting.It is a state that is best described in a variation of the protocol.Suppose that Alice and Bob run the protocol, but they skip the measurement in every round.Instead, they keep each qubit system in their lab without modifying its state.With current technology, this is practically impossible, but since ρ A l B l Θ l is a purely mathematical construct, we do not worry about the technical feasibility.Notice that Alice and Bob still make basis choices, compare them and discard rounds-they just do not actually perform the measurements.Let us compare the output of this modified protocol with the output of the original protocol: original protocol modified protocol Alice: Hence, if we model the classical bit string ϑ as the state of a classical register Θ l , we can say that the output of the modified protocol is a quantum-quantum-classical (QQC) state ρ A l B l Θ l .More generally, the state ρ A l B l Θ l associated with a sifting protocol is its output state in the case where all the measurements are skipped.This state still carries all the probabilistic information of the original protocol.To see this, let X = {X 0 , X 1 } and Z = {Z 0 , Z 1 } be the POVMs describing Alice's X-and Z-measurement, let X = {X 0 , X 1 } and Z = {Z 0 , Z 1 } be the POVMs describing Bob's X-and Z-measurement, and let M = {M 0 , M 1 } be the projective measurement on Θ with respect to which the state of the register Θ is diagonal.Define the operators Then, the probability distribution over the output of the protocol is where ρ (ABΘ) l is the same state as ρ A l B l Θ l , but with the registers reordered in the obvious way, and where With the state ρ A l B l Θ l associated with a sifting protocol at hand, it is easy to define the random variable Λ key associated with the protocol.The relevant probability space is the discrete probability space (Ω ZZ Θ , P ZZ Θ ), where Ω ZZ Θ is the sample space and where P ZZ Θ is the probability mass function The probability mass function P ZZ Θ corresponds to a Gedankenexperiment in which Alice and Bob measure all qubits in the Z-basis.Now we are able to formally say what the random variable Λ key of a sifting protocol is.Let ρ A l B l Θ l be the state associated with the sifting protocol, let (Ω ZZ Θ , P ZZ Θ ) be the probability space as in Equations ( 26) and (27).Then Λ key is the random variable which is the key bit error rate.Analogously, we have the test bit error rate This allows us to formally define the tail probability p tail .We define it via the same formula as in (4), which we repeat here for the reader's convenience: The difference is that now, we have formally defined all the components of the equality.The following proposition states the tail probability bound in a formal way.Proposition 3 (Tail probability estimate): Let ρ A l B l Θ l be a density-operator of a system A l B l Θ l where A and B are qubit systems and Θ is a classical system, let {Z 0 , Z 1 } and {Z 0 , Z 1 } be POVMs on the quantum systems A and B, respectively, let {M 0 , M 1 } be the read-out measurement of the classical system Θ, let Λ key , Λ test be random variables on the discrete probability space (Ω ZZ Θ , P ZZ Θ ) as defined in Equations ( 26) to (29) and let p tail be as in Equation (4).Let ρ A l B l and ρ Θ l denote the according reduced states of ρ A l B l Θ l and P Θ denote the according marginal of P ZZ Θ .If the two conditions hold, then where We prove 3 in Appendix E. The formulation of 3 allows us to see the formal requirements on a sifting protocol to lead to a correct parameter estimation when followed by SBPE: Condition (1) is exactly the statement that the sampling probability does not depend on the sample, i.e. the protocol leads to uniform sampling.There is one thing that we want to point out here: while it is sufficient for the sampling probabilities to be the inverse of the number of possible samples, i.e.
condition ( 1) is strictly weaker.In the case where there is a non-zero probability that the protocol aborts during the sifting phase (as it is the case for LCA sifting), the sampling probabilities do not add up to 1 but rather to 1−p abort , where p abort is the probability that the protocol aborts during the sifting phase.Condition ( 2) is the formal statement of what it means for a protocol that the basis choice register is uncorrelated with Alice's and Bob's qubits before measuring.3 states that if these two conditions are satisfied, then the correlation test of the SBPE protocol leads to the right conclusion.Hence, these are the two conditions that a sifting protocol needs to satisfy in order to be a good sifting protocol.
We point out that the digression to a classical probability space, Equations ( 26) to (29) and ( 4), is a mere change of notation.However, the fact that it is possible to express 3 in terms of a classical probability space shows that this part of a QKD security analysis is purely classical.

VI. CONCLUSION
In recent years QKD has emerged as a commercial technology, with the prospect of global QKD networks on the horizon [19].All QKD implementations have finite size, and yet only recently has finite-key analysis approached mathematical rigor [3][4][5][6][8][9][10][11].In this work, we showed that further modifications of the protocols and/or their analysis are needed to make finite-key analysis rigorous.
We pointed out that sifting-a stage of QKD that is often overlooked with respect to security analysis-is actually crucial for security.A carelessly designed sifting subroutine can jeopardize the security of an otherwise reliable protocol.We found that iterative sifting, a sifting protocol that has both been proposed theoretically [3,4,8,9] and been implemented experimentally [12], violates two assumptions in the typical security analysis.We showed how the violation of these assumptions can be exploited by an eavesdropper, leading to intercept-resend attacks with unexpectedly low error rates (see Fig. 1).
We presented an alternative scheme, LCA sifting and SBPE, and proved that it solves the two problems.We derived an expression for its abort probability and therefore provided everything that is needed for its future use as a subroutine.We argued that this scheme is more economical and efficient than some other other previously proposed protocols, as it does not require an additional random seed for the sample and at the same time allows for asymmetric basis choice probabilities.As we explained, the latter allows for a significantly higher sifting efficiency [15].
We gave the precise mathematical form of the two assumptions that are needed for secure sifting in Eqs. ( 1) and (2).In doing so, we have provided a guide for the construction of future protocols: when designing a sifting protocol, one just needs to check these two conditions in order to make sure that the usual analysis of the parameter estimation based on Inequality ( 6) is correct and the protocol is secure.This may require a mathematical model for the state ρ A l B l Θ l or for the probabilities of the output strings (ϑ i ) l i=1 , (s i ) l i=1 and (t i ) l i=1 generated by the sifting protocol.Such models are rarely provided in the literature.In the case of iterative sifting, the absence of such a model to check the desired properties has led to a wrong security analysis.
This points to a deeper problem in QKD security analysis: there is often a gap between the physical protocols that are written down as instructions for Alice and Bob and the mathematics of the security proof.This is not a purely pedantic issue, but rather a very practical one which can be exploited by eavesdroppers.In the future, we advocate that each step in the physical QKD protocol be explicitly mathematically modeled.In particular, we emphasize that sifting protocols must be proved to (rather than assumed to) satisfy the desired assumptions of the analysis.We believe our work will ultimately inspire more complete security proofs of finite-size QKD.
The validity of (B9) can be seen as follows.On the second bit of S and T , there is no error because it comes from a round in which all parties have measured in the Z-basis.Hence, the left had side of (B9) is the probability of getting an error on the first bit of S and T , divided by the total number of bits, 2. Hence, we need to determine the error probability of the first bit.If N x = 1, then the first bit comes from the first round of the loop phase, in which Alice, Bob and Eve have measured in the X-basis and hence, there is no error.However, for N x = n x , the first bit of S and T is chosen at random from one of the n x x-agreements.In only one of these n x rounds, Eve has measured in the X-basis, and in n x − 1 rounds, she measured in the Z-basis.Hence, the probability that Eve measured in the wrong basis on the first bit of S and T is (n x − 1)/n x , and therefore the error probability of the first bit is 1/2 • (n x − 1)/n x .Thus, Similarly, we get and Taking Equations (B8), (B9), (B12) and (B13) together, we get that In a similar way, we get Equations (B3), (B14) and (B15) taken together result in Figure 1 in the main article shows a plot of E as in (B16) as a function of p x .As one can see, E achieves a minimum of E ≈ 22.8% for p x ≈ 0.73.

Attack that exploits basis-information leak
Now we calculate the expected error rate of iterative sifting for the attack which exploits basis-information leak as described in Section III B. As before, let E be the expected value of the error rate as defined in Equation (17).Again, we assume that the X-and Z-basis are the same for Alice, Bob and Eve and that they are mutually unbiased.Recall the strategy of Eve's intercept-resend attack: Before the first round of the loop phase, Eve flips a fair coin.Let F be the random variable of the coin flip outcome and let 0 and 1 be its possible values.If F = 0, then in the first round, Eve attacks in the X basis, and if F = 1, she attacks in the Z-basis.In the subsequent rounds, she keeps attacking in that basis until Alice and Bob first reached a basis agreement.If it is an X-agreement (equivalent to Θ = 01), Eve attacks in the Z-basis in all remaining rounds, and if it is a Z-agreement (equivalent to Θ = 10), she attacks in the X-basis in all remaining rounds.
The calculation of E goes as follows: Equality (B17) is just a decomposition of E into conditional expectations.Equality (B18) follows from the fact that the problem is symmetric under the exchange of X and Z, i.e. under the exchange of 0 and 1.The only quantity that is not trivial to calculate in Equation (B19) is the expected value of the error rate, given that Eve first measures in X and that the first basis agreement is an X-agreement.It is calculated as follows: where ln denotes the logarithm to base e.Therefore, ≈ 16.3% .(B26)

Attack that exploits both problems
Here we present the error rate induced by the intercept-resend attack presented in Section III D, which exploits both non-uniform sampling and basis information leak.Let us recall the attack strategy.In the first round of the loop phase of the iterative sifting protocol, she attacks in the X-basis.She keeps doing that in subsequent rounds until Alice and Bob announce a basis-agreement.If they announce an X-agreement, Eve attacks in the Z-basis in all the following rounds.Otherwise, she keeps attacking in the X-basis.
The calculation of the error rate is similar to the calculations done in Appendices B 1 and B 2. We only show the result here: A plot of (B27) is shown in Figure 1 as a function of p x .As one can see, the expected error rate has a minimum of E ≈ 15.8% for p x ≈ 0.57.Hence, this combined attack on both problems performs much better than the one on non-uniform sampling alone (with a minimal expected error rate of ≈ 22.8%, see Section III A) and even better than the attack on the basis information leak alone (with a minimal expected error rate of ≈ 16.3%, see Section III B).
2. Formalization of (ΩABUV , PABUV ) According to what we said in the last subsection, the probability space that is relevant for our proof of uniform sampling of LCA sifting is the space (Ω ABU V , P ABU V ), which describes the probabilities of the basis choice strings a and b of Alice and Bob, as well as the choices u and v of the rounds that are used for the raw key and for parameter estimation, respectively.We are going to formalize this space in this subsection.
We start by determining the sample space In the loop phase of the protocol, Alice and Bob generate basis choice strings a = (a i ) m i=1 ∈ {0, 1} m , b = (b i ) m i=1 ∈ {0, 1} m .This happens in every run, no matter whether Alice and Bob abort the protocol in the final phase.Hence, In the final phase of the protocol, Alice and Bob do a quota check, in which they determine the rounds in which both measured in the X-basis (X-agreement) the rounds in which both measured in the Z-basis (Z-agreements).In the case where they had less than n X-agreements or less than k Z-agreements, they abort.In this case, Alice and Bob do not choose subsets u and v of their X-and Z-agreements, respectively.We model this by saying that in this case, u = v =⊥, where ⊥ is just a symbol indicating that Alice and Bob abort.In the case where the quota check of the protocol is successful, Alice and Bob choose random subsets u ⊆ u(m) of size n and v ⊆ v(m) of size k.We represent these subets by bit strings u ∈ {0, 1} m n , v ∈ {0, 1} l k , where They are to be interpreted as follows: For u ∈ {0, 1} m n and i ∈ [m], u i = 1 means that i is contained in the subset u ⊆ u(m), and u i = 0 means that i is not contained, and likewise for v ∈ {0, 1} m k .The requirement that the subsets u and v have size n and k translates into the conditions that the string components sum up to n and k, respectively.Taking the two possibilities (the protocol aborts or the quota check is successful) together, we have that and hence This is the sample space of the probability space (Ω ABU V , P ABU V ) that we are looking for.Next, we determine the probability mass function P ABU V .We can write where where for a string a ∈ {0, 1} m , we write The conditional probability distribution P U V |AB is a bit more tricky to write down.What is crucial for this conditional probability is whether the strings a and b have at least n X-agreements and at least Z-agreements.We want to give this condition a formula as follows.Imagine Alice and Bob want to count their X-and Z-agreements.To do so, they can first determine the string a ∧ b, given by a ∧ b := (a i b i ) m i=1 . (C28) The i-th entry a i b i of a ∧ b is 1 if the corresponding bits a i and b i are both 1, i.e. if they had a Z-agreement, and 0 otherwise.Hence, to count their Z-agreements, they can sum up the components of a ∧ b: Therefore, the condition that Alice and Bob had at least k Z-agreements can be expressed as Likewise, the condition that they had at least n X-agreements can be written as where χ is the indicator function, which evaluates to 1 if its argument is true and which evaluates to 0 if its argument is false.For (a, b) ∈ {0, 1} m × {0, 1} m such that condition (C33) is satisfied, the conditional probability P U V |AB is a little more difficult to write down.In that case, both u =⊥ and v =⊥ are impossible.Moreover, only those u ∈ {0, 1} m n are possible which are subsets of Alice and Bob's X-agreements, i.e. which satisfy With this notation at hand, we can determine ϑ from u and v as follows: for i ∈ [l], we have that ϑ i = 0 if u αi(u,v) = 1 and ϑ i = 1 if v αi(u,v) = 1.(Note that for i ∈ [l], it always holds either u αi(u,v) = 1 or v αi(u,v) = 1, but never both, so this is well-defined.)We can write this in terms of a helper function h as where This determines Θ for all (u, v) ∈ {0, 1} m n × {0, 1} m k such that |u ∧ v| = 0.However, since these are the only pairs (u, v) for which a sifted basis choice string ϑ ∈ {0, 1} l k is generated, we just let Θ send all other pairs (u, v) to ⊥: This way, pairs (u, v) are mapped to ⊥ which cannot occur in the protocol (e.g.(⊥, b) with b ∈ {0, 1} l k ).This is unproblematic, because for these pairs, P U V (u, v) = 0, so according to equation (C81), they do not contribute to P Θ .Definition 7: We define the sifted basis choice string random variable Θ on Ω U V by equation (C87).Its associated probability mass function P Θ is given by (C81).
We are ready to state the result.Proposition 8: For LCA sifting (Protocol III), we have that (C89) Before we prove 8, let us point out its importance.Equation (C88) is the probability that the sifting protocol aborts because Alice and Bob did not reach the quota on the X-and Z-agreements, and is therefore a performance parameter of the protocol.Equation (C89) is the sampling probability for each ϑ ∈ {0, 1} l k .Since (C89) is independent of ϑ ∈ {0, 1} l k , we get uniform sampling as a corollary of 8. Corollary: The combination of LCA sifting (Protocol III) and SBPE (Protocol II) samples uniformly.In other words, the LCA sifting protocol satisfies which is what we wanted to show.
For the case of the LCA sifting protocol, we have: The calculation of the expected efficiencies (D8) and (D14) requires a lot of computational power.We wrote programs that compute numerical lower bounds on η I and η L for the case where the probabilities are symmetric (p x = p z = 1/2) and where the quotas coincide (n = k).A plot of these lower bounds is shown in Figure 3.In order to plot the lower bound on η L , a choice for m had to be made for each value of n = k.Our program choses an m which is likely to maximize the expected efficiency for the given value of n = k.Note that 1/2, being the expected fraction of basis agreements, is an upper bound on the expected efficiencies.Hence, Figure 3 indicates that the difference in the expected efficiencies becomes insignificant for practically relevant values of the block length n + k.This means that replacing iterative sifting by LCA sifting is unlikely to have a significant effect on the key rate of a QKD protocol.
5: (Interim report): Alice and Bob communicate their basis choice ar and br over a public authenticated channel.Then they determine the sets u(r) := {j ∈ [r] | aj = bj = 0} , v(r) := {j ∈ [r] | aj = bj = 1} TC: If the condition (|u(r)| ≥ n and |v(r)| ≥ k) is reached, Alice and Bob set m := r and proceed with Step 6.Otherwise, they increment r by one and repeat from Step 1.

FIG. 3 .
FIG. 3. Efficiency comparison of the two sifting protocols.The plots show lower bounds on the expected efficiencies for symmetric probabilities px = pz = 1/2 and for identical quotas n = k.The solid red curve shows a lower bound on the expected value of the efficiency for the iterative sifting protocol as a function of n = k.For the LCA sifting protocol, an optimization over the additional parameter m has been made for each value of n = k.

pm
abort = P Θ (⊥) = − n − k n x − n m − k − n x n z − k 2 m−nx−nz p m+nx−nz x

2 x
x ≥ n ∧ N z = n z ∧ N d = n d ] ) m−nz−n d (p 2 z ) nz (2p x p z ) n d m n d m − n d n z .(D14) , for example.
[15] 2. Comparison of the expected sifting efficiencies.a)Intheprotocol of Shor and Preskill[21], only about a quarter of the measurement results end up in the raw key.Moreover, a relatively large amount of randomness needs to be injected for the sample choice, which in turn increases the length of preshared secret key that Alice and Bob use for authenticated communication.b)Theprotocol by Lo, Chau and Ardehali[15]allows for a bias, px > pz.This way, the expected fraction of bits with basis disagreements shrinks from one half to 2pxpz.The proportions drawn in this figure correspond to px = 0.8.However, it still requires randomness injection for the choice of the sample.c) If, instead, LCA sifting and SBPE are used, as we suggest, then no randomness injection is required for the choice of the sample.Moreover, less bits are consumed for parameter estimation in the finite-key regime, resulting in a longer raw key.