Security analysis of the decoy method with the Bennett–Brassard 1984 protocol for finite key lengths

This paper provides a formula for the sacrifice bit-length for privacy amplification with the Bennett–Brassard 1984 protocol for finite key lengths, when we employ the decoy method. Using the formula, we can guarantee the security parameter for a realizable quantum key distribution system. The key generation rates with finite key lengths are numerically evaluated. The proposed method improves the existing key generation rate even in the asymptotic setting.


Background
Quantum key distribution (QKD) protocol proposed by Bennett-Brassard [1] is one of the most applicable protocols in quantum information.The conventional BB84 QKD protocol generates keys with the matched bases ‡, which are called raw keys and are trivially shown to be secure with the noiseless channel and the perfect single photon source.However, in the realistic setting, there are two obstacles for security.One is the noise of the communication quantum channel.Due to the presence of the noise, the eavesdropper can obtain a part of information of raw keys behind the noise.The second one is the imperfection of the photon source.If the sender sends the two-photon state instead of the single photon state, the eavesdropper can obtain one photon so that she can obtain information perfectly.Many realized QKD systems have been realized with weak coherent pulses.In this case, the photon number of transmitted pulses obeys the Poisson distribution, whose average is given by the intensity µ of the pulse.The first problem can be resolved by the application of the error correction and the random privacy amplification to raw keys [2,3,4,5].In the privacy amplification stage, we amplify the security of our raw keys by sacrificing a part of our raw keys.The security of final keys depends on the decreasing number of keys in the privacy amplification stage, which is called the sacrifice bit-length.Shor-Preskill [2] and Mayers [3] showed that this method gives the secure keys asymptotically when the rate of the sacrifice bit-length is greater than a certain amount.In order to solve the second problem, Gottesman-Lo-Lütkenhaus-Preskill (GLLP) [6] extended their result to the case when the photon source has the imperfection.However, GLLP's result assumes the fractions of respective photon number pulses among received pulses.Indeed, there is a possibility that the eavesdropper can control the receiver's detection rate dependently of the photon number because pulses with the different photon number can be distinguished by the eavesdropper.In order to solve this problem, we need to estimate the detection rate of the single photon pulses.Hwang proposed the decoy method to estimate the detection rate [7].This method has been improved by many researchers [8,9,10,11,12,13,14,15,16].In this method, in order to estimate the detection rates, the sender randomly chooses several kinds of pulses with different intensities.The first kind of pulses are the signal pulses, which generate raw keys.The other kind of pulses are the decoy pulses, which are used for estimating the operation by the eavesdropper and have a different intensity from the signal pulses.
However, we still cannot realize a truly secure QKD system in the real world due to the finiteness of the coding length.Most of the above results assume the asymptotic setting except for Mayers [3].Also, their privacy amplification requires many calculation times.Renner [17] proposed to use universal 2 hash functions for privacy amplification and showed the security under this kind of hash functions.Universal 2 hash functions have been recognized as a fundamental tool for information theoretical security [18,19,20,21].His security proof is quite different from the traditional Shor-Preskill formalism in the following points.He focused on the trace norm of the difference between the true state and the ideal state as the security parameter because the trace norm is universally composable [38].In the following, we call the trace norm the universal composability criterion.As another different point, he employed the left over hashing lemma (privacy amplification) while the traditional Shor-Preskill formalism employs error correction.On the other hand, in the context of the traditional Shor-Preskill formalism, it was shown that the leaked information can be evaluated only by the phase error probability [5,22,23,24,25], which implies that the phase error correction guarantees the security.Using this fact, a previous paper [26] showed that the security under a wider class of hash function, which is called ε-almost dual universal 2 hash function.
In order to treat the finiteness problem in the single photon case, when n is the block length of our code, another previous paper [5] considers the asymptotic expansion of the coding length up to the order √ n § with Gaussian approximation by using the above phase error correction formalism.Scarani et al. [31] and Sano et al. [32] also treated the finiteness problem only for collective attack.Recently, using Renner's formalism, Tomamichel et al [33] derived an upper bound formula for the security parameter with the finite coding length.However, these results assume the single photon source.Furrer et al. [35] gave a finite-length analysis with continuous variable quantum key distribution, which works with weak coherent pulses.While continuous variable quantum key distribution can be implemented with an inexpensive Homodyne detection, the decoy method with BB84 protocol can achieve the longest distance with the current technology [36,37].Hence, we treat the security of finite coding length of BB84 protocol when we use weak coherent pulses and the decoy method.
In the single photon case, using the phase error correction formalism, another previous paper [34] derived better upper bound formulas for the security with the finite coding length, which attain the key generation rate given in [5] up to the order √ n.
They also treated the security with the universal composability based on the phase error correction formalism when the coding length depends on the outcomes of Alice and Bob.The phase error correction formalism provides an upper bound of the leaked information only from the decoding phase error probability.Hence, we employ the phase error correction formalism for our security analysis of finite coding length of BB84 protocol when we use weak coherent pulses and the decoy method.§ Analysis of this type of asymptotic expansion is called the second order analysis and has attracted attention among information theory community due to the relation with analysis of finite coding length [27,28,29,30].
1.2.Our formula for sacrifice bit-length with the finite-length setting When the raw keys are generated by BB84 protocol with the weak coherent pulses by the decoy method, we apply the error correction and the privacy amplification to the raw keys.The security of final keys can be evaluated by the amount of the sacrifice bit-length.The aim of this paper is to provide a calculation formula of the sacrifice bit-length guaranteeing a given security level with the universal composability.Since the generated pulses contain the vacuum pulses, the single-photon pulses, and the multiphoton pulses, we need to estimate these ratios among the pulses generating the raw keys.Note that the vacuum pulses also generate a part of raw keys.The flow of our analytical framework is illustrated as Fig. 1.First, using the relation between phase error and the security, we give a formula of the sacrifice bit-length based on the numbers of the detected pulses originated from the vacuum emissions by Alice, the detected pulses from the single-photon emissions, and the detected pulses from the multi-photon emissions among the detected pulses consisting of the raw keys.In the following, we call these numbers the partition of the detected pulses generating the raw keys.When a component of the partition are divided by total pulse number, we obtain the fractions.
For the finite-length analysis, we need the partition instead of the fractions.
In order to estimate the partition of the detected pulses generating the raw keys, we need to estimate the detection rates of respective kinds of pulses and the phase error probability of single photon pulses, which characterize Eve's operations and can be regarded as parameters of the quantum communication channel.For this purpose, Alice sends the pluses with different intensities.This method is called the decoy method, and enables us to estimate the above detection rates and the phase error probability of single photon pulses.This estimation part can be divided into two parts.The first part is the derivation of channel parameters from the detection rates, the phase error rates, and the partition of respective transmitted pulses by solving joint inequalities, which are given from non-negativity of several channel parameters.The second part is the treatment of statistical fluctuation.If we could treat infinite number of pulses, we had not had to deal with the statistical fluctuation.However, our finite-length setting requires the treatment of the statistical fluctuation.In contrast with the previous papers [34,5], this paper deals with the statistical fluctuation by interval estimation and percent point ¶.The interval estimation is employed for deriving the detection rates and the phase error rates of transmitted pulses with respective intensities from the observed detection rates and the observed phase error rate.The percent points are employed for deriving the partitions of transmitted pulses with respective intensities.Similarly, we employ percent Interval estimation is a statistical method to give an interval of possible (or probable) values of an unknown parameter from sample data, in contrast to point estimation, which is a single number.The method of the binomial case is explained in Appendix B. ¶ Precisely, the percent point means the lower percent point or the upper percent point dependently of the context.When we focus on the ε percent, the lower percent point of the random variable X is the value x 1 satisfying the following.The probability that the random variable X is less than x 1 is ε/100.For example, the lower 5% point of a standard normal distribution is -1.645.
points for deriving the partition of the detected pulses generating the raw keys from the channel parameters.

Roles of percent points and interval estimation
In our analysis, we focus on the universal composability criterion.Our calculation formula for the sacrifice bit-length employs only the basic formulas of percent points and the interval estimation of the binomial distribution, whose numerical calculations are possible by many computer software packages.Hence, it does not contain any optimization process, and then it requires a relatively shorter calculation time.Then, using our formula, we numerically calculate the key generation rate per pulses in several cases.In our numerical calculations, we require that the universal composability criterion is less than 2 −80 .Under this requirement, we have to require too small error probabilities to calculate the exact percent point and the exact interval estimation.
For this purpose, we employ Chernoff bound, which is an upper bound of the error probability and requires a quite small amount of calculations, as summarized in Appendix.Using Chernoff bound, we can derive upper and lower estimates of the true parameter.Since Chernoff bound is not the tight bound of the error probability, these upper and lower estimates are looser than the exact interval estimation.However, even when the required error probabilities are very small, when the size of obtained data is sufficiently large, these upper and lower estimates are sufficiently close to the exact interval estimation + .
Further, similar to Wang et al. [15,16], in Section 6, we discuss our key generation rate with finite-length when the intensities are not fixed and obey certain probability + The reason is the following.The rate of Chernoff bound to the true error probability behaves polynomially with respect to the size of data.In particular, in the binary case, the rate behaves linearly with respect to the size of data.Hence, even when the required error probabilities are very small, when the size of obtained data is sufficiently large, these upper and lower estimates are sufficiently close to the exact interval estimation.
distributions.In Subsection 6.2, we numerically calculate the above key generation rate when the intensities obey Gaussian distributions because the fluctuations of intensities are usually caused by the thermal noise.
Here, we summarize the physical assumption.The photon source generates the coherent state, and the phase factor of the coherent state is completely randomized.The receiver uses the threshold detector.We do not care about other types of imperfection of devices.In particular, we assume no side-channel-attack, i.e., Eve cannot directly see the phase modulator in Alice's side.Further, we do not assume the perfect vacuum pulses.That is, we allow that a non-vacuum state comes to be mixed in the vacuum pulses if the probability of erroneous emission of a non-vacuum state is sufficiently small.We do not assume the collective attack while we employ the binary distribution.That is, our security proof well works for the coherent attack.The reason why the binary distribution can be used instead of the hypergeometric distribution is given in Section 7.

Organization
The organization of the remaining part is the following.As a preparation, Section 2 reviews the result for the universal composability criterion of the final keys when we know the partition of the received pulses and the phase error probability among single photon pulses.Then, Section 2 derives the leaked information from the partition of detected pulses of raw keys by using the relation between the phase error and security, i.e., Step (5) in Fig. 1.Section 3 describes a concrete protocol of the decoy method.Section 4 explains how eavesdropper's operation can be described.Section 5 gives two formulas of the sacrifice bit-length.Subsection 5.2 gives a shorter sacrifice bit-length by improving the formula given in Subsection 5.1.So, we call the formula given in Subsection 5.2 the improved formula and we call the formula given in Subsection 5.1 the non-improved formula.Since the improved formula is too complicated, we give the non-improved formula in Subsection 5.1.After describing the whole structure of the non-improved formula, we give the improved formula in Subsection 5.2.Then, we present a numerical result with the improved formula.In Section 6, we treat the finite sacrifice bit-length when the source intensity is not fixed.Then, we present a numerical result with Gaussian distribution.
The remaining sections are devoted to the security proofs of the formulas given in Section 5.For this purpose, Section 7 summarizes a fundamental knowledge for random variables because the notation explained in Section 7 will be used in latter sections.Since the improved formula is too complicated, we first show the security proof of the non-improved formula in Sections 8, 9, and 10.After the security proof of the nonimproved formula, we give the security proof of the improved formula in Section 11.Section 8 briefly describes our security proof and the outline of discussions in the latter sections.It also gives the sacrifice bit-length from the leaked information, i.e., Step (6) in Fig. 1.Section 9 gives the estimate of channel parameters when the partition of the generated sources is given.Subsection 9.1 gives the partition of detected pulses of raw keys from the channel parameters by using percent points i.e., Step (4) in Fig. 1.Subsection 9.2 estimates the channel parameters from the partitions and the detection rates of several kinds of transmitted pulses by solving joint inequalities, i.e., Step (3) in Fig. 1, and Subsection 9.3 derives the detection rates of decoy pulses from the observed data based on interval estimation, i.e., Step (1) in Fig. 1.In Section 10, we treat statistical fluctuation of the photon number of the sources.In particular, Subsection 10.2 gives the partitions of several kinds of transmitted pulses by using percent points, i.e., Step (2) in Fig. 1.Then, combining the discussions in Sections 8, 9, and 10, we show the security under the sacrifice-length given by non-improved formula given in Subsection 5.1.In Section 11, we give the security proof of the improved formula given in Subsection 5.2 by putting out several probabilities from the square root.
In Appendices A and B, we summarize the basic knowledge for the tail probability and the interval estimation under the binary distribution.In Appendix C, we summarize calculations required for the numerical calculation in Subsection 6.2.

Relation between security evaluation and decoding phase error probability
An evaluation method to use the trace norm of the difference between the true state and the ideal state is known as a universally composable security criterion in QKD [38].Hence, we call it the universal composability criterion.When the length m of the final keys is not fixed, we need a more careful treatment.We denote the final state and Eve's final state by ρ AE|m and ρ E|m , respectively when the length of the final keys is m.Our ideal Alice's state is the uniform distribution ρ mix |m on m bits.Hence, the ideal composite state is ρ mix |m ⊗ ρ E|m .We denote the state indicating that the length of final keys is m, by |m m|, and its probability by P (m).Then, the state of the composite system is ρ AE := m P (m)|m m| ⊗ ρ AE|m , and its ideal state is ρ ideal := m P (m)|m m| ⊗ ρ mix |m ⊗ ρ E|m .Hence, the averaged universal composability criterion of the obtained keys is written as the trace norm of the difference between the real state ρ AE of the composite system and its ideal state ρ ideal as [39] * Thus, a smaller trace norm guarantees more secure final keys.On the other hand, when we apply surjective universal 2 linear hash functions as the privacy amplification [23], [34, (10)] the above value is bounded by the averaged virtual decoding phase error probability P ph as Then, the security analysis of QKD can be reduced to the evaluation of P ph .
In the following, we consider the protocol containing the privacy amplification with the sacrifice bit-length S over the raw keys with length M. When phase error occurs in E bits among M-bit raw keys and we apply the minimum length decoding, the averaged virtual decoding phase error probability P ph is evaluated as♯ Hence, we can guarantee the security of the final keys when the sacrifice bit-length S is sufficiently larger than Mh(min( E M , 1 2 )).However, the number E of bits having the phase error does not take a deterministic value, and it obeys a probability distribution Q(E).Then, when we apply the minimum length decoding, the averaged virtual decoding phase error probability P ph is evaluated as When we use an imperfect photon source, the M transmitted pulses generate M-bit raw keys.Then, each of the M transmitted pulses takes the following three types of states.The first is the vacuum state, the second is the single-photon state, and the third is the multi-photon state.In the following, we assume that the M transmitted pulses consist of J (0) pulses with the vacuum state, J (1) pulses with the single-photon state, and J (2) pulses with the multi-photon state.This assumption guarantees the relation M = J (0) + J (1) + J (2) .That is, the triplet (J (0) , J (1) , J (2) ) gives the partition of the M transmitted pulses.When we send the pulse with the vacuum state, no information can be leaked to Eve.That is, the leaked information in this case equals the leaked information to Eve when we send single-photon pulses with phase error probability 0. On the other hand, in the multi-photon case, we have to consider that all information is leaked to Eve.Hence, the leaked information in the multi-photon case equals the leaked information to Eve when we send single-photon pulses with phase error probability 1/2.In the following, we assume that the phase error occurs in J (1) e bits among J (1) bits.As is shown in [23, (19)] and [26], when we apply a proper class of hash functions in the privacy amplification † †, the averaged virtual decoding phase error probability P ph is evaluated as † P ph ≤ 2 φ(J (0) ,J (1) ,J (1)   e )−S (5) because J (2) = M − J (0) − J (1) , where we define φ(J (0) , J (1) , J (1)  e ) := J (1) h(min( J J (1) , = J (1) h(min( J (1) e J (1) , ♯ It is easy to see that Inequality (5) holds when the completely random matrices (a type of universal 2 hash functions) are used for PA, as in Koashi's case [24].† † More precisely, when we apply ε-almost dual universal 2 hash functions, P ph is evaluated as P ph ≤ ε • 2 φ(J (0) ,J (1) ,J (1) e )−S .As is explained in [26], several practical hash functions, e.g., the concatenation of Toeplitz matrix and the identity matrix, are 1-almost dual universal 2 .† In the derivation [23, (19)], we considered that the J (1) qubits have the phase error rate min( J (1)   e J (1) , 1  2 ) and the J (2) (= M − J (0) − J (1) ) qubits have the phase error rate 1/2.which provides Step (5) in Fig. 1.Due to Eq. ( 5), we can regard φ(J (0) , J (1) , J e ) as a leaked information.
In the general case, the size of sacrifice bit-length S also does not take a deterministic value, and is stochastically determined.In such a case, the values J (0) , J (1) , J e , and S obey a joint distribution Q(J (0) , J (1) , J e , S), and the averaged virtual decoding phase error probability P ph is evaluated by e ,S Q(J (0) , J (1) , J (1)  e , S) min(2 φ(J (0) ,J (1) ,J (1) e )−S , 1).

Protocol of decoy method
In the following, we assume that M s -bit raw keys are generated by N s signal pulses generated by an imperfect photon source.Now, we assume that there are N pulses take multi-photon states.In the following discussion, the partition of N s signal pulses is described by the triplet (N s ), and plays an important role.Now, we prepare three parameters q(0) , q(1) , and b(1) × as follows.The parameter q(0) is the detection rate in the vacuum pulse, i.e., the rate of the vacuum pulses detected in Bob's side to the vacuum pulses transmitted from Alice's side.The parameter q(1) is the detection rate in the single-photon pulse, i.e., the rate of the single-photon pulses detected in Bob's side to the single-photon pulses transmitted from Alice's side.The parameter b(1) × is the rate of the single-photon pulses detected with phase error in Bob's side to the single-photon pulses transmitted from Alice's side.We call the rate b(1) × the phase-error detection rate in the single-photon pulse.Then, the numbers J (0) , J (1) , and J (1) e can be estimated as s q(0) , J (1) ∼ N (1)   s q(1) , J (1)  e ∼ N (1) However, it is not easy to estimate the partition of N s pulses, i.e., (N s , N s ).Now, we consider the case when the N s µ 1 -intensity weak coherent pulses are transmitted.
Then, we obtain the expansion with respect to the photon-number states. where Then, the partition can be estimated as Hence, it is needed to estimate the parameters q(0) , q(1) , and b(1) × .For this purpose, we shuffle µ 1 -intensity coherent pulses and µ 2 -intensity coherent pulses.This method is called the decoy method [7,8,9,12,13] ‡ because µ 2 -intensity pulses work as "decoy" for estimating the parameters q(0) , q(1) , and b(1) × .Hence, the intensity µ 1 to be used to generating the raw keys is called the signal pulse, and the other intensity µ 2 is called the decoy pulse.In the following, we assume that µ 1 < µ 2 .Then, the µ 2 -intensity coherent pulse has the following expansion: where Using the difference between the coefficients in two expansions (10) and ( 13), we can estimate the detection rates q(0) and q(1) by the way explain in Section 6.
In the following, we give the detail of our protocol, in which, both µ 1 -intensity pulses with the bit basis and µ 2 -intensity pulses with the bit basis are used for generating the raw keys.
(1) Transmission: Alice (the sender) sends the pulses with the vacuum, the µ 1intensity coherent pulses and the µ 2 -intensity coherent pulses, randomly with a certain rate.Here, she chooses the bit basis and the phase basis with the ratio 1 − λ : λ among the µ 1 -intensity coherent pulses and the µ 2 -intensity coherent pulses.‡ In a wider sense, we can regard the check bits estimating the phase error probability as another kind of decoy state.
(2) Detection: Bob (the receiver) chooses the bit basis and the phase basis with the ratio 1 − λ : λ and measures the pulses in the received side.Then, he records existence or non-existence of the detection, his basis, and the measured bit.For the detail, see Remark 2.
(3) Verification of basis: Using the public channel, Alice sends Bob all information with respect to the basis and the intensity for all pulses.Using the public channel, Bob informs Alice what pulses has the matched basis.Then, as is illustrated in Table 1, they decide the numbers N 0 , N 1 , N 2 , N s,1 and N s,2 as follows.N 0 is the number of vacuum pulses, N 1 is the number of µ 1 -intensity pulses with the phase basis in the both sides, N 2 is the number of µ 2 -intensity pulses with the phase basis in the both sides, N s,1 is the number of µ 1 -intensity pulses with the bit basis in the both sides, and N s,2 is the number of µ 2 -intensity pulses with the bit basis in the both sides.
(4) Parameter estimation: Alice and Bob announce all bit information with respect to N 1 + N 2 pulses with the phase basis in the both sides.Then, as is illustrated in Table 2, they decide the numbers M 0 is the number of vacuum pulses detected by Bob.
) is the number of µ i -intensity coherent pulses those are detected by Bob and have the phase basis in the both sides and the agreement bit values (the disagreement bit values).(However, they will not use M 4 .)M s,1 is the number of µ 1 -intensity coherent pulses those are detected by Bob and have the bit basis in the both sides.M s,2 is the number of µ 2 -intensity coherent pulses those are detected by Bob and have the bit basis in the both sides.
In the following, we describe the key distillation protocol for M s,1 -bit raw keys generated by the µ 1 -intensity coherent pulses.The key distillation protocol for M s,2 -bit raw keys generated by the µ 2 -intensity coherent pulses can be obtained when N s,1 and M s,1 are replaced by N s,2 and M s,2 , respectively.
(5) Error correction: First, Alice and Bob choose a suitable M s,1 -bit classical code C 1 that can correct errors of the expected bit error rate p + .For decoding, they prepare a set {s /C 1 .They also prepare another set {s Then, they exchange their information [s] − s [s ′ −s] in C 1 .(6) Privacy amplification: Using the method explained latter, Alice and Bob define the sacrifice bit-length S in the privacy amplification from [26].Then, they obtain the final keys.(7) Error verification: Alice and Bob apply a suitable hash function to the final keys.They exchange the exclusive OR between the above hash value and other prepared secret keys.If the above exclusive OR agrees, their keys agree with a high probability [40,41].N s,1 N s,2 @@@@@@@@@@@ phase basis phase basis bit basis phase basis M s,1 M s,2 @@@@@@@@@@@ phase basis phase basis bit basis @@ @@@ phase basis correct In the error correction, we lose more than M s,1 h(p + ) bits.When we lose ηM s,1 h(p + ) bits in the error correction, the final key length is M s,1 − ηM s,1 h(p + ) − S. In a realistic case, we choose η to be 1.1.In the above protocol, it is possible to restrict the intensity to generate the raw keys to µ 1 or µ 2 .In this case, we restrict the intensity with the bit basis to µ 1 or µ 2 .When we restrict the intensity with the bit basis to µ 2 , the numbers N s,1 and M s,1 become 0.
In the following discussion, we denote the number of transmitted pulses for generation of raw keys, the number of raw keys, and the signal intensity by N s , M s , and µ s .That is, when we discuss the security of final keys generated from raw keys with the intensity µ i , the numbers N s , M s , and µ s are chosen to be N s,i , M s,i , and µ i for i = 1, 2.
Remark 1 In the above protocol, the raw keys are generated from the bit basis.However, this assumption is not essential.For example, our analysis can be applied to the case when the raw keys are generated from both bases as follows.First, we replace Step (3) by the following Step (3').
(3') Verification of basis: Using the public channel, Alice sends Bob all information with respect to the basis and the intensity for all pulses.Using the public channel, Bob informs Alice what pulses has the matched basis.Then, as is illustrated in Table 1, they decide the numbers 1 is the number of µ 1 -intensity pulses with the phase basis in the both sides, N ′ 2 is the number of µ 2 -intensity pulses with the phase basis in the both sides, N ′ s,1 is the number of µ 1 -intensity pulses with the bit basis in the both sides, and N ′ s,2 is the number of µ 2 -intensity pulses with the bit basis in the both sides.Then, we decide smaller numbers , respectively.We apply Step (4) and the following steps to the remaining N s,1 + N s,2 + N 1 + N 2 pulses and N 0 vacuum pulses with exchanging the roles of the bit and the phase bases.In this case, we may choose the classical error correcting code C based on the observed error rate in Step (5).
Remark 2 When the receiver uses the threshold detector, in Step (2) (Detection), the receiver might detect the both events.In this case, we use the following type detector [48].
Detector When the receiver detects the both events, the receiver chooses 0 as the bit value definitely.
In fact, since the encoding does not depend on the choice of the detector, the formula (2) holds with the averaged virtual decoding phase error probability P ph based on any Bob's virtual decoder employing any Bob's detector when Bob's detection event does not depend on the choice of the basis.Hence, our security analysis is still valid even in the above detector.

Description of Eve
In the following, we describe the strategy of Eve.For this purpose, we treat only the vacuum pulses and the pulses with matched bases, i.e., N 0 + N 1 + N 2 + N s pulses given in Table 1.We do not treat other kinds of pulses.Eve cannot distinguish pulses with the intensities µ 1 and µ 2 perfectly.Alternatively, we assume that Eve can choose her strategy depending on the number of photons because she can distinguish the number of photons.That is, Eve is assumed to be able to distinguish the states |0 0|, |1 1|, ρ 2 , and ρ 3 .We assume the following partition of pulses given in Table 1 as follows: • There are N (0) 1 pulses with the vacuum state and N (1) 1 pulses with the single-photon state among N 1 µ 1 -intensity pulses with the phase basis.
• There are N s pulses with the single-photon state among N s µ s -intensity pulses with the bit basis.
For a simplicity, we employ the notations N s := (N 2 ), and N := (N 1 , N 2 ).In the above partition, there are s pulses with the vacuum state, N (1) s pulses with the single-photon state, N (2) pulses with the state ρ 2 and the phase basis, and N (3) 2 pulses with the state ρ 3 and the phase basis, where 2 .Note that the average state with the bit basis is not the same as the average state with the phase basis in the case of the multi-photon state.
Then, Eve is assumed to be able to control the detection rates q(0) , q(1) , q(2) × , and pulses of the state ρ 2 with the phase basis, and pulses of the state ρ 3 with the phase basis, respectively.Similarly, Eve is assumed to be able to control the phase-error detection rates b(1) × , and b(3 pulses of the state ρ 2 with the phase basis, and pulses of the state ρ 3 with the phase basis, respectively.In the following discussion, we use the parameters ā(1) × .For a simplicity, we employ the notations ā := (ā × ).Eve is also assumed to be able to control the parameters q(0) , ā and b dependently on the partition of the total N 0 + N 1 + N 2 + N s pulses.Further, Eve is assumed to choose these values stochastically.Hence, the joint distribution conditioned with N and N s can be written as Q e (q (0) , ā, b| N , N s ).Since our analysis depends only on N , we use the conditional distribution Q e (q (0) , ā, b| N ) := Ns P s (N s )Q e (q (0) , ā, b| N , N s ), where P s is the distribution of N s and cannot be controlled by Eve.

Non-improved formula
The aim of this section is to give formulas of the sacrifice bit-length S satisfying as a function of β, µ s , µ 1 , µ 2 , N s , N 0 , N 1 , N 2 , and M , where ρ A,E is the final state and ρ ideal is the ideal state.This section gives two formulas, the non-improved formula and the improved formula.While the improved formula gives a shorter sacrifice bit-length than the non-improved formula, the non-improved formula is simpler than the improved formula.Hence, we give the non-improved formula firstly.In the next subsection, we give the improved formula.For this purpose, we prepare fundamental definition for behavior of random variables.
Definition 1 When the random variable k is subject to the distribution P , we denote k ∼ P .When the true distribution is the N-trial binary distribution with success probability p, which is denoted by Bin(N, p), we denote the upper percent point with probability α by X + per (N, p, α), and denote the lower percent point with probability α by X − per (N, p, α).Then, we define p + per (N, p, α) := X + per (N, p, α)/N, and p − per (N, p, α) := X − per (N, p, α)/N.When we observe the value k subject to the binomial distribution Bin(N, p) with N trials and probability p, we denote the lower confidence limit of the lower one-sided interval estimation with the confidential level 1 − α by p − est (N, k, α).Similarly, we denote the upper confidence limit of the upper one-sided interval estimation with the confidential level 1 − α by p + est (N, k, α).Then, we define When N is not so large (e.g., 10,000) or α is not so small (e.g., 0.001), the percent point X ± per (N, p, α) can be calculated by mathematical package in software (e.g., Mathematica).As is summarized in Appendix B.1, the interval estimation p ± est (N, k, α) is described by F distribution, and can be calculated by mathematical package in software in this case, similarly.However, when N is too large and α is too small, these calculation cannot be done by a usual mathematical package in software.However, since N is large enough, using formulas given in Appendices A and B, we can calculate good lower and upper bounds of these values, which is enough close to the exact values for our purpose.The calculation formulas can be implemented with small calculation amounts.
Indeed, in order to guarantee the unconditional security, we have to use the hypergeometric distribution instead of the binomial distribution.However, the hypergeometric distribution can be partially replaced by the binomial distribution.Section 7 explains which case allows this replacement.This replacement greatly simplifies the calculation of sacrifice bit-length.Now, we give the non-improved formula of the sacrifice bit-length S as a function of β, µ s , µ 1 , µ 2 , N s , N 0 , N 1 , N 2 , and M = (M s , M 0 , M 1 , M 2 , M 3 ).The whole structure of our formulas is summarized as Fig. 2.Then, as is shown latter, when the sacrifice bit-length is given by the following way, the final key satisfies (15).
Step (1) We estimate the detection rates of decoy pulses from the observed data based on interval estimation: Step (2) We estimate the partitions of several kinds of transmitted pulses by using percent points: N( 1) N( 1) Step (3) We estimate the channel parameters from the partitions and the detection rates of several kinds of transmitted pulses by solving joint inequalities: â( 1) b( 1) where [x] + := max(x, 0).
Step (4) We estimate the partition of detected pulses of raw keys from the channel parameters by using percent points: Ĵ(1) := X − per (N s , e −µs µ s (â r(1) Step ( 5) We estimate the leaked information from the partition of detected pulses of raw keys by using the relation between the phase error and the security: × , 1/2})).( 31) Step (6) We give the sacrifice bit-length from the leaked information: That is, when one of Conditions 1, 2 and 3 does not hold, we abort the protocol.
Conditions 1, 2, and 3 are given as follows.In order to give these conditions, we define the set Ω 1 as the set of N satisfying Condition 2 For any N ∈ Ω 1 , all of the following values are positive.
Remark 3 (Adjustment of q(0) for non-improved formula) When the vacuum pulse has a possibility to contain a non-vacuum state, we cannot apply the above formula q(0) .Hence, we need its adjustment.Assume that the vacuum pulse becomes a nonvacuum state with a probability q.In this case, we replace q(0) by Here, we should remark that Condition 1 is given for the initial parameters β, µ 1 , µ 2 , N 1 , N 2 while Conditions 2 and 3 are given for the observed values M = (M s , M 0 , M 1 , M 2 , M 3 ) as well as the initial parameters β, µ 1 , µ 2 , N 0 , N 1 , N 2 .Hence, it is required to choose the initial parameters β, µ 1 , µ 2 , N 1 , N 2 satisfying Condition 1. Further, we need to choose the initial parameters β, µ 1 , µ 2 , N 0 , N 1 , N 2 so that Conditions 2 and 3 hold with high probability.Now, we consider the case when there might exist an eavesdropper.In this case, even if we choose µ 1 , µ 2 , N 0 , N 1 , N 2 suitably, the eavesdropper might control the channel parameters q(0) , ā and b so that Conditions 2 and 3 do not hold.Hence, we need to prepare a method to smoothly decide whether Conditions 2 and 3 hold.
We will show that the non-improved formula satisfies the condition (15) in Sections 8, 9, and 10.The following table (Table 3) explains which equations in Sections 8, 9, and 10 correspond to the above steps in the non-improved formula.

Improved formula
However, the above construction is too restrictive.We can replace Steps (1), ( 2), (4), Condition 1, and the definition of the set Ω 1 as follows.That is, Conditions 2 and 3 are replaced by the conditions based on the improved version of Ω.The formula given here for the sacrifice bit-length is called the improved formula.Section 11 explains why the improvement is possible.That is, Section 11 shows that the improved formula also guarantees the condition (15).
Step (1) We replace the estimated detection rates of decoy pulses by the following way: Step (2) We replace the estimated partitions of several kinds of transmitted pulses by the following way: N( 1) N( 1) N( 2) Step ( 4) We replace the estimated partition of detected pulses and the estimated phase error rate of the single photon of raw keys by the following way: Ĵ(1) := X − per (N s , e −µs µ s (â The definition of the set Ω 1 is replaced as the set of N satisfying Condition 1 is replaced as follows.

Numerical analysis
Next, we treat numerical analysis with the improved formula of the sacrifice-bit length.
In the following, we consider only the case when the perfect vacuum state is available and the signal intensity is µ 2 , the decoy intensity is µ 1 , i.e., µ s = µ 2 , N s = N s,2 , and M s = M s,2 .This is because this case is better than the opposite case in the asymptotic case as is shown in the paper [50].We also choose the parameters as The above graphs describe the key generation rate R 2,f given in (61) as functions of the signal intensity µ 2 when the decoy intensity µ 1 is 0.1.The pink line is the case when the bit-length of raw keys M s,2 is 10 6 .The orange line is the case with M s,2 = 2×10 6 .The red line is the case with M s,2 = 3 × 10 6 .The green line is the case with M s,2 = 5 × 10 6 .The purple line is the case with M s,2 = 10 7 .The yellow line is the case with M s,2 = 10 8 .The blue line is the asymptotic case.The above graphs describe the key generation rate R 2,f given in (61) as functions of the bit-length of raw keys M s,2 when the signal intensity µ 2 is 0.5.The orange line is the case when the decoy intensity µ 1 is 0.01.The red line is the case with µ 1 = 0.05.The purple line is the case with µ 1 = 0.1.The yellow line is the case with µ 1 = 0.15.The pink line is the case with µ 1 = 0.2.The blue line is the case with µ 1 = 0.25.
It is natural to assume that the measured values M 0 , M 1 , M 2 , M 3 , and N s,2 are given as functions of M s,2 in the following way .The above graphs describe the key generation rate R 2,f given in (61) as functions of the signal intensity µ 2 when the bit-length of raw keys M s,2 is 10 7 .The orange line is the case when the decoy intensity µ 1 is 0.01.The red line is the case with µ 1 = 0.05.The purple line is the case with µ 1 = 0.1.The yellow line is the case with µ 1 = 0.15.The pink line is the case with µ 1 = 0.2.The blue line is the case with µ 1 = 0.25.The green line is the asymptotic case with µ 1 → 0.
We also assume that the channel parameters, i.e., the detection rates p i,× and p i,+ of µ i -intensity pulses with the bases × and + and the rates s i,× and s i,+ of the detected µ i -intensity pulses having phase error to the transmitted µ i -intensity pulses with the bases × and + as follows [51,52].
where α is the total transmission including quantum efficiency of the detector, and s is the error due to the imperfection of the optical system.In the following, we choose α = 1.0 × 10 −3 , p 0 = 4.0 × 10 −7 , s = 0.03.Then, we consider the key generation rate with finite-length: where S is the sacrifice bit-length and η = 1.1.Since the required value 2 −80 is too small and the sizes N 0 , N 1 , N 2 , M s are too large, the exact calculations of the values X ± per (N, p, α), X ± est (N, k, α), and p ± est (N, k, α) spend too much time.So, instead of the exact calculation, we employ the bounds of these values based on Chernoff bound, which require a smaller amount of calculations and are summarized in Appendices A and B. Indeed, when N is large enough, the exact values of X ± per (N, p, α), X ± est (N, k, α), and p ± est (N, k, α) are close to the values based on Chernoff bound sufficiently for our purpose because the difference between the exact values X ± per (N, p, α), X ± est (N, k, α) and their values based on Chernoff bound behaves with the order log N.
As is illustrated in Figs. 3 and 4 with µ 1 = 0.1, α = 1/1000, p 0 = 0.0000004, η = 1.1, the key generation rate is close to the asymptotic key generation rate R 2 (µ 1 , µ 2 ) when the length of the code M s,2 is increasing.As is shown in [50], the asymptotic key generation rate is monotonically decreasing with respect to µ 1 .However, as is illustrated in Fig. 5 with α = 1/1000, p 0 = 0.0000004, η = 1.1, the key generation rate is not monotonically decreasing with respect to µ 1 when the length of the code M s,2 is not sufficiently large.That is, too small µ 1 does not give a good key generation rate.This is because smaller µ 1 yields a larger estimation error.
6. Sacrifice bit-length when the intensities are not fixed with the finite-length case

Derivation of modified formula
Unfortunately, many realized quantum key distribution systems have fluctuation for the intensities.The formulas of the secure sacrifice bit-length given in Section 5 can guarantee the security (15) when the partitions of N 1 pulses and N 2 pulses obey the Poisson distribution with a fixed intensity.However, when the intensities have fluctuation, we have to derive the sacrifice bit-length by taking into account this factor.That is, we need to discuss the distribution for N in the different way.In this section, we discuss the sacrifice bit-length by taking into account the statistical fluctuation for the intensities.Since the definition of ρ 2 given in (10) depends on the intensity µ 1 , we need to modify the definition of ρ 2 properly.6.1.1.Modifications of ρ 2 , ρ 3 , ω 2 , and ω 3 In the following, we assume that the intensities µ 1 and µ 2 independently obey independent and identical distributions of the distributions P 1 and P 2 satisfying the following condition.For any integer n ≥ 3, the relation ] holds, where E denotes the expectation under the distributions P 1 and P 2 .Under the above assumption, we have expansions for two kinds of pulses. where |n n| (64) (66) Indeed, our analysis in the previous sections uses the expansions ( 10) and ( 13) and their coefficients.Hence, replacing expansions ( 10) and ( 13) by expansions (62) and (63), we can apply the discussion with suitable modifications in the following way.(A similar idea was used in Wang [15,16].) 6.1.2.Modifications of the set Ω 1 and the estimate N1 and N2 We redefine the set Ω 1 as the set of N satisfying ].We also redefine N1 and N2 in the following way.
Condition 1 Any element N in the modified set Ω 1 satisfies ) .
Conditions 2 and 3 are redefined in the term of Ω 1 defined above.
Condition 2 For any element N in the modified set Ω 1 , all of 2 , and A (2) 2 are negative.
6.1.5.Extension to Case when the distributions of µ 2 and µ 1 are unknown Next, we treat the case when there are several candidates for the distribution of µ 2 and µ 1 while µ 2 and µ 1 obey independent and identical distributions.The possible distributions is denoted by P θ,1 and P θ,2 , and the expectation is written by E θ .Then, we denote the set Ω 1 under the distribution P θ by Ω 1,θ .
In this case, Conditions 1 and 2 are needed to be satisfied for any θ.Hence, Condition 1 is redefined as follows.That is, the following relations hold for any θ.
where ω 2|θ is ω 2 with the distribution P θ,1 .Further, we redefine Condition 2 as the condition that all of A (0) 2 , and A (2) 2 are negative for N ∈ ∪ θ Ω 1,θ .We define φ2,θ to be φ2 given in (31) when the true distributions are P θ,1 and P θ,2 .Finally, we define the sacrifice bit-length S by sup θ φ2,θ + 2β + 5 when modified Conditions 1, 2, and 3 hold.Otherwise, we set S to be dim C 1 .Then, letting ρ A,E|θ be the final state with the true distributions P θ,1 and P θ,2 , ρ ideal |θ be the ideal state, we obtain That is, the inequality holds for any θ.
In the following, we consider the case when the pulses are generated with the mixture of the plural independent and identical distributions P θ,1 and P θ,2 , respectively.In this case, we define Conditions 1, 2, and 3 in the above way.Then, the intensities of N s,1 +N 1 pulses are described by (µ 1,1 , . . ., µ 1,N s,1 +N 1 ) and are subject to the distribution , where P ×N s,1 is the N s,1 -fold independent and identical distribution of P .Similarly, the intensities of N s,2 + N 2 pulses are described by (µ 2,1 , . . ., µ 2,N s,2 +N 2 ) and are subject to the distribution Hence, the universal composability criterion is upper bounded by 2 −β .

Numerical analysis with Gaussian distribution
Next, we treat numerical analysis when two intensities µ 1 and µ 2 independently and identically obey the Gaussian distributions with the averages μ1 and μ2 and the standard deviations μ1 t and μ2 t, respectively because these fluctuations usually are caused by the thermal noise.That is, we assume the value t is independent of the intensity.This assumption holds, if the weak pulses are obtained from strong light pulses with a wellcalibrated attenuator; the error originates mainly from the intensity fluctuation of the light source.In the following, we consider only the case when the signal intensity is µ 2 , the decoy intensity is µ 1 , i.e., µ s = µ 2 , N s = N s,2 , and M s = M s,2 .We also choose the parameters as N 0 = N 1 = N 2 = N s,2 /10, and β = 80, i.e., the trace norm is less than 2 −80 .In order to calculate the sacrifice bit-length given above, we need E[e and ω 2 , which can be easily calculated from the formulas given in Appendix D. In this case, due to (C.6), it is natural to assume that the measured values M 0 , M 1 , M 2 , and M 3 are given by ( 58) and (59) when p i,+ , p i,× , s i,+ , and s i,× are given as 2 Hence, we choose N s,2 to be M s,2 /p 2,+ .Under this assumption, substituting the sacrifice bit-length given above into the key generation rate R f,2 given in (61), we obtain the numerical calculation in Figs. 6 and 7.These numerical results suggest that when the variance is less than 10% of the average, the fluctuations of intensities do not cause serious decrease of the key generation rate.Here, similar to Subsection 5. similar to Subsection 5.3, we employ the bounds of X ± per (N, p, α), X ± est (N, k, α), and p ± est (N, k, α) given in Appendices A and B.

Preparation for behavior of random variables
In this section, we explain that we can use the binomial distribution even when the true distribution is the hypergeometric distribution.In this paper, we also treat the hypergeometric distribution HG(L, K, N) with N draws and L samples containing K success.In fact, the outcome obeys the binary distribution in the case of sampling with replacement, and the outcome obeys the hypergeometric distribution in the case of sampling without replacement.Then, we study the stochastic behavior of the measured values M = (M s , M 0 , M 1 , M 2 , M 3 ) under the assumption that the parameters q(0) , ā, and b are unknown, but are fixed to certain values.For this purpose, we introduce the random variables Ms , M0 , M1 , M2 , M3 subject to the binary distributions with the same draws and the same successful probabilities as M s , M 0 , M 1 , M 2 , M 3 by sampling with replacement.The number of vacuum pulses is Then, the number M 0 of detected pulses among these N 0 vacuum pulses obeys the hypergeometric distribution HG(N 0 + N s ), N 0 ).For a real number R > q(0) , the probability Pr{ M 0 N 0 > R} is smaller than the probability Pr{ M0 N 0 > R}.The reason is as follows.Let L be an arbitrary integer less than N 0 − 1.If the observed detection rate of the initial L transmitted pulses is greater than R, the detection probability of the L+1-th pulse is less than q(0) in the case of sampling without replacement.Thus, we obtain where Pr M ,J |q (0) ,ā, b, N is the distribution of the random variables M , J when q(0) , ā, b and N are fixed.That is, we obtain where M0 is the expectation of M 0 , which equals q(0) N 0 .
Remark 5 Here, we should remark that the above analysis does not imply that the non-replacement case can be reduced to the replacement case perfectly.Let M (0) 1 be the number of detected pulses among N (0) 1 transmitted vacuum pulses and M(0) 1 be the random variable subject to the binary distribution with the same draws and the same successful probability as M (0) 1 by sampling with replacement.Since M0 and M(0) 1 are independent of each other due to sampling with replacement, we have However, since M 0 and M (0) 1 have no replacement, M 0 and M (0) 1 are not independent of each other.Hence, the relation (76) does not hold for M 0 and N 0 + a} cannot be bounded by RHS of (76).Instead of RHS of (76), we have a weaker evaluation, because That is, the above discussion cannot yield a better bound (RHS of (76)) but can yield a weaker bound (RHS of (77)).
Next, we consider a more complicated case, i.e., focus on N 1 µ 1 -intensity pulses, which contain pulses with the single-photon state, and N (2) 1 pulses with the state ρ 2 .Then, the expectation M1 of M 1 is function satisfying this requirement.The final aim is to figure out the structure of our formula of the sacrifice bit-length, which gives the detail of Figs. 1 and 2. Now, we remember the definition of the set Ω 1 and Condition 1. Condition 1 is equivalent to the condition that any element N ∈ Ω 1 satisfies Conditions 4, 5, and 6.In the following, we assume that there exists a real-valued function φb of the measured value M that satisfies that under Condition 1.In Section 10, we will give a concrete function φ4 satisfying the above condition.Note that the value φb (M ) does not depend on the partition N .Then, we can show the following theorem.
Theorem 1 When Condition 1 holds and the function φb satisfies (92), we obtain Proof.The definition of Ω 1 yields that which implies the desired argument.Therefore, when ρ A,E is the final state with the sacrifice bit-length Thus, the relation (2) implies In summary, since Theorem 1 requires Condition 1, we need to choose the parameters µ 1 , µ 2 , N 0 , N 1 , and N 2 so that Condition 1 holds.That is, we need to choose sufficiently large integers N 0 , N 1 , and N 2 .Otherwise, we cannot apply Theorem 1, i.e., we cannot guarantee the security.
The latter sections give a formula of the sacrifice bit-length S as a function of β, µ s , µ 1 , µ 2 , N s , N 0 , N 1 , N 2 , and M by giving a concrete example of φb .In order to apply interval estimation and percent point, we have to decide which upper or lower bound to be used in the respective steps.These decisions will be done based on derivatives for respective variables.Hence, the calculations of these derivatives are the main issues in the latter sections.

Estimation of another kind of channel parameters M
In this subsection, we treat the estimation of the channel parameters M that is required to estimate the channel parameters q(0) , ā × , and b(1) × when the partition N = (N 1 , N 2 ) of pulses is known.That is, we consider the method to estimate M0 , M1 , M2 , and M3 from the measured value M 0 , M 1 , M 2 , and M 3 .
For this purpose, we introduce the following assumption for M .
In the following, we employ as estimates of M0 , M1 , M2 , and M3 , which give Step (1) in Figs. 1 and 2.Then, we define the function φ3 φ3 (M , N ) := φ2 ( M (M ), N ), which satisfies the condition (91) for φa , as is guaranteed by the following theorem.
Theorem 2 When the partition N belongs to Ω 1 and satisfies Conditions 4 and 5, and when M (M ) satisfies Condition 7, the relation Proof.The definition of φ2 given in (121) yields Hence, the above calculations of the partial derivatives and Conditions 4 and 5 imply Thus, it follows from the relations (74), ( 80), (83), and (85) that Using ( 141) and (106), we obtain Remark 7 When the vacuum pulse has a possibility to contain a non-vacuum state, we adjust the estimate q(0) (M 0 ) as follows where the vacuum pulse becomes a non-vacuum state with a probability q.Let N (1) 0 be the number of non-vacuum pulses among N 0 pulses.Then, 0 , 2 −2β−8 )}.Hence, the probability of {q 0 < q(0) (M 0 )} is less than 2 • 2 −2β−8 .Thus, in the proof of Theorem 2, we replace the right hand side of (142) by 5 In this section, we define the upper bound φb (M ) satisfying (92) as a function of the measured value M .Note that the upper bound φb (M ) does not depend on the partition N .In this subsection, for this purpose, we recall Conditions 2 and 3, which are conditions for µ 1 , µ 2 , N 0 , N 1 , N 2 and the observed data M .Condition 3 plays an alternative role of Condition 7.
Substituting M into M , and applying the relations (122), (123), and (124), we can calculate the above values as due to (95), we obtain ρ A,E − ρ ideal 1 ≤ 2 −β .Note that φ2 ( M (M ), N ) is given in (121), which is the same as φ2 given in (31).Since M + 2β + 5 is greater than dim C 1 , the formula (152) implies the abort of the protocol when one of Conditions 1, 2 and 3 does not hold.Hence, we obtain (15) under the formula given in Subsection 5.1.

Security proof of improved formula
Up to the previous section, based on (95) given in Section 8, we evaluate the universal composability criterion with the finite-length setting.However, the above given evaluation can be improved by removing the square root for a part of probabilities.
The improved formula for the sacrifice bit-length given in Subsection 5.2 is derived by removing the square root for a part of probabilities.The purpose of this section is to show the following theorem.
Theorem 3 The improved formula for the sacrifice bit-length given in Subsection 5.2 satisfies Now, we will show the above theorem.For this purpose, we discuss the formula (2) more deeply.Let s = (s 1 , . . ., s Ns+N 0 +N 1 +N 2 ) be the indicators of the kinds of initial states of pulses received by Bob.The indicators are decided as follows.If the i-th received state is the vacuum state, s i is 0. If the i-th received state is the single-photon state, s i is 1.If the i-th received state is the state ρ 2 , s i is 2. Otherwise, s i is 3.The information s contains all of information for N and (J (0) , J (1) , J (2) ).That is, s decides N and (J (0) , J (1) , J (2) ).However, it cannot decide J e .Once we apply (2) to the case when s is fixed, we obtain where ρ A,E|s , ρ ideal |s , and P ph|s are the final true composite state, the ideal final state, and the averaged virtual decoding phase error probability conditioned with s.Hence, the final true composite state ρ A,E and the ideal final state ρ ideal are written as Hence, for a set Ω of s, we have We choose the set Ω as Hence, using (155), we obtain Since φ4 (M ) ≥ φ3 ( M (M ), N ) for N ∈ Ω 1 , as was shown in (151), the relations (141), (142), and (105) guarantee that J (1)   ∪{q J (1) ≥ p + per (J (1) , b J (1) ≥ p + per J (1) , b Thus, using (156), we obtain because 2 √ 2 + 13 16 ( ∼ = 3.64) ≤ 4. Indeed, in order to put out a probability from the square root, the event corresponding to the probability must be defined by s, i.e., the probability conditioned with s must take the value 1 or 0. Hence, the probabilities corresponding to the sets { J (1)   e J (1) ≥ p + per (J (1) , b In summary, when the parameters µ 1 , µ 2 , N 0 , N 1 , and N 2 satisfy Condition 1 modified in Subsection 5.2, and when we choose the sacrifice bit-length S(M ) = φ4 (M ) + 2β + 5 by using the choice of φ4 (M ) given in (150) with the modification given in Subsection 5.2, we obtain ρ A,E − ρ ideal 1 ≤ 2 −β .

Conclusion and further improvement
In this paper, under the BB84 protocol with the decoy method, based on several observed values, we have derived the required sacrifice bit-length S(M ) = φ2 (M ) + 2β + 5, where φ2 (M ) is given in Step (6).Under the above sacrifice bit-length, we have shown that the final keys satisfy the security condition ρ A,E − ρ ideal 1 ≤ 2 −β when the parameters µ 1 , µ 2 , N s , N 0 , N 1 , and N 2 satisfy Condition 1.Hence, in order to apply our formula, we need to choose the parameters µ 1 , µ 2 , N s , N 0 , N 1 , and N 2 so that Condition 1 holds.This is a definitive requirement for our analysis.However, when we choose sufficiently large integers N s , N 0 , N 1 , and N 2 for the two values µ 1 and µ 2 − µ 1 , Condition 1 holds.Indeed, when the two positive values µ 1 and µ 2 − µ 1 are quite small, we need to choose quite large integers N s , N 0 , N 1 , and N 2 .As the second requirement, we need to choose the parameters µ 1 , µ 2 , N s , N 0 , N 1 , and N 2 so that Conditions 2 and 3 hold with a high probability when there is no eavesdropper.This requirement is also satisfied when the integers N s , N 0 , N 1 , and N 2 are sufficiently large and the noise in the channel is sufficiently small.Indeed, it is not so difficult to realize sufficiently large N s , N 0 , N 1 , and N 2 for these requirements because a universal 2 hash function (or an ε-almost dual universal 2 hash function) with a large size can be implemented with a small cost [49].
Since the decoy method has so many parameters, it is quite difficult to derive tight evaluation.The proposed method might be improved by modifying several points.However, such a modification might make the protocol more complex.For example, while we treat the decoding phase error probability and the estimation error probability, separately, The paper [34] treated them jointly.In order to keep the simplicity, it is better to treat these terms separately.Further, in Section 7, we proposed to treat the probability based on the hypergeometric distribution by using the binomial distribution.If we treat the probabilities given in Section 9 with the hypergeometric distribution, we obtain a better evaluation, but our analysis becomes much harder.Therefore, we have to consider the trade-off between the complexity and the tightness of our evaluation.This kind of trade-off cannot be ignored from an industrial view point.If the protocol is more complex, the cost for maintenance becomes higher.In particular, when we change the arrangement of the total system or we change the parameter of the system, we have to rewrite the program for calculating the sacrifice bit-length.If the protocol is simple, the change can be easily done.Otherwise, it spends some additional cost.Hence, we have to take into account this trade-off.This paper has treated this trade-off heuristically.
However, its systematic treatment might be possible partially in the following sense.Assume that we employ the Renner's formalism instead of the phase error correction formalism.If we parametrize the channel with more parameters to be estimated, the asymptotic key generation rate becomes better.One might consider that, if the number of parameters describing the model increases, we obtain a better estimation of the model.However, it is considered that it is not true in statistics.This is because if we do not have enough data to characterize so many parameters, we obtain a larger error.In order to resolve this problem, we have to treat the trade-off between the error and the number of parameters.Such a problem is called the model selection.In order to treat this problem quantitatively, we can use several information criteria, e.g., Akaike information criterion (AIC) [43], Takeuchi information criterion (TIC) [44], and minimum description length principle (MDL) [45].If we employ the Renner's formalism, and increase the number of channel parameters for precise description of channel, we need to consider this kind of trade-off.Currently, it is not known that what kind of information criterion is suitable for the above our trade-off.
For this purpose, when we fix an integer k and define the constants That is, n 2 n 1 f * 1 +n 2 is the lower confidence limit p − est (N, k, α) of the lower one-sided interval estimation with the confidential level 1 − α when we observe the value k.
Similarly, when we fix an integer k and define the constants That is, m 2 m 1 f * 2 +m 2 is the upper confidence limit p + est (N, k, α) of the upper one-sided interval estimation with the confidential level 1 − α when we observe the value k.

Appendix B.2. Application of Chernoff inequality
Assume that we observe the random variable X subject to the binomial distribution Bin(N, p) with N trials and probability p.For a fixed integer k, we have pulses among N s transmitted pulses.Then, the remaining N

•
(0) 2 pulses with the vacuum state, N (1) 2 pulses with the single-photon state, and N (2) 2 pulses with the state ρ 2 among N 2 µ 2 -intensity pulses with the phase basis.There are N (0) s pulses with the vacuum state and N (1)

Figure 2 .
Figure 2. Outline of our derivation of the sacrifice bit-length S.
Figure 3.The above graphs describe the key generation rate R 2,f given in (61) as functions of the signal intensity µ 2 when the decoy intensity µ 1 is 0.1.The pink line is the case when the bit-length of raw keys M s,2 is 10 6 .The orange line is the case with M s,2 = 2×10 6 .The red line is the case with M s,2 = 3 × 10 6 .The green line is the case with M s,2 = 5 × 10 6 .The purple line is the case with M s,2 = 10 7 .The yellow line is the case with M s,2 = 10 8 .The blue line is the asymptotic case.
Figure 4.The above graphs describe the key generation rate R 2,f given in (61) as functions of the bit-length of raw keys M s,2 when the signal intensity µ 2 is 0.5.The orange line is the case when the decoy intensity µ 1 is 0.01.The red line is the case with µ 1 = 0.05.The purple line is the case with µ 1 = 0.1.The yellow line is the case with µ 1 = 0.15.The pink line is the case with µ 1 = 0.2.The blue line is the case with µ 1 = 0.25.

Figure 5
Figure 5.The above graphs describe the key generation rate R 2,f given in (61) as functions of the signal intensity µ 2 when the bit-length of raw keys M s,2 is 10 7 .The orange line is the case when the decoy intensity µ 1 is 0.01.The red line is the case with µ 1 = 0.05.The purple line is the case with µ 1 = 0.1.The yellow line is the case with µ 1 = 0.15.The pink line is the case with µ 1 = 0.2.The blue line is the case with µ 1 = 0.25.The green line is the asymptotic case with µ 1 → 0.

Condition 3
Any element N in the modified set Ω 1 satisfies the conditions in original Condition 3. 6.1.4.Modifications of sacrifice bit-length S Next, in order to modify the sacrifice bit-length S, we modify Ĵ(0) , Ĵ(1) , r

=
−e −µs µ s N s (1 we obtain Theorem 2 in this adjustment.10.Derivation of upper bound φb of leaked information 10.1.Characterizations of Conditions 2 and 3

Table 3 .
Detail descriptions of respective steps All graphs give the key generation rates R 2,f when the bitlength of raw keys M s,2 is 10 7 and the decoy intensity µ 1 is 0.1 and is smaller than the signal intensity µ 2 .The horizontal axis describes the signal intensity µ 2 .The green line is the rate R 2,f with t = 0%.The blue line is the rate R 2,f with t = 10%.The red line is the rate R 2,f with t = 30%.All graphs give the key generation rate R 2,f with the bit-length of raw keys M s,2 = 10 6 when the decoy intensity µ 1 is 0.1 and is smaller than the signal intensity µ 2 .The horizontal axis describes the signal intensity µ 2 .
3, Here, The green line is the rate R 2,f with t = 0%.The blue line is the rate R 2,f with t = 10%.The red line is the rate R 2,f with t = 30%.