Quantum Capacity of a dephasing channel with memory

We show that the amount of coherent quantum information that can be reliably transmitted down a dephasing channel with memory is maximized by separable input states. In particular, we model the channel as a Markov chain or a multimode environment of oscillators. While in the first model the maximization is achieved for the maximally mixed input state, in the latter it is convenient to exploit the presence of a decoherence-protected subspace generated by memory effects. We explicitly compute the quantum channel capacity for the first model while numerical simulations suggest a lower bound for the latter. In both cases memory effects enhance the coherent information. We present results valid for arbitrary size of the input.


Introduction
Quantum communication channels [1,2] use quantum systems to transfer classical or quantum information. In the first case, we can encode classical bits by means of quantum states. In the latter case, we may want to transfer an unknown quantum state betweeen differents units of a quantum system, for instance of a quantum computer, or to distribute entanglement between communicating parties. In both cases, the fundamental question is what is the maximum rate of classical or quantum information that can be faithfully transmitted. Classical and quantum capacities, defined as the maximum number of bits/qubits that can be reliably transmitted per channel use, provide the answer to this question.
Quantum channels with memory are the natural theoretical framework for the study of any noisy quantum communication system where correlation times are longer than time between consecutive uses. This scenario applies to optical fibers which may show a birefringence fluctuating with characteristic time longer than the separation between successive light pulses [3] or to solid state implementations of quantum hardware, where memory effects due to low-frequency impurity noise [4] produce substantial dephasing [5].
Some theoretical result on quantum channels with memory has been already discussed for transmission of both classical and quantum information through a quantum channel. With regard to classical information transmission down a memory channel, it was pointed out that it can be enhanced by using entangled input states [6,7,8], and coding theorems have been recently proved for classes of memory quantum channels [9,10]. Concerning quantum capacity, a lower bound has been found for some classes of channels with memory [11] and subsequently specific model environments (structured in two parts, one responsible for memory effects and the other acting as a memoryless environment) have been studied [12,13,14]. In particular, coding theorems for quantum capacity have been proved in [14] for the so-called forgetful channels, for which memory effects decay exponentially with time.
The problem is formalized by considering the N-uses Hilbert space H N = H ⊗N and defining the system S, described by the reduced density matrix (RDM) ρ for N uses. The input state is ρ = K i=1 p i ρ i , namely states chosen from the ensemble {ρ 1 , ..., ρ K }, with a priori probabilities {p 1 , ..., p K }, are sent down the channel. Due to the coupling to further uncontrollable degrees of freedom, the transmission of S may be noisy. The output is therefore described by a linear, completely positive, trace preserving (CPT) map E N (ρ), corresponding to N-uses (the single use is defined in H and described by E). The map E N (ρ) can always be represented starting from an enlarged vector space including a suitably chosen environment E, initially in a pure state: where U is a suitable unitary evolution of S+E referring to N uses. The conditional (depending on ρ) evolution of the environment can also be considered. It is described by the environment RDM and allows to define the conjugate CPT map, w = Tr S [U (ρ ⊗ w 0 ) U † ] =:Ẽ N (ρ). The quantum capacity Q refers to the coherent transmission of quantum information (measured in number of qubits), and it is related to the dimension of the largest subspace of H N reliably transmitted down the channel, in the limit of large N. The value of Q can be computed, for memoryless channels, as [15,16,17,18,19] Here is the entropy exchange [20]. The quantity I c (E N , ρ) is called coherent information [21] and must be maximized over all input states ρ.
The limit N → ∞ in (2) makes difficult the evaluation of Q. On the other hand this regularization is necessary, since in general I c is not subadditive. Indeed for entangled input states ρ [16] we may have I c (E N , ρ) > N k=1 I c (E, ρ (k) ), where ρ (k) = Tr S−(k) (ρ) refers to the individual transmission of the k−th unit of information, therefore in general it cannot be excluded that Q N /N > Q 1 . The regularization is not necessary if the final state w of E can be reconstructed from the final state ρ ′ of the system. In this case, referred to as degradable channels [22,23,24,25], it exists a CPT map T such that E = T • E. It turns out [22] that for degradable channels the coherent information I c (E N , ρ) reduces to a suitable conditional entropy [1], which is subadditive and concave in the input state ρ, and therefore the quantum capacity is given by the "single-letter" formula Q = Q 1 .
In this work we focus on dephasing channels with memory. Dephasing channels are characterized by the property that when N qubits are sent through the channel, the states of a preferential orthonormal basis {|j ≡ |j 1 , ...., j N , j 1 , ..., j N = 0, 1} are transmitted without errors, implying a conservation law to hold [26]. Therefore, dephasing channels are noiseless from the viewpoint of the transmission of classical information, since the states of the preferential basis can be used for encoding classical information. Of course superpositions of basis states may decohere, thus corrupting the transmission of quantum information. Dephasing channels are relevant for systems in which relaxation is much slower than dephasing [27,4]. When memory effects are taken into account, we have that E N = E ⊗N , i.e. the channel does not act on each carrier independently.
We show that the coherent information is maximized by input states separable and diagonal in the reference basis {|j }. In particular, we calculate the coherent information for two models of dephasing channels. For a Markov chain we show that the coherent information is maximized by maximally mixed input states and compute Q. For an environment modeled by a bosonic bath, we propose a coding strategy based on the existence of a decoherence-protected subspace generated by memory effects and use numerical results to suggest a lower bound for Q. It turns out that in both cases memory effects increase the coherent information.

The dephasing channel and quantum capacity
The unitary representation of the generalized dephasing channel [22] reads where |φ j are environment states, in general non mutually orthogonal, describing the conditional evolution. The map E N can be written in the Kraus representation [1,2] as where the system operators (A α ) jl = α E |φ j δ jl are diagonal in the reference basis (here {|α E } is an orthonormal basis for the environment). It is easily shown that this channel is degradable [22]. Indeed, for a generic input ρ = j,l ρ jl |j l|, equation (4) yields Since w only depends on the populations ρ jj which are conserved, we can write as well E N =Ẽ N • E, thus proving degradability. We now show that for a generalized dephasing channel the coherent information I c (E N , ρ) is maximized by input states diagonal in the reference basis. To this end we introduce where ρ 0 = ρ and the local operator Σ (k) acts non-trivially only on the k−th qubit, by the Pauli operator σ (k) z which has eigenvectors |j k . We can easily see that ρ N is the diagonal part of ρ, by using the standard representation of the N-qubit density matrix: where σ 0 = 1 1. We now study the action of the operators Σ (k) for any k and ρ, since Σ (k) z commutes with the Kraus operators in (5) are the same as for ρ. We can therefore conclude that . This latter relation, together with the concavity of the coherent information for degradable channels (a direct consequence of the concavity of the conditional von Neumann entropy) implies that Hence, diagonal input states maximize the coherent information. These states are separable, since they can be written in the form with ρ

Forgetful channels
Interesting results on the quantum capacity of dephasing channels with memory can be obtained for forgetful channels, for which the memory dies out exponentially with time. Forgetfulness is defined in [14], according to a model in which the environment is structured in two parts: a memoryless one and one responsible for memory effects (see also [12]). A key feature of forgetfulness is that it permits, with a negligible error, the mapping of the memory channel itself into a memoryless one. This may be clarified by referring to the double-blocking strategy [14]: we consider blocks of N + L uses of the channel and do the actual coding and decoding for the first N uses, ignoring the remaining L idle uses. The resulting CPT mapĒ N +L acts on density matrices ρ on H ⊗N . If we consider M uses of such blocks, the corresponding CPT mapĒ M (N +L) can be approximated by the memoryless setting (Ē (N +L) ) ⊗M . This is possible because correlations among different blocks decay during the idle uses. This property can be expressed as follows [14]: for any input state ρ in H ⊗M N , where c > 1, · 1 is the trace distance [1], and h is some constant depending on the memory model (note that c and h are independent of the input state). This equation states that, even though the error commited by replacing the memory channel itself with the corresponding memoryless channel grows with the number M of blocks, it goes to zero expontially fast with the number L of idle uses in a single block. Equation (11) permits the proof of coding theorems for forgetful quantum memory channels, by mapping them into the corresponding memoryless channels, for which quantum coding theorems hold [14]. In particular, the quantum capacity Q is lim N →∞ Q N /N. Equation (11) by itself is a sufficient condition to prove coding theorems. Therefore, in the following we will use the wording forgetful channel for any system satisfying inequality (11), independently of the model from which memory arises. Now we focus on two specific, physically significant models.

Markovian model
The first model is a quantum channel that maps an arbitrary N-qubit input state ρ onto where the Kraus operators A i 1 ...i N are defined in terms of the Pauli operators σ 0 = 1 1 and σ z : i k acting on the k-th qubit ‡. The quantity p i 1 ...i N can be interpreted as the probability that the ordered sequence σ (1) i 1 , ..., σ (N ) i N of Pauli operators is applied to the N qubits crossing the channel. We define the single-qubit marginal probability p iq = {i k ,k =q} p i 1 ...i N and similarly the two-qubit marginal probability p i q ′ iq and assume that {p iq } = {1 − p z , p z } for all q = 1, . . . , N. Under these conditions the maximum of coherent information in model (12) is obtained for the totally unpolarized input state ρ unp ≡ (1/2 N )1 1 ⊗N . To prove this statement, we construct the same iterative transformation as in (7) but with Σ (k) z , and notice that ρ N = ρ unp is obtained starting from an input state ρ 0 diagonal in the reference basis. Moreover it can be proven that in this case Since ρ 0 is diagonal and E N only changes off-diagonal matrix elements, then We can also prove that HereΣ z is defined as Σ z but acts on the environment. Therefore, . Taking again advantage of the concavity of coherent information for degradable channels, we finally obtain We can explicitly compute the quantum capacity when the joint probabilities in equation (13) are described by a Markov chain [6,11]: where Here µ ∈ [0, 1] measures the partial memory of the channel: it is the probability that the same operator (either 1 1 or σ z ) is applied for two consecutive uses of the channel, whereas 1 − µ is the probability that the two operators are uncorrelated. The limiting cases µ = 0 and µ = 1 correspond to memoryless channels and channels with perfect memory, respectively. In this noise model µ might depend on the time interval between two consecutive channel uses. If the two qubits are sent at a time interval τ ≪ τ c , where τ c denotes the characteristic memory time scale for the environment, then the same operator is applied to both qubits (µ = 1), while the opposite limit corresponds to the memoryless case (µ = 0). ‡ The Kraus operators (13) define a generalized dephasing channel in the sense of equation (4), with The Markov chain model is forgetful, since condition (11) is fulfilled. We first consider a sequence of two blocks of N + L channel uses, for which where the index I stands for i 1 , ..., i N , i N +L+1 , ..., i 2N +L and the operators B I are defined in equation (13). The output state ρ ′ can be approximated bỹ where the factorized probability distributionp I ≡ p i 1 ,...,i N p i N+L+1 ,...,i 2N+L . Taking advantage of the strong convexity of trace distance [1], we obtain where the Kolmogorov distance between the probability distributions {p I } and {p I } is defined as Using the properties of stationary Markov chains and equation (16) we obtain This implies from which equation (11) readily follows §. The forgetfulness of the Markov chain model allows us to compute the quantum capacity from the regularized coherent information (2) [14]. In order to compute the quantum capacity, we consider the input state ρ unp and evaluate the coherent information I c (E N , ρ unp ). In this case S[E N (ρ unp )] = S(ρ unp ) = N. We now take advantage of the formula (S e ) N = S(W ), where the density operator W has components [20]. Here W is diagonal and where H(X 1 , ..., X N ) is by definition the Shannon entropy of the collection of random variables X 1 , ..., X N (characterized by the joint probabilities p i 1 ...i N ). For a stationary Markov chain, we have [28] lim where q 0,z ≡ (1−µ)p 0,z +µ are the conditional probabilities that the channel acts on two subsequent qubits via the same Pauli operator, and H(q 0 ), H(q z ) are binary Shannon § It is interesting to remind the reader that the Markov chain model can also be formulated in terms of a structured environment [12,14]. entropies, defined by H(q) = −q log 2 q − (1 − q) log 2 (1 − q). Therefore, the quantum capacity is given by It is interesting to point out that Q increases for increasing degree of memory of the channel. In particular, for µ = 0 we recover the capacity Q = Q 1 = 1 − H(p 0 ) of the memoryless dephasing channel, while for perfect memory (µ = 1) Q = 1, that is, the channel is asymptotically noiseless [12]. We also note that the right hand side of (24) is known [11] to be a lower bound for the quantum capacity of the Markov chain dephasing channel. Our results prove that this bound is tight. In order to illustrate the convergence of Q N /N to its limiting value Q, we first compute the entropy exchange for the N-qubit input state ρ unp . It is easy to check that Using this recurrence relation we obtain where (S e ) 1 = H(p 0 ). Therefore A plot of Q N /N for various N as a function of the memory factor µ is shown in figure 1. It is clear that the convergence of Q N /N is faster when the memory factor is smaller. Indeed, it is easy to prove that is a growing function of µ, with ǫ N (µ = 0) = 0 and ǫ N (µ = 1) = H(p 0 )/N. Moreover, for µ ≪ 1 we obtain

Spin-boson model
The second model of dephasing channel is defined by the system (qubits)-environment Hamiltonian Here is the environment operator coupled to the qubits. The k-th qubit has a switchable coupling to the environment via its Pauli operator σ (k) z : where f k (t) = 1 when the qubit is inside the channel, and f k (t) = 0 otherwise. Finally, is a counterterm [29]. We call τ p the time each carrier takes to cross the channel and τ the time interval that separates two consecutive qubits entering the channel. The Hamiltonian (30) is expressed in the interaction picture with respect to the qubits. If initially the system and the environment are not entangled, the state of the system at time t is given by the map (1) where In particular, we are interested in the final state ρ ′ = ρ(t = τ N ), where τ N = τ p +(N −1)τ is the transit time across the channel for the N-qubit train. To treat this problem we choose the factorized basis states {|j α E }, where -as above -{|j = |j 1 , ..., j N } are the eigenvectors of k σ (k) z . The dynamics preserves the qubit configuration |j and therefore the evolution operator (33) is diagonal in the system indices: where U(t|j) = j|U(t)|j expresses the conditional evolution operator of the environment alone. Therefore In this basis representation the environment only changes the off-diagonal elements of ρ, while populations are preserved. If the environment is initially in the pure state w 0 ≡ |0 E 0|, then the equations (4)-(5) are recovered. At any rate, it is sufficient to consider a purification of w in an enlarged Hilbert space to write our model as a generalized dephasing channel (4). For a multimode environment of oscillators initially at thermal equilibrium, w = exp(−βH E ), we obtain where S(ω) is the power spectrum of the coupling operator X E . A central question is if and under which conditions a spin-boson environment gives a forgetful channel.
Even though we cannot give a rigorous proof, we conjecture on physical grounds that an exponential time decay of the bath symmetrized autocorrelation function C(t) = 1/2 X E (t)X E (0) + X E (0)X E (t) is a sufficient condition for forgetfulness. To support this conjecture, we proof inequality (11) in the particular case in which two single channel uses (N = 1) are separated by idle times Lτ . We consider two qubits (M = 2 in equation (11)), prepared in a generic input state ρ. Then we compute the output state ρ ′ from equation (35), that is, taking into account memory effects, and the outputρ ′ in the memoryless limit. We obtain, for a generic monotonic decaying autocorrelation function, where the dephasing factor g is such that (ρ ′ ) 01 = g(ρ) 01 and is readily derived from (36) by letting N = 1. In particular we consider a Lorentzian power spectrum S(ω) = 2τ c /[1 + (ωτ c ) 2 ]. In this case, the autocorrelation function is C(τ ) = e −τ /τc and equation (37) is replaced by Inequality (38) is the (11) in the particular case N = 1 and M = 2 (we can set h = 4λ 2 g 2 τ 2 c by noting that (1 − e −τp/τc ) 2 < 1). We conjecture that (11) also holds for any N and M, since the correlations between blocks of N qubits decay exponentially with the delay time Lτ .
A remarkable feature of model (30) is that in the limit of perfect memory (τ c → ∞) there exists for any number N of qubits a decoherence-free subspace H (f ) N , corresponding to a qubit train with an equal number of |0 and |1 states. Since the dimension d of this subspace is such that log 2 d ≈ N − 1/2 log 2 N at large N, then the channel is asymptotically noiseless, that is, Q = 1. A coding strategy naturally appears when blocks ofN ≫ 1 qubits can be sent within the memory time scale τ c : if the quantum information is encoded in the decoherence-protected subspace H (f ) N in such a way that the input state ρ is maximally mixed within this subspace, then a lower bound for the coherent information can be estimated as 2N /(2N). The memoryless dephasing channel instead is recovered in the limit τ c → 0 and in this case the coherent information is maximized by the totally unpolarized input states ρ unp and the channel capacity Q = Q 1 = 1 − H(p 0 ), where p 0 = (1 + g)/2.
Even though we could not compute the channel capacity for generic values of τ, τ p , and τ c , we show in figure 2 numerical results of the coherent information I c for a Lorentzian power spectrum S(ω) and for the input state ρ unp as a function of the degree of memory of the channel, measured by the parameter ξ ≡ τ c /(τ + τ c ). We fix τ c , τ p and vary τ , so that the memoryless and perfect memory limits correspond to ξ → 0 (τ → ∞) and ξ → 1 (τ → 0). The curves in figure 2 show that memory effects enhance the coherent information I c /N and that I c /N grows monotonously with N. Furthermore, these numerical data strongly suggest that I c /N converges, for N → ∞, to a limiting value larger than the memoryless capacity Q 1 . This value would provide, assuming the above conjectured forgetfulness for the model, a lower bound for the quantum capacity. Therefore, using the previously mentioned double blocking strategy, it is possible to increase the transmission rate if the quantum information is encoded in arbitrarily long blocks, separated by time intervals larger than τ c .

Conclusion
In summary, we have shown that the coherent information in a dephasing channel with memory is maximized by separable input states, computed the quantum capacity Q for a Markov chain noise model and suggested a numerical lower bound for Q in the case of a bosonic bath where memory effects decay exponentially with time. These results also rely on the concept of forgetfulness, which we prove for the first model and strongly support on physical grounds for the second one. It would be relevant to further clarify the connection between the decay of environment autocorrelation functions and forgetfulness. It is important to point out that differently from previous works on quantum memory channels [6], we have carried out the limit in which the number of channel uses N → ∞. It would be interesting to investigate to what extent the results presented in this work could be applied to other physically relevant degradable noise models such as the amplitude damping channel [30]. Another physically relevant question is whether our results could be generalized to environments with algebraically decaying memory effects, which may model typical low-frequency noise in the solid state.
Note: After completion of our work we became aware of a related paper [31], in which, in particular, the quantum capacity of a Markov chain dephasing channel is provided. Their derivation, not reported in that paper, is based on a method different from ours (S. Virmani and M. Plenio, private communication).