Quasi-inversion of quantum and classical channels in finite dimensions

We introduce the concept of quasi-inverse of quantum and classical channels, prove general properties of these inverses and determine them for a large class of channels acting in an arbitrary finite dimension. Therefore we extend the previous results of [1] to arbitrary dimensional channels and to the classical domain. We demonstrate how application of the proposed scheme can increase on the average the fidelity between a given random pure state and its image transformed by the quantum channel followed by its quasi-inversion.


Introduction
It is generally understood that quantum resources make significant improvements over the classical ones in most of the information processing tasks [2]. However, these resources are usually fragile under the noise caused by inevitable interactions with environment which may drastically neutralize the quantum advantage mentioned above. Stated in other words, an open quantum system is usually interacting with an environment, so its dynamics cannot be described by a unitary evolution, ρ −→ UρU † . This unitary dynamics is an idealization which almost * Author to whom any correspondence should be addressed.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. never occurs in reality. There are always inevitable and unknown couplings with the environment which destroy the coherence, decrease the purity of the state, and deteriorate information encoded into a quantum system [3,4]. One of the central results in the quantum theory is that a general non-unitary dynamics of an open quantum system can be characterized by operators acting entirely within the quantum system [5,6]. The latter dynamics has long been studied and by now there is an extensive literature on the subject. The simplest way to describe a non-unitary dynamics is to use the Kraus form of a channel acting on a density matrix ρ of order d, which can be interpreted as a generalization of the unitary dynamics. The standard unitarity condition U † U = I has been replaced above by the identity resolution, r α=1 K † α K α = I. Here K α denotes a Kraus operator of size d. The number of these operators, r, may vary, but it is always possible to find representations with r d 2 . Any map of this class is completely positive and trace-preserving and is called a quantum operation or a quantum channel [5,6]. It captures the effect of errors (noise, decoherence and dissipation) in a quantum system caused by interaction with the environment.
It is easy to see that any unitary evolution can be explicitly inverted by replacing U with U −1 , so one can get back the original state by turning the dynamics backward. Even if we set aside practical considerations for reducing the effect of noise and errors, since after all there exist error correcting codes and other methods [7][8][9][10][11][12][13][14][15][16] for dealing with these issues, it is a mathematical curiosity to ask, whether a general quantum channel can be inverted too [17,18]. It is however an established fact that a quantum channel can be exactly inverted only if it is a unitary transformation.
By inversion of a given quantum channel we mean here using another quantum channel, which is physically possible. One may therefore ask, whether a quantum channel can be quasiinverted, in the sense that another quantum channel E qi exists, such that E qi • E is as close to the identity map as possible. This was the approach taken in [1], where the case of qubit channels was studied in detail. As qubit channels are completely characterized and classified in [19,20], the authors of [1] managed to find the quasi-inverse of all qubit channels. It was shown that the quasi-inverse of every quantum channel, except for a measure zero set, is nothing but a suitably defined unitary map [1]. An explicit formula for deriving this unitary map was also derived.
Slightly related issue was earlier considered by Koenig et al [21] who analyzed the following problem: given a bipartite quantum state ρ AB one wishes to convert it as closely as possible to a maximally entangled state by applying a quantum channel only on the system B. Due to Jamiołkowski isomorphism any quantum channel acting on a d-dimensional system can be treated as a bipartite state ρ AB of size d 2 -see [22]. Thus the maximal fidelity optimized over all possible channels can be related to the conditional min-entropy of the state ρ AB . A similar approach was later used by Chiribella and Ebler [23], who demonstrated that an unknown unitary channel can be optimally inverted with an average fidelity of 1/d 2 .
Extension of results obtained in [1] for single-qubit channels for higher dimensions is not straightforward, as very little is known concerning the structure of the set of channels acting in dimensions d 3. In addition to the quartic increase of the number of parameters with the dimension d, which defies any kind of geometrical picture for these channels, certain important and simplifying theorems which hold for qubit channels do not extend to the higher dimensional case. For instance, for d = 2 any unital channel, which leaves the maximally mixed state invariant, E(I) = I belongs to the class of mixed unitary channels, so it can be represented as a mixture of unitary operations [24]. This property does not hold for general d [25,26] and already for d 3 there exists the unital channel of Landau and Streater (LS), which is not mixed unitary [24], see the recent study [27] for further information on this map. This is also related to another important difference which concerns the extreme points of a convex set, i.e. those points which cannot be written as convex combination of other points of the set. While the extreme points of the convex set of unital qubit channels are unitary maps, this is no longer the case for higher dimensional channels.
All these properties make the study of quasi-inverse of quantum channels a non-trivial task. Nevertheless, we obtain here some general results on higher dimensional channels and their quasi-inverses and substantiate these results by several examples of general families of ddimensional channels. We provide upper and lower bounds for the performance of a channel after it is compensated by its quasi-inverse, i.e. E qi • E. In particular, we show that the quasiinverse of a channel of an arbitrary dimension d need not be a unitary map. We also study a class of self-quasi-inverse quantum channels including the interesting case of LS [24].
In the second part of this work we study an analogous problem posed for classical channels, namely for stochastic matrices which map the set of probability vectors into itself. This question, left open in [1], is of a special interest in view of the correspondence between quantum channels and their classical counterparts [28]. We show that several results originally formulated for the quantum domain find their natural parallels in the classical scenario.
The structure of this paper is as follows: in section 2, we collect the preliminary ingredients, in section 3 we define the quasi-inverse channel whose general properties and examples are respectively explored in sections 4 and 5. These studies are extended to the classical domain in sections 6 and 7. We conclude the paper with a discussion. Some of the detailed calculations and proofs are collected in the appendices.

Preliminaries
Let H d be a complex d-dimensional Hilbert space for which we choose the computational basis {|m , m = 1, . . . , d}. Let L(H d ) be the space of linear operators on this Hilbert space. The state of a d-dimensional quantum system (a qudit) is described by a density matrix ρ which is a positive operator of unit trace acting on this Hilbert space. The space of all density matrices is denoted by D(H d ). This is a convex subset of L(H d ). Any point of this convex set is described by d 2 − 1 real parameters. Any linear map A ∈ L(H d ) can be uniquely mapped to a vector |A ∈ H d ⊗ H d in the form where |φ + = 1 √ d i |i, i is a maximally entangled state. This is called vectorization of a matrix A = i, j A i, j |i j| into a vector |A = i, j A i, j |i, j , with the correspondence between the inner products: Taking d 2 − 1 traceless and Hermitian matrices Γ i along with identity I as a complete basis, one can write any density matrix as We normalize Γ i matrices to satisfy a concrete choice for this basis is given in appendix A. We will then have The (d 2 − 1)-dimensional vector r = (x 1 , x 2 , . . . , x d 2 −1 ) T is a real vector called the generalized Bloch vector. In contrast to the qubit case d = 2, the convex set of physical states (positive matrices with unit trace) is no longer a unit ball. In fact the geometry of this convex set can be quite complicated and only partial facts are known about low dimensions, i.e. for d = 3 [22,29]. The set of pure states ρ = |ψ ψ|, where |ψ = d i=1 ψ i |i is a sphere of dimension 2d − 2 which we denote by S 2d−2 . On the other hand, any pure state has the property Tr(ρ 2 ) = 1 which in view of equations (5) and (6) is equivalent to r · r = 1. This is a sphere of dimension S d 2 −2 which for d > 2 has a higher dimension than the set of pure states. Hence the set of pure states is a subset of this larger sphere. This larger sphere contains other points which are not even states at all, see appendix A for concrete examples. The necessary and sufficient condition for the Bloch vector to describe a pure quantum state can be found in [30]. The necessary condition for the Bloch vector to produce a general mixed state is provided in [31].
Consider now a quantum channel E represented by its Kraus operators K α acting on states of dimension d through equation (1). The correspondence (2) leads to the following representation of channels by matrices acting on vectorized form of density matrices, where is called the superoperator of the map E [22] in which * denotes complex conjugation. Although a quantum channel has many different sets of Kraus operators, connected by L β = β U α,β K β , where U is a unitary matrix, it is straightforward to see that Φ is unique. It also has the property that The quantum channel E induces an affine transformation on the generalized Bloch vector r where The matrix M of order d 2 − 1 is called the distortion matrix. Writing the basis {I, Γ i } in vectorized form {|I , |Γ i } which are vectors of dimension d 2 , we have the normalization condition The d 2 dimensional vector |ρ can be written in terms of the normalized basis vectors |Ĩ = where the first component is the coefficient of |Ĩ and the second component encapsulates the coefficients of |Γ i as a vector. The quantum channel E turns this into the vector This means that in this basis the superoperator can be written as In the case of unital channels the translation vector vanishes, t = 0. If the entire matrix M also vanishes the corresponding map Φ * represents the completely depolarizing channel, which sends any state ρ into the maximally mixed state, E * (ρ) = I/d. The above form of the map Φ, which represents the evolution of the Bloch vector r, is also called its Liouville representation [32]. On the other hand, it can be interpreted as the Fano form [33] of the bi-partite state representing the map in the Choi-Jamiołkowski isomorphism. Such a state, proportional to the Choi matrix, forms a positive operator with Tr(C E ) = d. It is related to the superoperator of the channel through the simple relation [22] where A R denotes the following reshuffling of the entries of a matrix A of order d 2

Average fidelity of a channel
The performance of a quantum channel E can be studied in different ways. Among these we choose the input-output fidelity averaged over a uniform distribution of pure states, F(E), and the entanglement fidelity F E (E) [34]. These are respectively defined as follows: where dψ = 1 and dψ = d(Uψ) for any unitary, and The average fidelity measures how much the input and output states are similar to each other and the entanglement fidelity measures how much a maximally entangled state is affected if the channel E acts on one part of this state. As we will see the two quantities are related to each other in a simple way. Below we calculate these quantities by different methods and each method sheds light on these quantities from a different angle. First consider the average fidelity where it can be written as where is the isotropic state and Φ E is defined in equation (8). Since L is an isotropic operator in the sense that [L, U ⊗ U * ] = 0 ∀U, it can be written in the form [35], Inserting (22) in (20) and using (16) and (17) one finds the following formula for the average fidelity More details on these calculations will be given in appendix B. From (8) and (14) we see that this can also be written as Entanglement fidelity is derived in a simpler way, e.g. by direct expansion of the state |φ + = 1 √ d i |i, i , and hence one finds Therefore, we find the simple relation [36] showing that quantities (18) and (19) are monotone with respect to each other.

Definition of the quasi-inverse channel
In the quantum communication context, it is usually the case that Alice (the information source), generates a message state ρ M from a distribution corresponding to her language alphabet which is known to Bob (the receiver). Let us denote by μ this probability measure over D(H d ), the set of all quantum states. The state is then fed into a quantum channel E to reach Bob. The received state is denoted by ρ R . Assuming that the characteristics of the channel are well known to Bob, and before performing any quantum error correction, he may want to pass the received state through some other channel in order to process it to a state more similar to the one Alice has sent. This latter channel clearly cannot depend on ρ M since Bob does not know it. It only depends on the channel E and the probability measure μ. The second channel is applied to somehow invert the action of E, therefore, it is natural to call it the quasi-inverse of E [1] and it is expected that this channel does this inversion in the best possible way, given the constraint of being a CPT map. Therefore, we have the formal definition of the quasi-inverse: The quasi-inverse of a channel E is defined as a CPT map E qi fulfilling the following condition [1] F where F is the average of a proper fidelity function between the output state and the input pure state.

Remark 1.
As we will see in the sequel, the quasi-inverse of a quantum channel is unique except when the channel falls on a set of measure zero. In the simplest case of unital qubit channels, in which every channel is unitarily equivalent to a Pauli channel, E(ρ) = 3 i=0 p i σ i ρσ i , only the channels with two or more equal weights p i have non-unique quasi-inverses.
We now prove our first theorem whose validity does not depend on the specific form of the fidelity function, nor on the input state restricted to be pure, rather it depends on two very general properties of the fidelity measure, as described in the proof.

Theorem 1.
For any proper similarity function F, probability measure μ and quantum channel E, a quasi-inverse E qi can be found on the boundary of the set of allowed channels.

Proof.
A proper similarity measure F should satisfy the following two conditions: (a) the state most similar to a state ρ is the state ρ itself, and (b) concavity: One such measure is the well-known fidelity, but the following argument is independent of the particular form of this measure. We proceed with a proof by contradiction. Let us denote the set of all possible quantum channels on H d by C d . This is a convex set. Assume the quasi-inverse for the channel E ≡ (M, t), defined by definition 1, is a quantum channel E in the interior of C d , figure 1. Now take the inverse of the above affine map which is given by (M −1 , −M −1 t). This affine map may not correspond to a legitimate quantum channel and may be outside C d . Denote it by E −1 . However for small enough ε, the channel (1 − ε)E + εE −1 is a CPT ∈ C d . We now note that this channel performs better in quasi-inverting the channel E since using (27) and (28) we find This means we have found a linear path along which the average fidelity increases or stays constant as we go toward the boundary of C d . Clearly, the quasi-inverse has to be in the end of such a path and hence at the boundary of C d .
To complete the proof, we need to consider the case of singular channels. This corresponds to det M = 0 and therefore has unit co-dimension. It means for any singular channel E, we may come up with a sequence of nonsingular channels E n −→ E . So far we have proved that where X denotes the average of X over μ. Using the continuity of F, implied by its concavity, the proposition is proved for the singular channels as well. Note that in this proof we have not assumed any particular form for the fidelity measure, except the two natural properties (27) and (28), neither we have assumed the average fidelity to be defined only for pure states.
Hereafter we assume the standard fidelity measure, F(ρ, ρ ) = Tr √ ρρ √ ρ 2 , of Uhlmann and Jozsa [37,38], and use the average input-output fidelity (18) for the performance of the channel. In such a case the average value can be compared with the mean fidelity between two random quantum states averaged over the set of all mixed states with an appropriate measure [39]. Under these assumptions, the quasi-inversion is defined as the channel maximizing F(E • E). In view of equation (23), the practical method for finding the quasi-inverse is through one of the following maximization problems: These equations immediately imply that for a given channel, the left and right quasi-inverses are the same, i.e. F E qi • E = F E • E qi . It is an interesting fact with practical benefit. Either Bob can apply the quasi-inverse after receiving the state or Alice before sending the state to Bob. Furthermore, note that for E and E denoted by (M, t) and (M , t ), respectively, their concatenation, E • E, is represented by (M M, M t + t ). This implies that the translation vector t, which determines the non-unitality of the channel, plays no role directly in amount of fidelity and fidelity after correction. However, it affects the range of the allowed values of the distortion matrix elements. This is one of the features of quasi-inverse for qubit channels [1], which survives for higher dimensions.
It is worth mentioning that one may find a relation between the input-output fidelity after correction and the conditional min-entropy. The latter quantity for a bipartite state ρ AB is defined as [40] where σ A is a quantum state. Indeed, it has been proven that [21] 2 Let us assume that ρ AB is the Jamiołkowski state (the normalised form of the Choi matrix (15)) assigned to a quantum channel and denoted by J E = C E /d. Then we get Here F E (E • E) is the entanglement fidelity (24) of the composed map E • E. So the above equation shows that entanglement fidelity after correction (thus input-output fidelity after correction, see (25)) are directly related to the conditional min-entropy of the Choi matrix of the channel.

General properties of the quasi-inverse channel
In this section we elaborate on some general properties of the quasi-inverse of quantum channels and discuss the similarities and the crucial differences with the qubit case [1]. Thus far, we have seen in theorem 1 that for any proper similarity measure the quasi-inverse lies on the boundaries of the set of quantum channels. In what follows, we show that if this similarity measure is linear with respect to quantum channels, we can specify the quasi-inverse in the set of extreme channels, a subset of the boundary points. Recall that a point of a convex set Ω is called extreme if it cannot be written as a convex combination of two other points of Ω. We will use the fact that the set C d of quantum channels of dimension d is convex [22] for any d.

Theorem 2. The quasi-inverse of a quantum channel can always be taken to be an extreme channel.
Proof. Assume that the quasi-inverse of a channel E is in the form Let E m be the element in the above set for which Then using the linearity of the quasi-inverse, we will have which means, according to definition 1, that the quasi-inverse can always be taken as an extreme channel.
The crucial difference between the qubit case and the higher dimensional case is that for qubit channels quasi-inversion is unital and the extreme points of the set of one-qubit unital maps are unitary channels, while this is no longer the case for d > 2. Even more than that, not all extreme points of the set of channels are known for d > 2, not even for unital channels.
Note that the linearity of F(E • E) over E implies that if E qi 1 and E qi 2 are both quasiinversions of E, then any convex combination of them, pE qi is the inverse as well. In accordance with the above theorem we arrive at the following result.

Corollary 1. The quasi-inverse is either unique or an infinite number of them exist where at least two of them are extreme channels.
An immediate result of theorem 2 is that the quasi-inversion is not an involution, i.e. (E qi ) qi = E for a general E. It is because a quasi-inverse map is an extreme point. So even if one takes into account the non-uniqueness of quasi-inverse, see remark 1 and corollary 1, there are always non-extremal maps which are not quasi-inverse of any other maps.
This inequality is saturated if we take E equal to the identity map.
On the other hand, let C E denote the set of quantum channels for which E defines the quasiinverse. We argue that such a set is convex because for any E 1 , where 0 p 1. In this sense, C I is a special convex subset of quantum channels which are not correctable, i.e. the identity map is its quasi-inverse. According to the above proposition, applying the quasi-inversion we actually send a given quantum channel to this special subset since we cannot correct the fidelity afterward.
As an example consider the tetrahedron of Pauli channels, Φ = i p i Φ i = 3 i=0 p i σ i ⊗ σ i where σ 0 = I and other σ i 's are the Pauli matrices. According to the results of [1], the subset of Pauli channels for which ∀i p 0 p i belongs to C I . This is also a consequence of the result of example 1 in the next section, since σ i 's satisfy orthogonality. The set of non-correctable Pauli channels is presented in figure 2.
Two unitarily equivalent channels are defined as E 1 and E 2 such that The set of Pauli channels. The blue region shows C I , the convex subset of the channels whose quasi-inversion is the identity map. Φ * is the center of the tetrahedron defined by 1 where we have used the fact that the set of all quantum channels, C d , is invariant under unitary transformations. This relation proves the quasi-inverse channels of E 1 and E 2 are also unitarily equivalents, and they can reach the same amount of fidelity after correction. This fact expands the result of [1] related to unitarily equivalent channels of the form Here a crucial difference between the qubit channels and higher dimensional channels shows up. In the case of qubits, a complete characterization of qubit channels exists and it is known that any qubit channel has the decomposition . The signed singular values [22] of the matrix M c confined inside a tetrahedron Δ whose extreme points are unitary operations ρ −→ σ i ρσ † i (i = 0, 1, 2, 3). Therefore the task of finding quasi-inverse of any qubit channel is considerably easy compared with higher dimensional channels where such a canonical decomposition does not exist and even if there was, with a presumably diagonal matrix M, we were faced with a highly complex characterization of the vector λ. It is known that the structure of the convex set of higher dimensional channels is far more complex than that of a simple tetrahedron, and in particular it is known that the extreme points of this set are not necessarily unitary channels. A well-known counter example is the LS channel [24] which will be discussed in section 5. Let us now put general bounds on the improved average fidelity.

Theorem 3. The average input-output fidelity of a channel after correction has the following upper and lower bounds:
where f = max |β β| C E |β is the fully entangled fraction of the Choi matrix C E of the channel E and |β denotes a maximally entangled state, while p m is the maximal eigenvalue of C E . Before proceeding with the proof let us mention in view of equation (25) and the definition of fully entangled fraction, the lower bound in above equation is actually an upper bound for the input-output fidelity before we correct it with quasi-inverse map, F(E).

Proof.
To prove the upper bound, we note that where S = i, j |i, j j, i| is the swap operator,C E = (SC E S) T , and p m is the largest eigenvalue of the Choi matrix of the channel E. In writing these equations we have used equations (16) and (17), and the fact that C E andC E are Hermitian and positive matrices. The above inequality leads to the following upper bound for the improved average fidelity In order to obtain a lower bound, we note that in view of (26) and choose forC E to be equal to d|β β|, where |β is a maximally entangled state. This gives the lower bound.
One of the main differences between the qubit case and the higher dimensional channels is that the singular values of M in the qubit case are always less than or equal to one. As a result, the Bloch vector cannot be stretched by applying the distortion matrix M in the qubit case, while as we will show in an explicit example, this is not necessarily the case for higher dimensional channels.
There are certain channels whose distortion matrix M in equation (14) can stretch the generalized Bloch vector r. This is due to the non-spherical shape of the space of quantum states in higher dimensions, see figure 3. We will elaborate on this point and its consequences in appendix C.

Proposition 2. The quasi-inverse of the tensor product of quantum channels is the tensor product of the quasi-inverses
Proof. Let ρ AA BB = ρ AB ⊗ ρ A B , then one can show the min-entropy (30) is additive [21], i.e. H min (BB |AA ) = H min (B|A) + H min (B |A ). The proof of the proposition then becomes straightforward using equation (32) and the fact that for the tensor product of quantum channels, the Jamiołkowski state is in the tensor product shape.
Applying this proposition we show that for N copies of a given quantum channel in a multipartite setting To prove this equation we note that for any quantum channel Q Let us apply Tr(Φ Q ) = x for simplicity of notation. In view of equation (23), we prove whenever 1 x d 2 the following inequality holds: To prove this relation we note that both sides are increasing functions of x. When x = 1 or x = d 2 both sides are equal. Comparing their derivatives with respect to x at x = 1, we see the right-hand side function grows faster at x = 1 which proves the inequality (43). To prove equation (41), it remains to show that 1 Tr(Φ E qi •E ) d 2 for the composition of any quantum channel and its quasi-inverse. The later is, however, a consequence of equation (37) and the fact that fully entangled fraction of the Choi matrix is greater than 1/d.

Examples
Taking into account the upper and lower bounds given in (37) and the general theorems of the previous section, in this section we will consider a few classes of examples, as the optimization problem (29) cannot be solved analytically in the general case. In the case of single-qubit systems a complete classification of quantum channels [19,20] leads to an explicit description of their quasi-inverse which can be unitary [1]. For higher dimensional channels, the quasi-inverse may not necessarily be a unitary map. Thus identifying the quasi-inverse is related to finding the extreme points of the set C d of quantum channels, which remains an open problem.

Example 1 (Mixed unitary channels with orthogonal unitaries).
A mixed unitary channel is defined [41,42] by where {V α } r α=1 is an arbitrary set of r unitary transformations. We restrict ourselves to the case where the unitaries are orthogonal with respect to the Hilbert-Schmidt scalar product, In analogy to the construction of an approximate time reversal proposed in [17], the quasi-inverse of E is then the where V m corresponds to the largest weight q m in the mixture (44). To see this note that the Choi matrix of this channel reads where we have used the correspondence (2). In view of the orthogonality of the vectors |V α , this is then the spectral decomposition of the Choi matrix C E with eigenvalues equal to p α = dq α (note that |V α is not normalized). Let q m = p m /d be the largest of coefficients in (44). In view of the diagonal nature of the Choi matrix, p m is the largest eigenvalue of the Choi matrix. Moreover, by taking |β = |V m , one sees that f = p m . So the upper and lower bounds of (37) coincide and we find the quasi-inverse is the unitary map induced by Let us emphasize again that this result is valid only for mixture of unitary maps corresponding to unitary matrices mutually orthogonal in sense of the Hilbert-Schmidt scalar product. If this assumption is not satisfied the quasi-inverse is not the inverse of one of the unitaries. As the example 5 shows.
Example 2 (Uniform mixture of orthogonal conjugations). In the following two examples, we will bring some quantum channels which are self-inverse.

Theorem 4. Let E be a unital channel obtained by the uniform mixture of conjugation (not necessarily unitary ones) by matrices which are orthogonal to each other. Such a channel is specified by
We now show that the quasi-inverse of this map is given by its dual, i.e. E qi (ρ) = X † α ρX α .
Proof. The Choi matrix of the channel E (47) is where |X α is based on the correspondence of equation (2) and it fulfills X α | X β = d q δ αβ . Thus we find that the above equation is indeed the spectral decomposition of the degenerated Choi matrix with the q-fold degenerated largest eigenvalue equal to p m = d q . The superoperator of this channel is given by Note that since E is assumed to be unital, Φ † E is also a valid quantum channel corresponding to the dual of E. Composing Φ E and Φ † E , we get So by such a composition the upper bound of equation (37) is obtained, which completes the proof.
The fidelity after correction for the channel (47) then reads F(E † • E) = d+q q(d+1) , while for the case with Tr(X α ) = 0 we have F(E) = 1 d+1 before applying quasi-inversion which admits significant improvement specially in higher dimensions and when q is not large. Note that for q = 1 the average fidelity after correction is 1, as it is expected. Moreover, it is obvious that if the operators X α are Hermitian, E is its own quasi-inverse. Two explicit examples for this case are provided in what follows.

Example 2.1 (LS channel).
Consider the LS channel [24], where J i are the Hermitian generators of SU (2) in its irreducible representation in dimension d = 2 j + 1 and they satisfy Tr(J † m J n ) = 1 3 j( j + 1)(2 j + 1)δ mn . It is clear then that LS channel is a special case of equation (47) with q = 3. It is worth mentioning that the LS channel (52) is an extreme point of C d when d 3 [24]. However for d = 2, the LS channel is not an extreme point of C 2 , so in view of theorem 2 there exist several quasi-inverse channels. LS channel). Let G be a Lie group with dimension n, with Lie algebra generators A α where 1 α n. Let D μ be an irreducible unitary representation of the Lie algebra. We define the generalized LS channel as

Example 2.2 (Generalized
where c μ is the value of the second Casimir operator in this representation The generators can be made orthogonal so that Such a channel satisfies the assumption of theorem 4 and is hence its own inverse.

Example 3 (The transverse-depolarizing and depolarizing channel).
The trans verse-depolarizing channel is defined as E td to satisfy complete positivity. The superoperator of this channel is given by: and the Choi matrix is equal to S is the swap operator introduced in the proof of theorem 3. It is a Hermitian unitary so its eigenvalues are ±1. Thus, the largest eigenvalue of C td Let E ± (ρ) = 1 d±1 Tr(ρ)I d ± ρ T . The channel E + is the transverse-depolarizing channel with w = d d+1 and E − , also called Werner-Holevo channel [43], is equal to E td w for w = d d−1 . Indeed, any transverse-depolarizing channel is a convex combination of E + and E − . Now it is straightforward to see F(E + • E td w ) saturates the upper bound of equation (37) (37) is achievable by E − • E td w confirming that E − is the quasi-inverse in this interval and it can indeed improve the fidelity by ΔF = 2(w−1) d+1 . The parameter space of the transverse-depolarizing channel is shown in figure 4.
We can also consider the depolarizing channel defined as where q is a probability. The superoperator and the Choi matrix are given by It is now obvious that the largest eigenvalue of the Choi matrix is given by p m = d(1 − q) + q d which is actually equal to Tr(Φ d q )/d. This implies that quasi-inverse is the identity map. So the depolarizing channel is not correctable.

Example 4 (Covariant channels). A quantum channel
is called covariant with respect to a group G, if the following property holds: in which U(g) and V(g) are two not necessarily equivalent representations of g. From the vectorization (2), we find This property has implications for the quasi-inverse. To see this we note that from which we obtain Equivalently this means that Therefore the quasi-inverse of a covariant channel is also covariant except that the order of the two representations of the group is reversed.

Example 5 (Mixed unitary channels with commuting unitaries).
In contrast to the qubit case where a classification of quantum channels facilitates the study of various aspects of them including their quasi-inverses, for higher dimensional channels, many aspects do not easily yield an analytical treatment. In this section we pose a simple problem whose solution, as we will see, is quite nontrivial and yet instructive. We have seen in example 1, that the quasi-inverse of a mixed unitary channel of the form when U k s are orthogonal to each other, is the unitary channel E qi (ρ) = U † max ρU max , where U max is the unitary corresponding to the maximum probability p max in (65). When the unitaries are not orthogonal to each other, then we know the complete answer only for qubit channels [1]. For higher dimensional channels we can tackle the simplified version of the problem, if all unitary matrices commute, so that they are diagonal in a certain basis. Under such assumptions we will prove that for qutrit channels the quasi-inverse is a unitary map. Our analysis also reveals certain facts about higher dimensional channels which may be of interest in their own right. At the end of this section we will consider a concrete case for d = 3 and the reader can follow the general arguments here by looking at that special case.
We aim to find the quasi-inverse of the channel in (65) when [U k , U l ] = 0 ∀k. In the basis in which all the unitaries are diagonal, U k = diag(e iθ (k) 1 , e iθ (k) 2 , . . . , e iθ (k) d ), the superoperator of the channel reads where Let G be the unitary group that consists of all diagonal unitary matrices in this basis 4 . Since U k 's commute with any diagonal matrix, the channel According to property (64), this implies that the quasi-inverse of this channel is also Gcovariant. Expressed in terms of the superoperators, this means that the superoperator Φ E must commute with the superoperator of all unitary maps ρ −→ UρU † , where U ∈ G. The superoperator of the latter is of the diagonal form Let the superoperator of the quasi-inverse be given by Equation (68) now restricts the form of this superoperator to the following simple form The Choi-matrix of the quasi-inverse is obtained by reshuffling the entries of the superoperator, see (16), which amounts to the quasi-inverse is the channel which maximizes the following quantity Here we have used the equality w ii = 1 ∀i, subject to the condition that C E qi in (72) designates the Choi matrix of a legitimate quantum channel, i.e. it is a positive matrix with partial trace equal to the identity C E qi 0 and Tr 1 C E qi = I d .
In view of the block-diagonal form of the Choi matrix, it turns out that its eigenvalues are of the form The second condition in (74) leads to the following set of equalities which can be rewritten as Note that r i = j , being the eigenvalues of the Choi matrix are non-negative. Therefore in order to maximize the right-hand side of (73), we can take all of them to be zero, reducing (77) to which further simplifies the expression (73) and reduces our problem to maximization of the expression subject to the positivity of the following matrix The set of all matrices X, denoted by Ω is a convex set. Therefore the linear function F takes its maximum at the extreme points of the set Ω.
In general, the problem of finding the extreme points of d-dimensional channels is a difficult and rather non-trivial. It is only known that the extreme points of the set of two dimensional unital channels are unitary maps. Here we show that for channels defined by the Choi matrix (72), the extreme points are unitary maps if d 3. We show also a stronger result: for any dimension d the rank of any extreme point of this set is less than √ d. To this end, we first need to clarify a few definitions and a lemma. In what follows, S is a vector space and Ω ⊂ S is a convex subset of S.
A basic property of extreme points of a convex set Ω is depicted in figure 5. In this figure X 0 and X 0 are extreme points while Y 0 is not. We note that any line (no matter how short) passing through an extreme point like X 0 or X 0 contains points which do not belong to Ω, while for the non-extreme point Y 0 , there exists a sufficiently short line (namely the one lying on the edge) which lies entirely in Ω. We present this more formally in the following statement.

Definition 2.
Let Y 0 be a non-extreme point of Ω. Then there is a Z ∈ S, and > 0, such that Y 0 + t(Z − Y 0 ) ∈ Ω for all t ∈ (− , ). We call Z a witness of non-extremality of Y 0 .
This definition implies its equivalent form.

Lemma 1. Let X 0 be an extreme point of Ω and let Z ∈ S be an arbitrary element in
Now we can state and prove the following result. Proof. Let Y 0 be any element Ω with rank r. We show that if r > √ d 2 − m, then we can always find an element Z ∈ Ω different from Y 0 such that the sufficiently short line segment Y 0 + t(Z − Y 0 ) belongs entirely to Ω. This shows that such points cannot be extreme points of Ω. To this end, let us expand Y 0 in its eigenbasis as where and let t be so small that Y 0 + t(Z − Y 0 ) 0. The only other requirement that is needed for this matrix to belong to Ω is to satisfy d 2 − m linear homogeneous equations which define the subspace Ω. This is a system of d 2 − m linear homogeneous equations on r 2 variables and if d 2 − m < r 2 , it has always a non-zero solution. This means that the point Y 0 is not an extreme point of the set Ω. Hence the rank of any extreme point of this set should be less than or equal to √ d 2 − m.

Corollary 2. As a corollary we find that the extreme points of the set of matrices (80) which is a subset of a d 2 − d dimensional space, have rank r < √ d. This means that for d = 3, the extreme points of the set (80) have unit rank and hence the quasi-inverse of mixed unitary channels with commuting unitaries is a unitary channel. This theorem by itself does not preclude the existence of quasi-inverses which are unitary maps in higher dimensions.
Consider now the case of d = 3. Here after setting r ij = 0 and satisfying the constraint (78), the superoperator is a diagonal matrix given by Φ E qi = diag[1, q 12 , q 21 , 1, q 23 , q 31 , q 32 , 1] and its Choi matrix is as follows: The Hermitian matrices W and X, which appear in equation (79), are now three dimensional and take the form Having proved that the matrix X 0 which maximizes the expression (79) is of unit rank, we can write it as X = |φ φ|, where |φ = q 1 q 2 q 3 T maximizes Tr(W|φ φ|), with q i being unimodular complex numbers. Note that the maximum will be smaller than the largest eigenvalue of W, if the corresponding eigenvectors is not built of unimodular entries. Therefore the quasi-inverse is the unitary map E qi (ρ) = UρU † , where the unitary operator U is given by U = diag (q 1 , q 2 , q 3 ). In view of (29) and (79), the final average fidelity becomes As a concrete example, consider a spin-1 particle subject to a magnetic field in the z direction, where the strength of the magnetic field or the exposure time is random. The evolution of the state is given by the following channel for some distribution f(τ ). For this channel we have In view of the form of X, the quasi-inverse will be given by and the average fidelity after application of the quasi-inverse is given by (90) Figure 6 shows the average fidelity and the increase in average fidelity for a simple discrete distribution P(τ = 0) = 1 − p and P(τ = τ 0 ) = p.

Quasi-inversion of classical channels
There are several known parallels between probability distributions and stochastic matrices on the one hand and their quantum counterparts, namely density matrices and quantum channels, on the other hand. For instance, stochastic and bi-stochastic matrices acting on probability vectors form classical analogues of quantum channels and unital quantum channels. Furthermore, the discrete group of permutations is the analog of the continuous group of unitary channels. As the concept of convexity is critical in both domains, it is instructive to analyze, how the notion of quasi-inverse works in the classical setup. In more concrete terms, the state of classical stochastic system of dimension d is a real vector p whose entries are non-negative and add up to one. The set of all probability vectors of length d, forms the simplex Δ d which is a (d − 1)-dimensional compact and convex set. For the sake of simplicity, let us use the bra-ket notation here to mention a probability vector. In this sense, let |i for i ∈ {1, . . . , d} denotes the pure probability vector whose all components are zero but the ith element which is equal to 1. Therefore, a general mixed probability vector p = (p 1 , . . . , p d ) T can be stated as a convex combination of {|i }, i.e. |p = i p i |i .
A classical channel is represented by a stochastic transition matrix T of order d with nonnegative elements where the sum of all elements in each column is equal to 1. This is the analog of trace-preserving property. The space of stochastic matrices of order d is a (d 2 − d)-dimensional convex and compact set which will be denoted by S d .
Making use of the analogy to the quantum case consider the generalized Bloch representation (4) of a diagonal density matrix ρ diag = diag(p). Let us order the generators of SU(d) matrices used in (5) in such a way that Γ i are diagonal for i = 1, . . . , d − 1. Then any diagonal matrix ρ diag , representing the classical state p, is represented in the Bloch form (4), where the Bloch vector t cl has now only d − 1 components.
Using such a Bloch representation for point p of the probability simplex Δ d one can represent the action of an arbitrary stochastic matrix in the form [44], Note that this representation mimics the Liouville form (14) of a quantum operation, with the only difference that the classical distortion matrix M cl forms a (d − 1) dimensional truncation of the quantum distortion matrix M or order d 2 − 1, while the classical translation vector t cl consists of d − 1 components of the original translation vector t of size d 2 − 1. Hence in the Bloch basis the classical transition matrix T forms a block of a matrix representing a quantum operation Φ E , which decoheres to it [28]. Let us mention explicitly two distinguished classical maps. The first is the permutation The next one is the flat, van der Wearden matrix, denoted by T * , which sends all input states to the uniform state. It implies (T * ) i j = 1/d for any 1 i, j d. As a result, the summation of elements of the columns of an assumed matrix A is equal to 1 if and only if T * A = T * . Accordingly, if A is an invertible matrix, the summation of elements on the columns of A −1 is also equal to 1.
Finally we note the fidelity of two probability vectors |p and |q which is defined as Proceeding in the same way that we did for the quantum case, here we should define the average fidelity of a classical channel to be the fidelity of an output state with the input pure state averaged over all input states. Thus we define Under these assumptions, the quasi-inversion is the channel (the stochastic matrix) increasing equation (93) as much as possible: To emphasize even further similarity to the quantum case, consider the Bloch representation (91) of the classical map T involving its distortion matrix M cl . Then the average fidelity of the corrected classical transformation T qi T can be expressed as the maximum over the set of allowed classical distortion matrices, which is in analogy to equation (29). However, the difference in prefactor with respect to equation (29) is due to the averaging over the set of classical probabilities of a dimension smaller than the set of quantum states. All the arguments of theorem 1, based on the linearity of the fidelity function and its two properties (27) and (28) are also valid here and therefore theorems 1 and 2 and the corollary 1 hold true also if we replace the quantum channels with classical ones. In particular, the fact that the quasi-inverse of a quantum channel can be taken to correspond to an extreme point is very important, since compared with the quantum case, we have a much better knowledge of the convex set of classical channels and its extreme points. In the general case, our basic theorem is the following:

Theorem 6. Let the stochastic matrix T be such that in each row i, the element in the a i th column be the maximum. Then its quasi-inverse is found by replacing that single element by 1 and setting all the other elements in that row equal to zero and then transposing the matrix.
Proof. Let the matrix T be written as T = (t 1 , t 2 , . . . , t n ) T where t T i denotes the ith row as a vector. Denote its quasi-inverse as a matrix T qi = (x 1 , x 2 , . . . , x n ), where x i is the ith column as a vector. The aim is to maximize Tr(TT qi ) = i t i · x i . We can maximize this sum if we maximize each inner product t i · x i independently. Since none of the components of the vectors x i can be larger than one, the maximization is achieved if we choose each x i = (0, 0, . . . , 1, 0, 0) such that 1 stands in the position of maximum component of t i . This proves the theorem.
For example assume a general two dimensional stochastic matrix parameterized as In this case, the quasi-inverse is either I, σ x , or a mixture of these two for x + y < 1, x + y > 1 and x + y = 1, respectively. This is depicted in figure 7. In higher dimensions, however, more options are possible.

Example 6.
As examples, the quasi-inverse of the stochastic matrices The second example shows that the quasi-inverse of a classical map is not necessarily a permutation.
In the examples above, the average fidelities of the stochastic maps T 1 and T 2 increase from 1/3 and 5/18 to 35/72 and 43/72, respectively.

Corollary 3.
If T denotes a stochastic matrix for which ∀i, j : T ii T ij , then its quasi-inverse is I d which means it is not possible to increase the average fidelity.
As it is seen, these quasi-inverses are at extreme points of the space of stochastic matrices. To have quasi-inverses which are not necessarily at the extreme points, we should consider the case where there are more than one maximum entry in each row. In this case if we follow the argument leading to theorem 6, we see the quasi-inverse can be the convex combination of all quasi-inverses which we construct when we consider only one of these elements. The next example illustrates this point. has as its quasi-inverse Here the average fidelity increases from the value 1 3 to 35 72 which is expectedly independent from λ. Theorem 7. Among the stochastic matrices with a unique quasi-inverse, only symmetric permutations are their own quasi-inverse.
Proof. Uniqueness of the quasi-inverse suggests that in each row of T there exists a single entry which is strictly larger than other elements in the row. Hence, we have exactly one nonzero array (which is equal to 1) in each column of T qi . Suppose that the classical channel T is a self-quasi-inverse stochastic matrix, i.e. T qi = T. This equality implies the existence of exactly one non-zero element equal to unity in each row of T qi . Because we would have had more than one leading value in a row of T which violates the uniqueness of T qi , otherwise. So T qi and consequently T are the same permutation matrix. However, a permutation matrix has a real inverse which is equal to its transpose. So we have T qi = T = T T suggesting that T is a symmetric permutation matrix.

On the commutativity of super-decoherence and quasi-inverse
Now that we have discussed the quasi-inverses of both quantum and classical channels, a natural question is whether or not through super-decoherence [28] of quasi-inverse of a quantum channel E, the quasi-inverse of a classical channel T E can be obtained. In other words, we want to see if the action of taking quasi-inverse commutes with super-decoherence. As we will see below, in general the answer is negative. For convenience we first remind the concept of super-decoherence [28].
The decoherence channel removes off-diagonal elements of density matrices and sends any quantum state ρ into its diagonal D(ρ) = ρ d = ρ ii |i i|, i.e. a classical state. Defining an analogous process in the space of quantum channels, one may extract a classical map, a stochastic matrix, from any quantum channel. This process is called super-decoherence, noted by Δ, to emphasize that it acts on quantum channels and not states. For a quantum channel E, the assigned classical transition matrix obtained by super-decoherence is defined by The last equality above shows that this stochastic matrix is actually gained by decohering the Choi matrix and clarifies why it is called super-decoherence. It is straightforward to see stochasticity of T E is guaranteed by the fact that E is a positive and trace preserving map. This is related to the fact that the classical distortion matrix M cl of size (d − 1) used in (91) forms a block of the quantum distortion matrix M of size d 2 − 1 present in the Liouville form (14) of any corresponding quantum operation [44]. Moreover, one can show that if the channel E is described by the set of Kraus operators {K α }, then T E = K α K * α where defines Hadamard (entry-wise) product. Through this relation it is easy to see by super-decohering a unital channel we get a bistochastic matrix, while a unistochastic matrix is obtained if the input channel is unitary.
However, a simple counter example shows that the answer to the question posed at the beginning of this subsection is negative. Consider the case of a single-qubit channel, where the quasi-inverse of a channel is in general a unitary map, U = e iθn·σ , where θ and n depend on the channel. On the other hand, as shown in figure 7, the quasi-inverse of any two-dimensional classical map is either identity I or the permutation σ x , which arise due to super-decoherence of a small subset of the unitary channels [45].

Further results and examples
Although theorem 6 gives a complete prescription for finding the quasi-inverse of any stochastic matrix, it is instructive to consider a few special classes. These examples illustrate further the parallels between classical and quantum notions of maps and their quasi-inverses. The first result is the analog of unitarily equivalent quantum maps.
Consider two transition matrices T 1 and T 2 which are related by two arbitrary permutations P α and P β in the following way T 2 = P α • T 1 • P β . Let us call them permutationaly equivalent stochastic matrices. Pursuing the same approach adopted in obtaining equation (36), and noting that the set of stochastic matrices, S d , is invariant under permutations, we get Moreover, T 1 and T 2 have the same amount of fidelity after correction. The next example illustrates the connection with its quantum analog, example 1.

Example 8 (Convex combination of orthogonal permutations).
A bi-stochastic matrix T is a stochastic matrix with the extra property that the sum of entries of each row equals unity. It is a well-known result due to Birkhoff's theorem that any such matrix can be written as a convex combination of permutations. For dimension d, there are d! such permutations which form the extreme points of the convex set of these matrices, conventionally called the Birkhoff polytope. Note however that a bi-stochastic matrix has (d − 1) 2 independent parameters and hence the convex decomposition of an arbitrary bi-stochastic matrix in terms of d! permutations is not unique. Consider now a special class of bi-stochastic matrices which are convex combination of orthogonal permutations. These permutations are defined by the property that Equation (103) (105) as m j|P m |k = m δ j,k+m = 0, 1. In this example all the permutations commute with each other. As an example consisting of non-commuting but orthogonal permutations consider the following: with matrix representations One can see that P 1 and P 2 are orthogonal while P 1 and P 3 are not. Also one can see that satisfying the condition (105) while violates (105). Furthermore one can check that in the group S 3 with generators σ 1 = 1 2 3 2 1 3 and σ 2 = 1 2 3 1 3 2 , each of the two sets of even and odd permutations, respectively given by (I, σ 1 σ 2 , σ 2 σ 1 ) and (σ 1 , σ 2 , σ 1 σ 2 σ 1 ) comprise orthogonal permutations. Consider now a bi-stochastic matrix of the form where {P m } are orthogonal. In what follows we will show the quasi-inverse of any such bistochastic matrix is P −1 m 0 where m 0 refers to the index of the greatest coefficient λ m . To see this we note that Using equation (105) for orthogonal permutations, we find where we used the fact that for every stochastic matrix, the sum of its elements is equal to the dimension d. Now, it is straightforward to see T qi = P −1 m 0 recovers the average fidelity of T to this upper bound.
Thus the quasi-inverse of convex combination of orthogonal permutations is the inverse of the single permutation which has the largest share in the convex combination. This is however not the case if the permutations are not orthogonal. This is shown in the next example.

Example 9 (Convex combination of non-orthogonal permutations).
Consider the following permutations where I = 1 0 0 1 and σ x = 0 1 1 0 . These permutations are obviously non-orthogonal in the sense that Tr(P i P T j ) = 0. Consider now the following convex combination where λ 1 + λ 2 + λ 3 = 1 and all λ i < 1 2 . In explicit form this permutation matrix is given by Since λ i < 1 2 for all i, we find λ i + λ j > 1 2 for all (i = j) (due to the requirement that λ 1 + λ 2 + λ 3 = 1). Thus according to theorem 6, the quasi-inverse of P is given by replacing the largest entry of each row with unity and then replacing the resulting matrix, hence P qi = σ x ⊕ I ⊕ I which is not equal to inverse of any of the permutations P 1 , P 2 or P 3 .

Concluding remarks
We have extended the concept of quasi-inversion of qubit channels [1] to quantum channels in arbitrary dimensions and to the classical domain, i.e. Markov processes in discrete time. In both cases, a quasi-inverse is a map which when combined with the original channel increases the average input-output fidelity in an optimal way. While the complete classification of qubit channels [19,20], makes a complete characterization of their quasi-inverse possible, in higher dimensions the lack of such classification makes the problem a highly non-trivial one. The most notable difference is that in the qubit case, the extreme points of all unital channels are the unitary maps while in higher dimensions this is not the case any more and no general theorem is known on extreme points. Therefore in this paper, we have established certain general theorems on the nature of the quasi-inverse, and have provided certain bounds on the average fidelity after quasi-inversion. Applying these general results to some concrete cases, we have found in example 1 through example 5 the quasi-inverse for a large class of quantum channels. Moreover, we have done a parallel analysis for the classical channels, represented by stochastic matrices.
As shown in the appendix D, we have also obtained exact expressions for the average input-output fidelity for classical channels in any dimension d and have shown that the quasiinversion increases this quantity from d −1 to, approximately, d −1/2 . In the quantum case, we analyzed numerically in appendix E the improvement of the average fidelity of a random channel after correction with quasi-inversion and after applications of the best possible unitary evolution-see figure 10. If the dimension of the systems increases, the non-unitarity of quasiinversion becomes larger. In other words, for a generic channel acting in higher dimensions, its quasi-inverse is usually non-unitary.
It is noteworthy to mention that by applying quasi-inversion, we aim to get as close as possible to the identity map in the sense of the maximizing average fidelity and so by the reduction of the average Bures distance. We do not expect the notion of quasi-inversion to improve some other properties of channels, for which the identity map is not the most distinguished channel. There are other figures of merit, like the average output purity of a channel or the unitarity of a channel [46] u(E) : which can also be studied in the same way, leading to a different version of the quasi-inverse, both in the quantum and the classical domain. We are aware of examples which show that these notions of quasi-inverse are different. That is, a quasi-inverse which increases the average fidelity can increase, decrease or keep constant the unitarity of a quantum channel. One can also study the effect of quasi-inversion on the cohering power of quantum channels as defined in [47][48][49]. Finally, an interesting by-product of our study is appendix C, where we have constructed special types of channels with affine maps (M, t), where the distortion matrix M can stretch the Bloch vectors. Such channels are possible only in dimensions higher than two. These are of course different from channels of the form E(ρ) = |ψ ψ|, where the distortion matrix M vanishes and the stretching is due only to the translation vector t.
We hope that this study can be pursued in different directions, i.e. in obtaining more information about the extreme points of the space of quantum channels, and hence the quasi-inverse of larger classes of channels, those channels which do not have unique quasi-inverses, and those which are their own quasi-inverse. And also more importantly in finding connections with the recovery maps [50][51][52][53].
The last line shows L = T |00 00| is an isotropic state obtained by twirling the state |00 . In general for any state ρ, its twirling gives [35]: where f ρ = Tr P + ρ and P + = |φ + φ + | is the maximally entangled state. So for the state |00 00| of a d-dimensional system, one has: and which coincides with equation (22) in the text. One can also write the average fidelity (20) in the form where is a symmetric Werner state This will also lead to the same result as in (23). Finally let us calculate the average fidelity in yet another way, by expressing the pure states as From and equation (126), one finds If we now assume a uniform distribution of pure states on the sphere S d 2 −2 (which is of course not dense) and hence use the relations we arrive at which coincides with the result (23) which we obtained by integrating over the invariant volume dψ of all pure states. Here we have not used any specific measure of volume over the sphere S d 2 −2 and only have assumed that pure states are distributed symmetrically (albeit in a parse way) on this sphere.

Appendix C. Quantum channels which stretch the Bloch vector
In two dimension, to satisfy the positivity condition, the affine matrix M of any positive map, and thus any quantum channel, fulfills M T M I. As a result, the Bloch vector corresponding to a qubit state is always shrunk when M acts on it. Here we show that this is no longer the case in higher dimensions. This is one of the strange or un-expected properties of higher dimensional channels which makes the study of their quasi-inverses, among other things, difficult.
As describing this class of quantum maps is not straightforward we construct an example of such a channel in this appendix.
Let H d be a d-dimensional Hilbert space and let P 1 , P 2 , Q 1 and Q 2 be orthogonal projectors of rank d 1 , d 2 , m 1 and m 2 respectively, with (d = d 1 + d 2 = m 1 + m 2 ), i.e.
The following operators are Hermitian and traceless Consider now the following measure-and-prepare CPT map By expanding the projectors as |i i|, a set of Kraus operators for this map is given by and Note that hence the channel is non-unital, unless m 1 = d 1 and m 2 = d 2 .
From the above relations and the fact that P 1 + P 2 = Q 1 + Q 2 = I, we find and Consider now a state where x represents the Bloch vector of this state, since x = Tr(ρA). We will then find where Straightforward calculation completes the reasoning. Hence we find that if m 1 m 2 > d 1 d 2 , then the Bloch vector can be stretched. It is interesting to note that the inhomogeneous translation does not compensate the stretching of x, rather it enhances it. Obviously, this is not possible in dimensions d = 2, 3 but it is possible in dimensions d 4, where for example we can take (m 1 , m 2 ) = (2, 2) and (d 1 , d 2 ) = (1, 3). Moreover we see that if the channel is unital, To see how the existence of these channels may affect quasi-inversion, let us denote by M s a subset of quantum channels whose distortion matrices M s can only shrink the Bloch vector, i.e. M s = {E s | M T s M s I}. Let us highlight two remarks related to the set M s . First, according to Russo-Dye theorem any linear positive map obtains its norm at the identity [54]. Thus, singular values of any unital map are less than or equal to one which implies all unital maps belong to M s . Moreover, for any M s included in the set M s it is always possible to find unitary operators U 1 and U 2 such that M s = (U 1 + U 2 )/2. For now, let us assume that quasi-inverse for a quantum channel E belongs to M s . In that case, one has (29) This inequality imposes an upper bound on the corrected fidelity whenever quasi-inversion lies in the set M s . An example of which is a qubit channel where not only quasi-inversion is a unital and unitary map in M s , but also the entire set of qubit channels are contained in M s . However, in higher dimensions M s is a nontrivial subset of quantum channels as our above example shows. It is a question then whether quasi-inversion of any quantum channel belongs to the set M s . In what follows, we show with an example it is possible to exceed the bound in (144), which implies a negative answer to the question. Moreover, as the bound (144) is valid when quasi-inversion is a unital map, one infers in higher dimensions, against qubit channel, quasi-inverse is not necessarily unital. Before presenting the example, we mention explicitly taking (131), along with the identity and d 2 − 3 other Hermitian and traceless operators orthogonal to A and B as the set of basis, the distortion matrix M and the translation vector t of the channel E in the beginning of this section are given by their entries as Now assume λ > 0 is sufficiently small so the d-dimensional map E described by the following affine parameters is a quantum channel Note that Tr(MM ) = m 1 m 2 d 1 d 2 λ. This amount is certainly larger than Tr|M | = λ if m 1 m 2 > d 1 d 2 , i.e. if the affine matrix M can stretch the Bloch vector and E is not contained in M s . So for the channel E specified in equation (146) we have through equation (29) For an even d one has max m 1 m 2 for an odd d. In any case, assuming d > 3 we get

Appendix D. Statistical properties of classical channels and their quasi-inverses
In this section we discuss some of the properties of classical channels and their inverses. A random classical channel is a stochastic matrix T in which the columns are independently picked from the uniform ensemble over the probabilistic simplex Let the probability distribution function (PDF) for any component p i be denoted by f d (x), that is P[x p i x + dx] = f d (x)dx ∀ i. Noting as an example figure 8 for the d = 3 case, we see that this PDF is proportional to the area of the narrow slab on the triangle which defines the probability simplex. For the general case, this function is given by It is also useful to have an expression for the corresponding cumulative distribution function (CDF) The average fidelity for any classical channel T is given by F (T) ≡ 1 d Tr T. If we now take the average over the uniform ensemble of all stochastic matrices, we find where we have used the statistical independence of the columns of T in the uniform ensemble. Moreover we can also find the variance of this average fidelity which turns out to be .
The interesting question is how much this average fidelity increases for classical channels when we apply the quasi-inversion. To find this we note that the average fidelity after quasi inversion is that is the average fidelity after quasi-inversion is the average of the maximum element in each row of T. Assuming that the largest element of two different rows do not occur on the same column (which can happen only for a subset of measure zero in the ensemble), the ensemble average of the improved average fidelity becomes where p m is the largest element in a single probability vector chosen uniformly. To find the average of this quantity for the uniform ensemble, we invoke the proposition that only on a set of measure zero, the maximum values of two different rows may occur on the same column. Therefore we first find the following CDF, i.e. the probability that the maximum values in all columns are less than x this leads to the following PDF, i.e. the probability that the maximum value is between x and x + dx, Now it is easy to see that (156) Figure 9 shows the average fidelity after quasi-inversion versus dimension. The curve fits the equation with c ≈ 1.055 and the exponent x ≈ −0.5042. Therefore, the average fidelity of a classical channel appears to be increasing due to quasi-inversion from 1/d to 1/ √ d.

Appendix E. A note on statistical properties of quantum channels and their quasi-inverses
Investigating the same problem in the set of quantum channels seems a highly difficult task since in quantum case the quasi-inversion is not generally known. Nonetheless, numerical analysis is possible in lower dimensions and it shows improvement in amount of fidelity for a typical channel. To see that, one may notice for random quantum channels distributed uniformly in the set of quantum channels [44] Φ E = Φ * , where Φ * is the maximally depolarizing channel, see section 2. This implies Tr Φ E = 1. Thus, the average amount of input-output fidelity over random channels distributed uniformly in d dimension is given by (23) To get the fidelity after correction averaged over the set of uniformly distributed quantum channels, we applied numerically searching for the quasi-inverse. The sketch of our method is based on the observation that any To approximate the quasi-inverse of a channel numerically, we first approximate the set of CP operators by only including a finite subset of such constraints obtained by random entangled states in H d ⊗ H d . Then the quasi-inverse will be calculated as the answer to a finite linear programming problem using the simplex method or other efficient algorithms. The results for dimensions d = 2, 3, 4, 5 are presented in figure 10 and they confirm improvement in the average fidelity after correction for random channels.
One may also think about applying a unitary evolution to correct the input-output fidelity. However, our numerical results show that the best unitary quantum channel, which will be denoted by U E , cannot modify the fidelity in the best possible way, see figure 10 for a comparison. Indeed, we can take the purity of Jamiołkowski state as a signature of unitarity of a given channel E, i.e. this quantity is equal to 1 if and only if the channel is unitary and any deviation from 1 shows non-unitarity. Adopting such a function, one finds unitarity for the quasi-inverse of random channels averaged over the set of uniformly distributed CPT maps is equal to 0.99 ± 0.01 for d = 3, 0.68 ± 0.05 for d = 4, and 0.52 ± 0.05 for d = 5. This interesting fact counter-intuitively shows in higher dimensions for a typical channel quasi inversion is almost a non-unitary channel. However, if we measure unitality by 1 − |t| 2 (t is the translation vector, see (9)) averaged over the set of random channels, we see for d = 3, 4, 5 it is respectively given by: 0.98 ± 0.01, 0.97 ± 0.01, and 0.97 ± 0.01, which confirms that the quasi-inverse of a typical channel is close to be unital.